메뉴 건너뛰기




Volumn , Issue , 2002, Pages 1611-1618

Nonparametric Representation of Policies and Value Functions: A Trajectory-Based Approach

Author keywords

[No Author keywords available]

Indexed keywords

REINFORCEMENT LEARNING;

EID: 85156195508     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (14)

References (15)
  • 1
    • 85132026293 scopus 로고
    • Integrated architectures for learning, planning and reacting based on approximating dynamic programming
    • Richard S. Sutton. Integrated architectures for learning, planning and reacting based on approximating dynamic programming. In Proceedings 7th International Conference on Machine Learning., 1990.
    • (1990) Proceedings 7th International Conference on Machine Learning
    • Sutton, Richard S.1
  • 3
    • 0039816976 scopus 로고
    • Using local trajectory optimizers to speed up global optimization in dynamic programming
    • Jack D. Cowan, Gerald Tesauro, and Joshua Alspector, editors, pages Morgan Kaufmann Publishers, Inc
    • Christopher G. Atkeson. Using local trajectory optimizers to speed up global optimization in dynamic programming. In Jack D. Cowan, Gerald Tesauro, and Joshua Alspector, editors, Advances in Neural Information Processing Systems, volume 6, pages 663-670. Morgan Kaufmann Publishers, Inc., 1994.
    • (1994) Advances in Neural Information Processing Systems , vol.6 , pp. 663-670
    • Atkeson, Christopher G.1
  • 10
    • 0034206993 scopus 로고    scopus 로고
    • Efficiency, speed, and scaling of two-dimensional passive-dynamic walking
    • M. Garcia, A. Chatterjee, and A. Ruina. Efficiency, speed, and scaling of two-dimensional passive-dynamic walking. Dynamics and Stability of Systems, 15(2):75-99, 2000.
    • (2000) Dynamics and Stability of Systems , vol.15 , Issue.2 , pp. 75-99
    • Garcia, M.1    Chatterjee, A.2    Ruina, A.3
  • 12
    • 0346871047 scopus 로고    scopus 로고
    • Robust Reinforcement Learning
    • Todd K. Leen, Thomas G. Dietterich, and Volker Tresp, editors, pages MIT Press, Cambridge, MA
    • J. Morimoto and K. Doya. Robust Reinforcement Learning. In Todd K. Leen, Thomas G. Dietterich, and Volker Tresp, editors, Advances in Neural Information Processing Systems 13, pages 1061-1067. MIT Press, Cambridge, MA, 2001.
    • (2001) Advances in Neural Information Processing Systems , vol.13 , pp. 1061-1067
    • Morimoto, J.1    Doya, K.2
  • 13
    • 84899024446 scopus 로고    scopus 로고
    • Risk Sensitive Reinforcement Learning
    • M. S. Kearns, S. A. Solla, and D. A. Cohn, editors, pages MIT Press, Cambridge, MA, USA
    • R. Neuneier and O. Mihatsch. Risk Sensitive Reinforcement Learning. In M. S. Kearns, S. A. Solla, and D. A. Cohn, editors, Advances in Neural Information Processing Systems 11, pages 1031-1037. MIT Press, Cambridge, MA, USA, 1998.
    • (1998) Advances in Neural Information Processing Systems , vol.11 , pp. 1031-1037
    • Neuneier, R.1    Mihatsch, O.2
  • 14
    • 0033077715 scopus 로고    scopus 로고
    • Risk-Sensitive and Minmax Control of Discrete-Time Finite-State Markov Decision Processes
    • S. P. Coraluppi and S. I. Marcus. Risk-Sensitive and Minmax Control of Discrete-Time Finite-State Markov Decision Processes. Automatica, 35:301-309, 1999.
    • (1999) Automatica , vol.35 , pp. 301-309
    • Coraluppi, S. P.1    Marcus, S. I.2
  • 15
    • 84887272277 scopus 로고    scopus 로고
    • Minimax differential dynamic programming: An application to robust biped walking
    • MIT Press, Cambridge, MA
    • J. Morimoto and C. Atkeson. Minimax differential dynamic programming: An application to robust biped walking. In Advances in Neural Information Processing Systems 15. MIT Press, Cambridge, MA, 2002.
    • (2002) Advances in Neural Information Processing Systems , vol.15
    • Morimoto, J.1    Atkeson, C.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.