메뉴 건너뛰기




Volumn , Issue , 2012, Pages

Reinforcement learning with guided policy search using Gaussian processes

Author keywords

[No Author keywords available]

Indexed keywords

CONTINUOUS STATE-ACTION SPACES; CONTROL POLICY; CONTROL TASK; ESTIMATED STATE; GAUSSIAN PROCESSES; GRADIENT BASED; GRADIENT ESTIMATES; ONLINE LEARNING; POLICY SEARCH; PROBABILISTIC MODELS; VALUE FUNCTION APPROXIMATION; VALUE FUNCTIONS;

EID: 84865063694     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/IJCNN.2012.6252509     Document Type: Conference Paper
Times cited : (4)

References (23)
  • 4
    • 0001771345 scopus 로고    scopus 로고
    • Linear least-squares algorithms for temporal difference learning
    • Steven J. Bradtke, Andrew G. Barto, and Pack Kaelbling. Linear least-squares algorithms for temporal difference learning. In Machine Learning, pages 22-33, 1996.
    • (1996) Machine Learning , pp. 22-33
    • Bradtke, S.J.1    Barto, A.G.2    Kaelbling, P.3
  • 6
    • 0038891993 scopus 로고    scopus 로고
    • Sparse on-line Gaussian Processes
    • Lehel Csató and Manfred Opper. Sparse on-line Gaussian Processes. Neural Computation, 14(3):641-669, 2002.
    • (2002) Neural Computation , vol.14 , Issue.3 , pp. 641-669
    • Csató, L.1    Opper, M.2
  • 8
    • 61849173491 scopus 로고    scopus 로고
    • Gaussian process dynamic programming
    • Marc Peter Deisenroth, Carl Edward Rasmussen, and Jan Peters. Gaussian process dynamic programming. Neurocomputing, 72(7-9):1508-1524, 2009.
    • (2009) Neurocomputing , vol.72 , Issue.7-9 , pp. 1508-1524
    • Deisenroth, M.P.1    Rasmussen, C.E.2    Peters, J.3
  • 9
    • 84864065133 scopus 로고    scopus 로고
    • Bayesian policy gradient algorithms
    • B. Schölkopf, J. Platt, and T. Hoffman, editors, Cambridge, MA, MIT Press
    • Mohammad Ghavamzadeh and Yaakov Engel. Bayesian policy gradient algorithms. In B. Schölkopf, J. Platt, and T. Hoffman, editors, NIPS '07: Advances in Neural Information Processing Systems 19, pages 457-464, Cambridge, MA, 2007. MIT Press.
    • (2007) NIPS '07: Advances in Neural Information Processing Systems , vol.19 , pp. 457-464
    • Ghavamzadeh, M.1    Engel, Y.2
  • 12
    • 79959344344 scopus 로고    scopus 로고
    • Improving Gaussian process value function approximation in policy gradient algorithms
    • Timo Honkela, Włodzisław Duch, Mark Girolami, and Samuel Kaski, editors, volume 6792 of Lecture Notes in Computer Science, Springer
    • Hunor Jakab and Lehel Csató. Improving Gaussian process value function approximation in policy gradient algorithms. In Timo Honkela, Włodzisław Duch, Mark Girolami, and Samuel Kaski, editors, Artificial Neural Networks and Machine Learning - ICANN 2011, volume 6792 of Lecture Notes in Computer Science, pages 221-228. Springer, 2011.
    • (2011) Artificial Neural Networks and Machine Learning - ICANN 2011 , pp. 221-228
    • Jakab, H.1    Csató, L.2
  • 13
    • 33646243319 scopus 로고    scopus 로고
    • Cambridge, MA, MIT Press
    • Sham Kakade. A natural policy gradient. volume 2, pages 1531-1538, Cambridge, MA, 2002. MIT Press.
    • (2002) A Natural Policy Gradient. , vol.2 , pp. 1531-1538
    • Kakade, S.1
  • 16
    • 79251576558 scopus 로고    scopus 로고
    • MCMC using Hamiltonian dynamics
    • Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng, Editors. Chapman & Hall/CRC Press
    • Radford M. Neal. MCMC using Hamiltonian dynamics. In Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng, editors, Handbook of Markov Chain Monte Carlo. Chapman & Hall/CRC Press, 2010.
    • (2010) Handbook of Markov Chain Monte Carlo
    • Neal, R.M.1
  • 17
    • 44949241322 scopus 로고    scopus 로고
    • Reinforcement learning of motor skills with policy gradients
    • Jan Peters and Stefan Schaal. Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4):682-697, 2008.
    • (2008) Neural Networks , vol.21 , Issue.4 , pp. 682-697
    • Peters, J.1    Schaal, S.2
  • 22
    • 84898939480 scopus 로고    scopus 로고
    • Policy gradient methods for reinforcement learning with function approximation
    • Sara A. Solla, Todd K. Leen, and Klaus- Robert Müller, editors
    • Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In Sara A. Solla, Todd K. Leen, and Klaus- Robert Müller, editors, NIPS '99: Advances in Neural Information Processing Systems, pages 1057-1063, 1999.
    • (1999) NIPS '99: Advances in Neural Information Processing Systems , pp. 1057-1063
    • Sutton, R.S.1    McAllester, D.A.2    Singh, S.P.3    Mansour, Y.4
  • 23
    • 0000337576 scopus 로고
    • Simple statistical gradient-following algorithms for connectionist reinforcement learning
    • Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8:229-256, 1992.
    • (1992) Machine Learning , vol.8 , pp. 229-256
    • Williams, R.J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.