SCOPUS 정보 검색 플랫폼

Proceedings of the International Joint Conference on Neural Networks

Volumn , Issue , 2012, Pages

Reinforcement learning with guided policy search using Gaussian processes

(2) Jakab, Hunor S a Csató, Lehel a

a BABE BOLYAI UNIVERSITY (Romania)

Author keywords

[No Author keywords available]

Indexed keywords

CONTINUOUS STATE-ACTION SPACES; CONTROL POLICY; CONTROL TASK; ESTIMATED STATE; GAUSSIAN PROCESSES; GRADIENT BASED; GRADIENT ESTIMATES; ONLINE LEARNING; POLICY SEARCH; PROBABILISTIC MODELS; VALUE FUNCTION APPROXIMATION; VALUE FUNCTIONS;

ALGORITHMS; GAUSSIAN DISTRIBUTION; GAUSSIAN NOISE (ELECTRONIC); NEURAL NETWORKS; REINFORCEMENT LEARNING;

UNCERTAINTY ANALYSIS;

EID: 84865063694 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/IJCNN.2012.6252509 Document Type: Conference Paper

Times cited : (4)

References (23)

1
- 74049127928
- John C. Platt, Daphne Koller, Yoram Singer, and Sam T. Roweis, editors, NIPS MIT Press
- András Antos, Rémi Munos, and Csaba Szepesvári. Fitted q-iteration in continuous action-space MDPs. In John C. Platt, Daphne Koller, Yoram Singer, and Sam T. Roweis, editors, NIPS. MIT Press, 2007.
- (2007) Fitted q-iteration in Continuous Action-space MDPs
- Antos, A.¹ Munos, R.² Szepesvári, C.³

2
- 0034859944
- Autonomous helicopter control using reinforcement learning policy search methods
- IEEE, May
- J. Andrew Bagnell and Jeff Schneider. Autonomous helicopter control using reinforcement learning policy search methods. In Proceedings of the International Conference on Robotics and Automation 2001. IEEE, May 2001.
- (2001) Proceedings of the International Conference on Robotics and Automation 2001
- Bagnell, J.A.¹ Schneider, J.²

3
- 85151728371
- Residual algorithms: Reinforcement learning with function approximation
- Morgan Kaufmann
- Leemon Baird. Residual algorithms: Reinforcement learning with function approximation. In In Proceedings of the Twelfth International Conference on Machine Learning, pages 30-37. Morgan Kaufmann, 1995.
- (1995) Proceedings of the Twelfth International Conference on Machine Learning , pp. 30-37
- Baird, L.¹

4
- 0001771345
- Linear least-squares algorithms for temporal difference learning
- Steven J. Bradtke, Andrew G. Barto, and Pack Kaelbling. Linear least-squares algorithms for temporal difference learning. In Machine Learning, pages 22-33, 1996.
- (1996) Machine Learning , pp. 22-33
- Bradtke, S.J.¹ Barto, A.G.² Kaelbling, P.³

5
- 14344257510
- PhD thesis, Neural Computing Research Group, March
- Lehel Csató. Gaussian Processes - Iterative Sparse Approximation. PhD thesis, Neural Computing Research Group, March 2002.
- (2002) Gaussian Processes - Iterative Sparse Approximation
- Csató, L.¹

6
- 0038891993
- Sparse on-line Gaussian Processes
- Lehel Csató and Manfred Opper. Sparse on-line Gaussian Processes. Neural Computation, 14(3):641-669, 2002.
- (2002) Neural Computation , vol.14 , Issue.3 , pp. 641-669
- Csató, L.¹ Opper, M.²

7
- 80053441894
- PILCO: A model-based and data-efficient approach to policy search
- L. Getoor and T. Scheffer, editors, Bellevue, WA, USA, June
- Marc P. Deisenroth and Carl E. Rasmussen. PILCO: A Model-Based and Data-Efficient Approach to Policy Search. In L. Getoor and T. Scheffer, editors, Proceedings of the 28th International Conference on Machine Learning, Bellevue, WA, USA, June 2011.
- (2011) Proceedings of the 28th International Conference on Machine Learning
- Deisenroth, M.P.¹ Rasmussen, C.E.²

8
- 61849173491
- Gaussian process dynamic programming
- Marc Peter Deisenroth, Carl Edward Rasmussen, and Jan Peters. Gaussian process dynamic programming. Neurocomputing, 72(7-9):1508-1524, 2009.
- (2009) Neurocomputing , vol.72 , Issue.7-9 , pp. 1508-1524
- Deisenroth, M.P.¹ Rasmussen, C.E.² Peters, J.³

9
- 84864065133
- Bayesian policy gradient algorithms
- B. Schölkopf, J. Platt, and T. Hoffman, editors, Cambridge, MA, MIT Press
- Mohammad Ghavamzadeh and Yaakov Engel. Bayesian policy gradient algorithms. In B. Schölkopf, J. Platt, and T. Hoffman, editors, NIPS '07: Advances in Neural Information Processing Systems 19, pages 457-464, Cambridge, MA, 2007. MIT Press.
- (2007) NIPS '07: Advances in Neural Information Processing Systems , vol.19 , pp. 457-464
- Ghavamzadeh, M.¹ Engel, Y.²

10
- 0033706365
- Evolving robust gaits with aibo
- G.S. Hornby, S. Takamura, J. Yokono, O. Hanagata, T. Yamamoto, and M. Fujita. Evolving robust gaits with aibo. In IEEE International Conference on Robotics and Automation (ICRA2000), pages 3040-3045, 2000.
- (2000) IEEE International Conference on Robotics and Automation (ICRA2000) , pp. 3040-3045
- Hornby, G.S.¹ Takamura, S.² Yokono, J.³ Hanagata, O.⁴ Yamamoto, T.⁵ Fujita, M.⁶

11
- 0036059542
- Movement imitation with nonlinear dynamical systems in humanoid robots
- J. A. Ijspeert, J. Nakanishi, and S. Schaal. Movement imitation with nonlinear dynamical systems in humanoid robots. In IEEE International conference on robotics and automation (ICRA2002), 2002.
- (2002) IEEE International Conference on Robotics and Automation (ICRA2002)
- Ijspeert, J.A.¹ Nakanishi, J.² Schaal, S.³

12
- 79959344344
- Improving Gaussian process value function approximation in policy gradient algorithms
- Timo Honkela, Włodzisław Duch, Mark Girolami, and Samuel Kaski, editors, volume 6792 of Lecture Notes in Computer Science, Springer
- Hunor Jakab and Lehel Csató. Improving Gaussian process value function approximation in policy gradient algorithms. In Timo Honkela, Włodzisław Duch, Mark Girolami, and Samuel Kaski, editors, Artificial Neural Networks and Machine Learning - ICANN 2011, volume 6792 of Lecture Notes in Computer Science, pages 221-228. Springer, 2011.
- (2011) Artificial Neural Networks and Machine Learning - ICANN 2011 , pp. 221-228
- Jakab, H.¹ Csató, L.²

13
- 33646243319
- Cambridge, MA, MIT Press
- Sham Kakade. A natural policy gradient. volume 2, pages 1531-1538, Cambridge, MA, 2002. MIT Press.
- (2002) A Natural Policy Gradient. , vol.2 , pp. 1531-1538
- Kakade, S.¹

14
- 3042632626
- Automatic gait optimisation for quadruped robots
- Min Sub Kim and William Uther. Automatic gait optimisation for quadruped robots. In In Australasian Conference on Robotics and Automation, 2003.
- (2003) Australasian Conference on Robotics and Automation
- Kim, M.S.¹ Uther, W.²

15
- 3042534761
- Policy gradient reinforcement learning for fast quadrupedal locomotion
- Nate Kohl and Peter Stone. Policy gradient reinforcement learning for fast quadrupedal locomotion. In in Proceedings of the IEEE International Conference on Robotics and Automation, pages 2619- 2624, 2004.
- (2004) Proceedings of the IEEE International Conference on Robotics and Automation , pp. 2619-2624
- Kohl, N.¹ Stone, P.²

16
- 79251576558
- MCMC using Hamiltonian dynamics
- Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng, Editors. Chapman & Hall/CRC Press
- Radford M. Neal. MCMC using Hamiltonian dynamics. In Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng, editors, Handbook of Markov Chain Monte Carlo. Chapman & Hall/CRC Press, 2010.
- (2010) Handbook of Markov Chain Monte Carlo
- Neal, R.M.¹

17
- 44949241322
- Reinforcement learning of motor skills with policy gradients
- Jan Peters and Stefan Schaal. Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4):682-697, 2008.
- (2008) Neural Networks , vol.21 , Issue.4 , pp. 682-697
- Peters, J.¹ Schaal, S.²

18
- 0003998452
- John Wiley & Sons, Inc., New York, NY
- Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, 1994.
- (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

19
- 25444448065
- MIT Press
- Carl Edward Rasmussen and Christopher Williams. Gaussian Processes for Machine Learning. MIT Press, 2006.
- (2006) Gaussian Processes for Machine Learning
- Rasmussen, C.E.¹ Williams, C.²

20
- 34848832311
- Dynamics systems vs. optimal control - A unifying view
- S Schaal, P. Mohajerian, and A. Ijspeert. Dynamics systems vs. optimal control - a unifying view. Progress In Brain Research, 165:425-445, 2007.
- (2007) Progress in Brain Research , vol.165 , pp. 425-445
- Schaal, S.¹ Mohajerian, P.² Ijspeert, A.³

21
- 0004102479
- MIT Press
- Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

22
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- Sara A. Solla, Todd K. Leen, and Klaus- Robert Müller, editors
- Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In Sara A. Solla, Todd K. Leen, and Klaus- Robert Müller, editors, NIPS '99: Advances in Neural Information Processing Systems, pages 1057-1063, 1999.
- (1999) NIPS '99: Advances in Neural Information Processing Systems , pp. 1057-1063
- Sutton, R.S.¹ McAllester, D.A.² Singh, S.P.³ Mansour, Y.⁴

23
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8:229-256, 1992.
- (1992) Machine Learning , vol.8 , pp. 229-256
- Williams, R.J.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.