SCOPUS 정보 검색 플랫폼

Proceedings of the 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, ADPRL 2007

Volumn , Issue , 2007, Pages 254-261

Evaluation of policy gradient methods and variants on the cart-pole benchmark

(3) Riedmiller, Martin a Peters, Jan b Schaal, Stefan b

a UNIVERSITY OF OSNABRÜCK (Germany)

b UNIVERSITY OF SOUTHERN CALIFORNIA (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; BENCHMARKING; COMPUTATIONAL METHODS; FINITE DIFFERENCE METHOD; MATHEMATICAL MODELS; OBJECT ORIENTED PROGRAMMING; OPTIMIZATION; PUBLIC POLICY;

CART-POLE BENCHMARK; FINITE DIFFERENCE GRADIENTS; POLICY SEARCH ALGORITHMS;

GRADIENT METHODS;

EID: 34548763245 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ADPRL.2007.368196 Document Type: Conference Paper

Times cited : (57)

References (13)

1
- 0000396062
- Natural gradient works efficiently in learning
- S. Amari. Natural gradient works efficiently in learning. Neural Computation, 10, 1998.
- (1998) Neural Computation , vol.10
- Amari, S.¹

2
- 34250653774
- Learning cpg sensory feedback with policy gradient for biped locomotion for a full-body humanoid
- G. Endo, J. Morimoto, T. Matsubara, J. Nakanishi, and G. Cheng. Learning cpg sensory feedback with policy gradient for biped locomotion for a full-body humanoid. In AAAI 2005, 2005.
- (2005) AAAI 2005
- Endo, G.¹ Morimoto, J.² Matsubara, T.³ Nakanishi, J.⁴ Cheng, G.⁵

3
- 0012260296
- Feature article: Optimization for simulation: Theory vs. practice
- M. C. Fu. Feature article: Optimization for simulation: Theory vs. practice. INFORMS Journal on Computing, 14(3): 192-215, 2002.
- (2002) INFORMS Journal on Computing , vol.14 , Issue.3 , pp. 192-215
- Fu, M.C.¹

4
- 0028381374
- V. Gullapalli, J. Franklin, and H. Benbrahim. Aquiring robot skills via reinforcement learning. IEEE Control Systems, -(39), 1994.
- V. Gullapalli, J. Franklin, and H. Benbrahim. Aquiring robot skills via reinforcement learning. IEEE Control Systems, -(39), 1994.

5
- 84898930479
- Natural policy gradient
- S. A. Kakade. Natural policy gradient. Advances in Neural Information Processing Systems 14, 2002.
- (2002) Advances in Neural Information Processing Systems , vol.14
- Kakade, S.A.¹

6
- 3042534761
- Policy gradient reinforcement learning for fast quadrupedal locomotion
- New Orleans, LA, May
- N. Kohl and P. Stone. Policy gradient reinforcement learning for fast quadrupedal locomotion. In Proceedings of the IEEE International Conference on Robotics and Automation, New Orleans, LA, May 2004.
- (2004) Proceedings of the IEEE International Conference on Robotics and Automation
- Kohl, N.¹ Stone, P.²

7
- 0141819580
- Pegasus: A policy search method for large mdps and pomdps
- A. Y. Ng and M. Jordan. Pegasus: A policy search method for large mdps and pomdps. In Uncertainty in Artificial Intelligence, Proceedings of the Sixteenth Conference, 2000.
- (2000) Uncertainty in Artificial Intelligence, Proceedings of the Sixteenth Conference
- Ng, A.Y.¹ Jordan, M.²

8
- 34250635407
- Policy gradient methods for robotics
- J. Peters and S. Schaal. Policy gradient methods for robotics. In Proceedings of the ieee international conference on intelligent robotics systems (iros 2006), 2006.
- (2006) Proceedings of the ieee international conference on intelligent robotics systems (iros 2006)
- Peters, J.¹ Schaal, S.²

9
- 33646413135
- Natural actor-critic
- springer
- J. Peters, S. Vijayakumar, and S. Schaal. Natural actor-critic. In Proceedings of the 16th european conference on machine learning (ecml 2005), pages 280-291. springer, 2005.
- (2005) Proceedings of the 16th european conference on machine learning (ecml 2005) , pp. 280-291
- Peters, J.¹ Vijayakumar, S.² Schaal, S.³

10
- 0028466750
- M. Riedmiller. Advanced supervised learning in multi-layer perceptrons -- from backpropagation to adaptive learning algorithms. Int. Journal of Computer Standards and Interfaces, 16:265-278, 1994. Special Issue on Neural Networks.
- M. Riedmiller. Advanced supervised learning in multi-layer perceptrons -- from backpropagation to adaptive learning algorithms. Int. Journal of Computer Standards and Interfaces, 16:265-278, 1994. Special Issue on Neural Networks.

11
- 34548720281
- M. Riedmiller, R. Hafner, S. Lange, and S. Timmer. Clsquare, a closed loop simulation system
- M. Riedmiller, R. Hafner, S. Lange, and S. Timmer. Clsquare - a closed loop simulation system.

12
- 34548800689
- Reinforcement learning benchmarks and bake-offs i, ii
- Decembre
- M. Riedmiller, M. L. Littman, M. G. Lagoudakis, N. Vlassis, S. White-son, and A. White. Reinforcement learning benchmarks and bake-offs i, ii. Workshop at the 2005 Neural Information Processing Systems (NIPS) Conference, Decembre 2005.
- (2005) Workshop at the 2005 Neural Information Processing Systems (NIPS) Conference
- Riedmiller, M.¹ Littman, M.L.² Lagoudakis, M.G.³ Vlassis, N.⁴ White-son, S.⁵ White, A.⁶

13
- 34548757155
- R. Tedrake, T. W. Zhang, and H. S. Seung. Learning to walk in 20 minutes. In Proceedings of the Fourteenth Yale Workshop on Adaptive and Learning Systems, Russ Tedrake, Teresa Weirui Zhang, and H. Sebastian Seung. (2005) Learning to Walk in 20 Minutes. In Proceedings of the Fourteenth Yale Workshop on Adaptive and Learning Systems, Yale University, New Haven, CT, 2005, 2005.
- R. Tedrake, T. W. Zhang, and H. S. Seung. Learning to walk in 20 minutes. In Proceedings of the Fourteenth Yale Workshop on Adaptive and Learning Systems, Russ Tedrake, Teresa Weirui Zhang, and H. Sebastian Seung. (2005) Learning to Walk in 20 Minutes. In Proceedings of the Fourteenth Yale Workshop on Adaptive and Learning Systems, Yale University, New Haven, CT, 2005, 2005.

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.