SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems

Volumn , Issue , 2013, Pages

Variational policy search via trajectory optimization

(2) Levine, Sergey a Koltun, Vladlen a

a STANFORD UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

AERODYNAMICS; ALGORITHMS; DYNAMICAL SYSTEMS; SPACE RESEARCH; TRAJECTORIES;

CONTROL POLICY; DIFFERENTIAL DYNAMIC PROGRAMMING; EXPLORATION STRATEGIES; HIGH-DIMENSIONAL; PARAMETER SPACES; POLICY OBJECTIVES; POLICY SEARCH; TRAJECTORY OPTIMIZATION;

OPTIMIZATION;

EID: 84898932265 PISSN: 10495258 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (116)

References (23)

1
- 29344436709
- Policy search by dynamic programming
- A. Bagnell, S. Kakade, A. Ng, and J. Schneider. Policy search by dynamic programming. In Advances in Neural Information Processing Systems (NIPS), 2003.
- (2003) Advances in Neural Information Processing Systems (NIPS)
- Bagnell, A.¹ Kakade, S.² Ng, A.³ Schneider, J.⁴

2
- 80053441894
- PILCO: A model-based and data-efficient approach to policy search
- M. Deisenroth and C. Rasmussen. PILCO: a model-based and data-efficient approach to policy search. In International Conference on Machine Learning (ICML), 2011.
- (2011) International Conference on Machine Learning (ICML)
- Deisenroth, M.¹ Rasmussen, C.²

3
- 84862273812
- Variational methods for reinforcement learning
- T. Furmston and D. Barber. Variational methods for reinforcement learning. Journal of Machine Learning Research, 9:241-248, 2010.
- (2010) Journal of Machine Learning Research , vol.9 , pp. 241-248
- Furmston, T.¹ Barber, D.²

4
- 0004291983
- Elsevier
- D. Jacobson and D. Mayne. Differential Dynamic Programming. Elsevier, 1970.
- (1970) Differential Dynamic Programming
- Jacobson, D.¹ Mayne, D.²

5
- 0003195240
- A new extension of the Kalman filter to nonlinear systems
- S. Julier and J. Uhlmann. A new extension of the Kalman filter to nonlinear systems. In International Symposium on Aerospace/Defense Sensing, Simulation, and Control, 1997.
- (1997) International Symposium on Aerospace/Defense Sensing, Simulation, and Control
- Julier, S.¹ Uhlmann, J.²

6
- 1942514728
- Approximately optimal approximate reinforcement learning
- S. Kakade and J. Langford. Approximately optimal approximate reinforcement learning. In International Conference on Machine Learning (ICML), 2002.
- (2002) International Conference on Machine Learning (ICML)
- Kakade, S.¹ Langford, J.²

7
- 84871705710
- STOMP: Stochastic trajectory optimization for motion planning
- M. Kalakrishnan, S. Chitta, E. Theodorou, P. Pastor, and S. Schaal. STOMP: stochastic trajectory optimization for motion planning. In International Conference on Robotics and Automation, 2011.
- (2011) International Conference on Robotics and Automation
- Kalakrishnan, M.¹ Chitta, S.² Theodorou, E.³ Pastor, P.⁴ Schaal, S.⁵

8
- 85060321083
- Learning motor primitives for robotics
- J. Kober and J. Peters. Learning motor primitives for robotics. In International Conference on Robotics and Automation, 2009.
- (2009) International Conference on Robotics and Automation
- Kober, J.¹ Peters, J.²

9
- 70649111792
- MIT Press
- D. Koller and N. Friedman. Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009.
- (2009) Probabilistic Graphical Models: Principles and Techniques
- Koller, D.¹ Friedman, N.²

10
- 84897529781
- Guided policy search
- S. Levine and V. Koltun. Guided policy search. In International Conference on Machine Learning (ICML), 2013.
- (2013) International Conference on Machine Learning (ICML)
- Levine, S.¹ Koltun, V.²

11
- 80053459456
- Variational inference for policy search in changing situations
- G. Neumann. Variational inference for policy search in changing situations. In International Conference on Machine Learning (ICML), 2011.
- (2011) International Conference on Machine Learning (ICML)
- Neumann, G.¹

12
- 44949241322
- Reinforcement learning of motor skills with policy gradients
- J. Peters and S. Schaal. Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4):682-697, 2008.
- (2008) Neural Networks , vol.21 , Issue.4 , pp. 682-697
- Peters, J.¹ Schaal, S.²

13
- 84877282363
- On stochastic optimal control and reinforcement learning by approximate inference
- K. Rawlik, M. Toussaint, and S. Vijayakumar. On stochastic optimal control and reinforcement learning by approximate inference. In Robotics: Science and Systems, 2012.
- (2012) Robotics: Science and Systems
- Rawlik, K.¹ Toussaint, M.² Vijayakumar, S.³

14
- 84862273266
- A reduction of imitation learning and structured prediction to no-regret online learning
- S. Ross, G. Gordon, and A. Bagnell. A reduction of imitation learning and structured prediction to no-regret online learning. Journal of Machine Learning Research, 15:627-635, 2011.
- (2011) Journal of Machine Learning Research , vol.15 , pp. 627-635
- Ross, S.¹ Gordon, G.² Bagnell, A.³

15
- 84950871099
- Accurate approximations for posterior moments and marginal densities
- L. Tierney and J. B. Kadane. Accurate approximations for posterior moments and marginal densities. Journal of the American Statistical Association, 81(393):82-86, 1986.
- (1986) Journal of the American Statistical Association , vol.81 , Issue.393 , pp. 82-86
- Tierney, L.¹ Kadane, J.B.²

16
- 85162021468
- Policy gradients in linearly-solvable MDPs
- E. Todorov. Policy gradients in linearly-solvable MDPs. In Advances in Neural Information Processing Systems (NIPS 23), 2010.
- (2010) Advances in Neural Information Processing Systems (NIPS 23)
- Todorov, E.¹

17
- 23944452693
- A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems
- E. Todorov and W. Li. A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems. In American Control Conference, 2005.
- (2005) American Control Conference
- Todorov, E.¹ Li, W.²

18
- 67650502124
- Iterative local dynamic programming
- E. Todorov and Y. Tassa. Iterative local dynamic programming. In IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2009.
- (2009) IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)
- Todorov, E.¹ Tassa, Y.²

19
- 71149083296
- Robot trajectory optimization using approximate inference
- M. Toussaint. Robot trajectory optimization using approximate inference. In International Conference on Machine Learning (ICML), 2009.
- (2009) International Conference on Machine Learning (ICML)
- Toussaint, M.¹

20
- 67349102783
- Hierarchical POMDP controller optimization by likelihood maximization
- M. Toussaint, L. Charlin, and P. Poupart. Hierarchical POMDP controller optimization by likelihood maximization. In Uncertainty in Artificial Intelligence (UAI), 2008.
- (2008) Uncertainty in Artificial Intelligence (UAI)
- Toussaint, M.¹ Charlin, L.² Poupart, P.³

21
- 70349327392
- Learning model-free robot control by a Monte Carlo em algorithm
- N. Vlassis, M. Toussaint, G. Kontes, and S. Piperidis. Learning model-free robot control by a Monte Carlo EM algorithm. Autonomous Robots, 27(2):123-130, 2009.
- (2009) Autonomous Robots , vol.27 , Issue.2 , pp. 123-130
- Vlassis, N.¹ Toussaint, M.² Kontes, G.³ Piperidis, S.⁴

22
- 34547691027
- SIMBICON: Simple biped locomotion control
- K. Yin, K. Loken, and M. van de Panne. SIMBICON: simple biped locomotion control. ACM Transactions Graphics, 26(3), 2007.
- (2007) ACM Transactions Graphics , vol.26 , pp. 3
- Yin, K.¹ Loken, K.² Van De Panne, M.³

23
- 84856113877
- PhD thesis, Carnegie Mellon University
- B. Ziebart. Modeling purposeful adaptive behavior with the principle of maximum causal entropy. PhD thesis, Carnegie Mellon University, 2010.
- (2010) Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy
- Ziebart, B.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.