SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Robotics: Science and Systems

Volumn 8, Issue , 2013, Pages 353-360

On stochastic optimal control and reinforcement learning by approximate inference

(3) Rawlik, Konrad a Toussaint, Marc b Vijayakumar, Sethu a

a UNIVERSITY OF EDINBURGH (United Kingdom)

b FREIE UNIVERSITÄT BERLIN (Germany)

Author keywords

[No Author keywords available]

Indexed keywords

EID: 84959314908 PISSN: None EISSN: 2330765X Source Type: Conference Proceeding
DOI: 10.15607/rss.2012.viii.045 Document Type: Conference Paper

Times cited : (29)

References (26)

1
- 33846516584
- Springer
- C. M. Bishop. Pattern recognition and machine learning. Springer, 2006.
- (2006) Pattern Recognition and Machine Learning
- Bishop, C.M.¹

2
- 84856364721
- Exploiting variable stiffness in explosive movement tasks
- D. Braun, M. Howard, and S. Vijayakumar. Exploiting variable stiffness in explosive movement tasks. In R:SS, 2011.
- (2011) R:SS
- Braun, D.¹ Howard, M.² Vijayakumar, S.³

3
- 80053135922
- Risk sensitive path integral control
- J.L. van den Broek, W.A.J.J. Wiegerinck, and H.J. Kappen. Risk sensitive path integral control. In UAI, 2010.
- (2010) UAI
- Van Den Broek, J.L.¹ Wiegerinck, W.A.J.J.² Kappen, H.J.³

4
- 0346982426
- Using em for reinforcement learning
- P. Dayan and G. E. Hinton. Using EM for reinforcement learning. Neural Computation, 9:271-278, 1997.
- (1997) Neural Computation , vol.9 , pp. 271-278
- Dayan, P.¹ Hinton, G.E.²

5
- 84862300689
- Dynamic policy programming with function approximation
- A.M. Gheshlaghi et al. Dynamic policy programming with function approximation. In AISTATS, 2011.
- (2011) AISTATS
- Gheshlaghi, A.M.¹

6
- 77955798441
- Optimal feedback control for anthropomorphic manipulators
- D. Mitrovic et al. Optimal feedback control for anthropomorphic manipulators. In ICRA, 2010.
- (2010) ICRA
- Mitrovic, D.¹

7
- 84862011769
- Learning policy improvements with path integrals
- E. A. Theodorou et al. Learning policy improvements with path integrals. In AISTATS, 2010.
- (2010) AISTATS
- Theodorou, E.A.¹

8
- 0001916840
- Risk sensitive markov decision processes
- S.I. Marcus et al. Risk sensitive markov decision processes. Systems and control in the 21st century, 1997.
- (1997) Systems and Control in the 21st Century
- Marcus, S.I.¹

9
- 29144534131
- Convergence theorems for generalized alternating minimization procedures
- A. Gunawardana and W. Byrne. Convergence theorems for generalized alternating minimization procedures. J. of Machine Learning Research, 6:2049-2073, 2005.
- (2005) J. of Machine Learning Research , vol.6 , pp. 2049-2073
- Gunawardana, A.¹ Byrne, W.²

10
- 84862024986
- arXiv:0901.0633v2
- B. Kappen, V. Gomez, and M. Opper. Optimal control as a graphical model inference problem. arXiv:0901.0633v2, 2009.
- (2009) Optimal Control As A Graphical Model Inference Problem
- Kappen, B.¹ Gomez, V.² Opper, M.³

11
- 29044440299
- Path integrals and symmetry breaking for optimal control theory
- page 11011ff
- H. J. Kappen. Path integrals and symmetry breaking for optimal control theory. J. of Statistical Mechanics: Theory and Experiment, page 11011ff, 2005.
- (2005) J. of Statistical Mechanics: Theory and Experiment
- Kappen, H.J.¹

12
- 0008815681
- Exponentiated gradient versus gradient descent for linear predictors
- J. Kivinen and M. Warmuth. Exponentiated gradient versus gradient descent for linear predictors. Information and Computation, 132:1-64, 1997.
- (1997) Information and Computation , vol.132 , pp. 1-64
- Kivinen, J.¹ Warmuth, M.²

13
- 39649124969
- An iterative optimal control and estimation design for nonlinear stochastic system
- W. Li and E. Todorov. An iterative optimal control and estimation design for nonlinear stochastic system. In CDC, 2006.
- (2006) CDC
- Li, W.¹ Todorov, E.²

14
- 84959314187
- Adaptive optimal control for redundantly actuated arms
- D. Mitrovic, S. Klanke, and S. Vijayakumar. Adaptive optimal control for redundantly actuated arms. In SAB, 2008.
- (2008) SAB
- Mitrovic, D.¹ Klanke, S.² Vijayakumar, S.³

15
- 84455175268
- Stiffness and temporal optimization in periodic movements: An optimal control approach
- J. Nakanishi, K. Rawlik, and S. Vijayakumar. Stiffness and temporal optimization in periodic movements: An optimal control approach. In IROS, 2011.
- (2011) IROS
- Nakanishi, J.¹ Rawlik, K.² Vijayakumar, S.³

16
- 34447553096
- Reinforcement learning for humanoid robotics
- J. Peters, S. Vijayakumar, and S. Schaal. Reinforcement learning for humanoid robotics. In Humanoids, 2003.
- (2003) Humanoids
- Peters, J.¹ Vijayakumar, S.² Schaal, S.³

17
- 85161978230
- An approximate inference approach to temporal optimization in optimal control
- K. Rawlik, M. Toussaint, and S. Vijayakumar. An approximate inference approach to temporal optimization in optimal control. In NIPS, 2010.
- (2010) NIPS
- Rawlik, K.¹ Toussaint, M.² Vijayakumar, S.³

18
- 84959257852
- Evaluation of policy gradient methods and variants on the cart-pole benchmark
- M. Riedmiller, J. Peters, and S. Schaal. Evaluation of policy gradient methods and variants on the cart-pole benchmark. In IEEE ADPRL, 2007.
- (2007) IEEE ADPRL
- Riedmiller, M.¹ Peters, J.² Schaal, S.³

19
- 84959277959
- Reinforcement learning by probability matching
- P. N. Sabes and M. I. Jordan. Reinforcement learning by probability matching. In NIPS, 1996.
- (1996) NIPS
- Sabes, P.N.¹ Jordan, I.M.²

20
- 0004294973
- Dover Publications
- R. F. Stengel. Optimal Control and Estimation. Dover Publications, 1986.
- (1986) Optimal Control and Estimation
- Stengel, R.F.¹

21
- 0004007508
- MIT Press, Cambridge
- R.S. Sutton and A.G. Barto. Reinforcement Learning. MIT Press, Cambridge, 1998.
- (1998) Reinforcement Learning
- Sutton, R.S.¹ Barto, A.G.²

22
- 67650915125
- Efficient computation of optimal actions
- E. Todorov. Efficient computation of optimal actions. PNAS, 106:11478-11483, 2009.
- (2009) PNAS , vol.106 , pp. 11478-11483
- Todorov, E.¹

23
- 0036829017
- Optimal feedback control as a theory of motor coordination
- E. Todorov and M. Jordan. Optimal feedback control as a theory of motor coordination. Nature Neuroscience, 5:1226-1235, 2002.
- (2002) Nature Neuroscience , vol.5 , pp. 1226-1235
- Todorov, E.¹ Jordan, M.²

24
- 71149083296
- Robot trajectory optimization using approximate inference
- M. Toussaint. Robot trajectory optimization using approximate inference. In ICML, 2009.
- (2009) ICML
- Toussaint, M.¹

25
- 33749234798
- Probabilistic inference for solving discrete and continuous state markov decision processes
- M. Toussaint and A. Storkey. Probabilistic inference for solving discrete and continuous state markov decision processes. In ICML, 2006.
- (2006) ICML
- Toussaint, M.¹ Storkey, A.²

26
- 84887272784
- Hierarchical motion planning in topological representations
- D. Zarubin, V. Ivan, M. Toussaint, T. Komura, and S. Vijayakumar. Hierarchical motion planning in topological representations. In R:SS, 2012.
- (2012) R:SS
- Zarubin, D.¹ Ivan, V.² Toussaint, M.³ Komura, T.⁴ Vijayakumar, S.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.