SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010

Volumn , Issue , 2010, Pages

Policy gradients in linearly-solvable MDPs

(1) Todorov, Emanuel a

a UNIVERSITY OF WASHINGTON (United States)

Author keywords

[No Author keywords available]

Indexed keywords

CONTINUOUS TIME SYSTEMS;

COMPATIBLE FUNCTIONS; CONTINOUS TIME; COST TO GO; FUNCTION APPROXIMATORS; POLICY GRADIENT;

STOCHASTIC SYSTEMS;

EID: 85162021468 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (15)

References (18)

1
- 0000396062
- Natural gradient works efficiently in learning
- S. Amari. Natural gradient works efficiently in learning. Neural Computation, 10:251-276, 1998.
- (1998) Neural Computation , vol.10 , pp. 251-276
- Amari, S.¹

2
- 84858765598
- Covariant policy search
- J. Bagnell and J. Schneider. Covariant policy search. In International Joint Conference on Artificial Intelligence, 2003.
- (2003) International Joint Conference on Artificial Intelligence
- Bagnell, J.¹ Schneider, J.²

3
- 0038595396
- Least-squares temporal difference learning
- J. Boyan. Least-squares temporal difference learning. In International Conference on Machine Learning, 1999.
- (1999) International Conference on Machine Learning
- Boyan, J.¹

4
- 0020203191
- Optimal control and nonlinear filtering for nondegenerate diffusion processes
- W. Fleming and S. Mitter. Optimal control and nonlinear filtering for nondegenerate diffusion processes. Stochastics, 8:226-261, 1982.
- (1982) Stochastics , vol.8 , pp. 226-261
- Fleming, W.¹ Mitter, S.²

5
- 84898930479
- A natural policy gradient
- S. Kakade. A natural policy gradient. In Advances in Neural Information Processing Systems, 2002.
- (2002) Advances in Neural Information Processing Systems
- Kakade, S.¹

6
- 23244466805
- PhD thesis, University College London
- S. Kakade. On the Sample Complexity of Reinforcement Learning. PhD thesis, University College London, 2003.
- (2003) On the Sample Complexity of Reinforcement Learning
- Kakade, S.¹

7
- 28844435646
- Linear theory for control of nonlinear stochastic systems
- H. Kappen. Linear theory for control of nonlinear stochastic systems. Physical Review Letters, 95, 2005.
- (2005) Physical Review Letters , vol.95
- Kappen, H.¹

8
- 84898938510
- Actor-critic algorithms
- V. Konda and J. Tsitsiklis. Actor-critic algorithms. SIAM Journal on Control and Optimization, pages 1008-1014, 2001.
- (2001) SIAM Journal on Control and Optimization , pp. 1008-1014
- Konda, V.¹ Tsitsiklis, J.²

9
- 33646399442
- Policy gradient in continuous time
- R. Munos. Policy gradient in continuous time. The Journal of Machine Learning Research, 7:771-791, 2006.
- (2006) The Journal of Machine Learning Research , vol.7 , pp. 771-791
- Munos, R.¹

10
- 0003722979
- Springer-Verlag, Berlin
- B. Oksendal. Stochastic Differential Equations (4th Ed). Springer-Verlag, Berlin, 1995.
- (1995) Stochastic Differential Equations (4th Ed)
- Oksendal, B.¹

11
- 40649106649
- Natural actor-critic
- J. Peters and S. Schaal. Natural actor-critic. Neurocomputing, 71:1180-1190, 2008.
- (2008) Neurocomputing , vol.71 , pp. 1180-1190
- Peters, J.¹ Schaal, S.²

12
- 85162068532
- M. Schmidt. minfunc. online material, 2005.
- (2005) Minfunc. Online Material
- Schmidt, M.¹

13
- 0004294973
- Dover, New York
- R. Stengel. Optimal Control and Estimation. Dover, New York, 1994.
- (1994) Optimal Control and Estimation
- Stengel, R.¹

14
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- R. Sutton, D. Mcallester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems, 2000.
- (2000) Advances in Neural Information Processing Systems
- Sutton, R.¹ McAllester, D.² Singh, S.³ Mansour, Y.⁴

15
- 84923382376
- Linearly-solvable Markov decision problems
- E. Todorov. Linearly-solvable Markov decision problems. Advances in Neural Information Processing Systems, 2006.
- (2006) Advances in Neural Information Processing Systems
- Todorov, E.¹

16
- 67650915125
- Efficient computation of optimal actions
- E. Todorov. Efficient computation of optimal actions. PNAS, 106:11478-11483, 2009.
- (2009) PNAS , vol.106 , pp. 11478-11483
- Todorov, E.¹

17
- 85162042971
- Eigen-function approximation methods for linearly-solvable optimal control problems
- E. Todorov. Eigen-function approximation methods for linearly-solvable optimal control problems. IEEE ADPRL, 2009.
- (2009) IEEE ADPRL
- Todorov, E.¹

18
- 0000337576
- Simple statistical gradient following algorithms for connectionist reinforcement learning
- R. Williams. Simple statistical gradient following algorithms for connectionist reinforcement learning. Machine Learning, pages 229-256, 1992.
- (1992) Machine Learning , pp. 229-256
- Williams, R.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.