SCOPUS 정보 검색 플랫폼

Volumn 7, Issue , 2006, Pages 771-791

Policy gradient in continuous time published

Author keywords

Gradient estimate; Likelihood ratio method; Optimization; Pathwise derivation

Indexed keywords

APPROXIMATION THEORY; DECISION MAKING; OPTIMIZATION; PARAMETER ESTIMATION; PROBLEM SOLVING; SEARCH ENGINES;

GRADIENT ESTIMATE; LIKELIHOOD RATIO METHOD; PATHWISE DERIVATION; REINFORCEMENT LEARNING;

OPTIMAL CONTROL SYSTEMS;

EID: 33646399442 PISSN: 15337928 EISSN: 15337928 Source Type: Journal
DOI: None Document Type: Article

Times cited : (75)

References (19)

1
- 0013535965
- Infinite-horizon gradient-based policy search
- J. Baxter and P. L. Bartlett. Infinite-horizon gradient-based policy search. Journal of Artificial Intelligence Research, 15:319-350, 2001.
- (2001) Journal of Artificial Intelligence Research , vol.15 , pp. 319-350
- Baxter, J.¹ Bartlett, P.L.²

2
- 0003981935
- Wiley/Gauthier-Villars Series in Modern Applied Mathematics. John Wiley & Sons Ltd., Chichester, Translated from the French by C. Tomson.
- A. Bensoussan. Perturbation methods in optimal control. Wiley/Gauthier-Villars Series in Modern Applied Mathematics. John Wiley & Sons Ltd., Chichester, 1988. Translated from the French by C. Tomson.
- (1988) Perturbation Methods in Optimal Control
- Bensoussan, A.¹

3
- 33646434172
- Optimal control of a double inverted pendulum on a cart
- A. Bogdanov. Optimal control of a double inverted pendulum on a cart. Technical report CSE-04-006, CSEE, OGI School of Science and Engineering, OHSU, 2004.
- (2004) Technical Report CSE-04-006, CSEE, OGI School of Science and Engineering, OHSU
- Bogdanov, A.¹

5
- 27644539475
- Sensitivity analysis using Itô-Malliavin calculus and martingales, application to stochastic optimal control
- E. Gobet and R. Munos. Sensitivity analysis using Itô-Malliavin calculus and martingales, application to stochastic optimal control. SIAM journal on Control and Optimization, 43(5): 1676-1713, 2005.
- (2005) SIAM Journal on Control and Optimization , vol.43 , Issue.5 , pp. 1676-1713
- Gobet, E.¹ Munos, R.²

6
- 0004236492
- Baltimore, MD: Johns Hopkins
- G. H. Golub and C. F. Van Loan. Matrix Computations, 3rd ed. Baltimore, MD: Johns Hopkins, 1996.
- (1996) Matrix Computations, 3rd Ed.
- Golub, G.H.¹ Van Loan, C.F.²

7
- 0004051886
- New York: McGraw Hill
- R. E. Kalman, P. L. Falb, and M. A. Arbib. Topics in Mathematical System Theory. New York: McGraw Hill, 1969.
- (1969) Topics in Mathematical System Theory
- Kalman, R.E.¹ Falb, P.L.² Arbib, M.A.³

8
- 0003568337
- Springer-Verlag
- P. E. Kloeden and E. Platen. Numerical Solutions of Stochastic Differential Equations. Springer-Verlag, 1995.
- (1995) Numerical Solutions of Stochastic Differential Equations
- Kloeden, P.E.¹ Platen, E.²

9
- 0004066022
- Springer-Verlag, Berlin and New York
- H. J. Kushner and G. Yin. Stochastic Approximation Algorithms and Applications. Springer-Verlag, Berlin and New York, 1997.
- (1997) Stochastic Approximation Algorithms and Applications
- Kushner, H.J.¹ Yin, G.²

10
- 77952010176
- Cambridge University Press
- S. M. LaValle. Planning Algorithms. Cambridge University Press, 2006.
- (2006) Planning Algorithms
- LaValle, S.M.¹

11
- 0038609707
- American Mathematical Society, Providence, RI
- M. Ledoux. The concentration of measure phenomenon. American Mathematical Society, Providence, RI, 2001.
- (2001) The Concentration of Measure Phenomenon
- Ledoux, M.¹

12
- 0037288469
- Approximate gradient methods in policy-space optimization of Markov reward processes
- P. Marbach and J. N. Tsitsiklis. Approximate gradient methods in policy-space optimization of Markov reward processes. Journal of Discrete Event Dynamical Systems, 13:111-148, 2003.
- (2003) Journal of Discrete Event Dynamical Systems , vol.13 , pp. 111-148
- Marbach, P.¹ Tsitsiklis, J.N.²

13
- 85049776636
- Optimization Software Inc., New York
- B. T. Polyak. Introduction to Optimization. Optimization Software Inc., New York, 1987.
- (1987) Introduction to Optimization
- Polyak, B.T.¹

15
- 0008321896
- Reinforcement learning: An introduction
- R. S. Sutton and A. G. Barto. Reinforcement learning: An introduction. Bradford Book, 1998.
- (1998) Bradford Book
- Sutton, R.S.¹ Barto, A.G.²

17
- 0030522124
- A new look at independence
- M. Talagrand. A new look at independence. Annals of Probability, 24:1-34, 1996,
- (1996) Annals of Probability , vol.24 , pp. 1-34
- Talagrand, M.¹

18
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8:229-256, 1992.
- (1992) Machine Learning , vol.8 , pp. 229-256
- Williams, R.J.¹

19
- 0026221552
- A Monte Carlo method for sensitivity analysis and parametric optimization of nonlinear stochastic systems
- J. Yang and H. J. Kushner. A Monte Carlo method for sensitivity analysis and parametric optimization of nonlinear stochastic systems. SIAM J. Control Optim., 29(5): 1216-1249, 1991.
- (1991) SIAM J. Control Optim. , vol.29 , Issue.5 , pp. 1216-1249
- Yang, J.¹ Kushner, H.J.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.