SCOPUS 정보 검색 플랫폼

2010 International Congress on Ultra Modern Telecommunications and Control Systems and Workshops, ICUMT 2010

Volumn , Issue , 2010, Pages 450-457

Statistically linearized least-squares temporal differences

(2) Geist, Matthieu a Pietquin, Olivier a

a UMI Georgia Tech CNRS 2958 (France)

Author keywords

Neural networks; Reinforcement learning; Statistical linearization; Value function approximation

Indexed keywords

ITERATIVE METHODS; LINEARIZATION; NEURAL NETWORKS; REINFORCEMENT LEARNING;

LEAST-SQUARES TEMPORAL DIFFERENCES; LINEAR PARAMETRIZATION; NONLINEAR PARAMETERIZATIONS; POLICY ITERATION; REAL-WORLD PROBLEM; STATISTICAL LINEARIZATION; VALUE FUNCTION APPROXIMATION; VALUE ITERATION;

LEARNING ALGORITHMS;

EID: 79951499926 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICUMT.2010.5676598 Document Type: Conference Paper

Times cited : (9)

References (20)

1
- 0004102479
- 3rd ed. The MIT Press, March
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning), 3rd ed. The MIT Press, March 1998.
- (1998) Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning)
- Sutton, R.S.¹ Barto, A.G.²

2
- 0031143730
- An analysis of temporal-difference learning with function approximation
- J. N. Tsitsiklisc and B. Van Roy, "An analysis of temporal-difference learning with function approximation," IEEE Transactions on Automatic Control, vol. 42, pp. 674-690, 1997.
- (1997) IEEE Transactions on Automatic Control , vol.42 , pp. 674-690
- Tsitsiklisc, J.N.¹ Van Roy, B.²

3
- 0001771345
- Linear Least-Squares algorithms for temporal difference learning
- S. J. Bradtke and A. G. Barto, "Linear Least-Squares algorithms for temporal difference learning," Machine Learning, vol. 22, no. 1-3, pp. 33-57, 1996.
- (1996) Machine Learning , vol.22 , Issue.1-3 , pp. 33-57
- Bradtke, S.J.¹ Barto, A.G.²

4
- 79951481923
- Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation
- H. Maei, C. Szepesvari, S. Bhatnagar, D. Precup, D. Silver, and R. Sutton, "Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation," in Advances in Neural Information Processing Systems 22, 2009, pp. 1204-1212.
- (2009) Advances in Neural Information Processing Systems , vol.22 , pp. 1204-1212
- Maei, H.¹ Szepesvari, C.² Bhatnagar, S.³ Precup, D.⁴ Silver, D.⁵ Sutton, R.⁶

5
- 33646435300
- A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning
- D. Choi and B. Van Roy, "A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning," Discrete Event Dynamic Systems, vol. 16, pp. 207-239, 2006.
- (2006) Discrete Event Dynamic Systems , vol.16 , pp. 207-239
- Choi, D.¹ Van Roy, B.²

6
- 85151728371
- Residual Algorithms: Reinforcement Learning with Function Approximation
- L. C. Baird, "Residual Algorithms: Reinforcement Learning with Function Approximation," in Proceedings of the International Conference on Machine Learning (ICML 95), 1995, pp. 30-37.
- Proceedings of the International Conference on Machine Learning (ICML 95), 1995 , pp. 30-37
- Baird, L.C.¹

7
- 1942421151
- Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning
- Y. Engel, S. Mannor, and R. Meir, "Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning," in Proceedings of the International Conference on Machine Learning (ICML 03), 2003, pp. 154-161.
- Proceedings of the International Conference on Machine Learning (ICML 03), 2003 , pp. 154-161
- Engel, Y.¹ Mannor, S.² Meir, R.³

8
- 67650458797
- Kalman Temporal Differences: The deterministic case
- M. Geist, O. Pietquin, and G. Fricout, "Kalman Temporal Differences: the deterministic case," in IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2009), Nashville, TN, USA, April 2009.
- IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2009), Nashville, TN, USA, April 2009
- Geist, M.¹ Pietquin, O.² Fricout, G.³

9
- 31844451013
- Reinforcement Learning with Gaussian Processes
- Y. Engel, S. Mannor, and R. Meir, "Reinforcement Learning with Gaussian Processes," in Proceedings of International Conference on Machine Learning (ICML-05), 2005.
- Proceedings of International Conference on Machine Learning (ICML-05), 2005
- Engel, Y.¹ Mannor, S.² Meir, R.³

10
- 79951485912
- Eligibility Traces through Colored Noises
- M. Geist and O. Pietquin, "Eligibility Traces through Colored Noises," in International Conference on Ultra Modern Control systems (ICUMT 2010 (Control Systems)), Moscow, Russia, October 2010.
- International Conference on Ultra Modern Control Systems (ICUMT 2010 (Control Systems)), Moscow, Russia, October 2010
- Geist, M.¹ Pietquin, O.²

11
- 0037288398
- Least Squares Policy Evaluation Algorithms with Linear Function Approximation
- A. Nedić and D. P. Bertsekas, "Least Squares Policy Evaluation Algorithms with Linear Function Approximation," Discrete Event Dynamic Systems: Theory and Applications, vol. 13, pp. 79-110, 2003.
- (2003) Discrete Event Dynamic Systems: Theory and Applications , vol.13 , pp. 79-110
- Nedić, A.¹ Bertsekas, D.P.²

12
- 4644323293
- Least-Squares Policy Iteration
- M. G. Lagoudakis and R. Parr, "Least-Squares Policy Iteration," Journal of Machine Learning Research, vol. 4, pp. 1107-1149, 2003.
- (2003) Journal of Machine Learning Research , vol.4 , pp. 1107-1149
- Lagoudakis, M.G.¹ Parr, R.²

13
- 34447553096
- Reinforcement Learning for Humanoid Robotics
- J. Peters, S. Vijayakumar, and S. Schaal, "Reinforcement Learning for Humanoid Robotics," in Third IEEE-RAS International Conference on Humanoid Robots (Humanoids 2003), 2003.
- Third IEEE-RAS International Conference on Humanoid Robots (Humanoids 2003), 2003
- Peters, J.¹ Vijayakumar, S.² Schaal, S.³

14
- 0003648234
- Wiley
- T. W. Anderson, An Introduction to Multivariate Statistical Analysis. Wiley, 1984.
- (1984) An Introduction to Multivariate Statistical Analysis
- Anderson, T.W.¹

15
- 21244437999
- Unscented filtering and nonlinear estimation
- S. J. Julier and J. K. Uhlmann, "Unscented filtering and nonlinear estimation," Proceedings of the IEEE, vol. 92, no. 3, pp. 401-422, 2004.
- (2004) Proceedings of the IEEE , vol.92 , Issue.3 , pp. 401-422
- Julier, S.J.¹ Uhlmann, J.K.²

16
- 0034326226
- New developments in state estimation for nonlinear systems
- P. Nørg̊ard, N. Poulsen, and O. Ravn, "New developments in state estimation for nonlinear systems," Automatica, vol. 36, no. 11, pp. 1627-1638, 2000.
- (2000) Automatica , vol.36 , Issue.11 , pp. 1627-1638
- Nørg̊ard, P.¹ Poulsen, N.² Ravn, O.³

17
- 78449267579
- Statistically Linearized Recursive Least Squares
- to appear
- M. Geist and O. Pietquin, "Statistically Linearized Recursive Least Squares," in Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2010), Kittilä (Finland), August-September 2010, 5 pages, to appear.
- Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2010), Kittilä (Finland), August-September 2010 , pp. 5
- Geist, M.¹ Pietquin, O.²

18
- 84966204836
- Methods for Modifying Matrix Factorization
- April
- P. E. Gill, G. H. Golub, W. Murray, and M. A. Saunders, "Methods for Modifying Matrix Factorization," Mathematics of Computation, vol. 28, no. 126, pp. 505-535, April 1974.
- (1974) Mathematics of Computation , vol.28 , Issue.126 , pp. 505-535
- Gill, P.E.¹ Golub, G.H.² Murray, W.³ Saunders, M.A.⁴

19
- 40849145988
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- A. Antos, C. Szepesvári, and R. Munos, "Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path," Machine Learning, vol. 71, no. 1, pp. 89-129, 2008.
- (2008) Machine Learning , vol.71 , Issue.1 , pp. 89-129
- Antos, A.¹ Szepesvári, C.² Munos, R.³

20
- 33646398129
- Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method
- M. Riedmiller, "Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method," in European Conference on Machine Learning, 2005, pp. 317-328.
- European Conference on Machine Learning, 2005 , pp. 317-328
- Riedmiller, M.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.