SCOPUS 정보 검색 플랫폼

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volumn 6408 LNAI, Issue , 2010, Pages 207-218

Revisiting natural actor-critics with value function approximation

(2) Geist, Matthieu a Pietquin, Olivier a

a UMI Georgia Tech CNRS 2958 (France)

Author keywords

[No Author keywords available]

Indexed keywords

REINFORCEMENT LEARNING; DYNAMIC PROGRAMMING;

ACTOR-CRITIC ALGORITHM; ACTOR-CRITIC ARCHITECTURES; BELLMAN EQUATIONS; FUNCTION APPROXIMATION; LARGE-SCALE PROBLEM; RECENT RESEARCHES; VALUE FUNCTION APPROXIMATION; VALUE FUNCTIONS;

DYNAMIC PROGRAMMING; ARTIFICIAL INTELLIGENCE;

EID: 79956274048 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/978-3-642-16292-3_21 Document Type: Conference Paper

Times cited : (10)

References (16)

1
- 0004140522
- Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems, pp. 535-549 (1988)
- (1988) Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems , pp. 535-549
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

2
- 0004049893
- PhD thesis, Cambridge University, Cambridge, England
- Watkins, C.: Learning from Delayed Rewards. PhD thesis, Cambridge University, Cambridge, England (1989)
- (1989) Learning from Delayed Rewards
- Watkins, C.¹

3
- 84898939480
- Policy Gradient Methods for Reinforcement Learning with Function Approximation
- Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy Gradient Methods for Reinforcement Learning with Function Approximation. In: Advances in Neural Information Processing Systems (NIPS 12), pp. 1057-1063 (2000)
- (2000) Advances in Neural Information Processing Systems (NIPS 12) , pp. 1057-1063
- Sutton, R.S.¹ McAllester, D.A.² Singh, S.P.³ Mansour, Y.⁴

4
- 84898938510
- Actor-Critic Algorithms
- Konda, V.R., Tsitsiklis, J.N.: Actor-Critic Algorithms. In: Advances in Neural Information Processing Systems, NIPS 12 (2000)
- (2000) Advances in Neural Information Processing Systems, NIPS 12
- Konda, V.R.¹ Tsitsiklis, J.N.²

5
- 34447553096
- Reinforcement Learning for Humanoid Robotics
- Peters, J., Vijayakumar, S., Schaal, S.: Reinforcement Learning for Humanoid Robotics. In: Third IEEE-RAS International Conference on Humanoid Robots, Humanoids 2003 (2003)
- Third IEEE-RAS International Conference on Humanoid Robots, Humanoids 2003 (2003)
- Peters, J.¹ Vijayakumar, S.² Schaal, S.³

6
- 4143121578
- Reinforcement Learning: An Introduction
- 3rd edn. The MIT Press, Cambridge
- Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. In: Adaptive Computation and Machine Learning, 3rd edn. The MIT Press, Cambridge (1998)
- (1998) Adaptive Computation and Machine Learning
- Sutton, R.S.¹ Barto, A.G.²

7
- 85162049326
- Incremental Natural Actor-Critic Algorithms
- Vancouver, Canada
- Bhatnagar, S., Sutton, R.S., Ghavamzadeh, M., Lee, M.: Incremental Natural Actor-Critic Algorithms. In: Advances in Neural Information Processing Systems (NIPS 21), Vancouver, Canada (2007)
- (2007) Advances in Neural Information Processing Systems (NIPS 21)
- Bhatnagar, S.¹ Sutton, R.S.² Ghavamzadeh, M.³ Lee, M.⁴

8
- 0000396062
- Natural gradient works efficiently in learning
- Amari, S.I.: Natural gradient works efficiently in learning. Neural Computation 10, 251-276 (1998)
- (1998) Neural Computation , vol.10 , pp. 251-276
- Amari, S.I.¹

9
- 84898930479
- A Natural Policy Gradient
- Kakade, S.: A Natural Policy Gradient. In: Advances in Neural Information Processing Systems (NIPS 14), pp. 1531-1538 (2002)
- (2002) Advances in Neural Information Processing Systems (NIPS 14) , pp. 1531-1538
- Kakade, S.¹

10
- 67650458797
- Kalman Temporal Differences: The deterministic case
- Geist, M., Pietquin, O., Fricout, G.: Kalman Temporal Differences: the deterministic case. In: Proceedings of the IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2009), Nashville, TN, USA (2009)
- Proceedings of the IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2009), Nashville, TN, USA (2009)
- Geist, M.¹ Pietquin, O.² Fricout, G.³

11
- 50849108789
- Utilizing the Natural Gradient in Temporal Difference Reinforcement Learning with Eligibility Traces
- Morimura, T., Uchibe, E., Doya, K.: Utilizing the Natural Gradient in Temporal Difference Reinforcement Learning with Eligibility Traces. In: 2nd Internatinal Symposium on Information Geometry and its Applications, Tokyo, Japan, pp. 256-263 (2005)
- (2005) 2nd Internatinal Symposium on Information Geometry and Its Applications, Tokyo, Japan , pp. 256-263
- Morimura, T.¹ Uchibe, E.² Doya, K.³

12
- 67650505326
- The QV Family Compared to Other Reinforcement Learning Algorithms
- Wiering, M., van Hasselt, H.: The QV Family Compared to Other Reinforcement Learning Algorithms. In: IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2009), Nashville, TN, USA (2009)
- IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2009), Nashville, TN, USA (2009)
- Wiering, M.¹ Van Hasselt, H.²

13
- 0001771345
- Linear Least-Squares algorithms for temporal difference learning
- Bradtke, S.J., Barto, A.G.: Linear Least-Squares algorithms for temporal difference learning. Machine Learning 22, 33-57 (1996)
- (1996) Machine Learning , vol.22 , pp. 33-57
- Bradtke, S.J.¹ Barto, A.G.²

14
- 76649127744
- Tracking in reinforcement learning
- Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009. Springer, Heidelberg
- Geist, M., Pietquin, O., Fricout, G.: Tracking in reinforcement learning. In: Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009. LNCS, vol. 5863, pp. 502-511. Springer, Heidelberg (2009)
- (2009) LNCS , vol.5863 , pp. 502-511
- Geist, M.¹ Pietquin, O.² Fricout, G.³

15
- 33646831159
- An RLS-Based Natural Actor-Critic Algorithm for Locomotion of a Two-Linked Robot Arm
- Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y.-m., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005. Springer, Heidelberg
- Park, J., Kim, J., Kang, D.: An RLS-Based Natural Actor-Critic Algorithm for Locomotion of a Two-Linked Robot Arm. In: Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y.-m., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005. LNCS (LNAI), vol. 3801, pp. 65-72. Springer, Heidelberg (2005)
- (2005) LNCS (LNAI) , vol.3801 , pp. 65-72
- Park, J.¹ Kim, J.² Kang, D.³

16
- 4644323293
- Least-Squares Policy Iteration
- Lagoudakis, M.G., Parr, R.: Least-Squares Policy Iteration. Journal of Machine Learning Research 4, 1107-1149 (2003)
- (2003) Journal of Machine Learning Research , vol.4 , pp. 1107-1149
- Lagoudakis, M.G.¹ Parr, R.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.