SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference

Volumn , Issue , 2008, Pages

Stable dual dynamic programming

(4) Wang, Tao a,b Lizotte, Daniel a Bowling, Michael a Schuurmans, Dale a

a UNIVERSITY OF ALBERTA (Canada)

b AUSTRALIAN NATIONAL UNIVERSITY (Australia)

Author keywords

[No Author keywords available]

Indexed keywords

APPROXIMATION ALGORITHMS; REINFORCEMENT LEARNING;

CONVERGENCE PROPERTIES; DUAL ALGORITHM; DUAL DYNAMIC PROGRAMMING; EXPLICIT REPRESENTATION; FUNCTIONS APPROXIMATIONS; PROGRAMMING LEARNING; REINFORCEMENT LEARNINGS; SCALED-UP; STATIONARY DISTRIBUTION; VALUE FUNCTIONS;

DYNAMIC PROGRAMMING;

EID: 85161971158 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (9)

References (10)

1
- 0003998452
- Wiley
- M. Puterman. Markov Decision Processes: Discrete Dynamic Programming. Wiley, 1994.
- (1994) Markov Decision Processes: Discrete Dynamic Programming
- Puterman, M.¹

2
- 0003565783
- Athena Scientific
- D. Bertsekas. Dynamic Programming and Optimal Control, volume 2. Athena Scientific, 1995.
- (1995) Dynamic Programming and Optimal Control , vol.2
- Bertsekas, D.¹

3
- 0003487482
- Athena Scientific
- D. Bertsekas and J. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, 1996.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.¹ Tsitsiklis, J.²

4
- 34548784027
- Dual representations for dynamic programming and reinforcement learning
- DOI 10.1109/ADPRL.2007.368168, 4220813, Proceedings of the 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, ADPRL 2007
- T. Wang, M. Bowling, and D. Schuurmans. Dual representations for dynamic programming and reinforcement learning. In Proceeding of the IEEE International Symposium on ADPRL, pages 44-51, 2007. (Pubitemid 47431365)
- (2007) Proceedings of the 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, ADPRL 2007 , pp. 44-51
- Wang, T.¹ Bowlingm, M.² Schuurmans, D.³

5
- 85151728371
- Residual algorithms: Reinforcement learning with function approximation
- L. C. Baird. Residual algorithms: Reinforcement learning with function approximation. In International Conference on Machine Learning, pages 30-37, 1995.
- (1995) International Conference on Machine Learning , pp. 30-37
- Baird, L.C.¹

6
- 0004102479
- MIT Press
- R. Sutton and A. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.¹ Barto, A.²

7
- 0031143730
- An analysis of temporal-difference learning with function approximation
- PII S0018928697034375
- J. Tsitsiklis and B. Van Roy. An analysis of temporal-difference learning with function approximation. IEEE Trans. Automat. Control, 42(5):674-690, 1997. (Pubitemid 127760263)
- (1997) IEEE Transactions on Automatic Control , vol.42 , Issue.5 , pp. 674-690
- Tsitsiklis, J.N.¹ Van Roy, B.²

8
- 0034342516
- On the existence of fixed points for approximate value iteration and temporal-difference learning
- D. de Farias and B. Van Roy. On the existence of fixed points for approximate value iteration and temporal-difference learning. J. Optimization Theory and Applic., 105(3):589-608, 2000.
- (2000) J. Optimization Theory and Applic. , vol.105 , Issue.3 , pp. 589-608
- De Farias, D.¹ Van Roy, B.²

9
- 85153940465
- Generalization in reinforcement learning: Safely approximating the value function
- J. A. Boyan and A. W. Moore. Generalization in reinforcement learning: Safely approximating the value function. In NIPS 7, pages 369-376, 1995.
- (1995) NIPS , vol.7 , pp. 369-376
- Boyan, J.A.¹ Moore, A.W.²

10
- 85156221438
- Generalization in reinforcement learning: Successful examples using sparse coarse coding
- R. S. Sutton. Generalization in reinforcement learning: Successful examples using sparse coarse coding. In Advances in Neural Information Processing Systems, pages 1038-1044, 1996.
- (1996) Advances in Neural Information Processing Systems , pp. 1038-1044
- Sutton, R.S.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.