SCOPUS 정보 검색 플랫폼

Volumn , Issue , 2007, Pages 44-51

Dual representations for dynamic programming and reinforcement learning

Author keywords

[No Author keywords available]

Indexed keywords

ITERATIVE METHODS; LEARNING ALGORITHMS; PROBLEM SOLVING; REINFORCEMENT LEARNING;

Q-LEARNING; VALUE FUNCTION ESTIMATION; VALUE FUNCTIONS; VALUE ITERATION;

DYNAMIC PROGRAMMING;

EID: 34548784027 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ADPRL.2007.368168 Document Type: Conference Paper

Times cited : (44)

References (10)

1
- 0004102479
- MIT Press
- R. Sutton and A. Barto, Reinforcement Learning: An Introduction. MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.¹ Barto, A.²

2
- 0003998452
- Wiley
- M. Puterman, Markov Decision Processes: Discrete Dynamic Programming. Wiley, 1994.
- (1994) Markov Decision Processes: Discrete Dynamic Programming
- Puterman, M.¹

3
- 0003565783
- Athena Scientific
- D. Bertsekas, Dynamic Programming and Optimal Control. Athena Scientific, 1995, vol. 2.
- (1995) Dynamic Programming and Optimal Control , vol.2
- Bertsekas, D.¹

4
- 0003487482
- Athena Scientific
- D. Bertsekas and J. Tsitsiklis, Neuro-Dynamic Programming. Athena Scientific, 1996.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.¹ Tsitsiklis, J.²

5
- 0001158047
- Improving generalisation for temporal difference learning: The successor representation
- P. Dayan, "Improving generalisation for temporal difference learning: The successor representation," Neural Computation, vol. 5, pp. 613-624, 1993.
- (1993) Neural Computation , vol.5 , pp. 613-624
- Dayan, P.¹

6
- 34548779576
- Policy search via density estimation
- A. Ng, R. Parr, and D. Koller, "Policy search via density estimation," in Proceedings NIPS, 1999.
- (1999) Proceedings NIPS
- Ng, A.¹ Parr, R.² Koller, D.³

7
- 0012255582
- D. de Farias and B. Van Roy, "The linear programming approach to approximate dynamic programming," 2001.
- (2001) The linear programming approach to approximate dynamic programming
- de Farias, D.¹ Van Roy, B.²

8
- 0003897447
- 6th ed. Academic Press
- S. Ross, Introduction to Probability Models, 6th ed. Academic Press, 1997.
- (1997) Introduction to Probability Models
- Ross, S.¹

9
- 0031143730
- An analysis of temporal-difference learning with function approximation
- J. Tsitsiklis and B. Van Roy, "An analysis of temporal-difference learning with function approximation," IEEE Transactions on Automatic Control, vol. 42, no. 5, pp. 674-690, 1997.
- (1997) IEEE Transactions on Automatic Control , vol.42 , Issue.5 , pp. 674-690
- Tsitsiklis, J.¹ Van Roy, B.²

10
- 85151728371
- Residual algorithms: Reinforcement learning with function approximation
- L. Baird, "Residual algorithms: Reinforcement learning with function approximation," in Proceedings ICML, 1995.
- (1995) Proceedings ICML
- Baird, L.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.