SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems

Volumn , Issue , 1997, Pages 1075-1081

Analysis of temporal-difference learning with function approximation

(2) Tsitsiklis, John N a Van Roy, Benjamin a

a MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

APPROXIMATION ERRORS; FINITE STATE; FUNCTION APPROXIMATION; FUNCTION APPROXIMATORS; LINEAR FUNCTIONS; PARAMETER VECTORS; PARAMETERIZED; TEMPORAL DIFFERENCE LEARNING;

MARKOV PROCESSES;

EID: 84887003012 PISSN: 10495258 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (261)

References (10)

1
- 85151728371
- Residual algorithms: Reinforcement learning with function approximation
- Prieditis & Russell, eds 9-12 July, Morgan Kaufman Publishers, San Francisco, CA
- Baird, L. C. (1995). "Residual Algorithms: Reinforcement Learning with Function Approximation," in Prieditis & Russell, eds. Machine Learning: Proceedings of the Twelfth International Conference, 9-12 July, Morgan Kaufman Publishers, San Francisco, CA.
- (1995) Machine Learning: Proceedings of the Twelfth International Conference
- Baird, L.C.¹

2
- 0000268954
- A counter-example to temporal-difference learning
- Bertsekas, D. P. (1994) "A Counter-Example to Temporal-Difference Learning," Neural Computation, vol. 7, pp. 270-279.
- (1994) Neural Computation , vol.7 , pp. 270-279
- Bertsekas, D.P.¹

3
- 0003565783
- Athena Scientific, Belmont, MA
- Bertsekas, D. P. (1995) Dynamic Programming and Optimal Control, Athena Scientific, Belmont, MA.
- (1995) Dynamic Programming and Optimal Control
- Bertsekas, D.P.¹

4
- 0003487482
- Athena Scientific, Belmont, MA
- Bertsekas, D. P. h Tsitsiklis, J. N. (1996) Neuro-Dynamic Programming, Athena Scientific, Belmont, MA.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

5
- 0003778897
- Springer- Verlag, Berlin
- Benveniste, A., Metivier, M., & Priouret, P., (1990) Adaptive Algorithms and Stochastic Approximations, Springer-Verlag, Berlin.
- (1990) Adaptive Algorithms and Stochastic Approximations
- Benveniste, A.¹ Metivier, M.² Priouret, P.³

6
- 84898963260
- preprint
- Dayan, P. D. & Singh, S. P (1996) "Mean Squared Error Curves in Temporal Difference Learning," preprint.
- (1996) Mean Squared Error Curves in Temporal Difference Learning
- Dayan, P.D.¹ Singh, S.P.²

7
- 84899012767
- personal communication
- Gurvits, L. (1996) personal communication.
- (1996)
- Gurvits, L.¹

8
- 33847202724
- Learning to predict by the method of temporal differences
- Sutton, R. S., (1988) "Learning to Predict by the Method of Temporal Differences," Machine Learning, vol. 3, pp. 9-44.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

9
- 33746944751
- On the virtues of linear learning and trajectory distributions
- Boyan, Moore, and Sutton, Eds., Technical Report CMU-CS-95-206, Carnegie Mellon University, Pittsburgh, PA 15213
- Sutton, R.S. (1995) "On the Virtues of Linear Learning and Trajectory Distributions," Proceedings of the Workshop on Value Function Approximation, Machine Learning Conference 1995, Boyan, Moore, and Sutton, Eds., p. 85. Technical Report CMU-CS-95-206, Carnegie Mellon University, Pittsburgh, PA 15213.
- (1995) Proceedings of the Workshop on Value Function Approximation, Machine Learning Conference 1995 , pp. 85
- Sutton, R.S.¹

10
- 0008813539
- An analysis of temporal-difference learning with function approximation
- to appear in the
- Tsitsiklis, J. N. & Van Roy, B. (1996) "An Analysis of Temporal-Difference Learning with Function Approximation," to appear in the IEEE Transactions on Automatic Control.
- (1996) IEEE Transactions on Automatic Control
- Tsitsiklis, J.N.¹ Van Roy, B.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.