SCOPUS 정보 검색 플랫폼

ICML 2010 - Proceedings, 27th International Conference on Machine Learning

Volumn , Issue , 2010, Pages 1207-1214

Convergence of least squares temporal difference methods under general conditions

Author keywords

[No Author keywords available]

Indexed keywords

BOUNDEDNESS PROPERTIES; DISCOUNTED COST CRITERION; FINITE STATE; LEARNING CONTEXT; LEAST SQUARE; MARKOV CHAIN; MARKOV DECISION PROCESSES; POLICY EVALUATION; PRACTICAL IMPLEMENTATION; SIMULATION-BASED; TEMPORAL DIFFERENCES; TEMPORAL-DIFFERENCE ALGORITHM; TOPOLOGICAL SPACES;

CHAINS; LEARNING ALGORITHMS; LEARNING SYSTEMS; MARKOV PROCESSES; TOPOLOGY;

CONVERGENCE OF NUMERICAL METHODS;

EID: 77956517288 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (41)

References (21)

1
- 33744819512
- Adaptive importance sampling technique for Markov chains using stochastic approximation
- Ahamed, T. P., Borkar, V. S., and Juneja, S. Adaptive importance sampling technique for Markov chains using stochastic approximation. Operations Research, 54:489-504, 2006.
- (2006) Operations Research , vol.54 , pp. 489-504
- Ahamed, T.P.¹ Borkar, V.S.² Juneja, S.³

2
- 0003565783
- Athena Scientific, Belmont, MA, third edition
- Bertsekas, D. P. Dynamic Programming and Optimal Control, volume II. Athena Scientific, Belmont, MA, third edition, 2007.
- (2007) Dynamic programming and optimal control , vol.2
- Bertsekas, D.P.¹

3
- 77956540624
- Projected equations, variational inequalities, and temporal difference methods
- to appear
- Bertsekas, D. P. Projected equations, variational inequalities, and temporal difference methods. IEEE Trans. Automat. Contr., 2009. to appear.
- (2009) IEEE Trans. Automat. Contr.
- Bertsekas, D.P.¹

4
- 0003487482
- Athena Scientific, Belmont, MA
- Bertsekas, D. P. and Tsitsiklis, J. N. Neuro-Dynamic Programming. Athena Scientific, Belmont, MA, 1996.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

5
- 61849106433
- Projected equation methods for approximate solution of large linear systems
- Bertsekas, D. P. and Yu, H. Projected equation methods for approximate solution of large linear systems. J. Computational and Applied Mathematics, 227(1): 27-50, 2009.
- (2009) J. Computational and Applied Mathematics , vol.227 , Issue.1 , pp. 27-50
- Bertsekas, D.P.¹ Yu, H.²

6
- 58849087743
- Hindustan Book Agency, New Delhi
- Borkar, V. S. Stochastic Approximation: A Dynamic Viewpoint. Hindustan Book Agency, New Delhi, 2008.
- (2008) Stochastic Approximation: A Dynamic Viewpoint
- Borkar, V.S.¹

7
- 0038595396
- Least-squares temporal difference learning
- Boyan, J. A. Least-squares temporal difference learning. In Proc. the 16th ICML, pp. 49-56, 1999.
- (1999) Proc. the 16th ICML , pp. 49-56
- Boyan, J.A.¹

8
- 0001771345
- Linear least-squares algorithms for temporal difference learning
- Bradtke, S. J. and Barto, A. G. Linear least-squares algorithms for temporal difference learning. Machine Learning, 22(2):33-57, 1996.
- (1996) Machine Learning , vol.22 , Issue.2 , pp. 33-57
- Bradtke, S.J.¹ Barto, A.G.²

9
- 32944469001
- Probability
- Philadelphia, PA
- Breiman, L. Probability. SIAM, Philadelphia, PA, 1992.
- (1992) SIAM
- Breiman, L.¹

10
- 0003954462
- John Wiley & Sons, New York
- Doob, J. L. Stochastic Processes. John Wiley & Sons, New York, 1953.
- (1953) Stochastic Processes
- Doob, J.L.¹

11
- 0001240715
- Importance sampling for stochastic simulations
- Glynn, P. W. and Iglehart, D. L. Importance sampling for stochastic simulations. Management Science, 35: 1367-1392, 1989.
- (1989) Management Science , vol.35 , pp. 1367-1392
- Glynn, P.W.¹ Iglehart, D.L.²

12
- 9944258743
- Springer-Verlag, New York, 2nd edition
- Kushner, H. J. and Yin, G. G. Stochastic Approximation and Recursive Algorithms and Applications. Springer-Verlag, New York, 2nd edition, 2003.
- (2003) Stochastic Approximation and Recursive Algorithms and Applications
- Kushner, H.J.¹ Yin, G.G.²

13
- 84925067999
- Cambridge University Press, Cambdrige, UK
- Meyn, S. Control Techniques for Complex Networks. Cambridge University Press, Cambdrige, UK, 2007.
- (2007) Control Techniques for Complex Networks
- Meyn, S.¹

14
- 70350302258
- Cambridge University Press, Cambdrige, UK, 2nd edition
- Meyn, S. and Tweedie, R. L. Markov Chains and Stochastic Stability. Cambridge University Press, Cambdrige, UK, 2nd edition, 2009.
- (2009) Markov Chains and Stochastic Stability
- Meyn, S.¹ Tweedie, R.L.²

15
- 0037288398
- Least squares policy evaluation algorithms with linear function approximation
- Nedic, A. and Bertsekas, D. P. Least squares policy evaluation algorithms with linear function approximation. Discrete Event Dyn. Syst., 13:79-110, 2003.
- (2003) Discrete Event Dyn. Syst. , vol.13 , pp. 79-110
- Nedic, A.¹ Bertsekas, D.P.²

16
- 4644328593
- Off-policy temporal-difference learning with function approximation
- Precup, D., Sutton, R. S., and Dasgupta, S. Off-policy temporal-difference learning with function approximation. In Proc. the 18th ICML, pp. 417-424, 2001.
- (2001) Proc. the 18th ICML , pp. 417-424
- Precup, D.¹ Sutton, R.S.² Dasgupta, S.³

17
- 33847202724
- Learning to predict by the methods of temporal differences
- Sutton, R. S. Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44, 1988.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

18
- 0004007508
- MIT Press, Cambridge, MA
- Sutton, R. S. and Barto, A. G. Reinforcement Learning. MIT Press, Cambridge, MA, 1998.
- (1998) Reinforcement Learning
- Sutton, R.S.¹ Barto, A.G.²

19
- 0031143730
- An analysis of temporal-difference learning with function approximation
- Tsitsiklis, J. N. and Van Roy, B. An analysis of temporal-difference learning with function approximation. IEEE Trans. Automat. Contr., 42(5):674- 690, 1997.
- (1997) IEEE Trans. Automat. Contr. , vol.42 , Issue.5 , pp. 674-690
- Tsitsiklis, J.N.¹ Van Roy, B.²

20
- 56449123618
- Preconditioned temporal difference learning
- Yao, H. S. and Liu, Z. Q. Preconditioned temporal difference learning. In Proc. the 25th ICML, pp. 1208-1215, 2008.
- (2008) Proc. the 25th ICML , pp. 1208-1215
- Yao, H.S.¹ Liu, Z.Q.²

21
- 77956506470
- Convergence of least squares temporal difference methods under general conditions
- Yu, H. Convergence of least squares temporal difference methods under general conditions. Tech. Report C-2010-1, Dept. CS, Univ. of Helsinki, 2010.
- (2010) Tech. Report C-2010-1, Dept. CS, Univ. of Helsinki
- Yu, H.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.