SCOPUS 정보 검색 플랫폼

International Journal of Innovative Computing, Information and Control

Volumn 5, Issue 4, 2009, Pages 913-922

Convergence analysis on temporal difference learning

(3) Leng, Jinsong a Jain, Lakhmi a Fyfe, Colin b

a UNIVERSITY OF SOUTH AUSTRALIA (Australia)

b UNIVERSITY OF THE WEST OF SCOTLAND (United Kingdom)

Author keywords

Agent; Convergence analysis; Temporal difference learning

Indexed keywords

EID: 64349090183 PISSN: 13494198 EISSN: None Source Type: Journal
DOI: None Document Type: Article

Times cited : (3)

References (27)

1
- 0001700171
- A markovian decision process
- R. Bellman, A markovian decision process, Journal of Mathematics and Mechanics, vol.6, no.3, pp.679-693, 1957.
- (1957) Journal of Mathematics and Mechanics , vol.6 , Issue.3 , pp. 679-693
- Bellman, R.¹

2
- 84972263711
- Intelligent agents: Theory and practice
- M. Wooldridge and N. Jennings, Intelligent agents: Theory and practice, Knowledge Engineering Review, vol.10, no.2, pp.115-152, 1995.
- (1995) Knowledge Engineering Review , vol.10 , Issue.2 , pp. 115-152
- Wooldridge, M.¹ Jennings, N.²

3
- 0003787146
- Princeton University Press, Princeton, NJ
- R. Bellman, Dynamic Programming, Princeton University Press, Princeton, NJ, 1957.
- (1957) Dynamic Programming
- Bellman, R.¹

4
- 77956317471
- Heuristic
- Prentice-Hall, Englewood, NJ
- S. Russell and P. Norvig, Heuristic: Intelligent Search Strategies for Computer Problem Solving, A Modern Approach, Prentice-Hall, Englewood, NJ, 1995.
- (1995) Intelligent Search Strategies for Computer Problem Solving, A Modern Approach
- Russell, S.¹ Norvig, P.²

5
- 33847202724
- Learning to predict by the method of temporal differences
- R. S. Sutton, Learning to predict by the method of temporal differences, Machine Learning, vol.3, no.9, pp.9-44, 1988.
- (1988) Machine Learning , vol.3 , Issue.9 , pp. 9-44
- Sutton, R.S.¹

6
- 26944466214
- Function approximation via tile coding: Automating parameter choice
- Proc. of the SARA 2005, Berlin
- A. A. Sherstov and P. Stone, Function approximation via tile coding: Automating parameter choice, Proc. of the SARA 2005, Berlin, LNCS, no.3607, pp.194-205, 2005.
- (2005) LNCS , vol.3607 , pp. 194-205
- Sherstov, A.A.¹ Stone, P.²

7
- 84988783053
- Convergence of reinforcement learning algorithms and acceleration of learning
- A. Potapov and M. K. Ali, Convergence of reinforcement learning algorithms and acceleration of learning, Physical Review E, vol.67, no.2, 2003.
- (2003) Physical Review E , vol.67 , Issue.2
- Potapov, A.¹ Ali, M.K.²

8
- 64349089159
- Teambots, http://www.es.emu.edu/^trb/Teambots/Domains/SoccerBots, 2000.
- Teambots, http://www.es.emu.edu/^trb/Teambots/Domains/SoccerBots, 2000.

9
- 49649148257
- A theory of cerebellar function
- J. S. Albus, A theory of cerebellar function, Mathematical Biosciences, vol.10, pp.25-61, 1971.
- (1971) Mathematical Biosciences , vol.10 , pp. 25-61
- Albus, J.S.¹

10
- 64349111008
- Ph.D. Thesis, Cambridge University, Cambridge, England
- C. J. C. H. Watkins, C J. C H. Watkins, Ph.D. Thesis, Cambridge University, Cambridge, England, 1989.
- (1989) C J. C H. Watkins
- Watkins, C.J.C.H.¹

11
- 0004102479
- MIT Press
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

12
- 0028388685
- TD(λ) converges with probability 1
- P. Dayan and T. J. Sejnowski, TD(λ) converges with probability 1, Machine Learning, vol.14, no.l, pp.295-301, 1994.
- (1994) Machine Learning , vol.14 , Issue.L , pp. 295-301
- Dayan, P.¹ Sejnowski, T.J.²

13
- 0028497630
- Asynchronous stochastic approximation and Q-learning
- J. N. Tsitsiklis, Asynchronous stochastic approximation and Q-learning, Machine Learning, vol.16, no.l, pp.185-202, 1994.
- (1994) Machine Learning , vol.16 , Issue.L , pp. 185-202
- Tsitsiklis, J.N.¹

14
- 0003786198
- Incremental Learning of Evaluation Functions for Absorbing Markov Chains: New Methods and Theorems
- Preprint
- L. Gurvits, L. J. Lin and S. J. Hanson, Incremental Learning of Evaluation Functions for Absorbing Markov Chains: New Methods and Theorems, Preprint, 1994.
- (1994)
- Gurvits, L.¹ Lin, L.J.² Hanson, S.J.³

15
- 0031143730
- An analysis of temporal-difference learning with function approximation
- J. N. Tsitsiklis and B. Van Roy, An analysis of temporal-difference learning with function approximation, IEEE Transactions on Automatic Control, vol.42, no.5, pp.674-690, 1997.
- (1997) IEEE Transactions on Automatic Control , vol.42 , Issue.5 , pp. 674-690
- Tsitsiklis, J.N.¹ Van Roy, B.²

16
- 38049144717
- Reinforcement learning of competitive skills with soccer agents
- Proc. of the 11th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems
- J. Leng, C. Fyfe and L. Jain, Reinforcement learning of competitive skills with soccer agents, Proc. of the 11th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, LNAI 4692, pp.572-579, 2007.
- (2007) LNAI , vol.4692 , pp. 572-579
- Leng, J.¹ Fyfe, C.² Jain, L.³

17
- 0010495476
- On step-size and bias in temporal-difference learning
- New Haven, CT, pp
- R. S. Sutton and S. P. Singh, On step-size and bias in temporal-difference learning, Proc. of the Eighth Yale Workshop on adaptive and Learning Systems, New Haven, CT, pp.91-96, 1994.
- (1994) Proc. of the Eighth Yale Workshop on adaptive and Learning Systems , pp. 91-96
- Sutton, R.S.¹ Singh, S.P.²

18
- 84888630832
- Kluwer Academic Publishers
- A. Gosavi, Simulation-based Optimization: Parametric Optimization Techniques and Reinforcement Learning, Kluwer Academic Publishers, 2003.
- (2003) Simulation-based Optimization: Parametric Optimization Techniques and Reinforcement Learning
- Gosavi, A.¹

19
- 63649137867
- X. Cai, Z. Cui, J. Zeng and Y. Tan, Performance-dependent adaptive particle swam optimization, International Journal of Innovative Computing, Information and Control, 3, no.6(B), pp.1697-1706, 2007.
- X. Cai, Z. Cui, J. Zeng and Y. Tan, Performance-dependent adaptive particle swam optimization, International Journal of Innovative Computing, Information and Control, vol.3, no.6(B), pp.1697-1706, 2007.

20
- 48249095357
- S.-C. Chu and P.-W. Tsai, Computational intelligence based on the behavior of cats, International Journal of Innovative Computing, Information and Control, 3, no.l, pp.163-173, 2007.
- S.-C. Chu and P.-W. Tsai, Computational intelligence based on the behavior of cats, International Journal of Innovative Computing, Information and Control, vol.3, no.l, pp.163-173, 2007.

21
- 5744249209
- Equations of state calculations by fast computing machines
- N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, M. N. Teller and E. Teller, Equations of state calculations by fast computing machines, Journal of Chemical Physics, vol.21, pp.1087-1091, 1953.
- (1953) Journal of Chemical Physics , vol.21 , pp. 1087-1091
- Metropolis, N.¹ Rosenbluth, A.W.² Rosenbluth, M.N.³ Teller, M.N.⁴ Teller, E.⁵

22
- 0024900644
- Very fast simulated re-annealing
- L. Ingber, Very fast simulated re-annealing, Mathematical Computer Modelling, vol.12, no.8, pp.967-973, 1989.
- (1989) Mathematical Computer Modelling , vol.12 , Issue.8 , pp. 967-973
- Ingber, L.¹

23
- 38049168425
- A reinforcement learning method based on adaptive simulated annealing
- A. F. Atiya, A. G. Parlos and L. Ingber, A reinforcement learning method based on adaptive simulated annealing, Proc. of the 46th IEEE International Midwest Symposium on, vol.1, pp.121-124, 2003.
- (2003) Proc. of the 46th IEEE International Midwest Symposium on , vol.1 , pp. 121-124
- Atiya, A.F.¹ Parlos, A.G.² Ingber, L.³

24
- 64349111926
- Ph.D. Thesis, University of Miskolc, Hungary
- P. Stefan, Combined Use of Reinforcement Learning and Simulated Annealing: Algorithms and Applications, Ph.D. Thesis, University of Miskolc, Hungary, 2003.
- (2003) Combined Use of Reinforcement Learning and Simulated Annealing: Algorithms and Applications
- Stefan, P.¹

25
- 0002363078
- On the experimental attainment of optimum conditions (with discussion)
- G. E. P. Box and K. B. Wilson, On the experimental attainment of optimum conditions (with discussion), Journal of the Royal Statistical Society Series B, vol.13, no.l, pp.1-45, 1951.
- (1951) Journal of the Royal Statistical Society Series B , vol.13 , Issue.L , pp. 1-45
- Box, G.E.P.¹ Wilson, K.B.²

26
- 0003737234
- John Wiley & Sons: New York
- G. A. F. Seber, Linear Regression Analysis, John Wiley & Sons: New York, 1977.
- (1977) Linear Regression Analysis
- Seber, G.A.F.¹

27
- 84869265512
- The MathWorks
- The MathWorks. http://www.mathworks.com.

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.