메뉴 건너뛰기




Volumn 5, Issue 4, 2009, Pages 913-922

Convergence analysis on temporal difference learning

Author keywords

Agent; Convergence analysis; Temporal difference learning

Indexed keywords


EID: 64349090183     PISSN: 13494198     EISSN: None     Source Type: Journal    
DOI: None     Document Type: Article
Times cited : (3)

References (27)
  • 2
    • 84972263711 scopus 로고
    • Intelligent agents: Theory and practice
    • M. Wooldridge and N. Jennings, Intelligent agents: Theory and practice, Knowledge Engineering Review, vol.10, no.2, pp.115-152, 1995.
    • (1995) Knowledge Engineering Review , vol.10 , Issue.2 , pp. 115-152
    • Wooldridge, M.1    Jennings, N.2
  • 3
    • 0003787146 scopus 로고
    • Princeton University Press, Princeton, NJ
    • R. Bellman, Dynamic Programming, Princeton University Press, Princeton, NJ, 1957.
    • (1957) Dynamic Programming
    • Bellman, R.1
  • 5
    • 33847202724 scopus 로고
    • Learning to predict by the method of temporal differences
    • R. S. Sutton, Learning to predict by the method of temporal differences, Machine Learning, vol.3, no.9, pp.9-44, 1988.
    • (1988) Machine Learning , vol.3 , Issue.9 , pp. 9-44
    • Sutton, R.S.1
  • 6
    • 26944466214 scopus 로고    scopus 로고
    • Function approximation via tile coding: Automating parameter choice
    • Proc. of the SARA 2005, Berlin
    • A. A. Sherstov and P. Stone, Function approximation via tile coding: Automating parameter choice, Proc. of the SARA 2005, Berlin, LNCS, no.3607, pp.194-205, 2005.
    • (2005) LNCS , vol.3607 , pp. 194-205
    • Sherstov, A.A.1    Stone, P.2
  • 7
    • 84988783053 scopus 로고    scopus 로고
    • Convergence of reinforcement learning algorithms and acceleration of learning
    • A. Potapov and M. K. Ali, Convergence of reinforcement learning algorithms and acceleration of learning, Physical Review E, vol.67, no.2, 2003.
    • (2003) Physical Review E , vol.67 , Issue.2
    • Potapov, A.1    Ali, M.K.2
  • 8
    • 64349089159 scopus 로고    scopus 로고
    • Teambots, http://www.es.emu.edu/^trb/Teambots/Domains/SoccerBots, 2000.
    • Teambots, http://www.es.emu.edu/^trb/Teambots/Domains/SoccerBots, 2000.
  • 9
    • 49649148257 scopus 로고
    • A theory of cerebellar function
    • J. S. Albus, A theory of cerebellar function, Mathematical Biosciences, vol.10, pp.25-61, 1971.
    • (1971) Mathematical Biosciences , vol.10 , pp. 25-61
    • Albus, J.S.1
  • 10
    • 64349111008 scopus 로고
    • Ph.D. Thesis, Cambridge University, Cambridge, England
    • C. J. C. H. Watkins, C J. C H. Watkins, Ph.D. Thesis, Cambridge University, Cambridge, England, 1989.
    • (1989) C J. C H. Watkins
    • Watkins, C.J.C.H.1
  • 12
    • 0028388685 scopus 로고
    • TD(λ) converges with probability 1
    • P. Dayan and T. J. Sejnowski, TD(λ) converges with probability 1, Machine Learning, vol.14, no.l, pp.295-301, 1994.
    • (1994) Machine Learning , vol.14 , Issue.L , pp. 295-301
    • Dayan, P.1    Sejnowski, T.J.2
  • 13
    • 0028497630 scopus 로고
    • Asynchronous stochastic approximation and Q-learning
    • J. N. Tsitsiklis, Asynchronous stochastic approximation and Q-learning, Machine Learning, vol.16, no.l, pp.185-202, 1994.
    • (1994) Machine Learning , vol.16 , Issue.L , pp. 185-202
    • Tsitsiklis, J.N.1
  • 14
    • 0003786198 scopus 로고
    • Incremental Learning of Evaluation Functions for Absorbing Markov Chains: New Methods and Theorems
    • Preprint
    • L. Gurvits, L. J. Lin and S. J. Hanson, Incremental Learning of Evaluation Functions for Absorbing Markov Chains: New Methods and Theorems, Preprint, 1994.
    • (1994)
    • Gurvits, L.1    Lin, L.J.2    Hanson, S.J.3
  • 15
    • 0031143730 scopus 로고    scopus 로고
    • An analysis of temporal-difference learning with function approximation
    • J. N. Tsitsiklis and B. Van Roy, An analysis of temporal-difference learning with function approximation, IEEE Transactions on Automatic Control, vol.42, no.5, pp.674-690, 1997.
    • (1997) IEEE Transactions on Automatic Control , vol.42 , Issue.5 , pp. 674-690
    • Tsitsiklis, J.N.1    Van Roy, B.2
  • 16
    • 38049144717 scopus 로고    scopus 로고
    • Reinforcement learning of competitive skills with soccer agents
    • Proc. of the 11th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems
    • J. Leng, C. Fyfe and L. Jain, Reinforcement learning of competitive skills with soccer agents, Proc. of the 11th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, LNAI 4692, pp.572-579, 2007.
    • (2007) LNAI , vol.4692 , pp. 572-579
    • Leng, J.1    Fyfe, C.2    Jain, L.3
  • 19
    • 63649137867 scopus 로고    scopus 로고
    • X. Cai, Z. Cui, J. Zeng and Y. Tan, Performance-dependent adaptive particle swam optimization, International Journal of Innovative Computing, Information and Control, 3, no.6(B), pp.1697-1706, 2007.
    • X. Cai, Z. Cui, J. Zeng and Y. Tan, Performance-dependent adaptive particle swam optimization, International Journal of Innovative Computing, Information and Control, vol.3, no.6(B), pp.1697-1706, 2007.
  • 20
    • 48249095357 scopus 로고    scopus 로고
    • S.-C. Chu and P.-W. Tsai, Computational intelligence based on the behavior of cats, International Journal of Innovative Computing, Information and Control, 3, no.l, pp.163-173, 2007.
    • S.-C. Chu and P.-W. Tsai, Computational intelligence based on the behavior of cats, International Journal of Innovative Computing, Information and Control, vol.3, no.l, pp.163-173, 2007.
  • 22
    • 0024900644 scopus 로고
    • Very fast simulated re-annealing
    • L. Ingber, Very fast simulated re-annealing, Mathematical Computer Modelling, vol.12, no.8, pp.967-973, 1989.
    • (1989) Mathematical Computer Modelling , vol.12 , Issue.8 , pp. 967-973
    • Ingber, L.1
  • 25
    • 0002363078 scopus 로고
    • On the experimental attainment of optimum conditions (with discussion)
    • G. E. P. Box and K. B. Wilson, On the experimental attainment of optimum conditions (with discussion), Journal of the Royal Statistical Society Series B, vol.13, no.l, pp.1-45, 1951.
    • (1951) Journal of the Royal Statistical Society Series B , vol.13 , Issue.L , pp. 1-45
    • Box, G.E.P.1    Wilson, K.B.2
  • 27
    • 84869265512 scopus 로고    scopus 로고
    • The MathWorks
    • The MathWorks. http://www.mathworks.com.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.