메뉴 건너뛰기




Volumn 2063, Issue , 2001, Pages 133-150

Chess neighborhoods, function combination, and reinforcement learning

Author keywords

Computer chess; Exponentiated gradient; Gradient descent; Linear regression; Multi layer neural nets; Reinforcement learning; Temporal difference learning; Value function approximation

Indexed keywords

COMPUTER GAMES; GRAPHIC METHODS; LEARNING SYSTEMS; LINEAR REGRESSION;

EID: 84898646291     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/3-540-45579-5_9     Document Type: Conference Paper
Times cited : (6)

References (29)
  • 1
    • 84958768821 scopus 로고    scopus 로고
    • New Advances in Adaptive Pattern-Oriented Chess
    • H.J. van den Herik and J. W.H., pp. 312-233 Uiterwijk, Universiteit Maastricht, The Netherlands
    • Allen, J., Hamilton, E., and Levinson, R. New Advances in Adaptive Pattern-Oriented Chess (1997). In H.J. van den Herik and J. W.H., Uiterwijk. Advances in Computer Chess 8, pp. 312-233., Universiteit Maastricht, The Netherlands.
    • (1997) Advances in Computer Chess , vol.8 , pp. 312-233
    • Allen, J.1    Hamilton, E.2    Levinson, R.3
  • 4
    • 0003140349 scopus 로고
    • Random Evaluation in Chess
    • A
    • Beal, D. F., & Smith, M.C. (1994). Random Evaluation in Chess. ICCA Journal, Vol. 17, No. 1, pp. 3-9 (A).
    • (1994) ICCA Journal , vol.17 , Issue.1 , pp. 3-9
    • Beal, D.F.1    Smith, M.C.2
  • 6
    • 56349117542 scopus 로고    scopus 로고
    • First results from using temporal difference learning in Shogi
    • H. J. van den Herik and H. Iida, editors, volume 1558 of Lecture Notes in Computer Science, Tsukuba, Japan, Springer-Verlag
    • Beal, D. F., & Smith, M.C. First results from using temporal difference learning in Shogi. In H. J. van den Herik and H. Iida, editors, Proceedings of the First International Conference on Computers and Games (CG-98), volume 1558 of Lecture Notes in Computer Science, page 114, Tsukuba, Japan, 1998. Springer-Verlag.
    • (1998) Proceedings of the First International Conference on Computers and Games (CG-98) , pp. 114
    • Beal, D.F.1    Smith, M.C.2
  • 8
    • 0001771345 scopus 로고    scopus 로고
    • Linear least-squares algorithms for temporal difference learning
    • Bradtke, S. J., and Barto, A. G. (1996). Linear least-squares algorithms for temporal difference learning. Machine Learning, 22, 33-57.
    • (1996) Machine Learning , vol.22 , pp. 33-57
    • Bradtke, S.J.1    Barto, A.G.2
  • 9
    • 85168770830 scopus 로고
    • A unified theory of heuristic evaluation functions and its applications to learning
    • Christensen, J. and Korf, R. (1986). A unified theory of heuristic evaluation functions and its applications to learning. Proceedings of AAAI-86 (pp. 148-152).
    • (1986) Proceedings of AAAI-86 , pp. 148-152
    • Christensen, J.1    Korf, R.2
  • 10
    • 0007943864 scopus 로고    scopus 로고
    • Machine learning in computer chess: The next generation
    • September
    • Fürnkranz, J., Machine learning in computer chess: The next generation. International Computer Chess Association Journal, 19(3): 147-160, September (1996).
    • (1996) International Computer Chess Association Journal , vol.19 , Issue.3 , pp. 147-160
    • Fürnkranz, J.1
  • 11
    • 5844312285 scopus 로고
    • Ph.D thesis. University of California, San Diego. San Diego, CA
    • Gherrity, M. A Game-Learning Machine. Ph.D thesis. University of California, San Diego. San Diego, CA. 1993.
    • (1993) A Game-Learning Machine
    • Gherrity, M.1
  • 14
    • 84969380755 scopus 로고    scopus 로고
    • Additive versus exponentiated gradient updates for linear prediction
    • Kivinen, J. and Warmuth, M. K. Additive versus exponentiated gradient updates for linear prediction. Information and Computation. Vol. 2, pp. 285-318, 1998.
    • (1998) Information and Computation , vol.2 , pp. 285-318
    • Kivinen, J.1    Warmuth, M.K.2
  • 21
    • 0001201756 scopus 로고
    • Some studies in machine learning using the game of checkers
    • Samuel, A. (1959). Some studies in machine learning using the game of checkers. IBM J. of Research and Development, 3, 210-229.
    • (1959) IBM J. of Research and Development , vol.3 , pp. 210-229
    • Samuel, A.1
  • 23
    • 77956735234 scopus 로고
    • A chess program that uses its transposition table to learn from experience
    • Slate, D.J., A chess program that uses its transposition table to learn from experience. International Computer Chess Association Journal 10(2): 59-71, 1987.
    • (1987) International Computer Chess Association Journal , vol.10 , Issue.2 , pp. 59-71
    • Slate, D.J.1
  • 24
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9-44.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 26
    • 0029276036 scopus 로고
    • Temporal Difference Learning and TD-Gammon
    • Tesauro, G. Temporal Difference Learning and TD-Gammon. Communications of the ACM, Vol 38, No 3, March 1995.
    • (1995) Communications of the ACM , vol.38 , Issue.3
    • Tesauro, G.1
  • 27
    • 0001046225 scopus 로고
    • Practical Issues in Temporal Difference Learning
    • Tesauro, G. Practical Issues in Temporal Difference Learning. Machine Learning, 8: 257-278, 1992.
    • (1992) Machine Learning , vol.8 , pp. 257-278
    • Tesauro, G.1
  • 28
    • 0003215153 scopus 로고
    • Learning to Play the Game of Chess
    • G. Tesauro, D. Touretzky, and T. Leen (eds.), MIT Press
    • Thrun, S., 1995. Learning to Play the Game of Chess. In Advances in Neural Information Processing Systems (NIPS) 7, G. Tesauro, D. Touretzky, and T. Leen (eds.), MIT Press.
    • (1995) Advances in Neural Information Processing Systems (NIPS) 7
    • Thrun, S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.