메뉴 건너뛰기




Volumn 8, Issue 3, 1992, Pages 257-277

Practical Issues in Temporal Difference Learning

Author keywords

backgammon; connectionist methods; feature discovery; games; neural networks; Temporal difference learning

Indexed keywords


EID: 0001046225     PISSN: 08856125     EISSN: 15730565     Source Type: Journal    
DOI: 10.1023/A:1022624705476     Document Type: Article
Times cited : (476)

References (29)
  • 1
    • 85025877891 scopus 로고    scopus 로고
    • Anderson, C.W. (1987). Strategy learning with multilayer connectionist representations. Proceedings of the Fourth International Workshop on Machine Learning (pp. 103–114).
  • 3
    • 85025864128 scopus 로고    scopus 로고
    • Berliner, H. (1977). Experiences in evaluation with BKG—a program that plays backgammon. Proceedings of IJCAI (pp. 428–433).
  • 4
    • 85025863536 scopus 로고    scopus 로고
    • Berliner, H. (1979). On the construction of evaluation functions for large domains. Proceedings of IJCAI (pp. 53–55).
  • 6
    • 84951519316 scopus 로고    scopus 로고
    • Christensen, J. & Korf, R. (1986). A unified theory of heuristic evaluation functions and its application to learning. Proceeding of AAAI-86 (pp. 148–152).
  • 15
    • 84951519317 scopus 로고    scopus 로고
    • Mitchell, D.H. (1984). Using features to evaluate positions in experts' and novices' Othello games. Master's Thesis, Northwestern Univ., Evanston, IL.
  • 16
    • 0001857179 scopus 로고
    • Learning efficient classification procedures and their application to chess end games
    • R.S., Michalski, J.G., Carbonell, T.M., Mitchell, Tioga, Palo Alto, CA
    • (1983) Machine learning
    • Quinlan, J.R.1
  • 19
    • 84951519318 scopus 로고    scopus 로고
    • Samuel, A.(1959). Some studies in machine learning using the game of checkers. IBM J. of Research and Development, 3, 210–229.
  • 20
    • 84951519319 scopus 로고    scopus 로고
    • Samuel, A.(1967). Some studies in machine learning using the game of checkers, II—recent progress. IBM J. of Research and Development, 11, 601–617.
  • 21
    • 84951519320 scopus 로고    scopus 로고
    • Sutton, R.S. (1984). Temporal credit assignment in reinforcement learning. Doctoral Dissertation, Dept. of Computer and Information Science, Univ. of Massachusetts, Amherst
  • 24
    • 84951519321 scopus 로고    scopus 로고
    • Tesauro, G. (1989). Connectionist learning of expert preferences by comparison training. In D. Touretzky (Ed.), Advances in neural information processing, 1, 99–106.
  • 26
    • 84951519322 scopus 로고    scopus 로고
    • Utgoff, P.E. & Clouse, J.A. (1991). Two kinds of training information for evaluation function training. To appear in: Proceedings of AAAI-91.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.