SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Machine Learning

Volumn 8, Issue 3, 1992, Pages 257-277

Practical Issues in Temporal Difference Learning

(1) Tesauro, Gerald a

a IBM T J WATSON RESEARCH CENTER (United States)

Author keywords

backgammon; connectionist methods; feature discovery; games; neural networks; Temporal difference learning

Indexed keywords

EID: 0001046225 PISSN: 08856125 EISSN: 15730565 Source Type: Journal
DOI: 10.1023/A:1022624705476 Document Type: Article

Times cited : (476)

References (29)

1
- 85025877891
- Anderson, C.W. (1987). Strategy learning with multilayer connectionist representations. Proceedings of the Fourth International Workshop on Machine Learning (pp. 103–114).

2
- 0020970738
- Neuronlike adaptive elements that can solve difficult learning control problems
- (1983) IEEE Transactions on Systems, Man and Cybernetics , vol.13 , pp. 835-846
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

3
- 85025864128
- Berliner, H. (1977). Experiences in evaluation with BKG—a program that plays backgammon. Proceedings of IJCAI (pp. 428–433).

4
- 85025863536
- Berliner, H. (1979). On the construction of evaluation functions for large domains. Proceedings of IJCAI (pp. 53–55).

5
- 0024750852
- Learnability and the Vapnik-Chervonenkis dimension
- (1989) JACM , vol.36 , pp. 929-965
- Blumer, A.¹ Ehrenfeucht, A.² Haussler, D.³ Warmuth, M.⁴

6
- 84951519316
- Christensen, J. & Korf, R. (1986). A unified theory of heuristic evaluation functions and its application to learning. Proceeding of AAAI-86 (pp. 148–152).

7
- 0000430514
- The convergence of TD(λ)
- (1992) Machine Learning , vol.8 , pp. 341-362
- Dayan, P.¹

8
- 84913371210
- Algorithmic strategies for improving the performance of game playing programs
- D., Farmer et al., North Holland, Amsterdam
- (1986) Evolution, games and learning
- Frey, P.W.¹

9
- 0016071909
- A comparison and evaluation of three machine learning procedures as applied to the game of checkers
- (1974) Artificial Intelligence , vol.5 , pp. 137-148
- Griffith, A.K.¹

10
- 84934665895
- Escaping brittleness: The possibilities of general-purpose learning algorithms applied to parallel rule-based systems
- R.S., Michalski, J.G., Carbonell, T.M., Mitchell, Morgan Kaufmann, Los Altos, CA
- (1986) Machine learning: An artificial intelligence approach
- Holland, J.H.¹

11
- 0024880831
- Multilayer feedforward networks are universal approximators
- (1989) Neural Networks , vol.2 , pp. 359-366
- Hornik, K.¹ Stinchcombe, M.² White, H.³

12
- 0024064183
- A pattern classification approach to evaluation function learning
- (1988) Artificial Intelligence , vol.36 , pp. 1-25
- Lee, K.-F.¹ Majahan, S.²

13
- 0007994760
- Times Books, New York
- (1976) Backgammon
- Magriel, P.¹

14
- 0004290881
- MIT Press, Cambridge, MA
- (1969) Perceptrons
- Minsky, M.L.¹ Papert, S.A.²

15
- 84951519317
- Mitchell, D.H. (1984). Using features to evaluate positions in experts' and novices' Othello games. Master's Thesis, Northwestern Univ., Evanston, IL.

16
- 0001857179
- Learning efficient classification procedures and their application to chess end games
- R.S., Michalski, J.G., Carbonell, T.M., Mitchell, Tioga, Palo Alto, CA
- (1983) Machine learning
- Quinlan, J.R.¹

17
- 0000016172
- A stochastic approximation method
- (1951) Annals of Mathematical Statistics , vol.22 , pp. 400-407
- Robbins, H.¹ Monro, S.²

18
- 0000646059
- Learning internal representations by error propagation
- D., Rumelhart, J., McClelland, MIT Press, Cambridge, MA
- (1986) Parallel distributed processing
- Rumelhart, D.E.¹ Hinton, G.E.² Willia, R.J.³

19
- 84951519318
- Samuel, A.(1959). Some studies in machine learning using the game of checkers. IBM J. of Research and Development, 3, 210–229.

20
- 84951519319
- Samuel, A.(1967). Some studies in machine learning using the game of checkers, II—recent progress. IBM J. of Research and Development, 11, 601–617.

21
- 84951519320
- Sutton, R.S. (1984). Temporal credit assignment in reinforcement learning. Doctoral Dissertation, Dept. of Computer and Information Science, Univ. of Massachusetts, Amherst

22
- 33847202724
- Learning to predict by the methods of temporal differences
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

23
- 0024702037
- A parallel network that learns to play backgammon
- (1989) Artificial Intelligence , vol.39 , pp. 357-390
- Tesauro, G.¹ Sejnowski, T.J.²

24
- 84951519321
- Tesauro, G. (1989). Connectionist learning of expert preferences by comparison training. In D. Touretzky (Ed.), Advances in neural information processing, 1, 99–106.

25
- 0025559238
- Neurogammon: a neural network backgammon program
- (1990) IJCNN Proceedings , vol.3 , pp. 33-39
- Tesauro, G.¹

26
- 84951519322
- Utgoff, P.E. & Clouse, J.A. (1991). Two kinds of training information for evaluation function training. To appear in: Proceedings of AAAI-91.

27
- 0001024505
- On the uniform convergence of relative frequencies of events to their probabilities
- (1971) Theory Prob. Appl. , vol.16 , pp. 264-280
- Vapnik, V.N.¹ Chervonenkis²

28
- 0016987049
- Stationary and nonstationary learning characteristics of the LMS adaptive filter
- (1976) Proceedings of the IEEE , vol.64 , pp. 1151-1162
- Widrow, B.¹

29
- 0007902192
- On optimal doubling in backgammon
- (1977) Management Science , vol.23 , pp. 853-858
- Zadeh, N.¹ Kobliska, G.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.