SCOPUS 정보 검색 플랫폼

Volumn 35, Issue 6, 2008, Pages 1999-2017

Application of reinforcement learning to the game of Othello

(2) van Eck, Nees Jan a van Wezel, Michiel a

a ERASMUS UNIVERSITY ROTTERDAM (Netherlands)

Author keywords

Dynamic programming; Game playing; Markov decision processes; Multiagent learning; Neural networks; Othello; Q learning; Reinforcement learning

Indexed keywords

DECISION MAKING; DYNAMIC PROGRAMMING; FUNCTION EVALUATION; MANAGEMENT SCIENCE; MARKOV PROCESSES; NEURAL NETWORKS; OPERATIONS RESEARCH; PROBLEM SOLVING;

GAME PLAYING; MULTIAGENT LEARNING; OTHELLO; Q-LEARNING;

REINFORCEMENT LEARNING;

EID: 35349027192 PISSN: 03050548 EISSN: None Source Type: Journal
DOI: 10.1016/j.cor.2006.10.004 Document Type: Article

Times cited : (33)

References (31)

1
- 84974870136
- A survey of applications of Markov decision processes
- White D.J. A survey of applications of Markov decision processes. The Journal of the Operational Research Society 44 11 (1993) 1073-1096
- (1993) The Journal of the Operational Research Society , vol.44 , Issue.11 , pp. 1073-1096
- White, D.J.¹

2
- 0003487482
- Athena Scientific, Belmont, MA, USA
- Bertsekas D.P., and Tsitsiklis J. Neuro-dynamic programming (1996), Athena Scientific, Belmont, MA, USA
- (1996) Neuro-dynamic programming
- Bertsekas, D.P.¹ Tsitsiklis, J.²

3
- 0029679044
- Reinforcement learning: a survey
- Kaelbling L.P., Littman M.L., and Moore A.W. Reinforcement learning: a survey. Journal of Artificial Intelligence Research 4 (1996) 237-285
- (1996) Journal of Artificial Intelligence Research , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

4
- 0004102479
- MIT Press, Cambridge, MA, USA
- Sutton R.S., and Barto A.G. Reinforcement learning: an introduction (1998), MIT Press, Cambridge, MA, USA
- (1998) Reinforcement learning: an introduction
- Sutton, R.S.¹ Barto, A.G.²

5
- 84888630832
- Kluwer Academic Publishers, Boston, MA, USA
- Gosavi A. Simulation-based optimization: parametric optimization techniques and reinforcement learning (2003), Kluwer Academic Publishers, Boston, MA, USA
- (2003) Simulation-based optimization: parametric optimization techniques and reinforcement learning
- Gosavi, A.¹

6
- 0000985504
- TD-gammon a self-teaching backgammon program, achieves master-level play
- Tesauro G. TD-gammon a self-teaching backgammon program, achieves master-level play. Neural Computation 6 2 (1994) 215-219
- (1994) Neural Computation , vol.6 , Issue.2 , pp. 215-219
- Tesauro, G.¹

7
- 0029276036
- Temporal difference learning and TD-gammon
- Tesauro G. Temporal difference learning and TD-gammon. Communications of the ACM 38 3 (1995) 58-68
- (1995) Communications of the ACM , vol.38 , Issue.3 , pp. 58-68
- Tesauro, G.¹

8
- 0036147771
- Programming backgammon using self-teaching neural nets
- Tesauro G. Programming backgammon using self-teaching neural nets. Artificial Intelligence 134 1-2 (2002) 181-199
- (2002) Artificial Intelligence , vol.134 , Issue.1-2 , pp. 181-199
- Tesauro, G.¹

9
- 0036722536
- A reinforcement learning approach to a single leg airline revenue management problem with multiple fare classes and overbooking
- Gosavi A., Bandla N., and Das T.K. A reinforcement learning approach to a single leg airline revenue management problem with multiple fare classes and overbooking. IIE Transactions 34 9 (2002) 729-742
- (2002) IIE Transactions , vol.34 , Issue.9 , pp. 729-742
- Gosavi, A.¹ Bandla, N.² Das, T.K.³

10
- 85156187730
- Improving elevator performance using reinforcement learning
- Touretzky D.S., Mozer M.C., and Hasselmo M.E. (Eds), The MIT Press, Cambridge, MA
- Crites R.H., and Barto A.G. Improving elevator performance using reinforcement learning. In: Touretzky D.S., Mozer M.C., and Hasselmo M.E. (Eds). Advances in neural information processing systems vol. 8 (1996), The MIT Press, Cambridge, MA 1017-1023
- (1996) Advances in neural information processing systems , vol.8 , pp. 1017-1023
- Crites, R.H.¹ Barto, A.G.²

11
- 0032208335
- Elevator group control using multiple reinforcement learning agents
- Crites R.H., and Barto A.G. Elevator group control using multiple reinforcement learning agents. Machine Learning 33 2 (1998) 235-262
- (1998) Machine Learning , vol.33 , Issue.2 , pp. 235-262
- Crites, R.H.¹ Barto, A.G.²

12
- 0242540456
- Pednault E, Abe N, Zadrozny B. Sequential cost-sensitive decision-making with reinforcement learning. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. Edmonton, Alberta, Canada: ACM Press; 2002. p. 259-68.

13
- 0032643313
- Solving semi-Markov decision problems using average reward reinforcement learning
- Das T.K., Gosavi A., Mahadevan S., and Marchalleck N. Solving semi-Markov decision problems using average reward reinforcement learning. Management Science 45 4 (1999) 560-574
- (1999) Management Science , vol.45 , Issue.4 , pp. 560-574
- Das, T.K.¹ Gosavi, A.² Mahadevan, S.³ Marchalleck, N.⁴

14
- 0742319170
- Reinforcement learning for long run average cost
- Gosavi A. Reinforcement learning for long run average cost. European Journal of Operational Research 144 (2004) 654-674
- (2004) European Journal of Operational Research , vol.144 , pp. 654-674
- Gosavi, A.¹

15
- 24544450341
- Machine learning in games: a survey
- Fürnkranz J., and Kubat M. (Eds), Nova Science Publishers, Huntington, NY, USA [chapter 2]
- Fürnkranz J. Machine learning in games: a survey. In: Fürnkranz J., and Kubat M. (Eds). Machines that learn to play games (2001), Nova Science Publishers, Huntington, NY, USA 11-59 [chapter 2]
- (2001) Machines that learn to play games , pp. 11-59
- Fürnkranz, J.¹

16
- 21844502480
- Discovering complex Othello strategies through evolutionary neural networks
- Moriarty D.E., and Miikkulainen R. Discovering complex Othello strategies through evolutionary neural networks. Connection Science 7 3 (1995) 195-210
- (1995) Connection Science , vol.7 , Issue.3 , pp. 195-210
- Moriarty, D.E.¹ Miikkulainen, R.²

17
- 21044442867
- Observing the evolution of neural networks learning to play the game of othello
- Chong S.Y., Tan M.K., and White J.D. Observing the evolution of neural networks learning to play the game of othello. IEEE Transactions on Evolutionary Computation 9 3 (2005) 240-251
- (2005) IEEE Transactions on Evolutionary Computation , vol.9 , Issue.3 , pp. 240-251
- Chong, S.Y.¹ Tan, M.K.² White, J.D.³

18
- 0003584577
- Prentice-Hall, Englewood Cliffs, NJ, USA
- Russell S., and Norvig P. Artificial intelligence-a modern approach. 2nd ed. (2003), Prentice-Hall, Englewood Cliffs, NJ, USA
- (2003) Artificial intelligence-a modern approach. 2nd ed.
- Russell, S.¹ Norvig, P.²

19
- 0003787146
- Princeton University Press, Princeton, NJ, USA
- Bellman R.E. Dynamic programming (1957), Princeton University Press, Princeton, NJ, USA
- (1957) Dynamic programming
- Bellman, R.E.¹

20
- 35348954680
- Watkins CJCH. Learning from delayed rewards. PhD thesis, Cambridge University, Cambridge, England; 1989.

21
- 34249833101
- Q-learning
- Watkins C.J.C.H., and Dayan P. Q-learning. Machine Learning 8 3 (1992) 279-292
- (1992) Machine Learning , vol.8 , Issue.3 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

22
- 35348961891
- Allis LV. Searching for solutions in games and artificial intelligence. PhD thesis, University of Limburg, Maastricht, The Netherlands; 1994.

23
- 0036149663
- Games solved: now and in the future
- van der Herik H.J., Uiterwijk J.W.H.M., and van Rijswijck J. Games solved: now and in the future. Artificial Intelligence 134 1-2 (2002) 277-311
- (2002) Artificial Intelligence , vol.134 , Issue.1-2 , pp. 277-311
- van der Herik, H.J.¹ Uiterwijk, J.W.H.M.² van Rijswijck, J.³

24
- 0003487601
- Oxford University Press, New York, NY, USA
- Bishop C.M. Neural networks for pattern recognition (1995), Oxford University Press, New York, NY, USA
- (1995) Neural networks for pattern recognition
- Bishop, C.M.¹

25
- 85149834820
- Markov games as a framework for multi-agent reinforcement learning
- Morgan Kaufmann, San Francisco, CA, USA
- Littman M.L. Markov games as a framework for multi-agent reinforcement learning. Proceedings of the eleventh international conference on machine learning (1994), Morgan Kaufmann, San Francisco, CA, USA 157-163
- (1994) Proceedings of the eleventh international conference on machine learning , pp. 157-163
- Littman, M.L.¹

26
- 0001547175
- Value-function reinforcement learning in Markov games
- Littman M.L. Value-function reinforcement learning in Markov games. Journal of Cognitive Systems Research 2 1 (2001) 55-66
- (2001) Journal of Cognitive Systems Research , vol.2 , Issue.1 , pp. 55-66
- Littman, M.L.¹

27
- 4644369748
- Nash Q-learning for general-sum stochastic games
- Hu J., and Wellman M.P. Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research 4 (2003) 1039-1069
- (2003) Journal of Machine Learning Research , vol.4 , pp. 1039-1069
- Hu, J.¹ Wellman, M.P.²

28
- 35349018883
- le Comte M. Introduction to Othello; 2000.

29
- 35349017772
- Rose B. Othello: a minute to learn-a lifetime to master; 2005.

30
- 35349010953
- Doucette MJ. Wipeout: the engineering of an Othello program. Project report, Acadia University, Wolfville, NS, Canada; 1998.

31
- 35349007860
- Leouski AV, Utgoff PE. What a neural network can learn about Othello. Technical report UM-CS-1996-010, Computer Science Department, Lederle Graduate Research Center, University of Massachusetts, Amherst, MA, USA; 1996.

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.