-
1
-
-
84974870136
-
A survey of applications of Markov decision processes
-
White D.J. A survey of applications of Markov decision processes. The Journal of the Operational Research Society 44 11 (1993) 1073-1096
-
(1993)
The Journal of the Operational Research Society
, vol.44
, Issue.11
, pp. 1073-1096
-
-
White, D.J.1
-
2
-
-
0003487482
-
-
Athena Scientific, Belmont, MA, USA
-
Bertsekas D.P., and Tsitsiklis J. Neuro-dynamic programming (1996), Athena Scientific, Belmont, MA, USA
-
(1996)
Neuro-dynamic programming
-
-
Bertsekas, D.P.1
Tsitsiklis, J.2
-
4
-
-
0004102479
-
-
MIT Press, Cambridge, MA, USA
-
Sutton R.S., and Barto A.G. Reinforcement learning: an introduction (1998), MIT Press, Cambridge, MA, USA
-
(1998)
Reinforcement learning: an introduction
-
-
Sutton, R.S.1
Barto, A.G.2
-
6
-
-
0000985504
-
TD-gammon a self-teaching backgammon program, achieves master-level play
-
Tesauro G. TD-gammon a self-teaching backgammon program, achieves master-level play. Neural Computation 6 2 (1994) 215-219
-
(1994)
Neural Computation
, vol.6
, Issue.2
, pp. 215-219
-
-
Tesauro, G.1
-
7
-
-
0029276036
-
Temporal difference learning and TD-gammon
-
Tesauro G. Temporal difference learning and TD-gammon. Communications of the ACM 38 3 (1995) 58-68
-
(1995)
Communications of the ACM
, vol.38
, Issue.3
, pp. 58-68
-
-
Tesauro, G.1
-
8
-
-
0036147771
-
Programming backgammon using self-teaching neural nets
-
Tesauro G. Programming backgammon using self-teaching neural nets. Artificial Intelligence 134 1-2 (2002) 181-199
-
(2002)
Artificial Intelligence
, vol.134
, Issue.1-2
, pp. 181-199
-
-
Tesauro, G.1
-
9
-
-
0036722536
-
A reinforcement learning approach to a single leg airline revenue management problem with multiple fare classes and overbooking
-
Gosavi A., Bandla N., and Das T.K. A reinforcement learning approach to a single leg airline revenue management problem with multiple fare classes and overbooking. IIE Transactions 34 9 (2002) 729-742
-
(2002)
IIE Transactions
, vol.34
, Issue.9
, pp. 729-742
-
-
Gosavi, A.1
Bandla, N.2
Das, T.K.3
-
10
-
-
85156187730
-
Improving elevator performance using reinforcement learning
-
Touretzky D.S., Mozer M.C., and Hasselmo M.E. (Eds), The MIT Press, Cambridge, MA
-
Crites R.H., and Barto A.G. Improving elevator performance using reinforcement learning. In: Touretzky D.S., Mozer M.C., and Hasselmo M.E. (Eds). Advances in neural information processing systems vol. 8 (1996), The MIT Press, Cambridge, MA 1017-1023
-
(1996)
Advances in neural information processing systems
, vol.8
, pp. 1017-1023
-
-
Crites, R.H.1
Barto, A.G.2
-
11
-
-
0032208335
-
Elevator group control using multiple reinforcement learning agents
-
Crites R.H., and Barto A.G. Elevator group control using multiple reinforcement learning agents. Machine Learning 33 2 (1998) 235-262
-
(1998)
Machine Learning
, vol.33
, Issue.2
, pp. 235-262
-
-
Crites, R.H.1
Barto, A.G.2
-
12
-
-
0242540456
-
-
Pednault E, Abe N, Zadrozny B. Sequential cost-sensitive decision-making with reinforcement learning. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. Edmonton, Alberta, Canada: ACM Press; 2002. p. 259-68.
-
-
-
-
13
-
-
0032643313
-
Solving semi-Markov decision problems using average reward reinforcement learning
-
Das T.K., Gosavi A., Mahadevan S., and Marchalleck N. Solving semi-Markov decision problems using average reward reinforcement learning. Management Science 45 4 (1999) 560-574
-
(1999)
Management Science
, vol.45
, Issue.4
, pp. 560-574
-
-
Das, T.K.1
Gosavi, A.2
Mahadevan, S.3
Marchalleck, N.4
-
14
-
-
0742319170
-
Reinforcement learning for long run average cost
-
Gosavi A. Reinforcement learning for long run average cost. European Journal of Operational Research 144 (2004) 654-674
-
(2004)
European Journal of Operational Research
, vol.144
, pp. 654-674
-
-
Gosavi, A.1
-
15
-
-
24544450341
-
Machine learning in games: a survey
-
Fürnkranz J., and Kubat M. (Eds), Nova Science Publishers, Huntington, NY, USA [chapter 2]
-
Fürnkranz J. Machine learning in games: a survey. In: Fürnkranz J., and Kubat M. (Eds). Machines that learn to play games (2001), Nova Science Publishers, Huntington, NY, USA 11-59 [chapter 2]
-
(2001)
Machines that learn to play games
, pp. 11-59
-
-
Fürnkranz, J.1
-
16
-
-
21844502480
-
Discovering complex Othello strategies through evolutionary neural networks
-
Moriarty D.E., and Miikkulainen R. Discovering complex Othello strategies through evolutionary neural networks. Connection Science 7 3 (1995) 195-210
-
(1995)
Connection Science
, vol.7
, Issue.3
, pp. 195-210
-
-
Moriarty, D.E.1
Miikkulainen, R.2
-
17
-
-
21044442867
-
Observing the evolution of neural networks learning to play the game of othello
-
Chong S.Y., Tan M.K., and White J.D. Observing the evolution of neural networks learning to play the game of othello. IEEE Transactions on Evolutionary Computation 9 3 (2005) 240-251
-
(2005)
IEEE Transactions on Evolutionary Computation
, vol.9
, Issue.3
, pp. 240-251
-
-
Chong, S.Y.1
Tan, M.K.2
White, J.D.3
-
18
-
-
0003584577
-
-
Prentice-Hall, Englewood Cliffs, NJ, USA
-
Russell S., and Norvig P. Artificial intelligence-a modern approach. 2nd ed. (2003), Prentice-Hall, Englewood Cliffs, NJ, USA
-
(2003)
Artificial intelligence-a modern approach. 2nd ed.
-
-
Russell, S.1
Norvig, P.2
-
19
-
-
0003787146
-
-
Princeton University Press, Princeton, NJ, USA
-
Bellman R.E. Dynamic programming (1957), Princeton University Press, Princeton, NJ, USA
-
(1957)
Dynamic programming
-
-
Bellman, R.E.1
-
20
-
-
35348954680
-
-
Watkins CJCH. Learning from delayed rewards. PhD thesis, Cambridge University, Cambridge, England; 1989.
-
-
-
-
22
-
-
35348961891
-
-
Allis LV. Searching for solutions in games and artificial intelligence. PhD thesis, University of Limburg, Maastricht, The Netherlands; 1994.
-
-
-
-
25
-
-
85149834820
-
Markov games as a framework for multi-agent reinforcement learning
-
Morgan Kaufmann, San Francisco, CA, USA
-
Littman M.L. Markov games as a framework for multi-agent reinforcement learning. Proceedings of the eleventh international conference on machine learning (1994), Morgan Kaufmann, San Francisco, CA, USA 157-163
-
(1994)
Proceedings of the eleventh international conference on machine learning
, pp. 157-163
-
-
Littman, M.L.1
-
26
-
-
0001547175
-
Value-function reinforcement learning in Markov games
-
Littman M.L. Value-function reinforcement learning in Markov games. Journal of Cognitive Systems Research 2 1 (2001) 55-66
-
(2001)
Journal of Cognitive Systems Research
, vol.2
, Issue.1
, pp. 55-66
-
-
Littman, M.L.1
-
28
-
-
35349018883
-
-
le Comte M. Introduction to Othello; 2000.
-
-
-
-
29
-
-
35349017772
-
-
Rose B. Othello: a minute to learn-a lifetime to master; 2005.
-
-
-
-
30
-
-
35349010953
-
-
Doucette MJ. Wipeout: the engineering of an Othello program. Project report, Acadia University, Wolfville, NS, Canada; 1998.
-
-
-
-
31
-
-
35349007860
-
-
Leouski AV, Utgoff PE. What a neural network can learn about Othello. Technical report UM-CS-1996-010, Computer Science Department, Lederle Graduate Research Center, University of Massachusetts, Amherst, MA, USA; 1996.
-
-
-
|