-
2
-
-
0029679044
-
Reinforcement learning: A survey
-
Online, Available
-
L. P. Kaelbling, M. Littman, and A. Moore, "Reinforcement learning: A survey," Journal of Artificial Intelligence Research, vol. 4, pp. 237-285, 1996. [Online]. Available: citeseer.ist.psu.edu/ kaelbling96reinforcement.html
-
(1996)
Journal of Artificial Intelligence Research
, vol.4
, pp. 237-285
-
-
Kaelbling, L.P.1
Littman, M.2
Moore, A.3
-
3
-
-
33847202724
-
Learning to predict by the method of temporal differences
-
R. Sutton, "Learning to predict by the method of temporal differences," Machine Learning, vol. 3, pp. 9-44, 1988.
-
(1988)
Machine Learning
, vol.3
, pp. 9-44
-
-
Sutton, R.1
-
4
-
-
0004049893
-
Learning from delayed rewards,
-
Ph.D. dissertation, Cambridge University
-
C. Watkins, "Learning from delayed rewards," Ph.D. dissertation, Cambridge University, 1989.
-
(1989)
-
-
Watkins, C.1
-
5
-
-
45149092016
-
-
home page
-
B. Bouzy, "Indigo home page," www.math-info.univparis5.fr/ ~bouzy/INDIGO.html, 2005.
-
(2005)
Indigo
-
-
Bouzy, B.1
-
6
-
-
0036145791
-
Games, Computers, and Artificial Intelligence
-
J. Schaeffer and J. van den Herik, "Games, Computers, and Artificial Intelligence," Artificial Intelligence, vol. 134, pp. 1-7, 2002.
-
(2002)
Artificial Intelligence
, vol.134
, pp. 1-7
-
-
Schaeffer, J.1
van den Herik, J.2
-
7
-
-
0036149663
-
Games solved: Now and in the future
-
H. van den Herik, J. Uiterwijk, and J. van Rijswijck, "Games solved: Now and in the future," Artificial Intelligence, vol. 134, pp. 277-311, 2002.
-
(2002)
Artificial Intelligence
, vol.134
, pp. 277-311
-
-
van den Herik, H.1
Uiterwijk, J.2
van Rijswijck, J.3
-
9
-
-
0036149522
-
Deep blue
-
M. Campbell, A. Hoane, and F.-H. Hsu, "Deep blue," Artificial Intelligence, vol. 134, pp. 57-83, 2002.
-
(2002)
Artificial Intelligence
, vol.134
, pp. 57-83
-
-
Campbell, M.1
Hoane, A.2
Hsu, F.-H.3
-
10
-
-
84880710441
-
Solving checkers
-
J. Schaeffer, Y. Björnsson, N. Burch, A. Kishimoto, M. Müller, R. Lake, P. Lu, and S. Sutphen, "Solving checkers," in IJCAI, 2005, pp. 292-297.
-
(2005)
IJCAI
, pp. 292-297
-
-
Schaeffer, J.1
Björnsson, Y.2
Burch, N.3
Kishimoto, A.4
Müller, M.5
Lake, R.6
Lu, P.7
Sutphen, S.8
-
11
-
-
0036148118
-
-
M. Buro, Improving heuristic mini-max search by supervised learning, Artificial Intelligence Journal, 134, pp. 85-99, 2002.
-
M. Buro, "Improving heuristic mini-max search by supervised learning," Artificial Intelligence Journal, vol. 134, pp. 85-99, 2002.
-
-
-
-
12
-
-
24944583230
-
Position evaluation in computer go
-
December
-
M. Müller, "Position evaluation in computer go," ICGA Journal, vol. 25, no. 4, pp. 219-228, December 2002.
-
(2002)
ICGA Journal
, vol.25
, Issue.4
, pp. 219-228
-
-
Müller, M.1
-
14
-
-
0035479281
-
Computer go: An Al oriented survey
-
B. Bouzy and T. Cazenave, "Computer go: an Al oriented survey," Artificial Intelligence, vol. 132, pp. 39-103, 2001.
-
(2001)
Artificial Intelligence
, vol.132
, pp. 39-103
-
-
Bouzy, B.1
Cazenave, T.2
-
16
-
-
45149130544
-
-
M. Reiss, "Go++," www.goplusplus.com/.
-
Go
-
-
Reiss, M.1
-
17
-
-
45149134550
-
-
home page
-
D. Bump, "Gnugo home page," www.gnu.org/software/gnugo/devel. html, 2006.
-
(2006)
Gnugo
-
-
Bump, D.1
-
19
-
-
45149085972
-
Explorer
-
M. Müller, "Explorer," web.cs.ualberta.ca/~mmueller/cgo/ explorer.html, 2005.
-
(2005)
-
-
Müller, M.1
-
20
-
-
45149093374
-
-
T. Cazenave, "Golois," www.ai.univ-paris8.fr/~cazenave/Golois. html.
-
Golois
-
-
Cazenave, T.1
-
21
-
-
0001798654
-
Some practical techniques for global search in go
-
K. Chen, "Some practical techniques for global search in go," ICGA Journal, vol. 23, no. 2, pp. 67-74, 2000.
-
(2000)
ICGA Journal
, vol.23
, Issue.2
, pp. 67-74
-
-
Chen, K.1
-
22
-
-
45149134549
-
-
M. Enzenberger, Evaluation in go by a neural network using soft segmentation, in 10th Advances in Computer Games. E. A. H. H. Jaap van den Herik, Hiroyuki lida. Ed. Graz: Kluwer Academic Publishers, 2003. pp. 97-108.
-
M. Enzenberger, "Evaluation in go by a neural network using soft segmentation," in 10th Advances in Computer Games. E. A. H. H. Jaap van den Herik, Hiroyuki lida. Ed. Graz: Kluwer Academic Publishers, 2003. pp. 97-108.
-
-
-
-
23
-
-
0001580774
-
Decomposition search: A combinatorial games approach to game tree search, with applications to solving go endgame
-
M. Müller, "Decomposition search: A combinatorial games approach to game tree search, with applications to solving go endgame," in IJCAI, 1999, pp. 578-583.
-
(1999)
IJCAI
, pp. 578-583
-
-
Müller, M.1
-
24
-
-
84958743851
-
Abstract proof search
-
Computers and Games, F. T. Marsland, Ed, Springer
-
T. Cazenave, "Abstract proof search," in Computers and Games, ser. Lecture Notes in Computer Science, I. F. T. Marsland, Ed., no. 2063. Springer, 2000, pp. 39-54.
-
(2000)
ser. Lecture Notes in Computer Science
, vol.1
, Issue.2063
, pp. 39-54
-
-
Cazenave, T.1
-
25
-
-
85085780301
-
Learning to score final positions in the game of go
-
H. J. van den Herik, H. Iida, and E. A. Heinz, Eds, Kluwer Academic Publishers
-
E. van der Werf, J. Uiterwijk, and J. van den Herik, "Learning to score final positions in the game of go," in Advances in Computer Games, Many Games, Many Challenges, H. J. van den Herik, H. Iida, and E. A. Heinz, Eds., vol. 10. Kluwer Academic Publishers, 2003, pp. 143-158.
-
(2003)
Advances in Computer Games, Many Games, Many Challenges
, vol.10
, pp. 143-158
-
-
van der Werf, E.1
Uiterwijk, J.2
van den Herik, J.3
-
27
-
-
84898992015
-
On-line policy improvement using Monte Carlo search
-
Cambridge MA: MIT Press
-
G. Tesauro and G. Galperin, "On-line policy improvement using Monte Carlo search," in Advances in Neural Information Processing Systems. Cambridge MA: MIT Press, 1996, pp. 1068-1074.
-
(1996)
Advances in Neural Information Processing Systems
, pp. 1068-1074
-
-
Tesauro, G.1
Galperin, G.2
-
28
-
-
0036149710
-
The challenge of poker
-
D. Billings, A. Davidson, J. Schaeffer, and D. Szafron, "The challenge of poker," Artificial Intelligence, vol. 134, pp. 201-240, 2002.
-
(2002)
Artificial Intelligence
, vol.134
, pp. 201-240
-
-
Billings, D.1
Davidson, A.2
Schaeffer, J.3
Szafron, D.4
-
29
-
-
0036146034
-
World-championship-caliber scrabble
-
B. Sheppard, "World-championship-caliber scrabble," Artificial Intelligence, vol. 134, pp. 241-275, 2002.
-
(2002)
Artificial Intelligence
, vol.134
, pp. 241-275
-
-
Sheppard, B.1
-
30
-
-
0025386231
-
Expected-outcome : A general model of static evaluation
-
B. Abramson, "Expected-outcome : a general model of static evaluation," IEEE Transactions on PAMI, vol. 12, pp. 182-193, 1990.
-
(1990)
IEEE Transactions on PAMI
, vol.12
, pp. 182-193
-
-
Abramson, B.1
-
33
-
-
84902513084
-
-
B. Bouzy and B. Helmstetter, Monte Carlo go developments, in 10th Advances in Computer Games, E. A. H. H. Jaap van den Herik, Hiroyuki Iida, Ed. Graz: Kluwer Academic Publishers, 2003, pp. 159-174.
-
B. Bouzy and B. Helmstetter, "Monte Carlo go developments," in 10th Advances in Computer Games, E. A. H. H. Jaap van den Herik, Hiroyuki Iida, Ed. Graz: Kluwer Academic Publishers, 2003, pp. 159-174.
-
-
-
-
34
-
-
0004280606
-
Learning in embedded systems,
-
Ph.D. dissertation, MIT
-
L. P. Kaelbling, "Learning in embedded systems," Ph.D. dissertation, MIT, 1993.
-
(1993)
-
-
Kaelbling, L.P.1
-
35
-
-
24944572334
-
The move decision process of Indigo
-
March
-
B. Bouzy, "The move decision process of Indigo," International Computer Game Association Journal, vol. 26, no. 1, pp. 14-27, March 2003.
-
(2003)
International Computer Game Association Journal
, vol.26
, Issue.1
, pp. 14-27
-
-
Bouzy, B.1
-
36
-
-
45149121322
-
-
_, Associating shallow and selective global tree search with Monte Carlo for 9×9 go, in Computers and Games: 4th International Conference, CG 2004, ser. Lecture Notes in Computer Science, N. N. J. van den Herik, Y. Björnsson, Ed., 3846 / 2006. Ramat-Gan. Israel: Springer Verlag, July 2004, pp. 67-80.
-
_, "Associating shallow and selective global tree search with Monte Carlo for 9×9 go," in Computers and Games: 4th International Conference, CG 2004, ser. Lecture Notes in Computer Science, N. N. J. van den Herik, Y. Björnsson, Ed., vol. 3846 / 2006. Ramat-Gan. Israel: Springer Verlag, July 2004, pp. 67-80.
-
-
-
-
37
-
-
40649089044
-
-
home page
-
P. Kaminski, "Vegos home page," www.ideanest.com/vegos/, 2003.
-
(2003)
Vegos
-
-
Kaminski, P.1
-
38
-
-
45149122181
-
Seven year itch
-
J. Hamlen, "Seven year itch," ICGA Journal, vol. 27, no. 4, pp. 255-258, 2004.
-
(2004)
ICGA Journal
, vol.27
, Issue.4
, pp. 255-258
-
-
Hamlen, J.1
-
39
-
-
34547971839
-
Efficient selectivity and back-up operators in montecarlo tree search
-
Torino, Italy, paper currently submitted
-
R. Coulom, "Efficient selectivity and back-up operators in montecarlo tree search," in Computers and Games, Torino, Italy, 2006, paper currently submitted.
-
(2006)
Computers and Games
-
-
Coulom, R.1
-
40
-
-
33646238098
-
The go-playing program called go81
-
Helsinki, Finland, September
-
T. Raiko, "The go-playing program called go81," in Finnish Artificial Intelligence Conference, Helsinki, Finland, September 2004, pp. 197-206.
-
(2004)
Finnish Artificial Intelligence Conference
, pp. 197-206
-
-
Raiko, T.1
-
41
-
-
24944478740
-
Associating knowledge and Monte Carlo approaches within a go program
-
November
-
B. Bouzy, "Associating knowledge and Monte Carlo approaches within a go program," Information Sciences, vol. 175, no. 4, pp. 247-257, November 2005.
-
(2005)
Information Sciences
, vol.175
, Issue.4
, pp. 247-257
-
-
Bouzy, B.1
-
45
-
-
0004370245
-
-
Online, Available
-
L. Baird, "Advantage updating." 1993. [Online]. Available: citeseer.ist.psu.edu/baird93advantage.html
-
(1993)
Advantage updating
-
-
Baird, L.1
-
47
-
-
85153938292
-
Reinforcement learning algorithm for partially observable Markov decision problems
-
G. Tesauro, D. Touretzky, and T. Leen, Eds, The MIT Press, Online, Available
-
T. Jaakkola, S. P. Singh, and M. I. Jordan, "Reinforcement learning algorithm for partially observable Markov decision problems," in Advances in Neural Information Processing Systems, G. Tesauro, D. Touretzky, and T. Leen, Eds., vol. 7, The MIT Press, 1995, pp. 345-352. [Online]. Available: citeseer.ist.psu.edu/jaakkola95reinforcement.html
-
(1995)
Advances in Neural Information Processing Systems
, vol.7
, pp. 345-352
-
-
Jaakkola, T.1
Singh, S.P.2
Jordan, M.I.3
-
48
-
-
29244474089
-
Co-evolution versus self-play temporal difference learning for acquiring position evaluation in small-board go
-
December
-
T. P. Runarsson and S. Lucas, "Co-evolution versus self-play temporal difference learning for acquiring position evaluation in small-board go," IEEE Transactions on Evolutionary Computation, vol. 9, no. 6, pp. 628-640, December 2005.
-
(2005)
IEEE Transactions on Evolutionary Computation
, vol.9
, Issue.6
, pp. 628-640
-
-
Runarsson, T.P.1
Lucas, S.2
-
49
-
-
85149834820
-
Markov games as a framework for multi-agent reinforcement learning
-
New Brunswick, NJ: Morgan Kaufmann, Online, Available
-
M. L. Littman, "Markov games as a framework for multi-agent reinforcement learning," in Proceedings of the 11th international Conference on Machine Learning (ML-94). New Brunswick, NJ: Morgan Kaufmann, 1994, pp. 157-163. [Online]. Available: citeseer.ist.psu.edu/littman94markov. html
-
(1994)
Proceedings of the 11th international Conference on Machine Learning (ML-94)
, pp. 157-163
-
-
Littman, M.L.1
|