-
1
-
-
0026391196
-
-
Experience-based learning experiments using Gomoku, in: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, Charlottesville, Virginia, USA, October 13-16
-
T.K. William, S. Pham, Experience-based learning experiments using Gomoku, in: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, Charlottesville, Virginia, USA, vol. 2, October 13-16, 1991, pp. 1405-1410.
-
(1991)
, vol.2
, pp. 1405-1410
-
-
William, T.K.1
Pham, S.2
-
2
-
-
0024767179
-
The history heuristic and alpha-beta search enhancements in practice
-
Schaeffer J. The history heuristic and alpha-beta search enhancements in practice. IEEE Trans. Pattern Anal. Mach. Intell. 1989, 11(11):1203-1212.
-
(1989)
IEEE Trans. Pattern Anal. Mach. Intell.
, vol.11
, Issue.11
, pp. 1203-1212
-
-
Schaeffer, J.1
-
3
-
-
84855338829
-
-
Gomoku and Threat-Space Search, doi:.
-
L.V. Allis, H.J. Herik, M.P.H. Huntjens, Gomoku and Threat-Space Search, 2010, doi:. http://10.1.1.96.5836.
-
(2010)
-
-
Allis, L.V.1
Herik, H.J.2
Huntjens, M.P.H.3
-
5
-
-
0007993990
-
Connectionist learning of expert preferences by comparison training
-
Morgan Kaufman, San Francisco
-
Teasauro G. Connectionist learning of expert preferences by comparison training. Advances in Neural Information Processing Systems 1989, vol. 1:99-106. Morgan Kaufman, San Francisco.
-
(1989)
Advances in Neural Information Processing Systems
, vol.1
, pp. 99-106
-
-
Teasauro, G.1
-
6
-
-
61849147871
-
Reinforcement-learning agents with different temperature parameters explain the variety of human action-selection behavior in a Markov decision process task
-
Ishida F., Sasaki T., Sakaguchi Y., Shimai H. Reinforcement-learning agents with different temperature parameters explain the variety of human action-selection behavior in a Markov decision process task. Neurocomputing 2009, 72:1979-1984.
-
(2009)
Neurocomputing
, vol.72
, pp. 1979-1984
-
-
Ishida, F.1
Sasaki, T.2
Sakaguchi, Y.3
Shimai, H.4
-
7
-
-
0025559238
-
Neurogammon: a neural-network backgammon program
-
Proceedings of International Joint Conference Neural Networks, San Diego, California, USA, June 17-21
-
G. Tesauro, Neurogammon: a neural-network backgammon program, in: Proceedings of International Joint Conference Neural Networks, San Diego, California, USA, June 17-21, 1990, pp. 33-40.
-
(1990)
, pp. 33-40
-
-
Tesauro, G.1
-
8
-
-
0001046225
-
Practical issues in temporal difference learning
-
Tesauro G. Practical issues in temporal difference learning. Mach. Learn. 1992, 8:257-277.
-
(1992)
Mach. Learn.
, vol.8
, pp. 257-277
-
-
Tesauro, G.1
-
9
-
-
0000985504
-
TD-Gammon, A self-teaching backgammon program achieves master-level play
-
Tesauro G. TD-Gammon, A self-teaching backgammon program achieves master-level play. Neural Comput. 1994, 6:215-219.
-
(1994)
Neural Comput.
, vol.6
, pp. 215-219
-
-
Tesauro, G.1
-
10
-
-
82655187342
-
-
Study and Practice on Machine Self-Learning of Game-Playing. Master Thesis, Guangxi Normal University
-
J.M. Mo, Study and Practice on Machine Self-Learning of Game-Playing. Master Thesis, Guangxi Normal University, 2003.
-
(2003)
-
-
Mo, J.M.1
-
11
-
-
0034275416
-
Learning to play chess using temporal differences
-
Baxter J., Tridgell A., Weaver L. Learning to play chess using temporal differences. Mach. Learn. 2000, 40:243-263.
-
(2000)
Mach. Learn.
, vol.40
, pp. 243-263
-
-
Baxter, J.1
Tridgell, A.2
Weaver, L.3
-
12
-
-
77949562818
-
Knowledge-free and learning-based methods in intelligent game playing
-
Jacek M. Knowledge-free and learning-based methods in intelligent game playing. Stud. Comput. Intell. 2010, 276:71-89.
-
(2010)
Stud. Comput. Intell.
, vol.276
, pp. 71-89
-
-
Jacek, M.1
-
13
-
-
1542471417
-
Mini-max initialization for function approximation
-
Zhang X.M., Chen Y.Q., Ansari N., Shi Y.Q. Mini-max initialization for function approximation. Neurocomputing 2004, 57:389-409.
-
(2004)
Neurocomputing
, vol.57
, pp. 389-409
-
-
Zhang, X.M.1
Chen, Y.Q.2
Ansari, N.3
Shi, Y.Q.4
-
14
-
-
79952619657
-
Robust high performance reinforcement learning through weighted k-nearest neighbors
-
Martin H J.A., Lope J., Maravall D. Robust high performance reinforcement learning through weighted k-nearest neighbors. Neurocomputing 2011, 74:1251-1259.
-
(2011)
Neurocomputing
, vol.74
, pp. 1251-1259
-
-
Martin H, J.A.1
Lope, J.2
Maravall, D.3
-
15
-
-
0020970738
-
Neuron like adaptive elements that can solve difficult learning control problems
-
Barto A.G., Sutton R.S., Anderson C.W. Neuron like adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. 1983, 13:834-847.
-
(1983)
IEEE Trans. Syst. Man Cybern.
, vol.13
, pp. 834-847
-
-
Barto, A.G.1
Sutton, R.S.2
Anderson, C.W.3
-
16
-
-
0002011091
-
A Menu of Designs for Reinforcement Learning Over Time
-
MIT Press, Cambridge
-
Werbos P.J. A Menu of Designs for Reinforcement Learning Over Time. Neural Networks for Control 1990, MIT Press, Cambridge.
-
(1990)
Neural Networks for Control
-
-
Werbos, P.J.1
-
18
-
-
0001192446
-
A neighboring optimal adaptive critic for missile guidance
-
Dalton J., Balakrishnan S.N. A neighboring optimal adaptive critic for missile guidance. Math. Comput. Model. 1996, 23(1):175-188.
-
(1996)
Math. Comput. Model.
, vol.23
, Issue.1
, pp. 175-188
-
-
Dalton, J.1
Balakrishnan, S.N.2
-
19
-
-
34548766226
-
Particle swarm optimized adaptive dynamic programming
-
Proceedings of the 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, Honolulu, Hawaiian Islands, USA, April 1-5
-
D.B. Zhao, J.Q. Yi, D.R. Liu, Particle swarm optimized adaptive dynamic programming, in: Proceedings of the 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, Honolulu, Hawaiian Islands, USA, April 1-5, 2007, pp. 32-37.
-
(2007)
, pp. 32-37
-
-
Zhao, D.B.1
Yi, J.Q.2
Liu, D.R.3
-
20
-
-
82655164054
-
Self-play and using an expert to learn to play backgammon with temporal difference learning
-
Wiering M.A. Self-play and using an expert to learn to play backgammon with temporal difference learning. J. Intell. Learn. Syst. Appl. 2010, 2:57-68.
-
(2010)
J. Intell. Learn. Syst. Appl.
, vol.2
, pp. 57-68
-
-
Wiering, M.A.1
-
22
-
-
84855331502
-
-
2010. http://gomocup.wz.cz/gomoku/download.php.
-
(2010)
-
-
|