-
1
-
-
0020970738
-
Neuronlike adaptive elements that can solve difficult learning control problems
-
Barto, A. G., Sutton, R. S., & Anderson, C. W. (1983). Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst., Man. & Cybern., 13, 834-846.
-
(1983)
IEEE Trans. Syst., Man. & Cybern.
, vol.13
, pp. 834-846
-
-
Barto, A.G.1
Sutton, R.S.2
Anderson, C.W.3
-
2
-
-
0030284259
-
Perfect recall and pruning in games with imperfect information
-
Blair, J. R. S., Mutchler, D., & Lent, M. (1995). Perfect recall and pruning in games with imperfect information. Computational Intelligence, 12, 131-154.
-
(1995)
Computational Intelligence
, vol.12
, pp. 131-154
-
-
Blair, J.R.S.1
Mutchler, D.2
Lent, M.3
-
4
-
-
0032208335
-
Elevator group control using multiple reinforcement learning agents
-
Crites, R. H., & Barto, A. G. (1996). Elevator group control using multiple reinforcement learning agents. Machine Learning, 33, 235-262.
-
(1996)
Machine Learning
, vol.33
, pp. 235-262
-
-
Crites, R.H.1
Barto, A.G.2
-
5
-
-
0036374294
-
Gib: Imperfect information in a computationally challenging fame
-
Ginsberg, M. (2001). Gib: Imperfect information in a computationally challenging fame. Journal of Artificial Intelligence Research, 14, 303-358.
-
(2001)
Journal of Artificial Intelligence Research
, vol.14
, pp. 303-358
-
-
Ginsberg, M.1
-
7
-
-
0036592028
-
Control of exploitation-exploration meta-parameter in reinforcement learning
-
Ishii, S., Yoshida, W., & Yoshimoto, J. (2002). Control of exploitation-exploration meta-parameter in reinforcement learning. Neural Networks, 15, 665-687.
-
(2002)
Neural Networks
, vol.15
, pp. 665-687
-
-
Ishii, S.1
Yoshida, W.2
Yoshimoto, J.3
-
8
-
-
0032073263
-
Planning and acting in partially observable stochastic domains
-
Kaelbling, L. P., Littman, M. L., & Cassandra, A. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101, 99-134.
-
(1998)
Artificial Intelligence
, vol.101
, pp. 99-134
-
-
Kaelbling, L.P.1
Littman, M.L.2
Cassandra, A.3
-
9
-
-
0012331016
-
Memory approaches to reinforcement learning in non-markovian domains
-
Lin, L.-J., & Mitchell, T. (1992). Memory approaches to reinforcement learning in non-markovian domains. Tech. rep., CMU-CS-92-138.
-
(1992)
Tech. Rep.
, vol.CMU-CS-92-138
-
-
Lin, L.-J.1
Mitchell, T.2
-
11
-
-
0034819983
-
A multi-agent reinforcement learning method for a partially-observable competitive game
-
Matsuno, Y., Yamazaki, T., Matsuda, J., & Ishii, S. (2001). A multi-agent reinforcement learning method for a partially-observable competitive game. In Proceedings of the Fifth International Conference on Autonomous Agents (pp. 39-40).
-
(2001)
Proceedings of the Fifth International Conference on Autonomous Agents
, pp. 39-40
-
-
Matsuno, Y.1
Yamazaki, T.2
Matsuda, J.3
Ishii, S.4
-
13
-
-
0000672424
-
Fast learning in networks of locally-tuned processing units
-
Moody, J., & Darken, C. J. (1989). Fast learning in networks of locally-tuned processing units. Neural Computation, 1, 281-294.
-
(1989)
Neural Computation
, vol.1
, pp. 281-294
-
-
Moody, J.1
Darken, C.J.2
-
14
-
-
0027684215
-
Prioritized sweeping: Reinforcement learning with less data and less real time
-
Moore, A., & Atkeson, C. (1993). Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning, 13, 103-130.
-
(1993)
Machine Learning
, vol.13
, pp. 103-130
-
-
Moore, A.1
Atkeson, C.2
-
16
-
-
0032208296
-
Learning team strategies: Soccer case studies
-
Salustowicz, K. P., Wiering, M. A., & Schmidhuber, J. (1998). Learning team strategies: Soccer case studies. Machine Learning, 33, 263-282.
-
(1998)
Machine Learning
, vol.33
, pp. 263-282
-
-
Salustowicz, K.P.1
Wiering, M.A.2
Schmidhuber, J.3
-
17
-
-
0030050933
-
Multiagent reinforcement learning in the iterated prisoner's dilemma
-
Sandholm, T. W., & Crites, R. H. (1995). Multiagent reinforcement learning in the iterated prisoner's dilemma, Biosystems, 37, 147-166.
-
(1995)
Biosystems
, vol.37
, pp. 147-166
-
-
Sandholm, T.W.1
Crites, R.H.2
-
18
-
-
0034131785
-
On-line em algorithm for the normalized gaussian network
-
Sato, M., & Ishii, S. (2000). On-line em algorithm for the normalized gaussian network. Neural Computation, 12, 407-432.
-
(2000)
Neural Computation
, vol.12
, pp. 407-432
-
-
Sato, M.1
Ishii, S.2
-
22
-
-
0000985504
-
Td-gammon, a self-teaching backgammon program, achieves masterlevel play
-
Tesauro, G. J. (1994). Td-gammon, a self-teaching backgammon program, achieves masterlevel play. Neural Computation, 6, 215-219.
-
(1994)
Neural Computation
, vol.6
, pp. 215-219
-
-
Tesauro, G.J.1
-
23
-
-
0029250080
-
Reinforcement learning of non-markov decision processes
-
Whitehead, S., & Lin, L.-J. (1995). Reinforcement learning of non-markov decision processes. Artificial Intelligence, 73, 271-306.
-
(1995)
Artificial Intelligence
, vol.73
, pp. 271-306
-
-
Whitehead, S.1
Lin, L.-J.2
|