-
1
-
-
0039816976
-
Using local trajectory optimizers to speed up global optimization in dynamic programming
-
Atkeson, C. (1993). Using local trajectory optimizers to speed up global optimization in dynamic programming. Advances in Neural Information Processing Systems, 5, 663-670.
-
(1993)
Advances in Neural Information Processing Systems
, vol.5
, pp. 663-670
-
-
Atkeson, C.1
-
4
-
-
0034248853
-
Stochastic dynamic programming with factored representations
-
Boutilier, C., Dearden, R., Goldszmidt, M. (2000). Stochastic dynamic programming with factored representations. Artificial Intelligence 121: 49-107.
-
(2000)
Artificial Intelligence
, vol.121
, pp. 49-107
-
-
Boutilier, C.1
Dearden, R.2
Goldszmidt, M.3
-
7
-
-
0036832950
-
Technical update: Least-squares temporal difference learning
-
DOI 10.1023/A:1017936530646
-
Boyan, J. A. (2002). Technical update: Least-squares temporal difference learning. Machine Learning, 49:233-246. (Pubitemid 34325688)
-
(2002)
Machine Learning
, vol.49
, Issue.2-3
, pp. 233-246
-
-
Boyan, J.A.1
-
8
-
-
0001771345
-
Linear least-squares algorithms for temporal difference learning
-
Bradtke, S., Barto, A. G. (1996). Linear least-squares al gorithms for temporal difference learning. Machine Learning, 22:33-57. (Pubitemid 126724362)
-
(1996)
Machine Learning
, vol.22
, Issue.1-3
, pp. 33-57
-
-
Bradtke, S.J.1
-
10
-
-
0030242092
-
General results on the convergence of stochastic algorithms
-
PII S0018928696067748
-
Delyon, B. (1996). General results on the convergence of stochastic algorithms. IEEE Transactions on Automatic Control, 41:1245-1255. (Pubitemid 126768500)
-
(1996)
IEEE Transactions on Automatic Control
, vol.41
, Issue.9
, pp. 1245-1255
-
-
Delyon, B.1
-
11
-
-
33750737011
-
Incremental least-squares temporal difference learning
-
Proceedings of the 21st National Conference on Artificial Intelligence and the 18th Innovative Applications of Artificial Intelligence Conference, AAAI-06/IAAI-06
-
Geramifard, A., Bowling, M., Sutton, R. S. (2006). Incremental least-square temporal difference learning. Proceedings of the National Conference on Artificial Intelligence, pp. 356-361. (Pubitemid 44705310)
-
(2006)
Proceedings of the National Conference on Artificial Intelligence
, vol.1
, pp. 356-361
-
-
Geramifard, A.1
Bowling, M.2
Sutton, R.S.3
-
12
-
-
0037631834
-
Model-based reinforcement learning with an approximate, learned model
-
Yale University, New Haven, CT
-
Kuvayev, L., Sutton, R. S. (1996). Model-based reinforcement learning with an approximate, learned model. Proceedings of the Ninth Yale Workshop on Adaptive and Learning Systems, pp. 101-105, Yale University, New Haven, CT.
-
(1996)
Proceedings of the Ninth Yale Workshop on Adaptive and Learning Systems
, pp. 101-105
-
-
Kuvayev, L.1
Sutton, R.S.2
-
15
-
-
0027684215
-
Prioritized sweeping: Reinforcement learning with less data and less real time
-
Moore, A. W., Atkeson, C. G. (1993). Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning, 13:103-130.
-
(1993)
Machine Learning
, vol.13
, pp. 103-130
-
-
Moore, A.W.1
Atkeson, C.G.2
-
17
-
-
84977063352
-
Efficient learning and planning within the Dyna framework
-
Peng, J.,Williams, R.J. (1993). Efficient learning and planning within the Dyna framework, Adaptive Behavior 1, 437-454.
-
(1993)
Adaptive Behavior
, vol.1
, pp. 437-454
-
-
Peng, J.1
Williams, R.J.2
-
19
-
-
0038145011
-
Temporal difference learning applied to a high-performance game-playing program
-
Schaeffer, J., Hlynka, M., Jussila, V. (2001). Temporal difference learning applied to a high-performance game-playing program. Proceedings of the International Joint Conference on Artificial Intelligence, pp. 529-534.
-
(2001)
Proceedings of the International Joint Conference on Artificial Intelligence
, pp. 529-534
-
-
Schaeffer, J.1
Hlynka, M.2
Jussila, V.3
-
20
-
-
84880900542
-
Reinforcement learning of local shape in the game of Go
-
Silver, D., Sutton, R. S., Müller, M. (2007). Reinforcement learning of local shape in the game of Go. Proceedings of the 20th International Joint Conference on Artificial Intelligence, pp. 1053-1058.
-
(2007)
Proceedings of the 20th International Joint Conference on Artificial Intelligence
, pp. 1053-1058
-
-
Silver, D.1
Sutton, R.S.2
Müller, M.3
-
22
-
-
33847202724
-
Learning to predict by the method of temporal differences
-
Sutton, R. S. (1988). Learning to predict by the method of temporal differences. Machine Learning, 3:9-44.
-
(1988)
Machine Learning
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
23
-
-
85132026293
-
Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
-
Sutton, R. S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming, Proceedings of the Seventh International Conference on Machine Learning, pp. 216-224.
-
(1990)
Proceedings of the Seventh International Conference on Machine Learning
, pp. 216-224
-
-
Sutton, R.S.1
|