-
2
-
-
85156187730
-
Improving elevator performance using reinforcement learning
-
Touretzky, D.S., Mozer, M.C., Hasselmo, M.E., (Eds.), MIT Press, Cambridge MA
-
Crites, R.H., Barto, A.G. 1996. Improving elevator performance using reinforcement learning. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (Eds.), Advances in Neural Information Processing Systems 8. MIT Press, Cambridge, MA, pp. 1017-1023.
-
(1996)
Advances in Neural Information Processing Systems 8
, pp. 1017-1023
-
-
Crites, R.H.1
Barto, A.G.2
-
3
-
-
0029679044
-
Reinforcement learning: a survey
-
Kaelbling, L.P., Littman, M.L., Moore, A.W., 1996. Reinforcement learning: a survey. Journal of Artificial Intelligence Research 4, 237-285.
-
(1996)
Journal of Artificial Intelligence Research
, vol.4
, pp. 237-285
-
-
Kaelbling, L.P.1
Littman, M.L.2
Moore, A.W.3
-
5
-
-
85040497000
-
-
AAAI-98 Proceedings
-
Meuleau, N., Hauskrecht, M., Kim, K., Peshkin, L., Kaelbling, L.P., Dean, T., Boutilier, C., 1998. Solving very large weekly coupled Markov decision processes. In: AAAI-98 Proceedings.
-
(1998)
Solving very large weekly coupled Markov decision processes
-
-
Meuleau, N.1
Hauskrecht, M.2
Kim, K.3
Peshkin, L.4
Kaelbling, L.P.5
Dean, T.6
Boutilier, C.7
-
7
-
-
0029332288
-
CABINS: a framework of knowledge acquisition and iterative revision for schedule improvement and reactive repair
-
Miyashita, K., Sycara, K., 1995. CABINS: a framework of knowledge acquisition and iterative revision for schedule improvement and reactive repair. Artificial Intelligence 76 (1-2), 377-426.
-
(1995)
Artificial Intelligence
, vol.76
, Issue.1-2
, pp. 377-426
-
-
Miyashita, K.1
Sycara, K.2
-
9
-
-
0031231885
-
Experiments with reinforcement learning in problems with continuous state and action spaces
-
Santamaría, J.C., Sutton, R.S., Ram, A., 1998. Experiments with reinforcement learning in problems with continuous state and action spaces. Adaptive Behavior 6 2, 163-218.
-
(1998)
Adaptive Behavior
, vol.6
, Issue.2
, pp. 163-218
-
-
Santamaría, J.C.1
Sutton, R.S.2
Ram, A.3
-
10
-
-
33847202724
-
Learning to predict by the method of temporal differences
-
Sutton, R.S., 1988. Learning to predict by the method of temporal differences. Machine Learning 3 1, 9-44.
-
(1988)
Machine Learning
, vol.3
, Issue.1
, pp. 9-44
-
-
Sutton, R.S.1
-
12
-
-
0004049893
-
Learning from delayed rewards
-
Cambridge University, UK
-
Watkins, C.J.C.H., 1989. Learning from delayed rewards. Ph.D. Thesis. Cambridge University, UK.
-
(1989)
Ph.D. Thesis
-
-
Watkins, C.J.C.H.1
|