-
2
-
-
0003787146
-
-
Princeton University Press. Princeton, NJ. June
-
R. E. Bellman. Dynamic Programming. Princeton University Press. Princeton, NJ. June 1957.
-
(1957)
Dynamic Programming
-
-
Bellman, R.E.1
-
3
-
-
85162049326
-
Incremental natural actor-critic algorithms
-
J. Piatt, D. Koller, Y. Singer, and S. Roweis, editors., MIT Press, Cambridge, MA
-
S. Bhatnagar, R. Sutton, M. Ghavamzadeh, and M. Lee. Incremental natural actor-critic algorithms. In J. Piatt, D. Koller, Y. Singer, and S. Roweis, editors. Advances in Neural Information Processing Systems 20, pages 105-112. MIT Press, Cambridge, MA, 2008.
-
(2008)
Advances in Neural Information Processing Systems
, vol.20
, pp. 105-112
-
-
Bhatnagar, S.1
Sutton, R.2
Ghavamzadeh, M.3
Lee, M.4
-
4
-
-
0028564629
-
Acting optimally in partially observable stochastic domains
-
Seattle, Washington, USA, AAAI Press/MIT Press
-
A. R. Cassandra, L. P. Kaelbling, and M. L. Littman. Acting optimally in partially observable stochastic domains. In Proceedings of the Twelfth National Conference on Artificial Intelligence, volume 2. pages 1023-1028, Seattle, Washington, USA, 1994. AAAI Press/MIT Press.
-
(1994)
Proceedings of the Twelfth National Conference on Artificial Intelligence
, vol.2
, pp. 1023-1028
-
-
Cassandra, A.R.1
Kaelbling, L.P.2
Littman, M.L.3
-
5
-
-
85156187730
-
Improving elevator performance using reinforcement learning
-
D. S. Touretzky, M. Mozer, and M. E. Hassehno, editors, NIPS, Denver, CO, November 27-30, 1995, MIT Press
-
R. H. Crites and A. G. Barto. Improving elevator performance using reinforcement learning. In D. S. Touretzky, M. Mozer, and M. E. Hassehno, editors, Advances in Neural Information Processing Systems 8, NIPS, Denver, CO, November 27-30, 1995, pages 1017-1023. MIT Press, 1996.
-
(1996)
Advances in Neural Information Processing Systems
, vol.8
, pp. 1017-1023
-
-
Crites, R.H.1
Barto, A.G.2
-
6
-
-
17444409624
-
A tutorial on the cross-entropy method
-
P. T. De Boer, D. P. Kroese, S. Mannor, and R. Rubinstein. A tutorial on the cross-entropy method. Annals of Operations Research, 134(1):19-67, 2005.
-
(2005)
Annals of Operations Research
, vol.134
, Issue.1
, pp. 19-67
-
-
De Boer, P.T.1
Kroese, D.P.2
Mannor, S.3
Rubinstein, R.4
-
7
-
-
33646243319
-
A natural policy gradient
-
T. G. Dietterich, S. Becker, and 2. Ghahramani, editors, MIT Press
-
S. Kakade. A natural policy gradient. In T. G. Dietterich, S. Becker, and 2. Ghahramani, editors, Advances in Neural Information Processing Systems 14, pages 1531-1538. MIT Press. 2001.
-
(2001)
Advances in Neural Information Processing Systems
, vol.14
, pp. 1531-1538
-
-
Kakade, S.1
-
10
-
-
0012327484
-
Using eligibility traces to find the best memoryless policy in partially observable markov decision processes
-
Morgan Kaufmann
-
J. Loch and S. Singh. Using eligibility traces to find the best memoryless policy in partially observable Markov decision processes. In Proceedings of the Fifteenth International Conference on Machine Learning, pages 323-331. Morgan Kaufmann, 1998.
-
(1998)
Proceedings of the Fifteenth International Conference on Machine Learning
, pp. 323-331
-
-
Loch, J.1
Singh, S.2
-
11
-
-
84898980684
-
Autonomous helicopter flight via reinforcement learning
-
S. Thrun, L. Saul, and B. Scholkopf, editors, MIT Press, Cambridge, MA
-
A. Y. Ng, H. J. Kim, M. I. Jordan, and S. Sastry. Autonomous helicopter flight via reinforcement learning. In S. Thrun, L. Saul, and B. Scholkopf, editors, Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, MA, 2004.
-
(2004)
Advances in Neural Information Processing Systems
, vol.16
-
-
Ng, A.Y.1
Kim, H.J.2
Jordan, M.I.3
Sastry, S.4
-
12
-
-
84898960655
-
A convergent form of approximate policy iteration
-
S. T. S. Becker and K. Obermayer, editors, MIT Press, Cambridge, MA
-
T. J. Perkins and D. Precup. A convergent form of approximate policy iteration. In S. T. S. Becker and K. Obermayer, editors, Advances in Neural Information Processing Systems 15, pages 1595-1602. MIT Press, Cambridge, MA, 2003.
-
(2003)
Advances in Neural Information Processing Systems
, vol.15
, pp. 1595-1602
-
-
Perkins, T.J.1
Precup, D.2
-
13
-
-
40649106649
-
Natural actor-critic
-
J. Peters and S. Schaal. Natural actor-critic. Neurocomputing, 71 (7-9): 1180-1190, 2008.
-
(2008)
Neurocomputing
, vol.71
, Issue.7-9
, pp. 1180-1190
-
-
Peters, J.1
Schaal, S.2
-
16
-
-
27544506565
-
Reinforcement learning for robocup-soccer keepaway
-
P. Stone, R. S. Sutton, and G. Kuhlmann. Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior, 13(3):165-188. 2005.
-
(2005)
Adaptive Behavior
, vol.13
, Issue.3
, pp. 165-188
-
-
Stone, P.1
Sutton, R.S.2
Kuhlmann, G.3
-
18
-
-
33845344721
-
Learning tetris using the noisy cross-entropy method
-
I. Szita and A. Lorincz. Learning Tetris using the noisy cross-entropy method. Neural Computation, 18:2936-2941, 2006.
-
(2006)
Neural Computation
, vol.18
, pp. 2936-2941
-
-
Szita, I.1
Lorincz, A.2
-
20
-
-
27544473171
-
Behavior transfer for value-function-based reinforcement learning
-
F. Dignum, V. Dignum, S. Koenig. S. Kraus, M. P. Singh, and M. Wooldridge, editors, New York, NY, July, ACM Press
-
M. E. Taylor and P. Stone. Behavior transfer for value-function-based reinforcement learning. In F. Dignum, V. Dignum, S. Koenig. S. Kraus, M. P. Singh, and M. Wooldridge, editors, The Fourth International Joint Conference on Autonomous Agents and Multiagent Systems, pages 53-59, New York, NY, July 2005. ACM Press.
-
(2005)
The Fourth International Joint Conference on Autonomous Agents and Multiagent Systems
, pp. 53-59
-
-
Taylor, M.E.1
Stone, P.2
-
21
-
-
34548031419
-
On the use of hybrid reinforcement learning for autonomic resource allocation
-
G. Tesauro, N. K. Jong, R. Das, and M. N. Bennani. On the use of hybrid reinforcement learning for autonomic resource allocation. Cluster Computing, 10(3):287-299, 2007.
-
(2007)
Cluster Computing
, vol.10
, Issue.3
, pp. 287-299
-
-
Tesauro, G.1
Jong, N.K.2
Das, R.3
Bennani, M.N.4
-
23
-
-
33646714634
-
Evolutionary function approximation for reinforcement learning
-
May
-
S. Whiteson and P. Stone. Evolutionary function approximation for reinforcement learning. Journal of Machine Learning Research, 7:877-917, May 2006.
-
(2006)
Journal of Machine Learning Research
, vol.7
, pp. 877-917
-
-
Whiteson, S.1
Stone, P.2
|