-
1
-
-
84864030941
-
An application of reinforcement learning to aerobatic helicopter flight
-
P. Abbeel, A. Coates, M. Quigley, and A. Y. Ng. An application of reinforcement learning to aerobatic helicopter flight. In NIPS 19, 2007.
-
(2007)
NIPS 19
-
-
Abbeel, P.1
Coates, A.2
Quigley, M.3
Ng, A.Y.4
-
4
-
-
0031630561
-
The dynamics of reinforcement learning in cooperative multiagent systems
-
Menlo Park, CA, AAAI Press/MIT Press
-
C. Claus and C. Boutilier. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the 15th National Conference on Artificial Intelligence, pages 746-752, Menlo Park, CA, 1998. AAAI Press/MIT Press.
-
(1998)
Proceedings of the 15th National Conference on Artificial Intelligence
, pp. 746-752
-
-
Claus, C.1
Boutilier, C.2
-
7
-
-
85161968592
-
Reinforcement learning in continuous action spaces through sequential monte carlo methods
-
J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Cambridge, MA, MIT Press
-
A. Lazaric, M. Restelli, and A. Bonarini. Reinforcement learning in continuous action spaces through sequential monte carlo methods. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20, pages 833-840, Cambridge, MA, 2008. MIT Press.
-
(2008)
Advances in Neural Information Processing Systems
, vol.20
, pp. 833-840
-
-
Lazaric, A.1
Restelli, M.2
Bonarini, A.3
-
8
-
-
85149834820
-
Markov games as a framework for multi-agent reinforcement learning
-
San Mateo, CA, Morgan Kaufmann
-
M. L. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proc. of the 11th Int. Conf. on Machine Learning, pages 157-163, San Mateo, CA, 1994. Morgan Kaufmann.
-
(1994)
Proc. of the 11th Int. Conf. on Machine Learning
, pp. 157-163
-
-
Littman, M.L.1
-
9
-
-
0141596576
-
Policy invariance under reward transformations: Theory and application to reward shaping
-
Morgan Kaufmann
-
A. Y. Ng, D. Harada, and S. Russell. Policy invariance under reward transformations: Theory and application to reward shaping. In Proc. 16th International Conf. on Machine Learning, pages 278-287. Morgan Kaufmann, 1999.
-
(1999)
Proc. 16th International Conf. on Machine Learning
, pp. 278-287
-
-
Ng, A.Y.1
Harada, D.2
Russell, S.3
-
10
-
-
0013530450
-
Lyapunov-constrained action sets for reinforcement learning
-
T. J. Perkins and A. G. Barto. Lyapunov-constrained action sets for reinforcement learning. In Proceedings of the ICML, pages 409-416, 2001.
-
(2001)
Proceedings of the ICML
, pp. 409-416
-
-
Perkins, T.J.1
Barto, A.G.2
-
13
-
-
0033170372
-
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
-
R. Sutton, D. Precup, and S. Singh. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112:181-211, 1999.
-
(1999)
Artificial Intelligence
, vol.112
, pp. 181-211
-
-
Sutton, R.1
Precup, D.2
Singh, S.3
-
14
-
-
0029276036
-
Temporal difference learning and TD-gammon
-
G. Tesauro. Temporal difference learning and TD-gammon. Communications of the ACM, 38(3):58-68, 1995.
-
(1995)
Communications of the ACM
, vol.38
, Issue.3
, pp. 58-68
-
-
Tesauro, G.1
-
15
-
-
27344453198
-
Potential based shaping and Q-value initialization are equivalent
-
E. Wiewiora. Potential based shaping and Q-value initialization are equivalent. Journal of Artificial Intelligence Research, pages 205-208, 2003.
-
(2003)
Journal of Artificial Intelligence Research
, pp. 205-208
-
-
Wiewiora, E.1
|