-
7
-
-
79951481923
-
Convergent temporal-difference learning with arbitrary smooth function approximation
-
MIT Press
-
Maei, H.R., Szepesvári, C., Bhatnagar, S., Precup, D., Silver, D., Sutton, R.S.: Convergent temporal-difference learning with arbitrary smooth function approximation. In: Advances in Neural Information Processing Systems, vol. 22. MIT Press (2009)
-
(2009)
Advances in Neural Information Processing Systems
, vol.22
-
-
Maei, H.R.1
Szepesvári, C.2
Bhatnagar, S.3
Precup, D.4
Silver, D.5
Sutton, R.S.6
-
8
-
-
77956541799
-
Toward off-policy learning control with function approximation
-
Maei, H.R., Szepesvári, C., Bhatnagar, S., Sutton, R.S.: Toward off-policy learning control with function approximation. In: Proceedings of the 27th International Conference on Machine Learning (2010)
-
(2010)
Proceedings of the 27th International Conference on Machine Learning
-
-
Maei, H.R.1
Szepesvári, C.2
Bhatnagar, S.3
Sutton, R.S.4
-
9
-
-
14344250635
-
Dynamic abstraction in reinforcement learning via clustering
-
Mannor, S., Menache, I., Hoze, A., Klein, U.: Dynamic abstraction in reinforcement learning via clustering. In: Proceedings of the Twenty-First International Conference on Machine Learning (2004)
-
(2004)
Proceedings of the Twenty-First International Conference on Machine Learning
-
-
Mannor, S.1
Menache, I.2
Hoze, A.3
Klein, U.4
-
11
-
-
84864861073
-
Multi-timescale nexting in a reinforcement learning robot
-
Modayil, J., White, A., Sutton, R.S.: Multi-timescale nexting in a reinforcement learning robot. In: Proceedings of the 2012 Conference on Simulation of Adaptive Behaviour (to appear, 2012)
-
Proceedings of the 2012 Conference on Simulation of Adaptive Behaviour (to appear, 2012
-
-
Modayil, J.1
White, A.2
Sutton, R.S.3
-
15
-
-
84899031920
-
Intrinsically motivated reinforcement learning
-
Singh, S., Barto, A.G., Chentanez, N.: Intrinsically motivated reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 17, pp. 1281-1288 (2005)
-
(2005)
Advances in Neural Information Processing Systems
, vol.17
, pp. 1281-1288
-
-
Singh, S.1
Barto, A.G.2
Chentanez, N.3
-
16
-
-
84912073624
-
Learning options in reinforcement learning
-
In: Koenig, S., Holte, R.C. (eds.) Springer, Heidelberg
-
Stolle, M., Precup, D.: Learning Options in Reinforcement Learning. In: Koenig, S., Holte, R.C. (eds.) SARA 2002. LNCS (LNAI), vol. 2371, pp. 212-223. Springer, Heidelberg (2002)
-
(2002)
SARA 2002. LNCS (LNAI)
, vol.2371
, pp. 212-223
-
-
Stolle, M.1
Precup, D.2
-
17
-
-
84864837762
-
-
http://richsutton.com/IncIdeas/KeytoAI.html
-
Sutton, R.S.: "Verification" and "Verfication, the key to AI" (2001), http://richsutton.com/IncIdeas/Verification.html, http://richsutton.com/IncIdeas/KeytoAI.html
-
(2001)
Verification and Verfication, the key to AI
-
-
Sutton, R.S.1
-
20
-
-
71149099079
-
Fast gradient-descent methods for temporal-difference learning with linear function approximation
-
Sutton, R.S., Maei, H.R., Precup, D., Bhatnagar, S., Silver, D., Szepesvári, C., Wiewiora, E.: Fast gradient-descent methods for temporal-difference learning with linear function approximation. In: Proceedings of the 26th International Conference on Machine Learning (2009)
-
(2009)
Proceedings of the 26th International Conference on Machine Learning
-
-
Sutton, R.S.1
Maei, H.R.2
Precup, D.3
Bhatnagar, S.4
Silver, D.5
Szepesvári, C.6
Wiewiora, E.7
-
21
-
-
84899464022
-
Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction
-
Sutton, R.S., Modayil, J., Delp, M., Degris, T., Pilarski, P.M., White, A., Precup, D.: Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction. In: Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems, AAMAS (2011)
-
(2011)
Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems, AAMAS
-
-
Sutton, R.S.1
Modayil, J.2
Delp, M.3
Degris, T.4
Pilarski, P.M.5
White, A.6
Precup, D.7
-
22
-
-
0033170372
-
Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning
-
Sutton, R.S., Precup, D., Singh, S.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112, 181-211 (1999)
-
(1999)
Artificial Intelligence
, vol.112
, pp. 181-211
-
-
Sutton, R.S.1
Precup, D.2
Singh, S.3
-
23
-
-
77956513316
-
A convergent O(n) Algorithm for off-policy temporal-difference learning with linear function approximation
-
MIT Press
-
Sutton, R.S., Szepesvári, C., Maei, H.R.: A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation. In: Advances in Neural Information Processing Systems, vol. 21. MIT Press (2009)
-
(2009)
Advances in Neural Information Processing Systems
, vol.21
-
-
Sutton, R.S.1
Szepesvári, C.2
Maei, H.R.3
-
24
-
-
0031143730
-
An analysis of temporal-difference learning with function approximation
-
Tsitsiklis, J.N., Van Roy, B.: An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control 42, 674-690 (1997)
-
(1997)
IEEE Transactions on Automatic Control
, vol.42
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
|