-
2
-
-
84879976780
-
The arcade learning environment: An evaluation platform for general agents
-
Bellemare, M. G.; Naddaf, Y.; Veness, J.; and Bowling, M. 2013. The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research 47: 253-279.
-
(2013)
Journal of Artificial Intelligence Research
, vol.47
, pp. 253-279
-
-
Bellemare, M.G.1
Naddaf, Y.2
Veness, J.3
Bowling, M.4
-
3
-
-
80053022338
-
Optimal policy switching algorithms for reinforcement learning
-
Comanici, G., and Precup, D. 2010. Optimal policy switching algorithms for reinforcement learning. In AAMAS, 709-714.
-
(2010)
AAMAS
, pp. 709-714
-
-
Comanici, G.1
Precup, D.2
-
4
-
-
78651097494
-
Skill characterization based on betweenness
-
ŞimŞek, O., and Barto, A. G. 2009. Skill characterization based on betweenness. In NIPS 21, 1497-1504.
-
(2009)
NIPS
, vol.21
, pp. 1497-1504
-
-
ŞimŞek, O.1
Barto, A.G.2
-
5
-
-
84982855450
-
Probabilistic inference for determining options in reinforcement learning
-
Daniel, C.; van Hoof, H.; Peters, J.; and Neumann, G. 2016. Probabilistic inference for determining options in reinforcement learning. Machine Learning, Special Issue 104 (2): 337-357.
-
(2016)
Machine Learning, Special Issue
, vol.104
, Issue.2
, pp. 337-357
-
-
Daniel, C.1
Van Hoof, H.2
Peters, J.3
Neumann, G.4
-
7
-
-
84898938510
-
Actor-critic algorithms
-
Konda, V. R., and Tsitsiklis, J. N. 2000. Actor-critic algorithms. In NIPS 12, 1008-1014.
-
(2000)
NIPS
, vol.12
, pp. 1008-1014
-
-
Konda, V.R.1
Tsitsiklis, J.N.2
-
8
-
-
80055032021
-
Skill discovery in continuous reinforcement learning domains using skill chaining
-
Konidaris, G., and Barto, A. 2009. Skill discovery in continuous reinforcement learning domains using skill chaining. In NIPS 22, 1015-1023.
-
(2009)
NIPS
, vol.22
, pp. 1015-1023
-
-
Konidaris, G.1
Barto, A.2
-
11
-
-
85019246453
-
Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation
-
Kulkarni, T.; Narasimhan, K.; Saeedi, A.; and Tenenbaum, J. 2016. Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. In NIPS 29.
-
(2016)
NIPS
, pp. 29
-
-
Kulkarni, T.1
Narasimhan, K.2
Saeedi, A.3
Tenenbaum, J.4
-
12
-
-
84861674687
-
Unified inter and intra options learning using policy gradient methods
-
Levy, K. Y., and Shimkin, N. 2011. Unified inter and intra options learning using policy gradient methods. In EWRL, 153-164.
-
(2011)
EWRL
, pp. 153-164
-
-
Levy, K.Y.1
Shimkin, N.2
-
13
-
-
85018897525
-
Adaptive skills, adaptive partitions (ASAP)
-
Mankowitz, D. J.; Mann, T. A.; and Mannor, S. 2016. Adaptive skills, adaptive partitions (ASAP). In NIPS 29.
-
(2016)
NIPS
, pp. 29
-
-
Mankowitz, D.J.1
Mann, T.A.2
Mannor, S.3
-
14
-
-
84919807958
-
Timeregularized interrupting options (TRIO)
-
Mann, T. A.; Mankowitz, D. J.; and Mannor, S. 2014. Timeregularized interrupting options (TRIO). In ICML, 1350-1358.
-
(2014)
ICML
, pp. 1350-1358
-
-
Mann, T.A.1
Mankowitz, D.J.2
Mannor, S.3
-
16
-
-
0013465187
-
Automatic discovery of subgoals in reinforcement learning using diverse density
-
McGovern, A., and Barto, A. G. 2001. Automatic discovery of subgoals in reinforcement learning using diverse density. In ICML, 361-368.
-
(2001)
ICML
, pp. 361-368
-
-
McGovern, A.1
Barto, A.G.2
-
17
-
-
84945250000
-
Q-cut-dynamic discovery of sub-goals in reinforcement learning
-
Menache, I.; Mannor, S.; and Shimkin, N. 2002. Q-cut-dynamic discovery of sub-goals in reinforcement learning. In ECML, 295-306.
-
(2002)
ECML
, pp. 295-306
-
-
Menache, I.1
Mannor, S.2
Shimkin, N.3
-
18
-
-
84904867557
-
-
CoRR abs/1312. 5602
-
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; and Riedmiller, M. A. 2013. Playing atari with deep reinforcement learning. CoRR abs/1312. 5602.
-
(2013)
Playing Atari with Deep Reinforcement Learning
-
-
Mnih, V.1
Kavukcuoglu, K.2
Silver, D.3
Graves, A.4
Antonoglou, I.5
Wierstra, D.6
Riedmiller, M.A.7
-
19
-
-
84999036937
-
Asynchronous methods for deep reinforcement learning
-
Mnih, V.; Badia, A. P.; Mirza, M.; Graves, A.; Lillicrap, T. P.; Harley, T.; Silver, D.; and Kavukcuoglu, K. 2016. Asynchronous methods for deep reinforcement learning. In ICML.
-
(2016)
ICML
-
-
Mnih, V.1
Badia, A.P.2
Mirza, M.3
Graves, A.4
Lillicrap, T.P.5
Harley, T.6
Silver, D.7
Kavukcuoglu, K.8
-
23
-
-
84867135062
-
Compositional planning using optimal option models
-
Silver, D., and Ciosek, K. 2012. Compositional planning using optimal option models. In ICML.
-
(2012)
ICML
-
-
Silver, D.1
Ciosek, K.2
-
24
-
-
84868298774
-
Linear options
-
Sorg, J., and Singh, S. P. 2010. Linear options. In AAMAS, 31-38.
-
(2010)
AAMAS
, pp. 31-38
-
-
Sorg, J.1
Singh, S.P.2
-
25
-
-
84912073624
-
Learning options in reinforcement learning
-
Stolle, M., and Precup, D. 2002. Learning options in reinforcement learning. In Abstraction, Reformulation and Approximation, 5th International Symposium, SARA Proceedings, 212-223.
-
(2002)
Abstraction, Reformulation and Approximation, 5th International Symposium, SARA Proceedings
, pp. 212-223
-
-
Stolle, M.1
Precup, D.2
-
26
-
-
84898939480
-
Policy gradient methods for reinforcement learning with function approximation
-
Sutton, R. S.; McAllester, D. A.; Singh, S. P.; and Mansour, Y. 2000. Policy gradient methods for reinforcement learning with function approximation. In NIPS 12. 1057-1063.
-
(2000)
NIPS 12
, pp. 1057-1063
-
-
Sutton, R.S.1
McAllester, D.A.2
Singh, S.P.3
Mansour, Y.4
-
27
-
-
0033170372
-
Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning
-
Sutton, R. S.; Precup, D.; and Singh, S. P. 1999. Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112 (1-2): 181-211.
-
(1999)
Artificial Intelligence
, vol.112
, Issue.1-2
, pp. 181-211
-
-
Sutton, R.S.1
Precup, D.2
Singh, S.P.3
-
29
-
-
84919794661
-
Bias in natural actor-critic algorithms
-
Thomas, P. 2014. Bias in natural actor-critic algorithms. In ICML, 441-448.
-
(2014)
ICML
, pp. 441-448
-
-
Thomas, P.1
-
30
-
-
85017563254
-
Strategic attentive writer for learning macro-actions
-
Vezhnevets, A. S.; Mnih, V.; Agapiou, J.; Osindero, S.; Graves, A.; Vinyals, O.; and Kavukcuoglu, K. 2016. Strategic attentive writer for learning macro-actions. In NIPS 29.
-
(2016)
NIPS 29
-
-
Vezhnevets, A.S.1
Mnih, V.2
Agapiou, J.3
Osindero, S.4
Graves, A.5
Vinyals, O.6
Kavukcuoglu, K.7
|