-
3
-
-
85070993309
-
-
contributors
-
OpenArena contributors. The openarena manual. 2005. URL http://openarena.wikia.com/wiki/Manual.
-
(2005)
The Openarena Manual
-
-
-
4
-
-
0001158047
-
Improving generalization for temporal difference learning: The successor representation
-
Peter Dayan. Improving generalization for temporal difference learning: The successor representation. Neural Computation, 5(4):613-624, 1993.
-
(1993)
Neural Computation
, vol.5
, Issue.4
, pp. 613-624
-
-
Dayan, P.1
-
5
-
-
0034293152
-
Learning to forget: Continual prediction with lstm
-
Felix A Gers, Jürgen Schmidhuber, and Fred Cummins. Learning to forget: Continual prediction with lstm. Neural computation, 12(10):2451-2471, 2000.
-
(2000)
Neural Computation
, vol.12
, Issue.10
, pp. 2451-2471
-
-
Gers, F.A.1
Schmidhuber, J.2
Cummins, F.3
-
6
-
-
85071026926
-
Id software
-
id software. Quake3. 1999. URL https://github.com/id-Software/Quake-III-Arena.
-
(1999)
Quake3
-
-
-
7
-
-
85015426298
-
-
´ arXiv preprint
-
Michal Kempka, Marek Wydmuch, Grzegorz Runc, Jakub Toczek, and Wojciech Jaskowski. ´ Viz-doom: A doom-based ai research platform for visual reinforcement learning. arXiv preprint arXiv:1605.02097, 2016.
-
(2016)
Viz-Doom: A Doom-Based Ai Research Platform for Visual Reinforcement Learning
-
-
Kempka, M.1
Wydmuch, M.2
Runc, G.3
Toczek, J.4
Jaskowski, W.5
-
8
-
-
80055032021
-
Skill discovery in continuous reinforcement learning domains using skill chaining
-
George Konidaris and Andre S Barreto. Skill discovery in continuous reinforcement learning domains using skill chaining. In Advances in Neural Information Processing Systems, pp. 1015-1023, 2009.
-
(2009)
Advances in Neural Information Processing Systems
, pp. 1015-1023
-
-
Konidaris, G.1
Barreto, A.S.2
-
11
-
-
84998780737
-
-
arXiv preprint
-
Xiujun Li, Lihong Li, Jianfeng Gao, Xiaodong He, Jianshu Chen, Li Deng, and Ji He. Recurrent reinforcement learning: A hybrid approach. arXiv preprint arXiv:1509.03044, 2015.
-
(2015)
Recurrent Reinforcement Learning: A Hybrid Approach
-
-
Li, X.1
Li, L.2
Gao, J.3
He, X.4
Chen, J.5
Deng, L.6
He, J.7
-
13
-
-
85031110463
-
-
Piotr Mirowski, Razvan Pascanu, Fabio Viola, Andrea Banino, Hubert Soyer, Andy Ballard, Misha Denil, Ross Goroshin, Laurent Sifre, Koray Kavukcuoglu, Dharshan Kumaran, and Raia Hadsell. Learning to navigate in complex environments. 2016.
-
(2016)
Learning to Navigate in Complex Environments
-
-
Mirowski, P.1
Pascanu, R.2
Viola, F.3
Banino, A.4
Soyer, H.5
Ballard, A.6
Denil, M.7
Goroshin, R.8
Sifre, L.9
Kavukcuoglu, K.10
Kumaran, D.11
Hadsell, R.12
-
14
-
-
84904867557
-
Playing atari with deep reinforcement learning
-
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep reinforcement learning. In NIPS Deep Learning Workshop. 2013.
-
(2013)
NIPS Deep Learning Workshop
-
-
Mnih, V.1
Kavukcuoglu, K.2
Silver, D.3
Graves, A.4
Antonoglou, I.5
Wierstra, D.6
Riedmiller, M.7
-
15
-
-
84924051598
-
Human-level control through deep reinforcement learning
-
02
-
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. Human-level control through deep reinforcement learning. Nature, 518(7540):529-533, 02 2015. URL http://dx.doi.org/10.1038/nature14236.
-
(2015)
Nature
, vol.518
, Issue.7540
, pp. 529-533
-
-
Mnih, V.1
Kavukcuoglu, K.2
Silver, D.3
Rusu, A.A.4
Veness, J.5
Bellemare, M.G.6
Graves, A.7
Riedmiller, M.8
Fidjeland, A.K.9
Ostrovski, G.10
Petersen, S.11
Beattie, C.12
Sadik, A.13
Antonoglou, I.14
King, H.15
Kumaran, D.16
Wierstra, D.17
Legg, S.18
Hassabis, D.19
-
16
-
-
84999036937
-
Asynchronous methods for deep reinforcement learning
-
Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In Proceedings of the 33rd International Conference on Machine Learning (ICML), pp. 1928-1937, 2016.
-
(2016)
Proceedings of the 33rd International Conference on Machine Learning (ICML)
, pp. 1928-1937
-
-
Mnih, V.1
Badia, A.P.2
Mirza, M.3
Graves, A.4
Lillicrap, T.P.5
Harley, T.6
Silver, D.7
Kavukcuoglu, K.8
-
17
-
-
84965178314
-
Action-conditional video prediction using deep networks in atari games
-
Junhyuk Oh, Xiaoxiao Guo, Honglak Lee, Richard L Lewis, and Satinder Singh. Action-conditional video prediction using deep networks in atari games. In Advances in Neural Information Processing Systems, pp. 2863-2871, 2015.
-
(2015)
Advances in Neural Information Processing Systems
, pp. 2863-2871
-
-
Oh, J.1
Guo, X.2
Lee, H.3
Lewis, R.L.4
Singh, S.5
-
18
-
-
84999048282
-
-
arXiv preprint
-
Junhyuk Oh, Valliappa Chockalingam, Satinder Singh, and Honglak Lee. Control of memory, active perception, and action in minecraft. arXiv preprint arXiv:1605.09128, 2016.
-
(2016)
Control of Memory, Active Perception, and Action in Minecraft
-
-
Oh, J.1
Chockalingam, V.2
Singh, S.3
Lee, H.4
-
19
-
-
84937060789
-
Hip-pocampal place cells construct reward related sequences through unexplored space
-
H Freyja Olafsdottir, Caswell Barry, Aman B Saleem, Demis Hassabis, and Hugo J Spiers. Hip-pocampal place cells construct reward related sequences through unexplored space. Elife, 4: e06063, 2015.
-
(2015)
Elife
, vol.4
-
-
Freyja Olafsdottir, H.1
Barry, C.2
Saleem, A.B.3
Hassabis, D.4
Spiers, H.J.5
-
20
-
-
0000955979
-
Incremental multi-step q-learning
-
Jing Peng and Ronald J Williams. Incremental multi-step q-learning. Machine Learning, 22(1-3): 283-290, 1996.
-
(1996)
Machine Learning
, vol.22
, Issue.1-3
, pp. 283-290
-
-
Peng, J.1
Williams, R.J.2
-
21
-
-
84869780901
-
The future of memory: Remembering, imagining, and the brain
-
Daniel L Schacter, Donna Rose Addis, Demis Hassabis, Victoria C Martin, R Nathan Spreng, and Karl K Szpunar. The future of memory: remembering, imagining, and the brain. Neuron, 76(4): 677-694, 2012.
-
(2012)
Neuron
, vol.76
, Issue.4
, pp. 677-694
-
-
Schacter, D.L.1
Addis, D.R.2
Hassabis, D.3
Martin, V.C.4
Nathan Spreng, R.5
Szpunar, K.K.6
-
22
-
-
84969760283
-
Universal value function approxima-tors
-
Tom Schaul, Daniel Horgan, Karol Gregor, and David Silver. Universal value function approxima-tors. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pp. 1312-1320, 2015a.
-
(2015)
Proceedings of the 32nd International Conference on Machine Learning (ICML-15)
, pp. 1312-1320
-
-
Schaul, T.1
Horgan, D.2
Gregor, K.3
Silver, D.4
-
24
-
-
77956578648
-
Formal theory of creativity, fun, and intrinsic motivation (1990-2010)
-
Jürgen Schmidhuber. Formal theory of creativity, fun, and intrinsic motivation (1990-2010). IEEE Transactions on Autonomous Mental Development, 2(3):230-247, 2010.
-
(2010)
IEEE Transactions on Autonomous Mental Development
, vol.2
, Issue.3
, pp. 230-247
-
-
Schmidhuber, J.1
-
26
-
-
84963949906
-
Mastering the game of go with deep neural networks and tree search
-
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. Mastering the game of go with deep neural networks and tree search. Nature, 529(7587):484-489, 2016.
-
(2016)
Nature
, vol.529
, Issue.7587
, pp. 484-489
-
-
Silver, D.1
Huang, A.2
Maddison, C.J.3
Guez, A.4
Sifre, L.5
Van Den Driessche, G.6
Schrittwieser, J.7
Antonoglou, I.8
Panneershelvam, V.9
Lanctot, M.10
-
27
-
-
84898939480
-
Policy gradient methods for reinforcement learning with function approximation
-
Richard S Sutton, David A McAllester, Satinder P Singh, Yishay Mansour, et al. Policy gradient methods for reinforcement learning with function approximation. In NIPS, volume 99, pp. 1057-1063, 1999a.
-
(1999)
NIPS
, vol.99
, pp. 1057-1063
-
-
Sutton, R.S.1
McAllester, D.A.2
Singh, S.P.3
Mansour, Y.4
-
28
-
-
0033170372
-
Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning
-
Richard S Sutton, Doina Precup, and Satinder Singh. Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 1999b.
-
(1999)
Artificial Intelligence
-
-
Sutton, R.S.1
Precup, D.2
Singh, S.3
-
29
-
-
84899464022
-
Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction
-
International Foundation for Autonomous Agents and Multiagent Systems
-
Richard S Sutton, Joseph Modayil, Michael Delp, Thomas Degris, Patrick M Pilarski, Adam White, and Doina Precup. Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction. In The 10th International Conference on Autonomous Agents and Multiagent Systems-Volume 2, pp. 761-768. International Foundation for Autonomous Agents and Multiagent Systems, 2011.
-
(2011)
The 10th International Conference on Autonomous Agents and Multiagent Systems-
, vol.2
, pp. 761-768
-
-
Sutton, R.S.1
Modayil, J.2
Delp, M.3
Degris, T.4
Pilarski, P.M.5
White, A.6
Precup, D.7
-
30
-
-
85019201204
-
-
arXiv preprint
-
Chen Tessler, Shahar Givony, Tom Zahavy, Daniel J Mankowitz, and Shie Mannor. A deep hierarchical approach to lifelong learning in minecraft. arXiv preprint arXiv:1604.07255, 2016.
-
(2016)
A Deep Hierarchical Approach to Lifelong Learning in Minecraft
-
-
Tessler, C.1
Givony, S.2
Zahavy, T.3
Mankowitz, D.J.4
Mannor, S.5
-
33
-
-
85013998007
-
-
CoRR, abs/1509.06824
-
Christopher Xie, Sachin Patil, Teodor Mihai Moldovan, Sergey Levine, and Pieter Abbeel. Model-based reinforcement learning with parametrized physical models and optimism-driven exploration. CoRR, abs/1509.06824, 2015.
-
(2015)
Model-Based Reinforcement Learning with Parametrized Physical Models and Optimism-Driven Exploration
-
-
Xie, C.1
Patil, S.2
Moldovan, T.M.3
Levine, S.4
Abbeel, P.5
|