-
1
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47:235-256, 2002.
-
(2002)
Machine Learning
, vol.47
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
2
-
-
85031088945
-
-
CoRR, abs/1612.03801
-
C. Beattie, J. Z. Leibo, D. Teplyashin, T. Ward, M. Wainwright, H. Küttler, A. Lefrancq, S. Green, V. Valdés, A. Sadik, J. Schrittwieser, K. Anderson, S. York, M. Cant, A. Cain, A. Bolton, S. Gaffney, H. King, D. Hassabis, S. Legg, and S. Petersen. Deepmind lab. CoRR, abs/1612.03801, 2016. URL http://arxiv.org/abs/1612.03801.
-
(2016)
Deepmind Lab
-
-
Beattie, C.1
Leibo, J.Z.2
Teplyashin, D.3
Ward, T.4
Wainwright, M.5
Küttler, H.6
Lefrancq, A.7
Green, S.8
Valdés, V.9
Sadik, A.10
Schrittwieser, J.11
Anderson, K.12
York, S.13
Cant, M.14
Cain, A.15
Bolton, A.16
Gaffney, S.17
King, H.18
Hassabis, D.19
Legg, S.20
Petersen, S.21
more..
-
3
-
-
84879976780
-
The arcade learning environment: An evaluation platform for general agents
-
M. G. Bellemare, Y. Naddaf, J. Veness, and M. Bowling. The Arcade Learning Environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253-279, 2013.
-
(2013)
Journal of Artificial Intelligence Research
, vol.47
, pp. 253-279
-
-
Bellemare, M.G.1
Naddaf, Y.2
Veness, J.3
Bowling, M.4
-
9
-
-
0001596874
-
Intuitive physics
-
M. McCloskey. Intuitive physics. Scientific American, 248(4):122-130, 1983.
-
(1983)
Scientific American
, vol.248
, Issue.4
, pp. 122-130
-
-
McCloskey, M.1
-
10
-
-
84924051598
-
Human-level control through deep reinforcement learning
-
02
-
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis. Human-level control through deep reinforcement learning. Nature, 518(7540): 529-533, 02 2015. URL http://dx.doi.org/10.1038/nature14236.
-
(2015)
Nature
, vol.518
, Issue.7540
, pp. 529-533
-
-
Mnih, V.1
Kavukcuoglu, K.2
Silver, D.3
Rusu, A.A.4
Veness, J.5
Bellemare, M.G.6
Graves, A.7
Riedmiller, M.8
Fidjeland, A.K.9
Ostrovski, G.10
Petersen, S.11
Beattie, C.12
Sadik, A.13
Antonoglou, I.14
King, H.15
Kumaran, D.16
Wierstra, D.17
Legg, S.18
Hassabis, D.19
-
11
-
-
84999036937
-
Asynchronous methods for deep reinforcement learning
-
V. Mnih, A. Puigdomènech Badia, M. Mirza, A. Graves, T. P Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In Proceedings of the 33rd International Conference on Machine Learning (ICML), 2016.
-
(2016)
Proceedings of the 33rd International Conference on Machine Learning (ICML)
-
-
Mnih, V.1
Puigdomènech Badia, A.2
Mirza, M.3
Graves, A.4
Lillicrap, T.P.5
Harley, T.6
Silver, D.7
Kavukcuoglu, K.8
-
12
-
-
67349283062
-
Reinforcement learning in the brain
-
Y. Niv. Reinforcement learning in the brain. Journal of Mathematical Psychology, 53(3):139-154, 2009.
-
(2009)
Journal of Mathematical Psychology
, vol.53
, Issue.3
, pp. 139-154
-
-
Niv, Y.1
-
13
-
-
84965178314
-
Action-conditional video prediction using deep networks in Atari games
-
J. Oh, X. Guo, H. Lee, R. L. Lewis, and S. P. Singh. Action-conditional video prediction using deep networks in Atari games. In Advances in Neural Information Processing Systems 28 (NIPS), pp. 2863-2871. 2015. URL http://arxiv.org/abs/1507.08750.
-
(2015)
Advances in Neural Information Processing Systems 28 (NIPS)
, pp. 2863-2871
-
-
Oh, J.1
Guo, X.2
Lee, H.3
Lewis, R.L.4
Singh, S.P.5
-
14
-
-
0035495009
-
A sensorimotor account of vision and visual consciousness
-
05
-
J. K. O'Regan and A. Noë. A sensorimotor account of vision and visual consciousness. Behavioral and brain sciences, 24(05):939-973, 2001.
-
(2001)
Behavioral and Brain Sciences
, vol.24
, pp. 939-973
-
-
O'Regan, J.K.1
Noë, A.2
-
15
-
-
34047267520
-
Intrinsic motivation systems for autonomous mental development
-
P.-Y. Oudeyer, F. Kaplan, and V. V. Hafner. Intrinsic motivation systems for autonomous mental development. Evolutionary Computation, IEEE Transactions on, 11(2):265-286, 2007.
-
(2007)
Evolutionary Computation, IEEE Transactions on
, vol.11
, Issue.2
, pp. 265-286
-
-
Oudeyer, P.-Y.1
Kaplan, F.2
Hafner, V.V.3
-
17
-
-
70349349170
-
-
Cambridge University Press
-
J. Pearl. Causality. Cambridge University Press, 2009.
-
(2009)
Causality
-
-
Pearl, J.1
-
19
-
-
85006142438
-
-
CoRR, abs/1512.08836
-
W. Sun, A. Venkatraman, B. Boots, and J. A. Bagnell. Learning to filter with predictive state inference machines. CoRR, abs/1512.08836, 2015. URL http://arxiv.org/abs/1512.08836.
-
(2015)
Learning to Filter with Predictive State Inference Machines
-
-
Sun, W.1
Venkatraman, A.2
Boots, B.3
Bagnell, J.A.4
-
23
-
-
84965129327
-
Embed to control: A locally linear latent dynamics model for control from raw images
-
M. Watter, J. Springenberg, J. Boedecker, and M. Riedmiller. Embed to control: A locally linear latent dynamics model for control from raw images. In Advances in Neural Information Processing Systems 28 (NIPS), pp. 2728-2736, 2015.
-
(2015)
Advances in Neural Information Processing Systems 28 (NIPS)
, pp. 2728-2736
-
-
Watter, M.1
Springenberg, J.2
Boedecker, J.3
Riedmiller, M.4
-
24
-
-
0001765578
-
Gradient-based learning algorithms for recurrent networks and their computational complexity
-
R. J. Williams and D. Zipser. Gradient-based learning algorithms for recurrent networks and their computational complexity. Bibliometrics, pp. 433-486, 1995.
-
(1995)
Bibliometrics
, pp. 433-486
-
-
Williams, R.J.1
Zipser, D.2
-
25
-
-
76249122848
-
-
v1.3.5
-
B. Wymann, E. Espié, C. Guionneau, C. Dimitrakakis, R. Coulom, and A. Sumner. Torcs: The open racing car simulator, v1.3.5. 2013. URL http://www.torcs.org.
-
(2013)
Torcs: The Open Racing Car Simulator
-
-
Wymann, B.1
Espié, E.2
Guionneau, C.3
Dimitrakakis, C.4
Coulom, R.5
Sumner, A.6
|