-
2
-
-
0346149797
-
A robot that reinforcement-learns to identify and memorize important previous observa-tions
-
Bakker, Bram, Zhumatiy, Viktor, Gruener, Gabriel, and Schmidhuber, Jiirgen. A robot that reinforcement-learns to identify and memorize important previous observations. In Intelligent Robots and Systems, 2003.
-
(2003)
Intelligent Robots and Systems
-
-
Bakker, B.1
Zhumatiy, V.2
Gruener, G.3
Schmidhuber, J.4
-
3
-
-
84879976780
-
The arcade learning environment: An evaluation platform for general agents
-
06
-
Bellemare, M. G., Naddaf, Y., Veness, J., and Bowling, M. The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253-279, 06 2013.
-
(2013)
Journal of Artificial Intelligence Research
, vol.47
, pp. 253-279
-
-
Bellemare, M.G.1
Naddaf, Y.2
Veness, J.3
Bowling, M.4
-
4
-
-
84888340666
-
Torch7: A matlab-like environment for machine learning
-
Collobert, Ronan, Kavukcuoglu, Koray, and Farabet, Clement. Torch7: A matlab-like environment for machine learning. In BigLearn, Advances in the Neural Information Processing System Workshop, 2011.
-
(2011)
BigLearn, Advances in the Neural Information Processing System Workshop
-
-
Collobert, R.1
Kavukcuoglu, K.2
Farabet, C.3
-
5
-
-
84911400494
-
Rich feature hierarchies for accurate object detection and semantic segmentation
-
Girshick, Ross, Donahue, Jeff, Darrell, Trevor, and Malik, Jitendra. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014.
-
(2014)
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
-
-
Girshick, R.1
Donahue, J.2
Darrell, T.3
Malik, J.4
-
8
-
-
84937779024
-
Deep learning for real-time atari game play using offline monte-carlo tree search planning
-
Guo, Xiaoxiao, Singh, Satinder, Lee, Honglak, Lewis, Richard L, and Wang, Xiaoshi. Deep learning for real-time atari game play using offline monte-carlo tree search planning. In Advances in the Neural Information Processing System, 2014.
-
(2014)
Advances in the Neural Information Processing System
-
-
Guo, X.1
Singh, S.2
Lee, H.3
Lewis, R.L.4
Wang, X.5
-
15
-
-
84930630277
-
Deep learning
-
LeCun, Yann, Bengio, Yoshua, and Hinton, Geoffrey. Deep learning. Nature, 521(7553):436-444, 2015.
-
(2015)
Nature
, vol.521
, Issue.7553
, pp. 436-444
-
-
LeCun, Y.1
Bengio, Y.2
Hinton, G.3
-
17
-
-
84979924150
-
End-to-end training of deep visuomotor policies
-
Levine, Sergey, Finn, Chelsea, Darrell, Trevor, and Abbeel, Pieter. End-to-end training of deep visuomotor policies. Journal of Machine Learning Research, 2016.
-
(2016)
Journal of Machine Learning Research
-
-
Levine, S.1
Finn, C.2
Darrell, T.3
Abbeel, P.4
-
18
-
-
85083953657
-
Continuous control with deep reinforcement learning
-
Lillicrap, Timothy P, Hunt, Jonathan J, Pritzel, Alexander, Heess, Nicolas, Erez, Tom, Tassa, Yuval, Silver, David, and Wierstra, Daan. Continuous control with deep reinforcement learning. In International Conference on Learning Representations, 2016.
-
(2016)
International Conference on Learning Representations
-
-
Lillicrap, T.P.1
Hunt, J.J.2
Pritzel, A.3
Heess, N.4
Erez, T.5
Tassa, Y.6
Silver, D.7
Wierstra, D.8
-
19
-
-
84924051598
-
Human-level control through deep reinforcement learning
-
Mnih, Volodymyr, Kavukcuoglu, Koray, Silver, David, Rusu, Andrei A, Veness, Joel, Bellemare, Marc G, Graves, Alex, Riedmiller, Martin, Fidjeland, Andreas K, Ostrovski, Georg, Petersen, Stig, Beattie, Charles, Sadik, Amir, Antonoglou, Ioannis, King, Helen, Kumaran, Dharshan, Wierstra, Daan, Legg, Shane, and Hassabis, Demis. Human-level control through deep reinforcement learning. Nature, 518(7540):529-533, 2015.
-
(2015)
Nature
, vol.518
, Issue.7540
, pp. 529-533
-
-
Mnih, V.1
Kavukcuoglu, K.2
Silver, D.3
Rusu, A.A.4
Veness, J.5
Bellemare, M.G.6
Graves, A.7
Riedmiller, M.8
Fidjeland, A.K.9
Ostrovski, G.10
Petersen, S.11
Beattie, C.12
Sadik, A.13
Antonoglou, I.14
King, H.15
Kumaran, D.16
Wierstra, D.17
Legg, S.18
Hassabis, D.19
-
20
-
-
84999036937
-
Asynchronous methods for deep reinforcement learning
-
Mnih, Volodymyr, Badia, Adria Puigdomenech, Mirza, Mehdi, Graves, Alex, Lillicrap, Timothy P, Harley, Tim, Silver, David, and Kavukcuoglu, Koray. Asynchronous methods for deep reinforcement learning. In Proceedings of the International Conference on Machine Learning, 2016.
-
(2016)
Proceedings of the International Conference on Machine Learning
-
-
Mnih, V.1
Badia, A.P.2
Mirza, M.3
Graves, A.4
Lillicrap, T.P.5
Harley, T.6
Silver, D.7
Kavukcuoglu, K.8
-
23
-
-
84965178314
-
Action-conditional video prediction using deep networks in atari games
-
Oh, Junhyuk, Guo, Xiaoxiao, Lee, Honglak, Lewis, Richard L, and Singh, Satinder. Action-conditional video prediction using deep networks in atari games. In Advances in the Neural Information Processing System, 2015.
-
(2015)
Advances in the Neural Information Processing System
-
-
Oh, J.1
Guo, X.2
Lee, H.3
Lewis, R.L.4
Singh, S.5
-
24
-
-
0018496708
-
Mazes, maps, and memory
-
Olton, David S. Mazes, maps, and memory. American Psychologist, 34(7):583, 1979.
-
(1979)
American Psychologist
, vol.34
, Issue.7
, pp. 583
-
-
Olton, D.S.1
-
25
-
-
84969760283
-
Universal value function approximators
-
Schaul, Tom, Horgan, Daniel, Gregor, Karol, and Silver, David. Universal value function approximators. In Proceedings of the International Conference on Machine Learning, 2015.
-
(2015)
Proceedings of the International Conference on Machine Learning
-
-
Schaul, T.1
Horgan, D.2
Gregor, K.3
Silver, D.4
-
26
-
-
84910651844
-
Deep learning in neural networks: An overview
-
Schmidhuber, Jiirgen. Deep learning in neural networks: An overview. Neural Networks, 61:85-117, 2015.
-
(2015)
Neural Networks
, vol.61
, pp. 85-117
-
-
Schmidhuber, J.1
-
27
-
-
84969963490
-
Trust region policy optimization
-
Schulman, John, Levine, Sergey, Moritz, Philipp, Jordan, Michael I, and Abbeel, Pieter. Trust region policy optimization. In Proceedings of the International Conference on Machine Learning, 2015.
-
(2015)
Proceedings of the International Conference on Machine Learning
-
-
Schulman, J.1
Levine, S.2
Moritz, P.3
Jordan, M.I.4
Abbeel, P.5
-
30
-
-
84999059347
-
-
arXiv preprint arXiv:1511.07401
-
Sukhbaatar, Sainbayar, Szlam, Arthur, Synnaeve, Gabriel, Chintala, Soumith, and Fergus, Rob. Mazebase: A sandbox for learning from games. arXiv preprint arXiv:1511.07401, 2015a.
-
(2015)
Mazebase: A Sandbox for Learning from Games
-
-
Sukhbaatar, S.1
Szlam, A.2
Synnaeve, G.3
Chintala, S.4
Fergus, R.5
-
32
-
-
0029276036
-
Temporal difference learning and td- gammon
-
Tesauro, Gerald. Temporal difference learning and td- gammon. Communications of the ACM, 38(3):58-68, 1995.
-
(1995)
Communications of the ACM
, vol.38
, Issue.3
, pp. 58-68
-
-
Tesauro, G.1
-
35
-
-
77957283019
-
Recurrent policy gradients
-
Wierstra, Daan, Forster, Alexander, Peters, Jan, and Schmidhuber, Jiirgen. Recurrent policy gradients. Logic Journal of lGPL, 18(5):620-634, 2010.
-
(2010)
Logic Journal of LGPL
, vol.18
, Issue.5
, pp. 620-634
-
-
Wierstra, D.1
Forster, A.2
Peters, J.3
Schmidhuber, J.4
-
37
-
-
84997831765
-
Learning simple algorithms from examples
-
Zaremba, Wojciech, Mikolov, Tomas, Joulin, Armand, and Fergus, Rob. Learning simple algorithms from examples. In Proceedings of the International Conference on Machine Learning, 2016.
-
(2016)
Proceedings of the International Conference on Machine Learning
-
-
Zaremba, W.1
Mikolov, T.2
Joulin, A.3
Fergus, R.4
-
38
-
-
84999062914
-
Policy learning with continuous memory states for partially observed robotic control
-
Zhang, Marvin, Levine, Sergey, McCarthy, Zoe, Finn, Chelsea, and Abbeel, Pieter. Policy learning with continuous memory states for partially observed robotic control. In International Conference on Robotics and Automation, 2016.
-
(2016)
International Conference on Robotics and Automation
-
-
Zhang, M.1
Levine, S.2
McCarthy, Z.3
Finn, C.4
Abbeel, P.5
|