-
2
-
-
85018889357
-
Learning to poke by poking: Experiential learning of intuitive physics
-
Agrawal, Pulkit, Nair, Ashvin, Abbeel, Pieter, Malik, Jitendra, and Levine, Sergey. Learning to poke by poking: Experiential learning of intuitive physics. NIPS, 2016.
-
(2016)
NIPS
-
-
Agrawal, P.1
Nair, A.2
Abbeel, P.3
Malik, J.4
Levine, S.5
-
3
-
-
85018935382
-
Unifying count-based exploration and intrinsic motivation
-
Bellemare, Marc, Srinivasan, Sriram, Ostrovski, Georg, Schaul, Tom, Saxton, David, and Munos, Remi. Unifying count-based exploration and intrinsic motivation. In NIPS, 2016.
-
(2016)
NIPS
-
-
Bellemare, M.1
Srinivasan, S.2
Ostrovski, G.3
Schaul, T.4
Saxton, D.5
Munos, R.6
-
4
-
-
14344250818
-
R-max-a general polynomial time algorithm for near-optimal reinforcement learning
-
Brafman, Ronen I and Tennenholtz, Moshe. R-max-a general polynomial time algorithm for near-optimal reinforcement learning. JMLR, 2002.
-
(2002)
JMLR
-
-
Brafman, R.I.1
Tennenholtz, M.2
-
5
-
-
85015444377
-
-
Brockman, Greg, Cheung, Vicki, Pettersson, Ludwig, Schneider, Jonas, Schulman, John, Tang, Jie, and Zaremba, Wojciech. Openai gym. arXiv:1606.01540, 2016.
-
(2016)
Openai Gym
-
-
Brockman, G.1
Cheung, V.2
Pettersson, L.3
Schneider, J.4
Schulman, J.5
Tang, J.6
Zaremba, W.7
-
6
-
-
84973916088
-
Unsupervised visual representation learning by context prediction
-
Doersch, Carl, Gupta, Abhinav, and Efros, Alexei A. Unsupervised visual representation learning by context prediction. In ICCV, 2015.
-
(2015)
ICCV
-
-
Doersch, C.1
Gupta, A.2
Efros, A.A.3
-
7
-
-
85031893333
-
Learning to act by predicting the future
-
Dosovitskiy, Alexey and Koltun, Vladlen. Learning to act by predicting the future. ICLR, 2016.
-
(2016)
ICLR
-
-
Dosovitskiy, A.1
Koltun, V.2
-
9
-
-
85070926206
-
-
Goroshin, Ross, Bruna, Joan, Tompson, Jonathan, Eigcn, David, and Le Cun, Yann. Unsupervised feature learning from temporal data. arXiv:1504.02518, 2015.
-
(2015)
Unsupervised Feature Learning from Temporal Data
-
-
Goroshin, R.1
Bruna, J.2
Tompson, J.3
Eigcn, D.4
Le Cun, Y.5
-
11
-
-
85019212563
-
Vime: Variational information maximizing exploration
-
Houthooft, Rein, Chen, Xi, Duan, Yan, Schulman, John, De Turck, Filip, and Abbeel, Pieter. Vime: Variational information maximizing exploration. In NIPS, 2016.
-
(2016)
NIPS
-
-
Houthooft, R.1
Chen, X.2
Duan, Y.3
Schulman, J.4
De Turck, F.5
Abbeel, P.6
-
12
-
-
85088229768
-
Reinforcement learning with unsupervised auxiliary tasks
-
Jaderberg, Max, Mnih, Volodymyr, Czamecki, Wojciech Marian, Schaul, Tom, Leibo, Joel Z, Silver, David, and Kavukcuoglu, Koray. Reinforcement learning with unsupervised auxiliary tasks. ICLR, 2017.
-
(2017)
ICLR
-
-
Jaderberg, M.1
Mnih, V.2
Czamecki, W.M.3
Schaul, T.4
Leibo, J.Z.5
Silver, D.6
Kavukcuoglu, K.7
-
13
-
-
84973897623
-
Learning image representations tied to ego-motion
-
Jayaraman, Dinesh and Grauman, Kristen. Learning image representations tied to ego-motion. In ICCV, 2015.
-
(2015)
ICCV
-
-
Jayaraman, D.1
Grauman, K.2
-
14
-
-
44049116478
-
Forward Models: Supervised learning with a distal, teacher
-
Jordan, Michael I and Rumelhart, David E. Forward models: Supervised learning with a distal teacher. Cognitive science, 1992.
-
(1992)
Cognitive Science
-
-
Jordan, M.I.1
Rumelhart, D.E.2
-
15
-
-
84880677563
-
Efficient reinforcement learning in factored mdps
-
Kearns, Michael and Roller, Daphne. Efficient reinforcement learning in factored mdps. In IJCAI, 1999.
-
(1999)
IJCAI
-
-
Kearns, M.1
Roller, D.2
-
16
-
-
85015426298
-
-
Kcmpka, Michal, Wydmuch, Marck, Rune, Grzegorz, Toczck, Jakub, and Jaskowski, Wojciech. Vizdoom: A doom-based ai research platform for visual reinforcement learning. arXiv:1605.02097 2016.
-
(2016)
Vizdoom: A Doom-based Ai Research Platform for Visual Reinforcement Learning
-
-
Kcmpka, M.1
Wydmuch, M.2
Rune, G.3
Toczck, J.4
Jaskowski, W.5
-
18
-
-
85083953657
-
Continuous control with deep reinforcement learning
-
Lillicrap, Timothy P, Hunt, Jonathan J, Pritzel, Alexander, Heess, Nicolas, Erez, Tom, Tassa, Yuval, Silver, David, and Wierstra, Daan. Continuous control with deep reinforcement learning. ICLR, 2016.
-
(2016)
ICLR
-
-
Lillicrap, T.P.1
Hunt, J.J.2
Pritzel, A.3
Heess, N.4
Erez, T.5
Tassa, Y.6
Silver, D.7
Wierstra, D.8
-
20
-
-
84877724875
-
Exploration in model-based reinforcement learning by empirically estimating learning progress
-
Lopes, Manuel, Lang, Tobias, Toussaint, Marc, and Oudeyer, Pierre-Yves. Exploration in model-based reinforcement learning by empirically estimating learning progress. In NIPS, 2012.
-
(2012)
NIPS
-
-
Lopes, M.1
Lang, T.2
Toussaint, M.3
Oudeyer, P.-Y.4
-
21
-
-
85041891851
-
Learning to navigate in complex environments
-
Mirowski, Piotr, Pascanu, Razvan, Viola, Fabio, Soyer, Hubert, Ballard, Andy, Banino, Andrea, Denil, Misha, Goroshin, Ross, Sifre, Laurent, Kavukcuoglu, Koray, et al. Learning to navigate in complex environments. ICLR, 2017.
-
(2017)
ICLR
-
-
Mirowski, P.1
Pascanu, R.2
Viola, F.3
Soyer, H.4
Ballard, A.5
Banino, A.6
Denil, M.7
Goroshin, R.8
Sifre, L.9
Kavukcuoglu, K.10
-
22
-
-
84924051598
-
Human-level control through deep reinforcement learning
-
Mnih, Volodymyr, Kavukcuoglu, Koray, Silver, David, Rusu, Andrei A, Veness, Joel, Bellemare, Marc G, Graves, Alex, Riedmiller, Martin, Fidjeland, Andreas K, Ostrovski, Georg, et al. Human-level control through deep reinforcement learning. Nature, 2015.
-
(2015)
Nature
-
-
Mnih, V.1
Kavukcuoglu, K.2
Silver, D.3
Rusu, A.A.4
Veness, J.5
Bellemare, M.G.6
Graves, A.7
Riedmiller, M.8
Fidjeland, A.K.9
Ostrovski, G.10
-
23
-
-
84999036937
-
Asynchronous methods for deep reinforcement, learning
-
Mnih, Volodymyr, Badia, Adria Puigdomenech, Mirza, Mehdi, Graves, Alex, Lillicrap, Timothy P, Harley, Tim, Silver, David, and Kavukcuoglu, Koray. Asynchronous methods for deep reinforcement learning. In ICML, 2016.
-
(2016)
ICML
-
-
Mnih, V.1
Badia, A.P.2
Mirza, M.3
Graves, A.4
Lillicrap, T.P.5
Harley, T.6
Silver, D.7
Kavukcuoglu, K.8
-
24
-
-
84965128263
-
Variational information maximisation for intrinsically motivated reinforcement learning
-
Mohamed, Shakir and Rezende, Danilo Jimenez. Variational information maximisation for intrinsically motivated reinforcement learning. In NIPS, 2015.
-
(2015)
NIPS
-
-
Mohamed, S.1
Rezende, D.J.2
-
25
-
-
84965178314
-
Action-conditional video prediction using deep networks in atari games
-
Oh, Junhyuk, Guo, Xiaoxiao, Lee, Honglak, Lewis, Richard L, and Singh, Satinder. Action-conditional video prediction using deep networks in atari games. In NIPS, 2015.
-
(2015)
NIPS
-
-
Oh, J.1
Guo, X.2
Lee, H.3
Lewis, R.L.4
Singh, S.5
-
26
-
-
85019259487
-
Deep exploration via bootstrapped dqn
-
Osband, Ian, Blundell, Charles, Pritzel, Alexander, and Van Roy, Benjamin. Deep exploration via bootstrapped dqn. In NIPS, 2016.
-
(2016)
NIPS
-
-
Osband, I.1
Blundell, C.2
Pritzel, A.3
Van Roy, B.4
-
27
-
-
84891105730
-
What is intrinsic motivation? a typology of computational approaches
-
Oudeycr, Pierre-Yves and Kaplan, Frederic. What is intrinsic motivation? a typology of computational approaches. Frontiers in neurorobotics, 2009.
-
(2009)
Frontiers in Neurorobotics
-
-
Oudeycr, P.-Y.1
Kaplan, F.2
-
29
-
-
85048545843
-
Super mario bros
-
github:ppaquette/gym-super-mario
-
Paquette, Philip. Super mario bros. In openai gym. github:ppaquette/gym-super-mario, 2016.
-
(2016)
Openai Gym
-
-
Paquette, P.1
-
30
-
-
84986294165
-
Context encoders: Feature learning by inpainting
-
Pathak, Deepak, Krahenbuhl, Philipp, Donahue, Jeff, Darrell, Trevor, and Efros, Alexei A. Context encoders: Feature learning by inpainting. In CVPR, 2016.
-
(2016)
CVPR
-
-
Pathak, D.1
Krahenbuhl, P.2
Donahue, J.3
Darrell, T.4
Efros, A.A.5
-
31
-
-
33749251297
-
An analytic solution to discrete Bayesian reinforcement learning
-
Poupart, Pascal, Vlassis, Nikos, Hoey, Jesse, and Regan, Kevin. An analytic solution to discrete bayesian reinforcement learning. In ICML, 2006.
-
(2006)
ICML
-
-
Poupart, P.1
Vlassis, N.2
Hoey, J.3
Regan, K.4
-
32
-
-
0002209063
-
Intrinsic and extrinsic motivations: Classic definitions and new directions
-
Ryan, Richard; Deci, Edward L. Intrinsic and extrinsic motivations: Classic definitions and new directions. Contemporary Educational Psychology, 2000.
-
(2000)
Contemporary Educational Psychology
-
-
Ryan, R.1
Deci, E.L.2
-
35
-
-
85144242642
-
-
Shelhamer, Evan, Mahmoudieh, Parsa, Argus, Max, and Darrell, Trevor. Loss is its own reward: Self-supervision for reinforcement learning. arXiv:1612.07307, 2017.
-
(2017)
Loss is its Own Reward: Self-supervision for Reinforcement Learning
-
-
Shelhamer, E.1
Mahmoudieh, P.2
Argus, M.3
Darrell, T.4
-
37
-
-
84899031920
-
Intrinsically motivated reinforcement learning
-
Singh, Satinder P, Barto, Andrew G, and Chentanez, Nuttapong. Intrinsically motivated reinforcement learning. In NIPS, 2005.
-
(2005)
NIPS
-
-
Singh, S.P.1
Barto, A.G.2
Chentanez, N.3
-
38
-
-
84959023524
-
In- centivizing exploration in reinforcement learning with deep predictive models
-
Stadie, Bradly C, Levine, Sergey, and Abbeel, Pieter. In- centivizing exploration in reinforcement learning with deep predictive models. NIPS Workshop, 2015.
-
(2015)
NIPS Workshop
-
-
Stadie, B.C.1
Levine, S.2
Abbeel, P.3
-
39
-
-
84865114997
-
An information-theoretic approach to curiosity-driven reinforcement learning
-
Still, Susanne and Precup, Doina. An information-theoretic approach to curiosity-driven reinforcement learning. Theory in Biosciences, 2012.
-
(2012)
Theory in Biosciences
-
-
Still, S.1
Precup, D.2
-
40
-
-
0010813872
-
Reinforcement driven information acquisition in non- deterministic environments
-
Storck, Jan, Hochreiter, Sepp, and Schmidhuber, Jurgen. Reinforcement driven information acquisition in non- deterministic environments. In ICANN, 1995.
-
(1995)
ICANN
-
-
Storck, J.1
Hochreiter, S.2
Schmidhuber, J.3
-
41
-
-
85046937806
-
-
Sukhbaatar, Sainbayar, Kostrikov, Ilya, Szlam, Arthur, and Fergus, Rob. Intrinsic motivation and automatic curricula via asymmetric self-play. arXiv:1703.05407, 2017.
-
(2017)
Intrinsic Motivation and Automatic Curricula via Asymmetric Self-play
-
-
Sukhbaatar, S.1
Kostrikov, I.2
Szlam, A.3
Fergus, R.4
-
42
-
-
79961180493
-
Planning to be Surprised: Optimal Bayesian exploration in dynamic environments
-
Sun, Yi, Gomez, Faustino, and Schmidhuber, Jurgen. Planning to be surprised: Optimal bayesian exploration in dynamic environments. In AGI, 2011.
-
(2011)
AGI
-
-
Sun, Y.1
Gomez, F.2
Schmidhuber, J.3
-
43
-
-
85024368720
-
-
Tang, Haoran, Houthooft, Rein, Foote, Davis, Stooke, Adam, Chen, Xi, Duan, Yan, Schulman, John, De Turck, Filip, and Abbeel, Pieter. # exploration: A study of count-based exploration for deep reinforcement learning. arXiv:1611.04717, 2016.
-
(2016)
# Exploration: A Study of Count-based Exploration for Deep Reinforcement Learning
-
-
Tang, H.1
Houthooft, R.2
Foote, D.3
Stooke, A.4
Chen, X.5
Duan, Y.6
Schulman, J.7
De Turck, F.8
Abbeel, P.9
-
44
-
-
84973889989
-
Unsupervised learning of visual representations using videos
-
Wang, Xiaolong and Gupta, Abhinav. Unsupervised learning of visual representations using videos. In ICCV, 2015.
-
(2015)
ICCV
-
-
Wang, X.1
Gupta, A.2
|