-
1
-
-
84879678310
-
-
arXiv preprint arXiv:1207.4708
-
Bellemare, Marc G, Naddaf, Yavar, Veness, Joel, and Bowling, Michael. The arcade learning environment: An evaluation platform for general agents. arXiv preprint arXiv:1207.4708, 2012.
-
(2012)
The Arcade Learning Environment: An Evaluation Platform for General Agents
-
-
Bellemare, M.G.1
Naddaf, Y.2
Veness, J.3
Bowling, M.4
-
2
-
-
84998591851
-
-
arXiv preprint arXiv: 1512.04860
-
Bellemare, Marc G, Ostrovski, Georg, Guez, Arthur, Thomas, Philip S, and Munos, Remi. Increasing the action gap: New operators for reinforcement learning. arXiv preprint arXiv: 1512.04860, 2015.
-
(2015)
Increasing the Action Gap: New Operators for Reinforcement Learning
-
-
Bellemare, M.G.1
Ostrovski, G.2
Guez, A.3
Thomas, P.S.4
Munos, R.5
-
5
-
-
0002278788
-
Hierarchical reinforcement learning with the MAXQ value function decomposition
-
Dietterich, Thomas G. Hierarchical reinforcement learning with the MAXQ value function decomposition. J. Artif. Intell. Res.(JAIR), 13:227-303, 2000.
-
(2000)
J. Artif. Intell. Res.(JAIR)
, vol.13
, pp. 227-303
-
-
Dietterich, T.G.1
-
6
-
-
84998993668
-
Learning embedded maps of markov processes
-
Citeseer
-
Engel, Yaakov and Mannor, Shie. Learning embedded maps of markov processes. In in Proceedings of ICML 2001. Citeseer, 2001.
-
(2001)
Proceedings of ICML 2001
-
-
Engel, Y.1
Mannor, S.2
-
7
-
-
77949524387
-
-
Dept. IRO, Universite de Montreal, Tech. Rep, 4323
-
Erhan, Dumitru, Bengio, Yoshua, Courville, Aaron, and Vincent, Pascal. Visualizing higher-layer features of a deep network. Dept. IRO, Universite de Montreal, Tech. Rep, 4323, 2009.
-
(2009)
Visualizing Higher-layer Features of a Deep Network
-
-
Erhan, D.1
Bengio, Y.2
Courville, A.3
Vincent, P.4
-
9
-
-
0006419533
-
Hierarchical solution of Markov decision processes using macro-actions
-
Morgan Kaufmann Publishers Inc
-
Hauskrecht, Milos, Meuleau, Nicolas, Kaelbling, Leslie Pack, Dean, Thomas, and Boutilier, Craig. Hierarchical solution of Markov decision processes using macro-actions. In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence, pp. 220-229. Morgan Kaufmann Publishers Inc., 1998.
-
(1998)
Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence
, pp. 220-229
-
-
Hauskrecht, M.1
Meuleau, N.2
Kaelbling, L.P.3
Dean, T.4
Boutilier, C.5
-
10
-
-
84943767635
-
-
arXiv preprint arXiv:1504.00702
-
Levine, Sergey, Finn, Chelsea, Darrell, Trevor, and Abbeel, Pieter. End-to-end training of deep visuomotor policies. arXiv preprint arXiv:1504.00702, 2015.
-
(2015)
End-to-end Training of Deep Visuomotor Policies
-
-
Levine, S.1
Finn, C.2
Darrell, T.3
Abbeel, P.4
-
12
-
-
85032751123
-
Manifold-learning-based feature extraction for classification of hyperspectral data: A review of advances in manifold learning
-
IEEE
-
Lunga, Dalton, Prasad, Santasriya, Crawford, Melba M, and Ersoy, Ozan. Manifold-learning-based feature extraction for classification of hyperspectral data: A review of advances in manifold learning. Signal Processing Magazine, IEEE, 31(1):55-66, 2014.
-
(2014)
Signal Processing Magazine
, vol.31
, Issue.1
, pp. 55-66
-
-
Lunga, D.1
Prasad, S.2
Crawford, M.M.3
Ersoy, O.4
-
13
-
-
84938498958
-
Approximate value iteration with temporally extended actions
-
Mann, Timothy A, Mannor, Shie, and Precup, Doina. Approximate value iteration with temporally extended actions. Journal of Artificial Intelligence Research, 53(1): 375-438, 2015.
-
(2015)
Journal of Artificial Intelligence Research
, vol.53
, Issue.1
, pp. 375-438
-
-
Mann, T.A.1
Mannor, S.2
Precup, D.3
-
14
-
-
14344250635
-
Dynamic abstraction in reinforcement learning via clustering
-
ACM
-
Mannor, Shie, Menache, Ishai, Hoze, Amit, and Klein, Uri. Dynamic abstraction in reinforcement learning via clustering. In Proceedings of the twenty-first international conference on Machine learning, pp. 71. ACM, 2004.
-
(2004)
Proceedings of the Twenty-first International Conference on Machine Learning
, pp. 71
-
-
Mannor, S.1
Menache, I.2
Hoze, A.3
Klein, U.4
-
15
-
-
84945250000
-
Q-cutdynamic discovery of sub-goals in reinforcement learning
-
Q, Springer
-
Menache, Ishai, Mannor, Shie, and Shimkin, Nahum. Q-cutdynamic discovery of sub-goals in reinforcement learning. In Machine Learning: ECML 2002, pp. 295- 306. Springer, 2002.
-
(2002)
Machine Learning: ECML 2002
, pp. 295-306
-
-
Menache, I.1
Mannor, S.2
Shimkin, N.3
-
16
-
-
84904867557
-
-
arXiv preprint arXiv:1312.5602
-
Mnih, Volodymyr, Kavukcuoglu, Koray, Silver, David, Graves, Alex, Antonoglou, Ioannis, Wierstra, Daan, and Riedmiller, Martin. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
-
(2013)
Playing Atari with Deep Reinforcement Learning
-
-
Mnih, V.1
Kavukcuoglu, K.2
Silver, D.3
Graves, A.4
Antonoglou, I.5
Wierstra, D.6
Riedmiller, M.7
-
17
-
-
84924051598
-
Human-level control through deep reinforcement learning
-
Mnih, Volodymyr, Kavukcuoglu, Koray, Silver, David, Rusu, Andrei A, Veness, Joel, Bellemare, Marc G, Graves, Alex, Riedmiller, Martin, Fidjeland, Andreas K, Ostrovski, Georg, et al. Human-level control through deep reinforcement learning. Nature, 518(7540):529- 533, 2015.
-
(2015)
Nature
, vol.518
, Issue.7540
, pp. 529-533
-
-
Mnih, V.1
Kavukcuoglu, K.2
Silver, D.3
Rusu, A.A.4
Veness, J.5
Bellemare, M.G.6
Graves, A.7
Riedmiller, M.8
Fidjeland, A.K.9
Ostrovski, G.10
-
18
-
-
84980007683
-
-
arXiv preprint arXiv: 1507.04296
-
Nair, Arun, Srinivasan, Praveen, Blackwell, Sam, Alcicek, Cagdas, Fearon, Rory, De Maria, Alessandro, Panneershelvam, Vedavyas, Suleyman, Mustafa, Beattie, Charles, Petersen, Stig, et al. Massively parallel methods for deep reinforcement learning. arXiv preprint arXiv: 1507.04296, 2015.
-
(2015)
Massively Parallel Methods for Deep Reinforcement Learning
-
-
Nair, A.1
Srinivasan, P.2
Blackwell, S.3
Alcicek, C.4
Fearon, R.5
De Maria, A.6
Panneershelvam, V.7
Suleyman, M.8
Beattie, C.9
Petersen, S.10
-
21
-
-
0346738900
-
Flexible decomposition algorithms for weakly coupled Markov decision problems
-
Morgan Kaufmann Publishers Inc
-
Parr, Ronald. Flexible decomposition algorithms for weakly coupled Markov decision problems. In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence, pp. 422-430. Morgan Kaufmann Publishers Inc., 1998.
-
(1998)
Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence
, pp. 422-430
-
-
Parr, R.1
-
22
-
-
21344435992
-
Invariant visual representation by single neurons in the human brain
-
Quiroga, R Quian, Reddy, Leila, Kreiman, Gabriel, Koch, Christof, and Fried, Itzhak. Invariant visual representation by single neurons in the human brain. Nature, 435 (7045): 1102-1107, 2005.
-
(2005)
Nature
, vol.435
, Issue.7045
, pp. 1102-1107
-
-
Quiroga, R.Q.1
Reddy, L.2
Kreiman, G.3
Koch, C.4
Fried, I.5
-
24
-
-
33646398129
-
Neural fitted Q iteration-first experiences with a data efficient neural reinforcement learning method
-
Springer
-
Riedmiller, Martin. Neural fitted Q iteration-first experiences with a data efficient neural reinforcement learning method. In Machine Learning: ECML 2005, pp. 317- 328. Springer, 2005.
-
(2005)
Machine Learning: ECML 2005
, pp. 317-328
-
-
Riedmiller, M.1
-
25
-
-
84980003817
-
-
Rusu, Andrei A., Colmenarejo, Sergio Gomez, Gulcehre, Caglar, Desjardins, Guillaume, Kirkpatrick, James, Pascanu, Razvan, Mnih, Volodymyr, Kavukcuoglu, Koray, and Hadsell, Raia. Policy distillation, 2015.
-
(2015)
Policy Distillation
-
-
Rusu, A.A.1
Colmenarejo, S.G.2
Gulcehre, C.3
Desjardins, G.4
Kirkpatrick, J.5
Pascanu, R.6
Mnih, V.7
Kavukcuoglu, K.8
Hadsell, R.9
-
26
-
-
84980041049
-
-
arXiv preprint arXiv:1511.05952
-
Schaul, Tom, Quan, John, Antonoglou, Ioannis, and Silver, David. Prioritized experience replay. arXiv preprint arXiv:1511.05952, 2015.
-
(2015)
Prioritized Experience Replay
-
-
Schaul, T.1
Quan, J.2
Antonoglou, I.3
Silver, D.4
-
28
-
-
31844447221
-
Identifying useful subgoals in reinforcement learning by local graph partitioning
-
ACM
-
Simsek, Ozgur, Wolfe, Alicia P, and Barto, Andrew G. Identifying useful subgoals in reinforcement learning by local graph partitioning. In Proceedings of the 22nd international conference on Machine learning, pp. 816- 823. ACM, 2005.
-
(2005)
Proceedings of the 22nd International Conference on Machine Learning
, pp. 816-823
-
-
Simsek, O.1
Wolfe, A.P.2
Barto, A.G.3
-
29
-
-
85153965130
-
Reinforcement learning with soft state aggregation
-
Singh, Satinder P, Jaakkola, Tommi, and Jordan, Michael I. Reinforcement learning with soft state aggregation. Advances in neural information processing systems, pp. 361-368, 1995.
-
(1995)
Advances in Neural Information Processing Systems
, pp. 361-368
-
-
Singh, S.P.1
Jaakkola, T.2
Jordan, M.I.3
-
31
-
-
0033170372
-
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
-
Sutton, Richard S, Precup, Doina, and Singh, Satinder. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 112( 1): 181-211, 1999.
-
(1999)
Artificial Intelligence
, vol.112
, Issue.1
, pp. 181-211
-
-
Sutton, R.S.1
Precup, D.2
Singh, S.3
-
32
-
-
84925331214
-
-
arXiv preprint arXiv: 1312.6199
-
Szegedy, Christian, Zaremba, Wojciech, Sutskever, Ilya, Bruna, Joan, Erhan, Dumitru, Goodfellow, Ian, and Fergus, Rob. Intriguing properties of neural networks. arXiv preprint arXiv: 1312.6199, 2013.
-
(2013)
Intriguing Properties of Neural Networks
-
-
Szegedy, C.1
Zaremba, W.2
Sutskever, I.3
Bruna, J.4
Erhan, D.5
Goodfellow, I.6
Fergus, R.7
-
33
-
-
0034704229
-
A global geometric framework for nonlinear dimensionality reduction
-
Tenenbaum, Joshua B, De Silva, Vin, and Langford, John C. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319-2323, 2000.
-
(2000)
Science
, vol.290
, Issue.5500
, pp. 2319-2323
-
-
Tenenbaum, J.B.1
De Silva, V.2
Langford, J.C.3
-
34
-
-
0029276036
-
Temporal difference learning and TD- Gammon
-
Tesauro, Gerald. Temporal difference learning and TD- Gammon. Communications of the ACM, 38(3):58-68, 1995.
-
(1995)
Communications of the ACM
, vol.38
, Issue.3
, pp. 58-68
-
-
Tesauro, G.1
-
35
-
-
0031998630
-
Learning metric-topological maps for indoor mobile robot navigation
-
Thrun, Sebastian. Learning metric-topological maps for indoor mobile robot navigation. Artificial Intelligence, 99(1):21-71, 1998.
-
(1998)
Artificial Intelligence
, vol.99
, Issue.1
, pp. 21-71
-
-
Thrun, S.1
-
36
-
-
0031143730
-
An analysis of temporal-difference learning with function approximation
-
Tsitsiklis, John N and Van Roy, Benjamin. An analysis of temporal-difference learning with function approximation. Automatic Control, IEEE Transactions on, 42(5): 674-690, 1997.
-
(1997)
Automatic Control, IEEE Transactions on
, vol.42
, Issue.5
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
-
37
-
-
84919775831
-
Accelerating t-SNE using tree- based algorithms
-
Van Der Maaten, Laurens. Accelerating t-SNE using tree- based algorithms. The Journal of Machine Learning Research, 15(1):3221-3245, 2014.
-
(2014)
The Journal of Machine Learning Research
, vol.15
, Issue.1
, pp. 3221-3245
-
-
Van Der Maaten, L.1
-
41
-
-
84937508363
-
How transferable are features in deep neural networks?
-
Yosinski, Jason, Clune, Jeff, Bengio, Yoshua, and Lipson, Hod. How transferable are features in deep neural networks? In Advances in Neural Information Processing Systems, pp. 3320-3328, 2014.
-
(2014)
Advances in Neural Information Processing Systems
, pp. 3320-3328
-
-
Yosinski, J.1
Clune, J.2
Bengio, Y.3
Lipson, H.4
-
42
-
-
84906489074
-
Visualizing and understanding convolutional networks
-
Springer
-
Zeiler, Matthew D and Fergus, Rob. Visualizing and understanding convolutional networks. In Computer Vision- ECCV2014, pp. 818-833. Springer, 2014.
-
(2014)
Computer Vision- ECCV2014
, pp. 818-833
-
-
Zeiler, M.D.1
Fergus, R.2
|