-
3
-
-
84879976780
-
The arcade learning environment: An evaluation platform for general agents
-
Bellemare, Marc G., Naddaf, Yavar, Veness, Joel, and Bowling, Michael. The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253–279, 2013.
-
(2013)
Journal of Artificial Intelligence Research
, vol.47
, pp. 253-279
-
-
Bellemare, M.G.1
Naddaf, Y.2
Veness, J.3
Bowling, M.4
-
5
-
-
84937779024
-
Deep learning for real-time atari game play using offline monte-carlo tree search planning
-
Guo, Xiaoxiao, Singh, Satinder, Lee, Honglak, Lewis, Richard L, and Wang, Xiaoshi. Deep learning for real-time atari game play using offline monte-carlo tree search planning. In Advances in Neural Information Processing Systems 27, pp. 3338–3346, 2014.
-
(2014)
Advances in Neural Information Processing Systems
, vol.27
, pp. 3338-3346
-
-
Guo, X.1
Singh, S.2
Lee, H.3
Lewis, R.L.4
Wang, X.5
-
10
-
-
85028018890
-
End-to-end training of deep visuomotor policies
-
Levine, Sergey, Finn, Chelsea, Darrell, Trevor, and Abbeel, Pieter. End-to-end training of deep visuomotor policies. CoRR, abs/1504.00702, 2015.
-
(2015)
CoRR
-
-
Levine, S.1
Finn, C.2
Darrell, T.3
Abbeel, P.4
-
11
-
-
85007167143
-
Continuous control with deep reinforcement learning
-
Lillicrap, Timothy P., Hunt, Jonathan J., Pritzel, Alexander, Heess, Nicholas, Erez, Tom, Tassa, Yuval, Silver, David, and Wierstra, Daan. Continuous control with deep reinforcement learning. CoRR, abs/1509.02971, 2015.
-
(2015)
CoRR
-
-
Lillicrap, T.P.1
Hunt, J.J.2
Pritzel, A.3
Heess, N.4
Erez, T.5
Tassa, Y.6
Silver, D.7
Wierstra, D.8
-
12
-
-
84924051598
-
Human-level control through deep reinforcement learning
-
Mnih, Volodymyr, Kavukcuoglu, Koray, Silver, David, Rusu, Andrei A., Veness, Joel, Bellemare, Marc G., Graves, Alex, Riedmiller, Martin, Fidjeland, Andreas K., Ostrovski, Georg, Petersen, Stig, Beattie, Charles, Sadik, Amir, Antonoglou, Ioannis, King, Helen, Kumaran, Dharshan, Wierstra, Daan, Legg, Shane, and Hassabis, Demis. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015.
-
(2015)
Nature
, vol.518
, Issue.7540
, pp. 529-533
-
-
Mnih, V.1
Kavukcuoglu, K.2
Silver, D.3
Rusu, A.A.4
Veness, J.5
Bellemare, M.G.6
Graves, A.7
Riedmiller, M.8
Fidjeland, A.K.9
Ostrovski, G.10
Petersen, S.11
Beattie, C.12
Sadik, A.13
Antonoglou, I.14
King, H.15
Kumaran, D.16
Wierstra, D.17
Legg, S.18
Hassabis, D.19
-
15
-
-
85083953559
-
Fitnets: Hints for thin deep nets
-
Romero, Adriana, Ballas, Nicolas, Kahou, Samira Ebrahimi, Chassang, Antoine, Gatta, Carlo, and Bengio, Yoshua. Fitnets: Hints for thin deep nets. In International Conference on Learning Representations, 2015.
-
(2015)
International Conference on Learning Representations
-
-
Romero, A.1
Ballas, N.2
Kahou, S.E.3
Chassang, A.4
Gatta, C.5
Bengio, Y.6
-
16
-
-
84862273266
-
A reduction of imitation learning and structured prediction to no-regret online learning
-
Ross, Stephane, Gordon, Geoffrey, and Bagnell, Andrew. A reduction of imitation learning and structured prediction to no-regret online learning. Journal of Machine Learning Research, 15: 627–635, 2011.
-
(2011)
Journal of Machine Learning Research
, vol.15
, pp. 627-635
-
-
Ross, S.1
Gordon, G.2
Bagnell, A.3
-
17
-
-
0037886159
-
Sensitivity analysis, ergodicity coefficients, and rank-one updates for finite markov chains
-
Seneta, E. Sensitivity analysis, ergodicity coefficients, and rank-one updates for finite markov chains. Numerical solution of Markov chains, 8:121–129, 1991.
-
(1991)
Numerical Solution of Markov Chains
, vol.8
, pp. 121-129
-
-
Seneta, E.1
-
19
-
-
68949157375
-
Transfer learning for reinforcement learning domains: A survey
-
Taylor, Matthew E and Stone, Peter. Transfer learning for reinforcement learning domains: A survey. The Journal of Machine Learning Research, 10:1633–1685, 2009.
-
(2009)
The Journal of Machine Learning Research
, vol.10
, pp. 1633-1685
-
-
Taylor, M.E.1
Stone, P.2
|