-
1
-
-
84887843747
-
Motion editing with independent component analysis
-
Yong Cao, Ari Shapiro, Petros Faloutsos, and Frédéric Pighin. Motion editing with independent component analysis. Visual Computer, 2, 2007.
-
(2007)
Visual Computer
, vol.2
-
-
Cao, Y.1
Shapiro, A.2
Faloutsos, P.3
Pighin, F.4
-
2
-
-
84999018287
-
Benchmarking deep reinforcement learning for continuous control
-
Yan Duan, Xi Chen, Rein Houthooft, John Schulman, and Pieter Abbeel. Benchmarking deep reinforcement learning for continuous control. In Proceedings of the 33rd International Conference on Machine Learning (ICML), 2016.
-
(2016)
Proceedings of the 33rd International Conference on Machine Learning (ICML)
-
-
Duan, Y.1
Chen, X.2
Houthooft, R.3
Schulman, J.4
Abbeel, P.5
-
3
-
-
85046125163
-
-
arXiv preprint
-
Jakob Foerster, Gregory Farquhar, Triantafyllos Afouras, Nantas Nardelli, and Shimon Whiteson. Counterfactual multi-agent policy gradients. arXiv preprint arXiv:1705.08926, 2017.
-
(2017)
Counterfactual Multi-Agent Policy Gradients
-
-
Foerster, J.1
Farquhar, G.2
Afouras, T.3
Nardelli, N.4
Whiteson, S.5
-
4
-
-
84897694817
-
Variance reduction techniques for gradient estimates in reinforcement learning
-
Nov
-
Evan Greensmith, Peter L Bartlett, and Jonathan Baxter. Variance reduction techniques for gradient estimates in reinforcement learning. Journal of Machine Learning Research, 5(Nov):1471–1530, 2004.
-
(2004)
Journal of Machine Learning Research
, vol.5
, pp. 1471-1530
-
-
Greensmith, E.1
Bartlett, P.L.2
Baxter, J.3
-
5
-
-
85041942380
-
Q-prop: Sample-efficient policy gradient with an off-policy critic
-
Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E Turner, and Sergey Levine. Q-prop: Sample-efficient policy gradient with an off-policy critic. In International Conference on Learning Representations (ICLR2017), 2017.
-
(2017)
International Conference on Learning Representations (ICLR2017)
-
-
Gu, S.1
Lillicrap, T.2
Ghahramani, Z.3
Turner, R.E.4
Levine, S.5
-
8
-
-
84979924150
-
End-to-end training of deep visuo-motor policies
-
Sergey Levine, Chelsea Finn, Trevor Darrell, and Pieter Abbeel. End-to-end training of deep visuo-motor policies. Journal of Machine Learning Research, 17(39):1–40, 2016.
-
(2016)
Journal of Machine Learning Research
, vol.17
, Issue.39
, pp. 1-40
-
-
Levine, S.1
Finn, C.2
Darrell, T.3
Abbeel, P.4
-
9
-
-
85083953657
-
Continuous control with deep reinforcement learning
-
Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement learning. In International Conference on Learning Representations (ICLR2016), 2016.
-
(2016)
International Conference on Learning Representations (ICLR2016)
-
-
Lillicrap, T.P.1
Hunt, J.J.2
Pritzel, A.3
Heess, N.4
Erez, T.5
Tassa, Y.6
Silver, D.7
Wierstra, D.8
-
10
-
-
85041351193
-
-
arXiv preprint
-
Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, and Igor Mordatch. Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv preprint arXiv:1706.02275, 2017.
-
(2017)
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
-
-
Lowe, R.1
Wu, Y.2
Tamar, A.3
Harb, J.4
Abbeel, P.5
Mordatch, I.6
-
11
-
-
84924051598
-
Human-level control through deep reinforcement learning
-
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Belle-mare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015.
-
(2015)
Nature
, vol.518
, Issue.7540
, pp. 529-533
-
-
Mnih, V.1
Kavukcuoglu, K.2
Silver, D.3
Rusu, A.A.4
Veness, J.5
Belle-Mare, M.G.6
Graves, A.7
Riedmiller, M.8
Fidjeland, A.K.9
Ostrovski, G.10
-
12
-
-
84971448181
-
Asynchronous methods for deep reinforcement learning
-
Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In International Conference on Machine Learning, pp. 1928–1937, 2016.
-
(2016)
International Conference on Machine Learning
, pp. 1928-1937
-
-
Mnih, V.1
Badia, A.P.2
Mirza, M.3
Graves, A.4
Lillicrap, T.5
Harley, T.6
Silver, D.7
Kavukcuoglu, K.8
-
13
-
-
85041963017
-
Guided policy search as approximate mirror descent
-
William Montgomery and Sergey Levine. Guided policy search as approximate mirror descent. In NIPS, 2016.
-
(2016)
NIPS
-
-
Montgomery, W.1
Levine, S.2
-
14
-
-
84965182099
-
Interactive control of diverse complex characters with neural networks
-
Igor Mordatch, Kendall Lowrey, Galen Andrew, Zoran Popovic, and Emanuel Todorov. Interactive Control of Diverse Complex Characters with Neural Networks. In NIPS, 2015.
-
(2015)
NIPS
-
-
Mordatch, I.1
Lowrey, K.2
Andrew, G.3
Popovic, Z.4
Todorov, E.5
-
15
-
-
40649106649
-
Natural actor-critic
-
Jan Peters and Stefan Schaal. Natural actor-critic. Neurocomputing, 71(7):1180–1190, 2008.
-
(2008)
Neurocomputing
, vol.71
, Issue.7
, pp. 1180-1190
-
-
Peters, J.1
Schaal, S.2
-
16
-
-
77953218689
-
Random features for large-scale kernel machines
-
Ali Rahimi and Benjamin Recht. Random Features for Large-Scale Kernel Machines. In NIPS, 2007.
-
(2007)
NIPS
-
-
Rahimi, A.1
Recht, B.2
-
17
-
-
85049877180
-
Learning complex dexterous manipulation with deep reinforcement learning and demonstrations
-
abs/1709.10087
-
Aravind Rajeswaran, Vikash Kumar, Abhishek Gupta, John Schulman, Emanuel Todorov, and Sergey Levine. Learning complex dexterous manipulation with deep reinforcement learning and demonstrations. CoRR, abs/1709.10087, 2017a.
-
(2017)
CoRR
-
-
Rajeswaran, A.1
Kumar, V.2
Gupta, A.3
Schulman, J.4
Todorov, E.5
Levine, S.6
-
18
-
-
85044996392
-
Towards generalization and simplicity in continuous control
-
Aravind Rajeswaran, Kendall Lowrey, Emanuel Todorov, and Sham Kakade. Towards Generalization and Simplicity in Continuous Control. In NIPS, 2017b.
-
(2017)
NIPS
-
-
Rajeswaran, A.1
Lowrey, K.2
Todorov, E.3
Kakade, S.4
-
19
-
-
84969963490
-
Trust region policy optimization
-
John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. Trust region policy optimization. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pp. 1889–1897, 2015.
-
(2015)
Proceedings of the 32nd International Conference on Machine Learning (ICML-15)
, pp. 1889-1897
-
-
Schulman, J.1
Levine, S.2
Abbeel, P.3
Jordan, M.4
Moritz, P.5
-
20
-
-
85083954383
-
High-dimensional continuous control using generalized advantage estimation
-
John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, and Pieter Abbeel. High-dimensional continuous control using generalized advantage estimation. In International Conference on Learning Representations (ICLR2016), 2016.
-
(2016)
International Conference on Learning Representations (ICLR2016)
-
-
Schulman, J.1
Moritz, P.2
Levine, S.3
Jordan, M.4
Abbeel, P.5
-
21
-
-
84963949906
-
Mastering the game of go with deep neural networks and tree search
-
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. Mastering the game of go with deep neural networks and tree search. Nature, 529(7587):484–489, 2016.
-
(2016)
Nature
, vol.529
, Issue.7587
, pp. 484-489
-
-
Silver, D.1
Huang, A.2
Maddison, C.J.3
Guez, A.4
Sifre, L.5
Van Den Driessche, G.6
Schrittwieser, J.7
Antonoglou, I.8
Panneershelvam, V.9
Lanctot, M.10
-
23
-
-
84898939480
-
Policy gradient methods for reinforcement learning with function approximation
-
Richard S Sutton, David A McAllester, Satinder P Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems, pp. 1057–1063, 2000.
-
(2000)
Advances in Neural Information Processing Systems
, pp. 1057-1063
-
-
Sutton, R.S.1
McAllester, D.A.2
Singh, S.P.3
Mansour, Y.4
-
25
-
-
28044474086
-
From task parameters to motor synergies: A hierarchical framework for approximately optimal control of redundant manipulators
-
Emanuel Todorov, Weiwei Li, and Xiuchuan Pan. From task parameters to motor synergies: A hierarchical framework for approximately optimal control of redundant manipulators. Journal of Field Robotics, 22(11):691–710, 2005.
-
(2005)
Journal of Field Robotics
, vol.22
, Issue.11
, pp. 691-710
-
-
Todorov, E.1
Li, W.2
Pan, X.3
-
29
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3-4):229–256, 1992.
-
(1992)
Machine Learning
, vol.8
, Issue.3-4
, pp. 229-256
-
-
Williams, R.J.1
|