-
3
-
-
0020970738
-
Neuronlike adaptive elements that can solve difficult learning control problems
-
Barto, A., Sutton, R., and Anderson, C. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man and Cybernetics, (5):834-846, 1983.
-
(1983)
IEEE Transactions on Systems, Man and Cybernetics
, Issue.5
, pp. 834-846
-
-
Barto, A.1
Sutton, R.2
Anderson, C.3
-
5
-
-
84903590417
-
A survey on policy search for robotics
-
Deisenroth, M., Neumann, G., and Peters, J. A survey on policy search for robotics. Foundations and Trends in Robotics, 2(1-2): 1-142, 2013.
-
(2013)
Foundations and Trends in Robotics
, vol.2
, Issue.1-2
, pp. 1-142
-
-
Deisenroth, M.1
Neumann, G.2
Peters, J.3
-
6
-
-
33846679442
-
Simulation optimization: A review, new developments, and applications
-
Winter Simulation Conference
-
Fu, Michael C, Glover, Fred W, and April, Jay. Simulation optimization: a review, new developments, and applications. In Proceedings of the 37th conference on Winter simulation, pp. 83-95. Winter Simulation Conference, 2005.
-
(2005)
Proceedings of the 37th Conference on Winter Simulation
, pp. 83-95
-
-
Fu, M.C.1
Glover, F.W.2
April, J.3
-
9
-
-
84937779024
-
Deep learning for real-time atari game play using offline monte-carlo tree search planning
-
Guo, X., Singh, S., Lee, H., Lewis, R. L., and Wang, X. Deep learning for real-time atari game play using offline Monte-Carlo tree search planning. In Advances in Neural Information Processing Systems, pp. 3338-3346, 2014.
-
(2014)
Advances in Neural Information Processing Systems
, pp. 3338-3346
-
-
Guo, X.1
Singh, S.2
Lee, H.3
Lewis, R.L.4
Wang, X.5
-
10
-
-
0029722015
-
Adapting arbitrary normal mutation distributions in evolution strategies: The covariance matrix adaptation
-
IEEE
-
Hansen, Nikolaus and Ostermeier, Andreas. Adapting arbitrary normal mutation distributions in evolution strategies: The covariance matrix adaptation. In Evolutionary Computation, 1996., Proceedings of IEEE International Conference on, pp. 312-317. IEEE, 1996.
-
(1996)
Evolutionary Computation, 1996., Proceedings of IEEE International Conference on
, pp. 312-317
-
-
Hansen, N.1
Ostermeier, A.2
-
13
-
-
1942514728
-
Approximately optimal approximate reinforcement learning
-
Kakade, Sham and Langford, John. Approximately optimal approximate reinforcement learning. In ICML, volume 2, pp. 267-274, 2002.
-
(2002)
ICML
, vol.2
, pp. 267-274
-
-
Kakade, S.1
Langford, J.2
-
14
-
-
1942420814
-
Reinforcement learning as classification: Leveraging modern classifiers
-
Lagoudakis, Michail G and Parr, Ronald. Reinforcement learning as classification: Leveraging modern classifiers. In ICML, volume 3, pp. 424-431, 2003.
-
(2003)
ICML
, vol.3
, pp. 424-431
-
-
Lagoudakis, M.G.1
Parr, R.2
-
16
-
-
84937822296
-
Learning neural network policies with guided policy search under unknown dynamics
-
Levine, Sergey and Abbeel, Pieter. Learning neural network policies with guided policy search under unknown dynamics. In Advances in Neural Information Processing Systems, pp. 1071-1079, 2014.
-
(2014)
Advances in Neural Information Processing Systems
, pp. 1071-1079
-
-
Levine, S.1
Abbeel, P.2
-
17
-
-
84872565347
-
Training deep and recurrent networks with hessian-free optimization
-
Springer
-
Martens, J. and Sutskever, I. Training deep and recurrent networks with hessian-free optimization. In Neural Networks: Tricks of the Trade, pp. 479-535. Springer, 2012.
-
(2012)
Neural Networks: Tricks of the Trade
, pp. 479-535
-
-
Martens, J.1
Sutskever, I.2
-
18
-
-
84904867557
-
-
arXiv preprint arXiv: 1312.5602
-
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. Playing Atari with deep reinforcement learning. arXiv preprint arXiv: 1312.5602, 2013.
-
(2013)
Playing Atari with Deep Reinforcement Learning
-
-
Mnih, V.1
Kavukcuoglu, K.2
Silver, D.3
Graves, A.4
Antonoglou, I.5
Wierstra, D.6
Riedmiller, M.7
-
23
-
-
44949241322
-
Reinforcement learning of motor skills with policy gradients
-
Peters, J. and Schaal, S. Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4): 682-697, 2008a.
-
(2008)
Neural Networks
, vol.21
, Issue.4
, pp. 682-697
-
-
Peters, J.1
Schaal, S.2
-
26
-
-
40649106649
-
Natural actor-critic
-
Peters, Jan and Schaal, Stefan. Natural actor-critic. Neuro-computing, 71(7): 1180-1190, 2008b.
-
(2008)
Neuro-computing
, vol.71
, Issue.7
, pp. 1180-1190
-
-
Peters, J.1
Schaal, S.2
-
27
-
-
84897496610
-
Safe policy iteration
-
Pirotta, Matteo, Restelli, Marcello, Pecorino, Alessio, and Calandriello, Daniele. Safe policy iteration. In Proceedings of The 30th International Conference on Machine Learning, pp. 307-315, 2013.
-
(2013)
Proceedings of the 30th International Conference on Machine Learning
, pp. 307-315
-
-
Pirotta, M.1
Restelli, M.2
Pecorino, A.3
Calandriello, D.4
-
29
-
-
33845344721
-
Learning tetris using the noisy cross-entropy method
-
Szita, István and Lörincz, András. Learning tetris using the noisy cross-entropy method. Neural computation, 18 (12):2936-2941, 2006.
-
(2006)
Neural Computation
, vol.18
, Issue.12
, pp. 2936-2941
-
-
Szita, I.1
Lörincz, A.2
-
31
-
-
84872292044
-
MuJoCo: A physics engine for model-based control
-
IEEE
-
Todorov, Emanuel, Erez, Tom, and Tassa, Yuval. MuJoCo: A physics engine for model-based control. In Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on, pp. 5026-5033. IEEE, 2012.
-
(2012)
Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on
, pp. 5026-5033
-
-
Todorov, E.1
Erez, T.2
Tassa, Y.3
-
32
-
-
70349668763
-
Optimal gait and form for animal locomotion
-
ACM
-
Wampler, Kevin and Popović, Zoran. Optimal gait and form for animal locomotion. In ACM Transactions on Graphics (TOG), volume 28, pp. 60. ACM, 2009.
-
(2009)
ACM Transactions on Graphics (TOG)
, vol.28
, pp. 60
-
-
Wampler, K.1
Popović, Z.2
|