-
2
-
-
84898939480
-
Policy gradient methods for reinforcement learning with function approximation
-
R. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation," Advances in Neural Information Processing Systems, vol.12, pp. 1057-1063, 2000.
-
(2000)
Advances in Neural Information Processing Systems
, vol.12
, pp. 1057-1063
-
-
Sutton, R.1
McAllester, D.2
Singh, S.3
Mansour, Y.4
-
8
-
-
33847238318
-
-
Center for Communications Systems Research, University of Cambridge, Tech. Rep., March
-
P. Marbach and J. N. Tsitsiklis, "Gradient-based optimization of Markov reward processes: Practical variants," Center for Communications Systems Research, University of Cambridge, Tech. Rep., March 2000.
-
(2000)
Gradient-based Optimization of Markov Reward Processes: Practical Variants
-
-
Marbach, P.1
Tsitsiklis, J.N.2
-
9
-
-
4043069840
-
On actor-critic algorithms
-
V. Konda and J. Tsitsiklis, "On actor-critic algorithms," SIAM Journal on Control and Optimization, vol.42, number 4, pp. 1143-1166, 2003.
-
(2003)
SIAM Journal on Control and Optimization
, vol.42
, Issue.4
, pp. 1143-1166
-
-
Konda, V.1
Tsitsiklis, J.2
-
10
-
-
33746878798
-
-
Massachusetts Institute of Technology, AI Memo Tech. Rep., April
-
N. Meuleau, L. Peshkin, and K. Kim, "Exploration in gradient based reinforcement learning," Massachusetts Institute of Technology, AI Memo 2001-2003, Tech. Rep., April 2001.
-
(2001)
Exploration in Gradient Based Reinforcement Learning
, pp. 2001-2003
-
-
Meuleau, N.1
Peshkin, L.2
Kim, K.3
-
11
-
-
14044262287
-
Stochastic policy gradient reinforcement learning on a simple 3D biped
-
R. Tedrake, T. W. Zhang, and H. S. Seung, "Stochastic policy gradient reinforcement learning on a simple 3D biped," in IEEE/RSJ International Conference on Intelligent Robots and Systems IROS'04, Sendai, Japan, September 28 - October 2 2004.
-
IEEE/RSJ International Conference on Intelligent Robots and Systems IROS'04, Sendai, Japan, September 28 - October 2 2004
-
-
Tedrake, R.1
Zhang, T.W.2
Seung, H.S.3
-
12
-
-
33846174631
-
Learning sensory feedback to CPG with policy gradient for biped locomotion
-
T. Matsubara, J. Morimoto, J. Nakanishi, M. Sato, and K. Doya, "Learning sensory feedback to CPG with policy gradient for biped locomotion," in Proceedings of the International Conference on Robotics and Automation ICRA, Barcelona, Spain, April 2005.
-
Proceedings of the International Conference on Robotics and Automation ICRA, Barcelona, Spain, April 2005
-
-
Matsubara, T.1
Morimoto, J.2
Nakanishi, J.3
Sato, M.4
Doya, K.5
-
13
-
-
0000123778
-
Self-improving reactive agents based on reinforcement learning, planning and teaching
-
L. Lin, "Self-improving reactive agents based on reinforcement learning, planning and teaching." Machine Learning, vol.8(3/4), pp. 293-321, 1992.
-
(1992)
Machine Learning
, vol.8
, Issue.3-4
, pp. 293-321
-
-
Lin, L.1
-
14
-
-
0004090962
-
-
Ph.D. dissertation, Department of Computer Science at Brown University, Rhode Island, May
-
W. Smart, "Making reinforcement learning work on real robots," Ph.D. dissertation, Department of Computer Science at Brown University, Rhode Island, May 2002.
-
(2002)
Making Reinforcement Learning Work on Real Robots
-
-
Smart, W.1
-
15
-
-
0031074521
-
Locally weighted learning
-
C. Atkenson, A. Moore, and S. Schaal, "Locally weighted learning," Artificial Intelligence Review, vol.11, pp. 11-73, 1997.
-
(1997)
Artificial Intelligence Review
, vol.11
, pp. 11-73
-
-
Atkenson, C.1
Moore, A.2
Schaal, S.3
-
16
-
-
33846984231
-
Learning obstacle avoidance parameters from operator behavior
-
December
-
B. Hammer, S. Singh, and S. Scherer, "Learning obstacle avoidance parameters from operator behavior," Journal of Field Robotics, Special Issue on Machine Learning Based Robotics in Unstructured Environments, vol.23 (11/12), December 2006.
-
(2006)
Journal of Field Robotics, Special Issue on Machine Learning Based Robotics in Unstructured Environments
, vol.23
, Issue.11-12
-
-
Hammer, B.1
Singh, S.2
Scherer, S.3
-
20
-
-
36348971779
-
Ictineu auv wins the first sauc-e competition
-
D. Ribas, N. Palomeras, P. Ridao, M. Carreras, and E. Hernandez, "Ictineu auv wins the first sauc-e competition," in IEEE International Conference on Robotics and Automation, 2007.
-
(2007)
IEEE International Conference on Robotics and Automation
-
-
Ribas, D.1
Palomeras, N.2
Ridao, P.3
Carreras, M.4
Hernandez, E.5
-
21
-
-
3342922286
-
On the identification of non-linear models of unmanned underwater vehicles
-
DOI 10.1016/j.conengprac.2004.01.004, PII S0967066104000152
-
P. Ridao, A. Tiano, A. El-Fakdi, M. Carreras, and A. Zirilli, "On the identification of non-linear models of unmanned underwater vehicles," Control Engineering Practice, vol.12, pp. 1483-1499, 2004. (Pubitemid 38994782)
-
(2004)
Control Engineering Practice
, vol.12
, Issue.12 SPEC. ISS
, pp. 1483-1499
-
-
Ridao, P.1
Tiano, A.2
El-Fakdi, A.3
Carreras, M.4
Zirilli, A.5
-
22
-
-
8844227781
-
A vision system for an underwater cable tracker
-
DOI 10.1007/s001380100065
-
A. Ortiz, M. Simo, and G. Oliver, "A vision system for an underwater cable tracker," International Journal of Machine Vision and Applications, vol.13 (3), pp. 129-140, 2002. (Pubitemid 41200797)
-
(2002)
Machine Vision and Applications
, vol.13
, Issue.3
, pp. 129-140
-
-
Ortiz, A.1
Simo, M.2
Oliver, G.3
-
23
-
-
35248838766
-
Underwater cable tracking by visual feedback
-
J. Antich and A. Ortiz, "Underwater cable tracking by visual feedback," in First Iberian Conference on Pattern recognition and Image Analysis (IbPRIA, LNCS 2652), Port d'Andratx, Spain, 2003.
-
First Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA, LNCS 2652), Port D'Andratx, Spain, 2003
-
-
Antich, J.1
Ortiz, A.2
|