-
4
-
-
0345307666
-
Vision-based localization of an underwater robot in a structured environment
-
Taipei, Taiwan
-
M. Carreras, P. Ridao, R. Garcia, and T. Nicosevici, "Vision-based localization of an underwater robot in a structured environment," in IEEE International Conference on Robotics and Automation, Taipei, Taiwan, 2003.
-
(2003)
IEEE International Conference on Robotics and Automation
-
-
Carreras, M.1
Ridao, P.2
Garcia, R.3
Nicosevici, T.4
-
6
-
-
84898939480
-
Policy gradient methods for reinforcement learning with function approximation
-
R. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation," Advances in Neural Information Processing Systems, vol. 12, pp. 1057-1063, 2000.
-
(2000)
Advances in Neural Information Processing Systems
, vol.12
, pp. 1057-1063
-
-
Sutton, R.1
McAllester, D.2
Singh, S.3
Mansour, Y.4
-
7
-
-
34250624603
-
-
C. Anderson, Approximating a policy can be easier than approximating a value function, University of Colorado State, Computer Science Technical Report, 2000.
-
C. Anderson, "Approximating a policy can be easier than approximating a value function," University of Colorado State," Computer Science Technical Report, 2000.
-
-
-
-
8
-
-
14344253499
-
Policy-gradient algorithms for partially observable markov decision processes,
-
Ph.D. dissertation, Australian National University, April
-
D. A. Aberdeen, "Policy-gradient algorithms for partially observable markov decision processes," Ph.D. dissertation, Australian National University, April 2003.
-
(2003)
-
-
Aberdeen, D.A.1
-
10
-
-
34250683611
-
-
N. Meuleau, K. E. Kim, L. P. Kaelbling, and A. R. Cassandra, Solving POMDPs by searching the space of finite policies, in 15th Conference on Uncertainty in Artificial Intelligence, M. Kaufmann, Ed., Computer science Dep., Brown University, July 1999, pp. 127-136.
-
N. Meuleau, K. E. Kim, L. P. Kaelbling, and A. R. Cassandra, "Solving POMDPs by searching the space of finite policies," in 15th Conference on Uncertainty in Artificial Intelligence, M. Kaufmann, Ed., Computer science Dep., Brown University, July 1999, pp. 127-136.
-
-
-
-
11
-
-
2142812536
-
Learning without state-estimation in partially observable markovian decision processes
-
New Jersey, USA
-
S. Singh, T. Jaakkola, and M. Jordan, "Learning without state-estimation in partially observable markovian decision processes," in Proceedings of the Eleventh International Conference on Machine Learning, New Jersey, USA, 1994.
-
(1994)
Proceedings of the Eleventh International Conference on Machine Learning
-
-
Singh, S.1
Jaakkola, T.2
Jordan, M.3
-
14
-
-
34250638040
-
-
P. Marbach and J. N. Tsitsiklis, Gradient-based optimization of Markov reward processes: Practical variants, Center for Communications Systems Research, University of Cambridge, Tech. Rep., March 2000.
-
P. Marbach and J. N. Tsitsiklis, "Gradient-based optimization of Markov reward processes: Practical variants," Center for Communications Systems Research, University of Cambridge, Tech. Rep., March 2000.
-
-
-
-
15
-
-
4043069840
-
On actor-critic algorithms
-
V. Konda and J. Tsitsiklis, "On actor-critic algorithms," SIAM Journal on Control and Optimization, vol. 42, number 4, pp. 1143-1166, 2003.
-
(2003)
SIAM Journal on Control and Optimization
, vol.42
, Issue.4
, pp. 1143-1166
-
-
Konda, V.1
Tsitsiklis, J.2
-
16
-
-
34250649024
-
-
N. Meuleau, L. Peshkin, and K. Kim, Exploration in gradient based reinforcement learning, Massachusetts Institute of Technology, AI Memo 2001-003, Tech. Rep., April 2001.
-
N. Meuleau, L. Peshkin, and K. Kim, "Exploration in gradient based reinforcement learning," Massachusetts Institute of Technology, AI Memo 2001-003, Tech. Rep., April 2001.
-
-
-
-
17
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
R. Williams, "Simple statistical gradient-following algorithms for connectionist reinforcement learning," Machine Learning, vol. 8, pp. 229-256, 1992.
-
(1992)
Machine Learning
, vol.8
, pp. 229-256
-
-
Williams, R.1
-
18
-
-
0001251942
-
Reinforcement learning in pomdps with function approximation
-
D. H. Fisher, Ed
-
H. Kimura, K. Miyazaki, and S. Kobayashi, "Reinforcement learning in pomdps with function approximation," in Fourteenth International Conference on Machine Learning (ICML'97), D. H. Fisher, Ed., 1997, pp. 152-160.
-
(1997)
Fourteenth International Conference on Machine Learning (ICML'97)
, pp. 152-160
-
-
Kimura, H.1
Miyazaki, K.2
Kobayashi, S.3
-
19
-
-
85153938292
-
Reinforcement Learning algorithms for partially observable Markov decision problems
-
T. Jaakkola, S. Singh, and M. Jordan, Reinforcement Learning algorithms for partially observable Markov decision problems. Morgan Kaufman, 1995, vol. 7, pp. 345-352.
-
(1995)
Morgan Kaufman
, vol.7
, pp. 345-352
-
-
Jaakkola, T.1
Singh, S.2
Jordan, M.3
-
20
-
-
34250642878
-
-
J. Baxter and P. Bartlett, Direct gradient-based reinforcement learning: I. gradient estimation algorithms, Australian National University, Tech. Rep., 1999.
-
J. Baxter and P. Bartlett, "Direct gradient-based reinforcement learning: I. gradient estimation algorithms," Australian National University, Tech. Rep., 1999.
-
-
-
-
21
-
-
0009011171
-
Simulation-based optimization of markov reward processes
-
Technical report LIDS-P-2411, Massachussets Institute of Technology
-
P. Marbach and J. N. Tsitsiklis, "Simulation-based optimization of markov reward processes," Technical report LIDS-P-2411, Massachussets Institute of Technology, 1998.
-
(1998)
-
-
Marbach, P.1
Tsitsiklis, J.N.2
-
22
-
-
0009011171
-
Simulation-based methods for markov decision processes,
-
PhD Thesis, Laboratory for Information and Decision Systems, MIT
-
P. Marbach, "Simulation-based methods for markov decision processes," PhD Thesis, Laboratory for Information and Decision Systems, MIT, 1998.
-
(1998)
-
-
Marbach, P.1
-
26
-
-
14044262287
-
Stochastic policy gradient reinforcement learning on a simple 3D biped
-
Sendai, Japan, September 28, October 2
-
R. Tedrake, T. W. Zhang, and H. S. Seung, "Stochastic policy gradient reinforcement learning on a simple 3D biped," in IEEE/RSJ International Conference on Intelligent Robots and Systems IROS'04, Sendai, Japan, September 28 - October 2 2004.
-
(2004)
IEEE/RSJ International Conference on Intelligent Robots and Systems IROS'04
-
-
Tedrake, R.1
Zhang, T.W.2
Seung, H.S.3
-
27
-
-
33846174631
-
Learning sensory feedback to CPG with policy gradient for biped locomotion
-
Barcelona, Spain, April
-
T. Matsubara, J. Morimoto, J. Nakanishi, M. Sato, and K. Doya, "Learning sensory feedback to CPG with policy gradient for biped locomotion," in Proceedings of the International Conference on Robotics and Automation ICRA, Barcelona, Spain, April 2005.
-
(2005)
Proceedings of the International Conference on Robotics and Automation ICRA
-
-
Matsubara, T.1
Morimoto, J.2
Nakanishi, J.3
Sato, M.4
Doya, K.5
|