-
1
-
-
0037262814
-
An introduction to MCMC for machine learning
-
1033.68081 10.1023/A:1020281327116
-
C. Andrieu N. de Freitas A. Doucet M. I. Jordan 2003 An introduction to MCMC for machine learning Machine Learning 50 1 5 43 1033.68081 10.1023/A:1020281327116
-
(2003)
Machine Learning
, vol.50
, Issue.1
, pp. 5-43
-
-
Andrieu, C.1
De Freitas, N.2
Doucet, A.3
Jordan, M.I.4
-
2
-
-
0039816976
-
Using local trajectory optimizers to speed up global optimization in dynamic programming
-
Atkeson, C. G. (1994). Using local trajectory optimizers to speed up global optimization in dynamic programming. In Advances in neural information processing systems (Vol. 6, pp. 503-521), Denver, CO, USA.
-
(1994)
Advances in Neural Information Processing Systems Denver, CO, USA
, vol.6
, pp. 503-521
-
-
Atkeson, C.G.1
-
5
-
-
84898962948
-
Policy search by dynamic programming
-
Vancouver, BC, CA
-
Bagnell, J., Kadade, S., Ng, A., & Schneider, J. (2004). Policy search by dynamic programming. In Advances in neural information processing systems (Vol. 16), Vancouver, BC, CA.
-
(2004)
Advances in Neural Information Processing Systems
, vol.16
-
-
Bagnell, J.1
Kadade, S.2
Ng, A.3
Schneider, J.4
-
6
-
-
0031273462
-
Adaptive Probabilistic Networks with Hidden Variables
-
J. Binder D. Koller S. Russell K. Kanazawa 1997 Adaptive probabilistic networks with hidden variables Machine Learning 29 2-3 213 244 0892.68079 10.1023/A:1007421730016 (Pubitemid 127510039)
-
(1997)
Machine Learning
, vol.29
, Issue.2-3
, pp. 213-244
-
-
Binder, J.1
Koller, D.2
Russell, S.3
Kanazawa, K.4
-
7
-
-
70049111229
-
Using Bayesian dynamical systems for motion template libraries
-
D. Koller D. Schuurmans Y. Bengio L. Bottou (eds)
-
Chiappa, S., Kober, J., & Peters, J. (2009). Using Bayesian dynamical systems for motion template libraries. In D. Koller, D. Schuurmans, Y. Bengio, & L. Bottou (Eds.), Advances in neural information processing systems (Vol. 21, pp. 297-304).
-
(2009)
Advances in Neural Information Processing Systems
, vol.21
, pp. 297-304
-
-
Chiappa, S.1
Kober, J.2
Peters, J.3
-
8
-
-
79958813136
-
-
DARPA
-
DARPA (2010a). Learning locomotion (L2). http://www.darpa.mil/ipto/ programs/ll/ll.asp.
-
(2010)
Learning Locomotion (L2)
-
-
-
11
-
-
0346982426
-
Using expectation-maximization for reinforcement learning
-
P. Dayan G. E. Hinton 1997 Using expectation-maximization for reinforcement learning Neural Computation 9 2 271 278 0876.68090 10.1162/neco.1997.9.2.271 (Pubitemid 127635391)
-
(1997)
Neural Computation
, vol.9
, Issue.2
, pp. 271-278
-
-
Dayan, P.1
Hinton, G.E.2
-
13
-
-
34250644253
-
Towards direct policy search reinforcement learning for robot control
-
Beijing, China
-
El-Fakdi, A., Carreras, M., & Ridao, P. (2006). Towards direct policy search reinforcement learning for robot control. In Proceedings of the IEEE/RSJ 2006 international conference on intelligent robots and systems (IROS), Beijing, China.
-
(2006)
Proceedings of the IEEE/RSJ 2006 International Conference on Intelligent Robots and Systems (IROS)
-
-
El-Fakdi, A.1
Carreras, M.2
Ridao, P.3
-
15
-
-
34948857495
-
Reinforcement learning for imitating constrained reaching movements
-
F. Guenter M. Hersch S. Calinon A. Billard 2007 Reinforcement learning for imitating constrained reaching movements Advanced Robotics, Special Issue on Imitative Robots 21 13 1521 1544 (Pubitemid 47529845)
-
(2007)
Advanced Robotics
, vol.21
, Issue.13
, pp. 1521-1544
-
-
Guenter, F.1
Hersch, M.2
Calinon, S.3
Billard, A.4
-
17
-
-
70350090880
-
Bayesian policy learning with trans-dimensional MCMC
-
Vancouver, BC, CA
-
Hoffman, M., Doucet, A., de Freitas, N., & Jasra, A. (2007). Bayesian policy learning with trans-dimensional MCMC. In Advances in neural information processing systems (Vol. 20), Vancouver, BC, CA.
-
(2007)
Advances in Neural Information Processing Systems
, vol.20
-
-
Hoffman, M.1
Doucet, A.2
De Freitas, N.3
Jasra, A.4
-
18
-
-
0036059542
-
Movement imitation with nonlinear dynamical systems in humanoid robots
-
Ijspeert, A. J., Nakanishi, J., & Schaal, S. (2002). Movement imitation with nonlinear dynamical systems in humanoid robots. In Proceedings of IEEE international conference on robotics and automation (ICRA) (pp. 1398-1403), Washington, DC. (Pubitemid 34916443)
-
(2002)
Proceedings - IEEE International Conference on Robotics and Automation
, vol.2
, pp. 1398-1403
-
-
Ijspeert, A.J.1
Nakanishi, J.2
Schaal, S.3
-
19
-
-
84899019754
-
Learning attractor landscapes for learning motor primitives
-
Vancouver, BC, CA
-
Ijspeert, A. J., Nakanishi, J., & Schaal, S. (2003). Learning attractor landscapes for learning motor primitives. In Advances in neural information processing systems (Vol. 15, pp. 1547-1554), Vancouver, BC, CA.
-
(2003)
Advances in Neural Information Processing Systems
, vol.15
, pp. 1547-1554
-
-
Ijspeert, A.J.1
Nakanishi, J.2
Schaal, S.3
-
20
-
-
4243385070
-
Convergence of stochastic iterative dynamic programming algorithms
-
J. D. Cowan G. Tesauro J. Alspector (eds). Morgan Kaufmann San Mateo
-
Jaakkola, T., Jordan, M. I., & Singh, S. P. (1994). Convergence of stochastic iterative dynamic programming algorithms. In J. D. Cowan, G. Tesauro, & J. Alspector (Eds.), Advances in neural information processing systems (Vol. 6, pp. 703-710). San Mateo: Morgan Kaufmann.
-
(1994)
Advances in Neural Information Processing Systems
, vol.6
, pp. 703-710
-
-
Jaakkola, T.1
Jordan, M.I.2
Singh, S.P.3
-
23
-
-
84858754385
-
Policy search for motor primitives in robotics
-
D. Koller D. Schuurmans Y. Bengio L. Bottou (eds)
-
Kober, J., & Peters, J. (2009b). Policy search for motor primitives in robotics. In D. Koller, D. Schuurmans, Y. Bengio, & L. Bottou (Eds.), Advances in neural information processing systems (Vol. 21, pp. 849-856).
-
(2009)
Advances in Neural Information Processing Systems
, vol.21
, pp. 849-856
-
-
Kober, J.1
Peters, J.2
-
24
-
-
67650835709
-
Learning perceptual coupling for motor primitives
-
Nice, France 10.1109/IROS.2008.4650953
-
Kober, J., Mohler, B., & Peters, J. (2008). Learning perceptual coupling for motor primitives. In Proceedings of the IEEE/RSJ 2008 international conference on intelligent robots and systems (IROS) (pp. 834-839), Nice, France.
-
(2008)
Proceedings of the IEEE/RSJ 2008 International Conference on Intelligent Robots and Systems (IROS)
, pp. 834-839
-
-
Kober, J.1
Mohler, B.2
Peters, J.3
-
26
-
-
33750422418
-
Gradient-based reinforcement planning in policy-search methods
-
M. A. Wiering (eds). Onderwijsinsituut CKI, Utrecht University Manno
-
Kwee, I., Hutter, M., & Schmidhuber, J. (2001). Gradient-based reinforcement planning in policy-search methods. In M. A. Wiering (Ed.), Cognitieve Kunstmatige Intelligentie: Vol. 27. Proceedings of the 5th European workshop on reinforcement learning (EWRL) (pp. 27-29), Lugano. Manno: Onderwijsinsituut CKI, Utrecht University.
-
(2001)
Proceedings of the 5th European Workshop on Reinforcement Learning (EWRL). Lugano Cognitieve Kunstmatige Intelligentie
, vol.27
, pp. 27-29
-
-
Kwee, I.1
Hutter, M.2
Schmidhuber, J.3
-
27
-
-
79958834032
-
Efficient gradient estimation for motor control learning
-
Acapulco, Mexico
-
Lawrence, G., Cowan, N., & Russell, S. (2003). Efficient gradient estimation for motor control learning. In Proceedings of the international conference on uncertainty in artificial intelligence (UAI) (pp. 354-361), Acapulco, Mexico.
-
(2003)
Proceedings of the International Conference on Uncertainty in Artificial Intelligence (UAI)
, pp. 354-361
-
-
Lawrence, G.1
Cowan, N.2
Russell, S.3
-
30
-
-
0030297195
-
A Kendama learning robot based on bi-directional theory
-
DOI 10.1016/S0893-6080(96)00043-3, PII S0893608096000433
-
H. Miyamoto S. Schaal F. Gandolfo H. Gomi Y. Koike R. Osu E. Nakano Y. Wada M. Kawato 1996 A Kendama learning robot based on bi-directional theory Neural Networks 9 8 1281 1302 10.1016/S0893-6080(96)00043-3 (Pubitemid 26413052)
-
(1996)
Neural Networks
, vol.9
, Issue.8
, pp. 1281-1302
-
-
Miyamoto, H.1
Schaal, S.2
Gandolfo, F.3
Gomi, H.4
Koike, Y.5
Osu, R.6
Nakano, E.7
Wada, Y.8
Kawato, M.9
-
32
-
-
33744488034
-
Inverted autonomous helicopter flight via reinforcement learning
-
MIT Press Cambridge
-
Ng, A. Y., Kim, H. J., Jordan, M. I., & Sastry, S. (2004). Inverted autonomous helicopter flight via reinforcement learning. In Proceedings of the international symposium on experimental robotics (ISER). Cambridge: MIT Press.
-
(2004)
Proceedings of the International Symposium on Experimental Robotics (ISER)
-
-
Ng, A.Y.1
Kim, H.J.2
Jordan, M.I.3
Sastry, S.4
-
33
-
-
63549125238
-
Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields
-
Park, D. H., Hoffmann, H., Pastor, P., & Schaal, S. (2008). Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields. In IEEE international conference on humanoid robots (HUMANOIDS) (pp. 91-98).
-
(2008)
IEEE International Conference on Humanoid Robots (HUMANOIDS)
, pp. 91-98
-
-
Park, D.H.1
Hoffmann, H.2
Pastor, P.3
Schaal, S.4
-
34
-
-
79951968365
-
-
PASCAL2
-
PASCAL2 (2010). Challenges. http://pascallin2.ecs.soton.ac.uk/Challenges/ .
-
(2010)
Challenges
-
-
-
36
-
-
70049104346
-
-
PhD thesis, University of Southern California, Los Angeles, CA, 90089, USA
-
Peters, J. (2007). Machine learning of motor skills for robotics. PhD thesis, University of Southern California, Los Angeles, CA, 90089, USA.
-
(2007)
Machine Learning of Motor Skills for Robotics
-
-
Peters, J.1
-
37
-
-
34250635407
-
Policy gradient methods for robotics
-
DOI 10.1109/IROS.2006.282564, 4058714, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2006
-
Peters, J., & Schaal, S. (2006). Policy gradient methods for robotics. In Proceedings of the IEEE/RSJ 2006 international conference on intelligent robots and systems (IROS) (pp. 2219-2225), Beijing, China. (Pubitemid 46928224)
-
(2006)
IEEE International Conference on Intelligent Robots and Systems
, pp. 2219-2225
-
-
Peters, J.1
Schaal, S.2
-
38
-
-
34547964788
-
Reinforcement learning by reward-weighted regression for operational space control
-
Corvallis, OR, USA
-
Peters, J., & Schaal, S. (2007). Reinforcement learning by reward-weighted regression for operational space control. In Proceedings of the international conference on machine learning (ICML), Corvallis, OR, USA.
-
(2007)
Proceedings of the International Conference on Machine Learning (ICML)
-
-
Peters, J.1
Schaal, S.2
-
39
-
-
34447553096
-
Reinforcement learning for humanoid robotics
-
Karlsruhe, Germany
-
Peters, J., Vijayakumar, S., & Schaal, S. (2003). Reinforcement learning for humanoid robotics. In Proceedings of the IEEE-RAS international conference on humanoid robots (HUMANOIDS) (pp. 103-123), Karlsruhe, Germany.
-
(2003)
Proceedings of the IEEE-RAS International Conference on Humanoid Robots (HUMANOIDS)
, pp. 103-123
-
-
Peters, J.1
Vijayakumar, S.2
Schaal, S.3
-
40
-
-
33646413135
-
Natural actor-critic
-
Porto, Portugal
-
Peters, J., Vijayakumar, S., & Schaal, S. (2005). Natural actor-critic. In Proceedings of the European conference on machine learning (ECML) (pp. 280-291), Porto, Portugal.
-
(2005)
Proceedings of the European Conference on Machine Learning (ECML)
, pp. 280-291
-
-
Peters, J.1
Vijayakumar, S.2
Schaal, S.3
-
41
-
-
56049089041
-
State-dependent exploration for policy gradient methods
-
Antwerp, Belgium
-
Rückstie, T., Felder, M., & Schmidhuber, J. (2008). State-dependent exploration for policy gradient methods. In Proceedings of the European conference on machine learning (ECML) (pp. 234-249), Antwerp, Belgium.
-
(2008)
Proceedings of the European Conference on Machine Learning (ECML)
, pp. 234-249
-
-
Rückstie, T.1
Felder, M.2
Schmidhuber, J.3
-
43
-
-
0036639869
-
Scalable techniques from nonparametric statistics for real time robot learning
-
DOI 10.1023/A:1015727715131
-
S. Schaal C. G. Atkeson S. Vijayakumar 2002 Scalable techniques from nonparameteric statistics for real-time robot learning Applied Intelligence 17 1 49 60 1003.68169 10.1023/A:1015727715131 (Pubitemid 34789897)
-
(2002)
Applied Intelligence
, vol.17
, Issue.1
, pp. 49-60
-
-
Schaal, S.1
Atkeson, C.G.2
Vijayakumar, S.3
-
44
-
-
84871709049
-
Control, planning, learning, and imitation with dynamic movement primitives
-
Las Vegas, NV October 27-31, 2003
-
Schaal, S., Peters, J., Nakanishi, J., & Ijspeert, A. J. (2003). Control, planning, learning, and imitation with dynamic movement primitives. In Proceedings of the workshop on bilateral paradigms on humans and humanoids, IEEE international conference on intelligent robots and systems (IROS), Las Vegas, NV, October 27-31, 2003.
-
(2003)
Proceedings of the Workshop on Bilateral Paradigms on Humans and Humanoids, IEEE International Conference on Intelligent Robots and Systems (IROS)
-
-
Schaal, S.1
Peters, J.2
Nakanishi, J.3
Ijspeert, A.J.4
-
45
-
-
34848832311
-
Dynamics systems vs. optimal control - A unifying view
-
DOI 10.1016/S0079-6123(06)65027-9, PII S0079612306650279, Computational Neuroscience: Theoretical Insights into Brain Function
-
S. Schaal P. Mohajerian A. J. Ijspeert 2007 Dynamics systems vs. optimal control-a unifying view Progress in Brain Research 165 1 425 445 10.1016/S0079-6123(06)65027-9 (Pubitemid 47513886)
-
(2007)
Progress in Brain Research
, vol.165
, pp. 425-445
-
-
Schaal, S.1
Mohajerian, P.2
Ijspeert, A.3
-
51
-
-
0002995053
-
Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
-
Sutton, R. S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the international machine learning conference (pp. 9-44).
-
(1990)
Proceedings of the International Machine Learning Conference
, pp. 9-44
-
-
Sutton, R.S.1
-
52
-
-
84898939480
-
Policy gradient methods for reinforcement learning with function approximation
-
Denver, CO, USA
-
Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems (NIPS) (Vol. 13, pp. 1057-1063), Denver, CO, USA.
-
(2000)
Advances in Neural Information Processing Systems (NIPS)
, vol.13
, pp. 1057-1063
-
-
Sutton, R.S.1
McAllester, D.2
Singh, S.3
Mansour, Y.4
-
55
-
-
14044262287
-
Stochastic policy gradient reinforcement learning on a simple 3D biped
-
SA1-E5, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
-
Tedrake, R., Zhang, T. W., & Seung, H. S. (2004). Stochastic policy gradient reinforcement learning on a simple 3d biped. In Proceedings of the IEEE 2004 international conference on intelligent robots and systems (IROS) (pp. 2849-2854). (Pubitemid 40276027)
-
(2004)
2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
, vol.3
, pp. 2849-2854
-
-
Tedrake, R.1
Zhang, T.W.2
Seung, H.S.3
-
57
-
-
51349151157
-
Probabilistic inference for structured planning in robotics
-
San Diego, CA, USA
-
Toussaint, M., & Goerick, C. (2007). Probabilistic inference for structured planning in robotics. In Proceedings of the IEEE/RSJ 2007 international conference on intelligent robots and systems (IROS), San Diego, CA, USA.
-
(2007)
Proceedings of the IEEE/RSJ 2007 International Conference on Intelligent Robots and Systems (IROS)
-
-
Toussaint, M.1
Goerick, C.2
-
59
-
-
70349327392
-
Learning model-free robot control by a Monte Carlo em algorithm
-
10.1007/s10514-009-9132-0
-
N. Vlassis M. Toussaint G. Kontes S. Piperidis 2009 Learning model-free robot control by a Monte Carlo EM algorithm Autonomous Robots 27 2 123 130 10.1007/s10514-009-9132-0
-
(2009)
Autonomous Robots
, vol.27
, Issue.2
, pp. 123-130
-
-
Vlassis, N.1
Toussaint, M.2
Kontes, G.3
Piperidis, S.4
-
60
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
0772.68076
-
R. J. Williams 1992 Simple statistical gradient-following algorithms for connectionist reinforcement learning Machine Learning 8 229 256 0772.68076
-
(1992)
Machine Learning
, vol.8
, pp. 229-256
-
-
Williams, R.J.1
|