-
2
-
-
0036832950
-
Technical update: least-squares temporal difference learning
-
Boyan, J. A. (2002). Technical update: least-squares temporal difference learning. Mach. Learn. 49, 233-246.
-
(2002)
Mach. Learn.
, vol.49
, pp. 233-246
-
-
Boyan, J.A.1
-
3
-
-
84904894350
-
"Estimating passive dynamics distributions in linearly solvable markov decision processes from measured immediate costs in reinforcement learning problems,"
-
in Proceedings of the 21st Annual Conference of the Japanese Neural Network Society (Okinawa).
-
Burdelis, M. A. P., and Ikeda, K. (2011). "Estimating passive dynamics distributions in linearly solvable markov decision processes from measured immediate costs in reinforcement learning problems," in Proceedings of the 21st Annual Conference of the Japanese Neural Network Society (Okinawa).
-
(2011)
-
-
Burdelis, M.A.P.1
Ikeda, K.2
-
4
-
-
70349666986
-
Linear Bellman combination for control of character animation.
-
doi: 10.1145/1531326.1531388.
-
da Silva, M., Durand, F., and Popovic, J. (2009). Linear Bellman combination for control of character animation. ACM Trans. Grap. 28. doi: 10.1145/1531326.1531388.
-
(2009)
ACM Trans. Grap.
, vol.28
-
-
da Silva, M.1
Durand, F.2
Popovic, J.3
-
5
-
-
79952746011
-
Model-based influences on humans' choices and striatal prediction errors
-
Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P., and Dolan, R. J. (2011). Model-based influences on humans' choices and striatal prediction errors. Neuron 69, 1204-1215.
-
(2011)
Neuron
, vol.69
, pp. 1204-1215
-
-
Daw, N.D.1
Gershman, S.J.2
Seymour, B.3
Dayan, P.4
Dolan, R.J.5
-
6
-
-
80053441894
-
"PILCO: a model-based and data-efficient approach to policy search,"
-
eds L. Getoor and T. Scheffer (Bellevue, WA, USA).
-
Deisenroth, M. P., and Rasmussen, C. E. (2011). "PILCO: a model-based and data-efficient approach to policy search," in Proceedings of the 28th International Conference on Machine Learning, eds L. Getoor and T. Scheffer (Bellevue, WA, USA).
-
(2011)
Proceedings of the 28th International Conference on Machine Learning
-
-
Deisenroth, M.P.1
Rasmussen, C.E.2
-
7
-
-
61849173491
-
Gaussian process dynamic programming
-
Deisenroth, M. P., Rasmussen, C. E., and Peters, J. (2009). Gaussian process dynamic programming. Neurocomputing 72, 1508-1524.
-
(2009)
Neurocomputing
, vol.72
, pp. 1508-1524
-
-
Deisenroth, M.P.1
Rasmussen, C.E.2
Peters, J.3
-
8
-
-
84872761547
-
The ubiquity of model-based reinforcement learning
-
Doll, B. B., Simon, D. A., and Daw, N. D. (2012). The ubiquity of model-based reinforcement learning. Curr. Opin. Neurobiol. 22, 1075-1081.
-
(2012)
Curr. Opin. Neurobiol.
, vol.22
, pp. 1075-1081
-
-
Doll, B.B.1
Simon, D.A.2
Daw, N.D.3
-
9
-
-
67650904998
-
How can we learn efficiently to act optimally and flexibly?
-
Doya, K. (2009). How can we learn efficiently to act optimally and flexibly? Proc. Natl. Acad. Sci. U.S.A. 106, 11429-11430.
-
(2009)
Proc. Natl. Acad. Sci. U.S.A.
, vol.106
, pp. 11429-11430
-
-
Doya, K.1
-
10
-
-
0003294328
-
"Logalithmic transformations and risk sensitivity,"
-
(New York, NY: Springer Science + Business Media, Inc.), (eds.)
-
Fleming, W., and Soner, H. (eds.). (2006). "Logalithmic transformations and risk sensitivity," in Controlled Markov Processes and Viscosity Solutions, Chapter 6 (New York, NY: Springer Science + Business Media, Inc.), 227-260.
-
(2006)
Controlled Markov Processes and Viscosity Solutions, Chapter 6
, pp. 227-260
-
-
Fleming, W.1
Soner, H.2
-
11
-
-
77955817264
-
"Generalized model learning for Reinforcement Learning on a humanoid robot,"
-
(Anchorage, AK: IEEE)
-
Hester, T., Quinlan, M., and Stone, P. (2010). "Generalized model learning for Reinforcement Learning on a humanoid robot," in Proceedings of IEEE International Conference on Robotics and Automation (Anchorage, AK: IEEE), 2369-2374.
-
(2010)
Proceedings of IEEE International Conference on Robotics and Automation
, pp. 2369-2374
-
-
Hester, T.1
Quinlan, M.2
Stone, P.3
-
12
-
-
28844435646
-
Linear theory for control of nonlinear stochastic systems
-
Kappen, H. (2005a). Linear theory for control of nonlinear stochastic systems. Phys. Rev. Lett. 95, 200201-200204.
-
(2005)
Phys. Rev. Lett.
, vol.95
, pp. 200201-200204
-
-
Kappen, H.1
-
13
-
-
29044440299
-
Path integrals and symmetry breaking for optimal control theory
-
Kappen, H. (2005b). Path integrals and symmetry breaking for optimal control theory. J. Stat. Mech. Theor. Exp. 11, P11011.
-
(2005)
J. Stat. Mech. Theor. Exp.
, vol.11
-
-
Kappen, H.1
-
15
-
-
81255141919
-
Model learning for robot control: a survey
-
Nguyen-Tuong, D., and Peters, J. (2011). Model learning for robot control: a survey. Cogn. Proc. 12, 319-340.
-
(2011)
Cogn. Proc.
, vol.12
, pp. 319-340
-
-
Nguyen-Tuong, D.1
Peters, J.2
-
16
-
-
67349283062
-
Reinforcement learning in the brain
-
Niv, Y. (2009). Reinforcement learning in the brain. J. Math. Psychol. 53, 139-154.
-
(2009)
J. Math. Psychol.
, vol.53
, pp. 139-154
-
-
Niv, Y.1
-
17
-
-
80053576494
-
On-line regression algorithms for learning mechanical models of robots: a survey
-
Sigaud, O., Salaün, C., and Padois, V. (2011). On-line regression algorithms for learning mechanical models of robots: a survey. Robot. Auton. Syst. 59, 1115-1129.
-
(2011)
Robot. Auton. Syst.
, vol.59
, pp. 1115-1129
-
-
Sigaud, O.1
Salaün, C.2
Padois, V.3
-
20
-
-
77950208664
-
Least-squares conditional density estimation.
-
E93-D
-
Sugiyama, M., Takeuchi, I., Suzuki, T., Kanamori, T., and Hachiya, H. (2010). Least-squares conditional density estimation. IEICE Trans. Inform. Syst. E93-D, 583-594.
-
(2010)
IEICE Trans. Inform. Syst.
, pp. 583-594
-
-
Sugiyama, M.1
Takeuchi, I.2
Suzuki, T.3
Kanamori, T.4
Hachiya, H.5
-
21
-
-
79551503171
-
A generalized path integral control approach to reinforcement learning
-
Theodorou, E., Buchli, J., and Schaal, S. (2010). A generalized path integral control approach to reinforcement learning. J. Mach. Learn. Res. 11, 3137-3181.
-
(2010)
J. Mach. Learn. Res.
, vol.11
, pp. 3137-3181
-
-
Theodorou, E.1
Buchli, J.2
Schaal, S.3
-
22
-
-
84874256580
-
"Relative entropy and free energy dualities: connections to path integral and kl control,"
-
Theodorou, E., and Todorov, E. (2012). "Relative entropy and free energy dualities: connections to path integral and kl control," in the 51th IEEE Conference on Decision and Control (Maui), 1466-1473.
-
(2012)
the 51th IEEE Conference on Decision and Control (Maui)
, pp. 1466-1473
-
-
Theodorou, E.1
Todorov, E.2
-
23
-
-
34848854613
-
"Optimal control theory,"
-
in Bayesian Brain: Probabilistic Approaches to Neural Coding, Chapter 12, eds D. Kenji, S. Ishii, A. Pouget, R.P. Rao (Cambridge, MA: MIT Press)
-
Todorov, E. (2006). "Optimal control theory," in Bayesian Brain: Probabilistic Approaches to Neural Coding, Chapter 12, eds D. Kenji, S. Ishii, A. Pouget, and R. P. Rao (Cambridge, MA: MIT Press), 269-298.
-
(2006)
, pp. 269-298
-
-
Todorov, E.1
-
24
-
-
84864055301
-
Linearly-solvable Markov decision problems
-
Todorov, E. (2007). Linearly-solvable Markov decision problems. Adv. Neural Inform. Proc. Syst. 19, 1369-1379.
-
(2007)
Adv. Neural Inform. Proc. Syst.
, vol.19
, pp. 1369-1379
-
-
Todorov, E.1
-
25
-
-
80052201109
-
Compositionality of optimal control laws
-
Todorov, E. (2009a). Compositionality of optimal control laws. Adv. Neural Inform. Proc. Syst. 22, 1856-1864.
-
(2009)
Adv. Neural Inform. Proc. Syst.
, vol.22
, pp. 1856-1864
-
-
Todorov, E.1
-
26
-
-
67650915125
-
Efficient computation of optimal actions
-
Todorov, E. (2009b). Efficient computation of optimal actions. Proc. Natl. Acad. Sci. U.S.A. 106, 11478-11483.
-
(2009)
Proc. Natl. Acad. Sci. U.S.A.
, vol.106
, pp. 11478-11483
-
-
Todorov, E.1
-
27
-
-
67650469280
-
"Eigenfunction approximation methods for linearly-solvable optimal control problems,"
-
in Proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming Reinforcement Learning (Nashville, TN)
-
Todorov, E. (2009c). "Eigenfunction approximation methods for linearly-solvable optimal control problems," in Proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (Nashville, TN), 161-168.
-
(2009)
, pp. 161-168
-
-
Todorov, E.1
-
28
-
-
84866749314
-
"Aggregation methods for lineary-solvable Markov decision process,"
-
in Proceedings of the World Congress of the International Federation of Automatic Control (Milano).
-
Zhong, M., and Todorov, E. (2011). "Aggregation methods for lineary-solvable Markov decision process," in Proceedings of the World Congress of the International Federation of Automatic Control (Milano).
-
(2011)
-
-
Zhong, M.1
Todorov, E.2
|