메뉴 건너뛰기




Volumn 7, Issue APR, 2013, Pages

Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task

Author keywords

Linearly solvable Markov decision process; Model learning; Model based reinforcement learning; Optimal control; Robot navigation

Indexed keywords

ARTICLE; CAMERA; COMPUTER SIMULATION; DECISION MAKING; LEARNING; LINEARLY SOLVABLE MARKOV DECISION PROCESS; MATHEMATICAL COMPUTING; ROBOTICS; STATISTICAL MODEL; TASK PERFORMANCE;

EID: 84904902685     PISSN: None     EISSN: 16625218     Source Type: Journal    
DOI: 10.3389/fnbot.2013.00007     Document Type: Article
Times cited : (18)

References (28)
  • 2
    • 0036832950 scopus 로고    scopus 로고
    • Technical update: least-squares temporal difference learning
    • Boyan, J. A. (2002). Technical update: least-squares temporal difference learning. Mach. Learn. 49, 233-246.
    • (2002) Mach. Learn. , vol.49 , pp. 233-246
    • Boyan, J.A.1
  • 3
    • 84904894350 scopus 로고    scopus 로고
    • "Estimating passive dynamics distributions in linearly solvable markov decision processes from measured immediate costs in reinforcement learning problems,"
    • in Proceedings of the 21st Annual Conference of the Japanese Neural Network Society (Okinawa).
    • Burdelis, M. A. P., and Ikeda, K. (2011). "Estimating passive dynamics distributions in linearly solvable markov decision processes from measured immediate costs in reinforcement learning problems," in Proceedings of the 21st Annual Conference of the Japanese Neural Network Society (Okinawa).
    • (2011)
    • Burdelis, M.A.P.1    Ikeda, K.2
  • 4
    • 70349666986 scopus 로고    scopus 로고
    • Linear Bellman combination for control of character animation.
    • doi: 10.1145/1531326.1531388.
    • da Silva, M., Durand, F., and Popovic, J. (2009). Linear Bellman combination for control of character animation. ACM Trans. Grap. 28. doi: 10.1145/1531326.1531388.
    • (2009) ACM Trans. Grap. , vol.28
    • da Silva, M.1    Durand, F.2    Popovic, J.3
  • 5
    • 79952746011 scopus 로고    scopus 로고
    • Model-based influences on humans' choices and striatal prediction errors
    • Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P., and Dolan, R. J. (2011). Model-based influences on humans' choices and striatal prediction errors. Neuron 69, 1204-1215.
    • (2011) Neuron , vol.69 , pp. 1204-1215
    • Daw, N.D.1    Gershman, S.J.2    Seymour, B.3    Dayan, P.4    Dolan, R.J.5
  • 8
    • 84872761547 scopus 로고    scopus 로고
    • The ubiquity of model-based reinforcement learning
    • Doll, B. B., Simon, D. A., and Daw, N. D. (2012). The ubiquity of model-based reinforcement learning. Curr. Opin. Neurobiol. 22, 1075-1081.
    • (2012) Curr. Opin. Neurobiol. , vol.22 , pp. 1075-1081
    • Doll, B.B.1    Simon, D.A.2    Daw, N.D.3
  • 9
    • 67650904998 scopus 로고    scopus 로고
    • How can we learn efficiently to act optimally and flexibly?
    • Doya, K. (2009). How can we learn efficiently to act optimally and flexibly? Proc. Natl. Acad. Sci. U.S.A. 106, 11429-11430.
    • (2009) Proc. Natl. Acad. Sci. U.S.A. , vol.106 , pp. 11429-11430
    • Doya, K.1
  • 10
    • 0003294328 scopus 로고    scopus 로고
    • "Logalithmic transformations and risk sensitivity,"
    • (New York, NY: Springer Science + Business Media, Inc.), (eds.)
    • Fleming, W., and Soner, H. (eds.). (2006). "Logalithmic transformations and risk sensitivity," in Controlled Markov Processes and Viscosity Solutions, Chapter 6 (New York, NY: Springer Science + Business Media, Inc.), 227-260.
    • (2006) Controlled Markov Processes and Viscosity Solutions, Chapter 6 , pp. 227-260
    • Fleming, W.1    Soner, H.2
  • 12
    • 28844435646 scopus 로고    scopus 로고
    • Linear theory for control of nonlinear stochastic systems
    • Kappen, H. (2005a). Linear theory for control of nonlinear stochastic systems. Phys. Rev. Lett. 95, 200201-200204.
    • (2005) Phys. Rev. Lett. , vol.95 , pp. 200201-200204
    • Kappen, H.1
  • 13
    • 29044440299 scopus 로고    scopus 로고
    • Path integrals and symmetry breaking for optimal control theory
    • Kappen, H. (2005b). Path integrals and symmetry breaking for optimal control theory. J. Stat. Mech. Theor. Exp. 11, P11011.
    • (2005) J. Stat. Mech. Theor. Exp. , vol.11
    • Kappen, H.1
  • 14
    • 4644323293 scopus 로고    scopus 로고
    • Least-squares policy iteration
    • Lagoudakis, M. G., and Parr, R. (2003). Least-squares policy iteration. J. Mach. Learn. Res. 4, 1107-1149.
    • (2003) J. Mach. Learn. Res. , vol.4 , pp. 1107-1149
    • Lagoudakis, M.G.1    Parr, R.2
  • 15
    • 81255141919 scopus 로고    scopus 로고
    • Model learning for robot control: a survey
    • Nguyen-Tuong, D., and Peters, J. (2011). Model learning for robot control: a survey. Cogn. Proc. 12, 319-340.
    • (2011) Cogn. Proc. , vol.12 , pp. 319-340
    • Nguyen-Tuong, D.1    Peters, J.2
  • 16
    • 67349283062 scopus 로고    scopus 로고
    • Reinforcement learning in the brain
    • Niv, Y. (2009). Reinforcement learning in the brain. J. Math. Psychol. 53, 139-154.
    • (2009) J. Math. Psychol. , vol.53 , pp. 139-154
    • Niv, Y.1
  • 17
    • 80053576494 scopus 로고    scopus 로고
    • On-line regression algorithms for learning mechanical models of robots: a survey
    • Sigaud, O., Salaün, C., and Padois, V. (2011). On-line regression algorithms for learning mechanical models of robots: a survey. Robot. Auton. Syst. 59, 1115-1129.
    • (2011) Robot. Auton. Syst. , vol.59 , pp. 1115-1129
    • Sigaud, O.1    Salaün, C.2    Padois, V.3
  • 21
    • 79551503171 scopus 로고    scopus 로고
    • A generalized path integral control approach to reinforcement learning
    • Theodorou, E., Buchli, J., and Schaal, S. (2010). A generalized path integral control approach to reinforcement learning. J. Mach. Learn. Res. 11, 3137-3181.
    • (2010) J. Mach. Learn. Res. , vol.11 , pp. 3137-3181
    • Theodorou, E.1    Buchli, J.2    Schaal, S.3
  • 22
    • 84874256580 scopus 로고    scopus 로고
    • "Relative entropy and free energy dualities: connections to path integral and kl control,"
    • Theodorou, E., and Todorov, E. (2012). "Relative entropy and free energy dualities: connections to path integral and kl control," in the 51th IEEE Conference on Decision and Control (Maui), 1466-1473.
    • (2012) the 51th IEEE Conference on Decision and Control (Maui) , pp. 1466-1473
    • Theodorou, E.1    Todorov, E.2
  • 23
    • 34848854613 scopus 로고    scopus 로고
    • "Optimal control theory,"
    • in Bayesian Brain: Probabilistic Approaches to Neural Coding, Chapter 12, eds D. Kenji, S. Ishii, A. Pouget, R.P. Rao (Cambridge, MA: MIT Press)
    • Todorov, E. (2006). "Optimal control theory," in Bayesian Brain: Probabilistic Approaches to Neural Coding, Chapter 12, eds D. Kenji, S. Ishii, A. Pouget, and R. P. Rao (Cambridge, MA: MIT Press), 269-298.
    • (2006) , pp. 269-298
    • Todorov, E.1
  • 24
    • 84864055301 scopus 로고    scopus 로고
    • Linearly-solvable Markov decision problems
    • Todorov, E. (2007). Linearly-solvable Markov decision problems. Adv. Neural Inform. Proc. Syst. 19, 1369-1379.
    • (2007) Adv. Neural Inform. Proc. Syst. , vol.19 , pp. 1369-1379
    • Todorov, E.1
  • 25
    • 80052201109 scopus 로고    scopus 로고
    • Compositionality of optimal control laws
    • Todorov, E. (2009a). Compositionality of optimal control laws. Adv. Neural Inform. Proc. Syst. 22, 1856-1864.
    • (2009) Adv. Neural Inform. Proc. Syst. , vol.22 , pp. 1856-1864
    • Todorov, E.1
  • 26
    • 67650915125 scopus 로고    scopus 로고
    • Efficient computation of optimal actions
    • Todorov, E. (2009b). Efficient computation of optimal actions. Proc. Natl. Acad. Sci. U.S.A. 106, 11478-11483.
    • (2009) Proc. Natl. Acad. Sci. U.S.A. , vol.106 , pp. 11478-11483
    • Todorov, E.1
  • 27
    • 67650469280 scopus 로고    scopus 로고
    • "Eigenfunction approximation methods for linearly-solvable optimal control problems,"
    • in Proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming Reinforcement Learning (Nashville, TN)
    • Todorov, E. (2009c). "Eigenfunction approximation methods for linearly-solvable optimal control problems," in Proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (Nashville, TN), 161-168.
    • (2009) , pp. 161-168
    • Todorov, E.1
  • 28
    • 84866749314 scopus 로고    scopus 로고
    • "Aggregation methods for lineary-solvable Markov decision process,"
    • in Proceedings of the World Congress of the International Federation of Automatic Control (Milano).
    • Zhong, M., and Todorov, E. (2011). "Aggregation methods for lineary-solvable Markov decision process," in Proceedings of the World Congress of the International Federation of Automatic Control (Milano).
    • (2011)
    • Zhong, M.1    Todorov, E.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.