SCOPUS 정보 검색 플랫폼

Volumn 7, Issue APR, 2013, Pages

Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task

(3) Kinjo, Ken a,b Uchibe, Eiji b Doya, Kenji a,b

a NARA INSTITUTE OF SCIENCE AND TECHNOLOGY (Japan)

b OKINAWA INSTITUTE OF SCIENCE AND TECHNOLOGY GRADUATE UNIVERSITY (Japan)

Author keywords

Linearly solvable Markov decision process; Model learning; Model based reinforcement learning; Optimal control; Robot navigation

Indexed keywords

ARTICLE; CAMERA; COMPUTER SIMULATION; DECISION MAKING; LEARNING; LINEARLY SOLVABLE MARKOV DECISION PROCESS; MATHEMATICAL COMPUTING; ROBOTICS; STATISTICAL MODEL; TASK PERFORMANCE;

EID: 84904902685 PISSN: None EISSN: 16625218 Source Type: Journal
DOI: 10.3389/fnbot.2013.00007 Document Type: Article

Times cited : (18)

References (28)

1
- 0004007508
- Cambridge, MA: MIT Press/Bradford Books.
- Barto, A. G., and Sutton, R. S. (1998). Reinforcement Learning. Cambridge, MA: MIT Press/Bradford Books.
- (1998) Reinforcement Learning.
- Barto, A.G.¹ Sutton, R.S.²

2
- 0036832950
- Technical update: least-squares temporal difference learning
- Boyan, J. A. (2002). Technical update: least-squares temporal difference learning. Mach. Learn. 49, 233-246.
- (2002) Mach. Learn. , vol.49 , pp. 233-246
- Boyan, J.A.¹

3
- 84904894350
- "Estimating passive dynamics distributions in linearly solvable markov decision processes from measured immediate costs in reinforcement learning problems,"
- in Proceedings of the 21st Annual Conference of the Japanese Neural Network Society (Okinawa).
- Burdelis, M. A. P., and Ikeda, K. (2011). "Estimating passive dynamics distributions in linearly solvable markov decision processes from measured immediate costs in reinforcement learning problems," in Proceedings of the 21st Annual Conference of the Japanese Neural Network Society (Okinawa).
- (2011)
- Burdelis, M.A.P.¹ Ikeda, K.²

4
- 70349666986
- Linear Bellman combination for control of character animation.
- doi: 10.1145/1531326.1531388.
- da Silva, M., Durand, F., and Popovic, J. (2009). Linear Bellman combination for control of character animation. ACM Trans. Grap. 28. doi: 10.1145/1531326.1531388.
- (2009) ACM Trans. Grap. , vol.28
- da Silva, M.¹ Durand, F.² Popovic, J.³

5
- 79952746011
- Model-based influences on humans' choices and striatal prediction errors
- Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P., and Dolan, R. J. (2011). Model-based influences on humans' choices and striatal prediction errors. Neuron 69, 1204-1215.
- (2011) Neuron , vol.69 , pp. 1204-1215
- Daw, N.D.¹ Gershman, S.J.² Seymour, B.³ Dayan, P.⁴ Dolan, R.J.⁵

6
- 80053441894
- "PILCO: a model-based and data-efficient approach to policy search,"
- eds L. Getoor and T. Scheffer (Bellevue, WA, USA).
- Deisenroth, M. P., and Rasmussen, C. E. (2011). "PILCO: a model-based and data-efficient approach to policy search," in Proceedings of the 28th International Conference on Machine Learning, eds L. Getoor and T. Scheffer (Bellevue, WA, USA).
- (2011) Proceedings of the 28th International Conference on Machine Learning
- Deisenroth, M.P.¹ Rasmussen, C.E.²

7
- 61849173491
- Gaussian process dynamic programming
- Deisenroth, M. P., Rasmussen, C. E., and Peters, J. (2009). Gaussian process dynamic programming. Neurocomputing 72, 1508-1524.
- (2009) Neurocomputing , vol.72 , pp. 1508-1524
- Deisenroth, M.P.¹ Rasmussen, C.E.² Peters, J.³

8
- 84872761547
- The ubiquity of model-based reinforcement learning
- Doll, B. B., Simon, D. A., and Daw, N. D. (2012). The ubiquity of model-based reinforcement learning. Curr. Opin. Neurobiol. 22, 1075-1081.
- (2012) Curr. Opin. Neurobiol. , vol.22 , pp. 1075-1081
- Doll, B.B.¹ Simon, D.A.² Daw, N.D.³

9
- 67650904998
- How can we learn efficiently to act optimally and flexibly?
- Doya, K. (2009). How can we learn efficiently to act optimally and flexibly? Proc. Natl. Acad. Sci. U.S.A. 106, 11429-11430.
- (2009) Proc. Natl. Acad. Sci. U.S.A. , vol.106 , pp. 11429-11430
- Doya, K.¹

10
- 0003294328
- "Logalithmic transformations and risk sensitivity,"
- (New York, NY: Springer Science + Business Media, Inc.), (eds.)
- Fleming, W., and Soner, H. (eds.). (2006). "Logalithmic transformations and risk sensitivity," in Controlled Markov Processes and Viscosity Solutions, Chapter 6 (New York, NY: Springer Science + Business Media, Inc.), 227-260.
- (2006) Controlled Markov Processes and Viscosity Solutions, Chapter 6 , pp. 227-260
- Fleming, W.¹ Soner, H.²

11
- 77955817264
- "Generalized model learning for Reinforcement Learning on a humanoid robot,"
- (Anchorage, AK: IEEE)
- Hester, T., Quinlan, M., and Stone, P. (2010). "Generalized model learning for Reinforcement Learning on a humanoid robot," in Proceedings of IEEE International Conference on Robotics and Automation (Anchorage, AK: IEEE), 2369-2374.
- (2010) Proceedings of IEEE International Conference on Robotics and Automation , pp. 2369-2374
- Hester, T.¹ Quinlan, M.² Stone, P.³

12
- 28844435646
- Linear theory for control of nonlinear stochastic systems
- Kappen, H. (2005a). Linear theory for control of nonlinear stochastic systems. Phys. Rev. Lett. 95, 200201-200204.
- (2005) Phys. Rev. Lett. , vol.95 , pp. 200201-200204
- Kappen, H.¹

13
- 29044440299
- Path integrals and symmetry breaking for optimal control theory
- Kappen, H. (2005b). Path integrals and symmetry breaking for optimal control theory. J. Stat. Mech. Theor. Exp. 11, P11011.
- (2005) J. Stat. Mech. Theor. Exp. , vol.11
- Kappen, H.¹

14
- 4644323293
- Least-squares policy iteration
- Lagoudakis, M. G., and Parr, R. (2003). Least-squares policy iteration. J. Mach. Learn. Res. 4, 1107-1149.
- (2003) J. Mach. Learn. Res. , vol.4 , pp. 1107-1149
- Lagoudakis, M.G.¹ Parr, R.²

15
- 81255141919
- Model learning for robot control: a survey
- Nguyen-Tuong, D., and Peters, J. (2011). Model learning for robot control: a survey. Cogn. Proc. 12, 319-340.
- (2011) Cogn. Proc. , vol.12 , pp. 319-340
- Nguyen-Tuong, D.¹ Peters, J.²

16
- 67349283062
- Reinforcement learning in the brain
- Niv, Y. (2009). Reinforcement learning in the brain. J. Math. Psychol. 53, 139-154.
- (2009) J. Math. Psychol. , vol.53 , pp. 139-154
- Niv, Y.¹

17
- 80053576494
- On-line regression algorithms for learning mechanical models of robots: a survey
- Sigaud, O., Salaün, C., and Padois, V. (2011). On-line regression algorithms for learning mechanical models of robots: a survey. Robot. Auton. Syst. 59, 1115-1129.
- (2011) Robot. Auton. Syst. , vol.59 , pp. 1115-1129
- Sigaud, O.¹ Salaün, C.² Padois, V.³

18
- 84880732336
- "Path integral policy improvement with covariance matrix adaptation,"
- (EWRL 2012) (Edinburgh).
- Stulp, F., and Sigaud, O. (2012). "Path integral policy improvement with covariance matrix adaptation," in Proceedings of the 10th European Workshop on Reinforcement Learning (EWRL 2012) (Edinburgh).
- (2012) Proceedings of the 10th European Workshop on Reinforcement Learning
- Stulp, F.¹ Sigaud, O.²

19
- 84904894351
- "Phase-dependent trajectory optimization for periodic movement using path integral reinforcement learning,"
- (Okinawa)
- Sugimoto, N., and Morimoto, J. (2011). "Phase-dependent trajectory optimization for periodic movement using path integral reinforcement learning," in Proceedings of the 21st Annual Conference of the Japanese Neural Network Society (Okinawa).
- (2011) Proceedings of the 21st Annual Conference of the Japanese Neural Network Society
- Sugimoto, N.¹ Morimoto, J.²

20
- 77950208664
- Least-squares conditional density estimation.
- E93-D
- Sugiyama, M., Takeuchi, I., Suzuki, T., Kanamori, T., and Hachiya, H. (2010). Least-squares conditional density estimation. IEICE Trans. Inform. Syst. E93-D, 583-594.
- (2010) IEICE Trans. Inform. Syst. , pp. 583-594
- Sugiyama, M.¹ Takeuchi, I.² Suzuki, T.³ Kanamori, T.⁴ Hachiya, H.⁵

21
- 79551503171
- A generalized path integral control approach to reinforcement learning
- Theodorou, E., Buchli, J., and Schaal, S. (2010). A generalized path integral control approach to reinforcement learning. J. Mach. Learn. Res. 11, 3137-3181.
- (2010) J. Mach. Learn. Res. , vol.11 , pp. 3137-3181
- Theodorou, E.¹ Buchli, J.² Schaal, S.³

22
- 84874256580
- "Relative entropy and free energy dualities: connections to path integral and kl control,"
- Theodorou, E., and Todorov, E. (2012). "Relative entropy and free energy dualities: connections to path integral and kl control," in the 51th IEEE Conference on Decision and Control (Maui), 1466-1473.
- (2012) the 51th IEEE Conference on Decision and Control (Maui) , pp. 1466-1473
- Theodorou, E.¹ Todorov, E.²

23
- 34848854613
- "Optimal control theory,"
- in Bayesian Brain: Probabilistic Approaches to Neural Coding, Chapter 12, eds D. Kenji, S. Ishii, A. Pouget, R.P. Rao (Cambridge, MA: MIT Press)
- Todorov, E. (2006). "Optimal control theory," in Bayesian Brain: Probabilistic Approaches to Neural Coding, Chapter 12, eds D. Kenji, S. Ishii, A. Pouget, and R. P. Rao (Cambridge, MA: MIT Press), 269-298.
- (2006) , pp. 269-298
- Todorov, E.¹

24
- 84864055301
- Linearly-solvable Markov decision problems
- Todorov, E. (2007). Linearly-solvable Markov decision problems. Adv. Neural Inform. Proc. Syst. 19, 1369-1379.
- (2007) Adv. Neural Inform. Proc. Syst. , vol.19 , pp. 1369-1379
- Todorov, E.¹

25
- 80052201109
- Compositionality of optimal control laws
- Todorov, E. (2009a). Compositionality of optimal control laws. Adv. Neural Inform. Proc. Syst. 22, 1856-1864.
- (2009) Adv. Neural Inform. Proc. Syst. , vol.22 , pp. 1856-1864
- Todorov, E.¹

26
- 67650915125
- Efficient computation of optimal actions
- Todorov, E. (2009b). Efficient computation of optimal actions. Proc. Natl. Acad. Sci. U.S.A. 106, 11478-11483.
- (2009) Proc. Natl. Acad. Sci. U.S.A. , vol.106 , pp. 11478-11483
- Todorov, E.¹

27
- 67650469280
- "Eigenfunction approximation methods for linearly-solvable optimal control problems,"
- in Proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming Reinforcement Learning (Nashville, TN)
- Todorov, E. (2009c). "Eigenfunction approximation methods for linearly-solvable optimal control problems," in Proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (Nashville, TN), 161-168.
- (2009) , pp. 161-168
- Todorov, E.¹

28
- 84866749314
- "Aggregation methods for lineary-solvable Markov decision process,"
- in Proceedings of the World Congress of the International Federation of Automatic Control (Milano).
- Zhong, M., and Todorov, E. (2011). "Aggregation methods for lineary-solvable Markov decision process," in Proceedings of the World Congress of the International Federation of Automatic Control (Milano).
- (2011)
- Zhong, M.¹ Todorov, E.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.