메뉴 건너뛰기




Volumn 7, Issue FEB, 2013, Pages

Scaled free-energy based reinforcement learning for robust and efficient learning in high-dimensional state spaces

Author keywords

Free energy; Function approximation; Reinforcement learning; Restricted boltzmann machine; Robot navigation

Indexed keywords

ARTICLE; CAMERA; COLOR; DELAY DISCOUNTING; FREE ENERGY BASED REINFORCEMENT LEARNING; LEARNING ALGORITHM; MACHINE LEARNING; NONHUMAN; REWARD; ROBOTICS; SENSOR; SIMULATION; TEMPERATURE;

EID: 84892587881     PISSN: None     EISSN: 16625218     Source Type: Journal    
DOI: 10.3389/fnbot.2013.00003     Document Type: Article
Times cited : (9)

References (21)
  • 1
    • 0034276719 scopus 로고    scopus 로고
    • Spatial cognition and neuro-mimetic navigation: a model of hippocampal place cell activity
    • Arleo, A., and Gerstner, W. (2000). Spatial cognition and neuro-mimetic navigation: a model of hippocampal place cell activity. Biol. Cybern. 83, 287-299.
    • (2000) Biol. Cybern. , vol.83 , pp. 287-299
    • Arleo, A.1    Gerstner, W.2
  • 2
    • 42149165123 scopus 로고    scopus 로고
    • Biologically-inspired robot spatial cognition based on rat neurophysiological studies
    • Barrera, A., and Weitzenfeld, A. (2008). Biologically-inspired robot spatial cognition based on rat neurophysiological studies. Auton. Robots 25, 147-169.
    • (2008) Auton. Robots , vol.25 , pp. 147-169
    • Barrera, A.1    Weitzenfeld, A.2
  • 4
    • 0036618011 scopus 로고    scopus 로고
    • Multiple model-based reinforcement learning
    • Doya, K., Samejima, K., Katagiri, K., and Kawato, M. (2002). Multiple model-based reinforcement learning. Neural Comput. 14, 1347-1369.
    • (2002) Neural Comput. , vol.14 , pp. 1347-1369
    • Doya, K.1    Samejima, K.2    Katagiri, K.3    Kawato, M.4
  • 5
    • 21344434798 scopus 로고    scopus 로고
    • The cyber rodent project: exploration of adaptive mechanisms for self-preservation and self-reproduction
    • Doya, K., and Uchibe, E. (2005). The cyber rodent project: exploration of adaptive mechanisms for self-preservation and self-reproduction. Adapt. Behav. 13, 149-160.
    • (2005) Adapt. Behav. , vol.13 , pp. 149-160
    • Doya, K.1    Uchibe, E.2
  • 6
    • 78650216238 scopus 로고    scopus 로고
    • "Free-energy based reinforcement learning for vision-based navigation with high-dimensional sensory inputs,"
    • in Proceedings of the International Conference on Neural Information Processing (ICONIP2010) (Berlin, Heidelberg)
    • Elfwing, S., Otsuka, M., Uchibe, E., and Doya, K. (2010). "Free-energy based reinforcement learning for vision-based navigation with high-dimensional sensory inputs," in Proceedings of the International Conference on Neural Information Processing (ICONIP2010) (Berlin, Heidelberg), 215-222.
    • (2010) , pp. 215-222
    • Elfwing, S.1    Otsuka, M.2    Uchibe, E.3    Doya, K.4
  • 7
    • 33847689778 scopus 로고    scopus 로고
    • Retrospective and prospective responses arising in a modeled hippocampus during maze navigation by a brain-based device
    • Fleischer, J. G., Gally, J. A., Edelman, G. M., and Krichmar, J. L. (2007). Retrospective and prospective responses arising in a modeled hippocampus during maze navigation by a brain-based device. Proc. Natl. Acad. Sci. U.S.A. 104, 3556-3561.
    • (2007) Proc. Natl. Acad. Sci. U.S.A. , vol.104 , pp. 3556-3561
    • Fleischer, J.G.1    Gally, J.A.2    Edelman, G.M.3    Krichmar, J.L.4
  • 8
    • 0345368881 scopus 로고
    • "Unsupervised learning of distributions on binary vectors using two layer networks,"
    • in Advances in Neural Information Processing Systems 4, eds J.E. Moody, S.J. Hanson, and R.P. Lippmann (Denver, CO: Morgan Kaufmann)
    • Freund, Y., and Haussler, D. (1992). "Unsupervised learning of distributions on binary vectors using two layer networks," in Advances in Neural Information Processing Systems 4, eds J. E. Moody, S. J. Hanson, and R. P. Lippmann (Denver, CO: Morgan Kaufmann), 912-919.
    • (1992) , pp. 912-919
    • Freund, Y.1    Haussler, D.2
  • 9
    • 69549110079 scopus 로고    scopus 로고
    • "Autonomous vision-based navigation: goal-oriented action planning by transient states prediction, cognitive map building, and sensory-motor learning,"
    • in Proceedings of the International Conference on Intelligent Robots and Systems (IROS2008) (Nice, France)
    • Giovannangeli, C., and Gaussier, P. (2008). "Autonomous vision-based navigation: goal-oriented action planning by transient states prediction, cognitive map building, and sensory-motor learning," in Proceedings of the International Conference on Intelligent Robots and Systems (IROS2008) (Nice, France), 676-683.
    • (2008) , pp. 676-683
    • Giovannangeli, C.1    Gaussier, P.2
  • 10
    • 0013344078 scopus 로고    scopus 로고
    • Training products of experts by minimizing contrastive divergence
    • Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Comput. 12, 1771-1800.
    • (2002) Neural Comput. , vol.12 , pp. 1771-1800
    • Hinton, G.E.1
  • 11
    • 23444445493 scopus 로고    scopus 로고
    • Spatial navigation and causal analysis in a brain-based device modeling cortical-hippocampal interactions
    • Krichmar, J. L., Seth, A. K., Nitz, D. A., Fleischer, J., and Edelman, G. M. (2005). Spatial navigation and causal analysis in a brain-based device modeling cortical-hippocampal interactions. Neuroinformatics 3, 197-221.
    • (2005) Neuroinformatics , vol.3 , pp. 197-221
    • Krichmar, J.L.1    Seth, A.K.2    Nitz, D.A.3    Fleischer, J.4    Edelman, G.M.5
  • 12
    • 0032203257 scopus 로고    scopus 로고
    • Gradient-based learning applied to document recognition
    • LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278-2324.
    • (1998) Proc. IEEE , vol.86 , pp. 2278-2324
    • LeCun, Y.1    Bottou, L.2    Bengio, Y.3    Haffner, P.4
  • 13
    • 84904884798 scopus 로고    scopus 로고
    • version 7.14.0 (R2012a).
    • MATLAB, Natick, MA: The MathWorks Inc.
    • MATLAB. (2010). version 7.14.0 (R2012a). Natick, MA: The MathWorks Inc.
    • (2010)
  • 14
    • 77955596781 scopus 로고    scopus 로고
    • Persistent navigation and mapping using a biologically inspired slam system
    • Milford, M., and Wyeth, G. (2010). Persistent navigation and mapping using a biologically inspired slam system. Int. J. Rob. Res. 29, 1131-1153.
    • (2010) Int. J. Rob. Res. , vol.29 , pp. 1131-1153
    • Milford, M.1    Wyeth, G.2
  • 15
    • 84887013392 scopus 로고    scopus 로고
    • "Free-energy-based reinforcement learning in a partially observable environments,"
    • in Proceedings of the European Symposium on Artificial Neural Networks (ESANN2010) (Bruges, Belgium)
    • Otsuka, M., Yoshimoto, J., and Doya, K. (2010). "Free-energy-based reinforcement learning in a partially observable environments," in Proceedings of the European Symposium on Artificial Neural Networks (ESANN2010) (Bruges, Belgium), 541-545.
    • (2010) , pp. 541-545
    • Otsuka, M.1    Yoshimoto, J.2    Doya, K.3
  • 16
    • 0003636089 scopus 로고
    • On-line Q-learning Using Connectionist Systems
    • Cambridge University Engineering Department
    • Rummery, G. A., and Niranjan, M. (1994). On-line Q-learning Using Connectionist Systems. Technical Report CUED/F-INFENG/TR 166, Cambridge University Engineering Department.
    • (1994) Technical Report CUED/F-INFENG/TR , vol.166
    • Rummery, G.A.1    Niranjan, M.2
  • 17
    • 32844474095 scopus 로고    scopus 로고
    • Reinforcement learning with factored states and actions
    • Sallans, B., and Hinton, G. E. (2004). Reinforcement learning with factored states and actions. J. Mach. Learn. Res. 5, 1063-1088.
    • (2004) J. Mach. Learn. Res. , vol.5 , pp. 1063-1088
    • Sallans, B.1    Hinton, G.E.2
  • 18
    • 0000329993 scopus 로고
    • "Information processing in dynamical systems: foundations of harmony theory,"
    • in Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations, eds D.E. Rumelhart J.L. McClelland (Cambridge, MA: MIT Press)
    • Smolensky, P. (1986). "Information processing in dynamical systems: foundations of harmony theory," in Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations, eds D. E. Rumelhart and J. L. McClelland (Cambridge, MA: MIT Press), 194-281.
    • (1986) , pp. 194-281
    • Smolensky, P.1
  • 19
    • 85156221438 scopus 로고    scopus 로고
    • "Generalization in reinforcement learning: successful examples using sparse coarse coding,"
    • in Advances in Neural Information Processing Systems 8, eds D.S. Touretzky, M.C. Mozer, M.E. Hasselmo (Denver, CO: MIT Press)
    • Sutton, R. S. (1996). "Generalization in reinforcement learning: successful examples using sparse coarse coding," in Advances in Neural Information Processing Systems 8, eds D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo (Denver, CO: MIT Press), 1038-1044.
    • (1996) , pp. 1038-1044
    • Sutton, R.S.1
  • 20
    • 0004102479 scopus 로고    scopus 로고
    • Reinforcement Learning: An Introduction.
    • Cambridge, MA: MIT Press.
    • Sutton, R. S., and Barto, A. (1998). Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press.
    • (1998)
    • Sutton, R.S.1    Barto, A.2
  • 21
    • 14044256851 scopus 로고    scopus 로고
    • "Competitive-cooperative-concurrent reinforcement learning with importance sampling,"
    • in Proceedings of the International Conference on Simulation of Adaptive Behavior: From Animals and Animats (SAB2004) (Santa Monica, CA: MIT Press)
    • Uchibe, E., and Doya, K. (2004). "Competitive-cooperative-concurrent reinforcement learning with importance sampling," in Proceedings of the International Conference on Simulation of Adaptive Behavior: From Animals and Animats (SAB2004) (Santa Monica, CA: MIT Press), 287-296.
    • (2004) , pp. 287-296
    • Uchibe, E.1    Doya, K.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.