메뉴 건너뛰기




Volumn 3141, Issue , 2004, Pages 80-94

Biologically inspired reinforcement learning: Reward-Based decomposition for multi-goal environments

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; DECISION MAKING; REINFORCEMENT LEARNING;

EID: 35048843384     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-540-27835-1_7     Document Type: Article
Times cited : (5)

References (12)
  • 1
    • 0002278788 scopus 로고    scopus 로고
    • Hierarchical reinforcement learning with the MAXQ value function decomposition
    • T. G. Dietterich. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13:227-303, 2000.
    • (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
    • Dietterich, T.G.1
  • 2
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • R. S. Sutton, D. Precup, and S. Singh. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1):181-211, 1999.
    • (1999) Artificial Intelligence , vol.112 , Issue.1 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3
  • 3
    • 0013465036 scopus 로고    scopus 로고
    • Discovering hierarchy in reinforcement learning with HEXQ. in Claude Sammut Hoffmann and Achim, editors
    • Sydney Australia
    • B. Hengst. Discovering hierarchy in reinforcement learning with HEXQ. In Claude Sammut Hoffmann and Achim, editors, the Nineteenth International Conference on Machine Learning, pages 243-250, Sydney Australia, 2002.
    • (2002) The Nineteenth International Conference on Machine Learning , pp. 243-250
    • Hengst, B.1
  • 4
    • 84899028619 scopus 로고    scopus 로고
    • Balancing multiple sources of reward in reinforcement learning
    • MIT Press
    • C. R. Shelton. Balancing multiple sources of reward in reinforcement learning. In Advances in Neural Information Processing Systems, volume 13, pages 1082-1088. MIT Press, 2001.
    • (2001) Advances in Neural Information Processing Systems , vol.13 , pp. 1082-1088
    • Shelton, C.R.1
  • 6
    • 0034061495 scopus 로고    scopus 로고
    • Reward processing in primate orbitofrontal cortex and basla ganglia
    • W. L. Schultz, L. Tremblay, and J. R. Hollerman. Reward processing in primate orbitofrontal cortex and basla ganglia. Cerebral Cortex, 10:272-283, 2000.
    • (2000) Cerebral Cortex , vol.10 , pp. 272-283
    • Schultz, W.L.1    Tremblay, L.2    Hollerman, J.R.3
  • 7
    • 84958795157 scopus 로고    scopus 로고
    • A biologically inspired hierarchical reinforcement learning system
    • to appear
    • W. Zhou and R. Coggins. A biologically inspired hierarchical reinforcement learning system. Cybernetics and Systems, to appear, 2004.
    • (2004) Cybernetics and Systems
    • Zhou, W.1    Coggins, R.2
  • 8
    • 0026847155 scopus 로고
    • Brain mechanisms of emotion and emotional learning.
    • J. E. LeDoux. Brain mechanisms of emotion and emotional learning. Current Opinion in Neurobiology, 2:191-197, 1992.
    • (1992) Current Opinion in Neurobiology , vol.2 , pp. 191-197
    • LeDoux, J.E.1
  • 9
    • 0000541213 scopus 로고
    • Adaptive critics and the basal ganglia
    • J.L. Davis J.C. Houk Beiser and D.G., editors, MIT Press
    • A. G. Barto. Adaptive critics and the basal ganglia. In J.L. Davis J.C. Houk Beiser and D.G., editors, Models of information processing in the basal ganglia, pages 215-232. MIT Press, 1995.
    • (1995) Models of Information Processing in the Basal Ganglia , pp. 215-232
    • Barto, A.G.1
  • 10
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • R. S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44, 1988.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 11
    • 85150714688 scopus 로고
    • Reinforcement learning methods for continuous-time markov decision problems
    • MIT Press
    • S. J. Bradtke and M. O. Duff. Reinforcement learning methods for continuous-time markov decision problems. In Advances in Neural Information Processing Systems, volume 7, pages 393-500. MIT Press, 1995.
    • (1995) Advances in Neural Information Processing Systems , vol.7 , pp. 393-500
    • Bradtke, S.J.1    Duff, M.O.2
  • 12
    • 85047698537 scopus 로고    scopus 로고
    • Emotion-triggered learning in autonomous robot control
    • J. Gadanho and SC. Hallam. Emotion-triggered learning in autonomous robot control. Cybernetics and Systems, 32(5):531-59, 2001.
    • (2001) Cybernetics and Systems , vol.32 , Issue.5 , pp. 531-559
    • Gadanho, J.1    Hallam, S.C.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.