메뉴 건너뛰기




Volumn 36, Issue 1, 2005, Pages 1-44

A biologically inspired hierarchical reinforcement learning system

Author keywords

[No Author keywords available]

Indexed keywords

FUNCTIONS; HIERARCHICAL SYSTEMS; LEARNING ALGORITHMS; NEUROLOGY; PROBLEM SOLVING; PUBLIC POLICY;

EID: 13844298400     PISSN: 01969722     EISSN: None     Source Type: Journal    
DOI: 10.1080/01969720590887270     Document Type: Article
Times cited : (3)

References (52)
  • 1
    • 13844265451 scopus 로고    scopus 로고
    • Emotional learning: A computational model of the amygdala
    • Balkenius, C. and J. Moren. 2001. Emotional learning: A computational model of the amygdala. Cybernetics and Systems, 32:611-636.
    • (2001) Cybernetics and Systems , vol.32 , pp. 611-636
    • Balkenius, C.1    Moren, J.2
  • 2
    • 0000541213 scopus 로고
    • Adaptive critics and the basal ganglia
    • edited by J. L. D. J. C. H. a. D. G. Beiser. Cambridge, MA: MIT Press
    • Barto, A. G. 1995. Adaptive critics and the basal ganglia. In Models of information processing in the basal ganglia, pp. 215-232, edited by J. L. D. J. C. H. a. D. G. Beiser. Cambridge, MA: MIT Press.
    • (1995) In Models of Information Processing in the Basal Ganglia , pp. 215-232
    • Barto, A.G.1
  • 3
    • 0034214397 scopus 로고    scopus 로고
    • Control of response selection by reinforcer value requires interaction of amygdala and orbital prefrontal cortex
    • Baxter, M. G., A. Parker, C. C. Lindner, A. D. Izquierdo, and E. A. Murray. 2000. Control of response selection by reinforcer value requires interaction of amygdala and orbital prefrontal cortex. Journal of Neuroscience, 20:4311-4319.
    • (2000) Journal of Neuroscience , vol.20 , pp. 4311-4319
    • Baxter, M.G.1    Parker, A.2    Lindner, C.C.3    Izquierdo, A.D.4    Murray, E.A.5
  • 4
    • 0034059124 scopus 로고    scopus 로고
    • Emotion, decision making and the orbitofrontal cortex
    • Bechara, A., H. Damasio, and A. R. Damasio. 2000. Emotion, decision making and the orbitofrontal cortex. Cerebral Cortex, 10:295-307.
    • (2000) Cerebral Cortex , vol.10 , pp. 295-307
    • Bechara, A.1    Damasio, H.2    Damasio, A.R.3
  • 5
    • 0344063724 scopus 로고    scopus 로고
    • Self-learning agents: A connectionist theory of emotion based on crossbar value judgment
    • Bozinovski, S. 2001. Self-learning agents: A connectionist theory of emotion based on crossbar value judgment. Cybernetics and Systems, 32:637-669.
    • (2001) Cybernetics and Systems , vol.32 , pp. 637-669
    • Bozinovski, S.1
  • 9
    • 0001234682 scopus 로고
    • Advances in Neural Information Processing Systems. San Mateo, CA: Morgan Kaufmann
    • Dayan, P. and G. Hinton. 1993. Feudal reinforcement learning. Advances in Neural Information Processing Systems. San Mateo, CA: Morgan Kaufmann.
    • (1993) Feudal Reinforcement Learning
    • Dayan, P.1    Hinton, G.2
  • 11
    • 0002278788 scopus 로고    scopus 로고
    • Hierarchical reinforcement learning with the MAXQ value function decomposition
    • Dietterich, T. G. 2000a. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13:227-303.
    • (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
    • Dietterich, T.G.1
  • 13
    • 0000701302 scopus 로고    scopus 로고
    • Differential involvement of amygdala subsystems in appetitive conditioning and drug addiction
    • edited by J. P. Aggleton. NewYork: Oxford University Press
    • Everitt, B. J., R. N. Cardinal, J. Hall, J. A. Parkinson, and T. W. Robbind. 2000. Differential involvement of amygdala subsystems in appetitive conditioning and drug addiction. In The amygdala: A functional analysis, pp. 253-390, edited by J. P. Aggleton. NewYork: Oxford University Press.
    • (2000) The Amygdala: a Functional Analysis , pp. 253-390
    • Everitt, B.J.1    Cardinal, R.N.2    Hall, J.3    Parkinson, J.A.4    Robbind, T.W.5
  • 14
    • 0034030805 scopus 로고    scopus 로고
    • The central nucleus of the amygdala projection to dopamine subpopulations in primates
    • Fudge, J. L. and S. N. Haber. 2000. The central nucleus of the amygdala projection to dopamine subpopulations in primates. Neuroscience, 97(3):479-494.
    • (2000) Neuroscience , vol.97 , Issue.3 , pp. 479-494
    • Fudge, J.L.1    Haber, S.N.2
  • 15
    • 85047698537 scopus 로고    scopus 로고
    • Emotion-triggered learning in autonomous robot control
    • Gadanho, S. C. and J. Hallam. 2001. Emotion-triggered learning in autonomous robot control. Cybernetics and Systems, 32(5):531-559.
    • (2001) Cybernetics and Systems , vol.32 , Issue.5 , pp. 531-559
    • Gadanho, S.C.1    Hallam, J.2
  • 16
    • 0033535628 scopus 로고    scopus 로고
    • Amygdaloid Dl dopamine receptor involvement in Pavlovian fear conditioning
    • Guarraci, F. A., R. J. Frohardt, and B. S. Kapp. 1999. Amygdaloid Dl dopamine receptor involvement in Pavlovian fear conditioning. Brain Research, 827:28-40.
    • (1999) Brain Research , vol.827 , pp. 28-40
    • Guarraci, F.A.1    Frohardt, R.J.2    Kapp, B.S.3
  • 17
    • 0029758993 scopus 로고    scopus 로고
    • Neurotoxic lesions of basolateral, but not central, amygdala interfere with pavlovian second-order conditioning and reinforcer devaluation effects
    • Hatfield, T., J.-S. Han, M. Conley, M. Gallagher, and P. Holland. 1996. Neurotoxic lesions of basolateral, but not central, amygdala interfere with pavlovian second-order conditioning and reinforcer devaluation effects. Journal of Neuroscience, 16:5256-5265.
    • (1996) Journal of Neuroscience , vol.16 , pp. 5256-5265
    • Hatfield, T.1    Han, J.-S.2    Conley, M.3    Gallagher, M.4    Holland, P.5
  • 18
    • 0013465036 scopus 로고    scopus 로고
    • Discovering hierarchy in reinforcement learning with HEXQ
    • Sydney, Australia, July
    • Hengst, B. 2002. Discovering hierarchy in reinforcement learning with HEXQ. Nineteenth International Conference on Machine Learning, Sydney, Australia, July, pp. 8-12.
    • (2002) Nineteenth International Conference on Machine Learning , pp. 8-12
    • Hengst, B.1
  • 19
    • 0002861883 scopus 로고
    • A model of how the basal ganglia generate and use neural signals that predict reinforcement
    • edited by J. L. D. J. C. H. a. D. G. Beiser. Cambridge, MA: MIT Press
    • Houk, J. C., J. L. Adams, and A. G. Barto. 1995. A model of how the basal ganglia generate and use neural signals that predict reinforcement. In Models of information processing in the basal ganglia, pp. 249-270, edited by J. L. D. J. C. H. a. D. G. Beiser. Cambridge, MA: MIT Press.
    • (1995) Models of Information Processing in the Basal Ganglia , pp. 249-270
    • Houk, J.C.1    Adams, J.L.2    Barto, A.G.3
  • 21
    • 0030795741 scopus 로고    scopus 로고
    • Different types of fear-conditioned behavior mediated by separate nuclei within amygdala
    • Killcross, S., T. W. Robbins, and B. J. Everitt. 1997. Different types of fear-conditioned behavior mediated by separate nuclei within amygdala. Nature Neuroscience, 388:377-380.
    • (1997) Nature Neuroscience , vol.388 , pp. 377-380
    • Killcross, S.1    Robbins, T.W.2    Everitt, B.J.3
  • 22
    • 0026847155 scopus 로고
    • Brain mechanisms of emotion and emotional learning
    • LeDoux, J. E. 1992. Brain mechanisms of emotion and emotional learning. Current Opinion in Neurobiology, 2:191-197.
    • (1992) Current Opinion in Neurobiology , vol.2 , pp. 191-197
    • Ledoux, J.E.1
  • 23
    • 0034076609 scopus 로고    scopus 로고
    • Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning
    • LeDoux, J. E. 2000. Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning. Annual Review of Neuroscience, 23:155-184.
    • (2000) Annual Review of Neuroscience , vol.23 , pp. 155-184
    • Ledoux, J.E.1
  • 24
    • 0000123778 scopus 로고
    • Self-improving reactive agents based on reinforcement learning, planning and teaching
    • Lin, L.-J. 1992. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8:293-321,
    • (1992) Machine Learning , vol.8 , pp. 293-321
    • Lin, L.-J.1
  • 25
    • 0002224896 scopus 로고
    • Scaling up reinforcement learning for robot control
    • Amherst, MA, USA, 27-29 June
    • Lin, L.-J. 1993. Scaling up reinforcement learning for robot control. Tenth International conference on Machine Learning, Amherst, MA, USA, 27-29 June, pp. 182-189.
    • (1993) Tenth International Conference on Machine Learning , pp. 182-189
    • Lin, L.-J.1
  • 26
    • 0343632381 scopus 로고    scopus 로고
    • Excitotoxic lesions of the amygdala fail to produce impairment in visual learning for auditory secondary reinforcement but interfere with reinforcer devaluation effects in rhesus monkeys
    • Malkova, L., D. Gaffan, and E. A. Murray. 1997. Excitotoxic lesions of the amygdala fail to produce impairment in visual learning for auditory secondary reinforcement but interfere with reinforcer devaluation effects in rhesus monkeys. Journal of Neuroscience, 17:6011-6020.
    • (1997) Journal of Neuroscience , vol.17 , pp. 6011-6020
    • Malkova, L.1    Gaffan, D.2    Murray, E.A.3
  • 27
    • 14344264466 scopus 로고    scopus 로고
    • Q-Cut - Dynamic discovery of sub-goals in reinforcement learning
    • Helsinki, Finland. Berlin, Heidelberg: Springer-Verlag
    • Menache, I., S. Mannor, and N. Shimkin. 2002. Q-Cut - Dynamic discovery of sub-goals in reinforcement learning. 13th European Conference on Machine Learning, Helsinki, Finland. Berlin, Heidelberg: Springer-Verlag.
    • (2002) 13th European Conference on Machine Learning
    • Menache, I.1    Mannor, S.2    Shimkin, N.3
  • 29
    • 0035152958 scopus 로고    scopus 로고
    • Abstract reward and punishment representations in the human orbitofrontal cortex
    • O'Doherty, J. K., M. L., Kringelbach, E. T. Rolls, J. Hornak, and C. Andrews. 2001. Abstract reward and punishment representations in the human orbitofrontal cortex. Nature Neuroscience, 4(1):95-102.
    • (2001) Nature Neuroscience , vol.4 , Issue.1 , pp. 95-102
    • O'Doherty, J.K.1    Kringelbach, M.L.2    Rolls, E.T.3    Hornak, J.4    Andrews, C.5
  • 30
    • 0034175654 scopus 로고    scopus 로고
    • Neuronal correlates of fear in the lateral amygdala: Multiple extracellular recordings in conscious cats
    • Paré, D. and D. R. Collins. 2000. Neuronal correlates of fear in the lateral amygdala: Multiple extracellular recordings in conscious cats. Journal of Neuroscience, 20:2701-2710.
    • (2000) Journal of Neuroscience , vol.20 , pp. 2701-2710
    • Paré, D.1    Collins, D.R.2
  • 32
    • 0002055710 scopus 로고    scopus 로고
    • Connectivity of the rat amygdaloid complex
    • edited by J. P. Aggleton. New York: Oxford University Press
    • Pitkanen, A. 2000. Connectivity of the rat amygdaloid complex. In The amygdala: A functional analysis, pp. 31-115, edited by J. P. Aggleton. New York: Oxford University Press.
    • (2000) The Amygdala: a Functional Analysis , pp. 31-115
    • Pitkanen, A.1
  • 33
    • 0029093419 scopus 로고
    • LTP is accompanied by commensurate enhancement of auditory-evoked responses in a fear conditioning circuit
    • Rogan, M. T. and J. E. LeDoux. 1995. LTP is accompanied by commensurate enhancement of auditory-evoked responses in a fear conditioning circuit. Neuron, 15:127-136.
    • (1995) Neuron , vol.15 , pp. 127-136
    • Rogan, M.T.1    Ledoux, J.E.2
  • 34
  • 36
    • 0034017604 scopus 로고    scopus 로고
    • Orbitofrontal cortex and reward
    • Rolls, E. T. 2000. Orbitofrontal cortex and reward. Cerebral Cortex, 10:284-294.
    • (2000) Cerebral Cortex , vol.10 , pp. 284-294
    • Rolls, E.T.1
  • 37
    • 13844270304 scopus 로고
    • The monaminergic innervation of the amygdala in squirrel monkey brain: An immunohischemical study
    • Sadikot, A. F. and A. Parent. 1990. The monaminergic innervation of the amygdala in squirrel monkey brain: An immunohischemical study. Experimental Brain Research 81:443-446.
    • (1990) Experimental Brain Research , vol.81 , pp. 443-446
    • Sadikot, A.F.1    Parent, A.2
  • 38
    • 0032081988 scopus 로고    scopus 로고
    • Orbitofrontal cortex and basolateral amygdala encode expected outcome during learning
    • Schoenbaum, G., A. A. Chiba, and M. Gallagher. 1998. Orbitofrontal cortex and basolateral amygdala encode expected outcome during learning. Nature Neuroscience, 1:155-159.
    • (1998) Nature Neuroscience , vol.1 , pp. 155-159
    • Schoenbaum, G.1    Chiba, A.A.2    Gallagher, M.3
  • 39
    • 0033103761 scopus 로고    scopus 로고
    • Neural encoding in orbitofrontal cortex and basolateral amygdala during olfactory discrimination learning
    • Schoenbaum, G., A. A. Chiba, and M. Gallagher. 1999. Neural encoding in orbitofrontal cortex and basolateral amygdala during olfactory discrimination learning. Journal of Neuroscience, 19(5):1876-1884.
    • (1999) Journal of Neuroscience , vol.19 , Issue.5 , pp. 1876-1884
    • Schoenbaum, G.1    Chiba, A.A.2    Gallagher, M.3
  • 40
    • 0031867046 scopus 로고    scopus 로고
    • Predective reward signal of dopamine neurons
    • Schultz, W. 1998. Predective reward signal of dopamine neurons. Journal of Neurophysiology, 80:1-27.
    • (1998) Journal of Neurophysiology , vol.80 , pp. 1-27
    • Schultz, W.1
  • 41
    • 0030896968 scopus 로고    scopus 로고
    • A neural substrate of prediction and reward
    • Schultz, W., P. Dayan, and P. R. Montague. 1997. A neural substrate of prediction and reward. Science, 275:1593-1599.
    • (1997) Science , vol.275 , pp. 1593-1599
    • Schultz, W.1    Dayan, P.2    Montague, P.R.3
  • 42
    • 0034061495 scopus 로고    scopus 로고
    • Reward processing in primate orbitofrontal cortex and basla ganglia
    • Schultz, W. L., L. Tremblay, and J. R. Hollerman. 2000. Reward processing in primate orbitofrontal cortex and basla ganglia. Cerebral Cortex, 10:272-283.
    • (2000) Cerebral Cortex , vol.10 , pp. 272-283
    • Schultz, W.L.1    Tremblay, L.2    Hollerman, J.R.3
  • 44
    • 0014041964 scopus 로고
    • Motivational and emotional control of cognition
    • Simon, H. A. 1967. Motivational and emotional control of cognition. Psychological Review, 74:29-39.
    • (1967) Psychological Review , vol.74 , pp. 29-39
    • Simon, H.A.1
  • 45
    • 0001027894 scopus 로고
    • Transfer of learning by composing solutions of elemental sequential tasks
    • Singh, S. P. 1992. Transfer of learning by composing solutions of elemental sequential tasks. Machine Learning, 8:323-339.
    • (1992) Machine Learning , vol.8 , pp. 323-339
    • Singh, S.P.1
  • 46
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton, R. S. 1988. Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 47
    • 0019537951 scopus 로고
    • Toward a modern theory of adaptive networks: Expectation and prediction
    • Sutton, R. S. and A. G. Barto. 1981. Toward a modern theory of adaptive networks: Expectation and prediction. Psychological Review, 88:135-140.
    • (1981) Psychological Review , vol.88 , pp. 135-140
    • Sutton, R.S.1    Barto, A.G.2
  • 49
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • Sutton, R. S., D. Precup, and S. Singh. 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1):181-211.
    • (1999) Artificial Intelligence , vol.112 , Issue.1 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3
  • 50
    • 0033594310 scopus 로고    scopus 로고
    • Relative reward preference in primate orbitofrontal cortex
    • Tremblay, L. and W. Schultz. 1999. Relative reward preference in primate orbitofrontal cortex. Nature Neuroscience, 398:704-708.
    • (1999) Nature Neuroscience , vol.398 , pp. 704-708
    • Tremblay, L.1    Schultz, W.2
  • 51
    • 0034036369 scopus 로고    scopus 로고
    • Reward-related neuronal activity during gonogo task performance in primate orbitofrontal cortex
    • Tremblay, L. and W. Schultz. 2000. Reward-related neuronal activity during gonogo task performance in primate orbitofrontal cortex. Journal of Neurophysiology, 83:1864-1876.
    • (2000) Journal of Neurophysiology , vol.83 , pp. 1864-1876
    • Tremblay, L.1    Schultz, W.2
  • 52
    • 35248870803 scopus 로고    scopus 로고
    • When robots weep: Emotional memories and decision-making
    • Madison, WI: MIT Press
    • Valesquez, J. D. 1998. When robots weep: Emotional memories and decision-making. AAAI'98. Madison, WI: MIT Press.
    • (1998) AAAI'98
    • Valesquez, J.D.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.