SCOPUS 정보 검색 플랫폼

Cybernetics and Systems

Volumn 36, Issue 1, 2005, Pages 1-44

A biologically inspired hierarchical reinforcement learning system

(2) Zhou, Weidong a Coggins, Richard a

a UNIVERSITY OF SYDNEY (Australia)

Author keywords

[No Author keywords available]

Indexed keywords

FUNCTIONS; HIERARCHICAL SYSTEMS; LEARNING ALGORITHMS; NEUROLOGY; PROBLEM SOLVING; PUBLIC POLICY;

ARTIFICIAL EMOTION INDICATION (AEI); HIERARCHICAL REINFORCEMENT LEARNING (HRL); LEARNING PROCESSES; SUBTASKS;

LEARNING SYSTEMS;

EID: 13844298400 PISSN: 01969722 EISSN: None Source Type: Journal
DOI: 10.1080/01969720590887270 Document Type: Article

Times cited : (3)

References (52)

1
- 13844265451
- Emotional learning: A computational model of the amygdala
- Balkenius, C. and J. Moren. 2001. Emotional learning: A computational model of the amygdala. Cybernetics and Systems, 32:611-636.
- (2001) Cybernetics and Systems , vol.32 , pp. 611-636
- Balkenius, C.¹ Moren, J.²

2
- 0000541213
- Adaptive critics and the basal ganglia
- edited by J. L. D. J. C. H. a. D. G. Beiser. Cambridge, MA: MIT Press
- Barto, A. G. 1995. Adaptive critics and the basal ganglia. In Models of information processing in the basal ganglia, pp. 215-232, edited by J. L. D. J. C. H. a. D. G. Beiser. Cambridge, MA: MIT Press.
- (1995) In Models of Information Processing in the Basal Ganglia , pp. 215-232
- Barto, A.G.¹

3
- 0034214397
- Control of response selection by reinforcer value requires interaction of amygdala and orbital prefrontal cortex
- Baxter, M. G., A. Parker, C. C. Lindner, A. D. Izquierdo, and E. A. Murray. 2000. Control of response selection by reinforcer value requires interaction of amygdala and orbital prefrontal cortex. Journal of Neuroscience, 20:4311-4319.
- (2000) Journal of Neuroscience , vol.20 , pp. 4311-4319
- Baxter, M.G.¹ Parker, A.² Lindner, C.C.³ Izquierdo, A.D.⁴ Murray, E.A.⁵

4
- 0034059124
- Emotion, decision making and the orbitofrontal cortex
- Bechara, A., H. Damasio, and A. R. Damasio. 2000. Emotion, decision making and the orbitofrontal cortex. Cerebral Cortex, 10:295-307.
- (2000) Cerebral Cortex , vol.10 , pp. 295-307
- Bechara, A.¹ Damasio, H.² Damasio, A.R.³

5
- 0344063724
- Self-learning agents: A connectionist theory of emotion based on crossbar value judgment
- Bozinovski, S. 2001. Self-learning agents: A connectionist theory of emotion based on crossbar value judgment. Cybernetics and Systems, 32:637-669.
- (2001) Cybernetics and Systems , vol.32 , pp. 637-669
- Bozinovski, S.¹

6
- 0000409272
- Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press
- Bradtke, S. J. and M. O. Duff. 1995. Reinforcement learning methods for continuous-time Markov decision problems. Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press.
- (1995) Reinforcement Learning Methods for Continuous-time Markov Decision Problems
- Bradtke, S.J.¹ Duff, M.O.²

7
- 0034103541
- The anatomical connections of the macaque monkey orbitofrontal cortex, a review
- Cavada, C., T. Compañy, J. Tejedor, R. J. Cruz-Rizzolo, and F. Reinoso-Suárez. 2000. The anatomical connections of the macaque monkey orbitofrontal cortex, A review. Cerebral Cortex, 10:220-242.
- (2000) Cerebral Cortex , vol.10 , pp. 220-242
- Cavada, C.¹ Compañy, T.² Tejedor, J.³ Cruz-Rizzolo, R.J.⁴ Reinoso-Suárez, F.⁵

8
- 0003900941
- London: Vintage
- Damasio, A. 1999. The feeling of what happens: Body & emotion in the making of consciousness. London: Vintage.
- (1999) The Feeling of What Happens: Body & Emotion in the Making of Consciousness
- Damasio, A.¹

9
- 0001234682
- Advances in Neural Information Processing Systems. San Mateo, CA: Morgan Kaufmann
- Dayan, P. and G. Hinton. 1993. Feudal reinforcement learning. Advances in Neural Information Processing Systems. San Mateo, CA: Morgan Kaufmann.
- (1993) Feudal Reinforcement Learning
- Dayan, P.¹ Hinton, G.²

10
- 0003793529
- New York: Cambridge University Press
- Dickinson, A. 1980. Contemporary animal learning theory. New York: Cambridge University Press.
- (1980) Contemporary Animal Learning Theory
- Dickinson, A.¹

11
- 0002278788
- Hierarchical reinforcement learning with the MAXQ value function decomposition
- Dietterich, T. G. 2000a. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13:227-303.
- (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
- Dietterich, T.G.¹

12
- 0003506152
- Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press
- Dietterich, T. G. 2000b. State abstraction in MAXQ hierarchical reinforcement learning. Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press.
- (2000) State Abstraction in MAXQ Hierarchical Reinforcement Learning
- Dietterich, T.G.¹

13
- 0000701302
- Differential involvement of amygdala subsystems in appetitive conditioning and drug addiction
- edited by J. P. Aggleton. NewYork: Oxford University Press
- Everitt, B. J., R. N. Cardinal, J. Hall, J. A. Parkinson, and T. W. Robbind. 2000. Differential involvement of amygdala subsystems in appetitive conditioning and drug addiction. In The amygdala: A functional analysis, pp. 253-390, edited by J. P. Aggleton. NewYork: Oxford University Press.
- (2000) The Amygdala: a Functional Analysis , pp. 253-390
- Everitt, B.J.¹ Cardinal, R.N.² Hall, J.³ Parkinson, J.A.⁴ Robbind, T.W.⁵

14
- 0034030805
- The central nucleus of the amygdala projection to dopamine subpopulations in primates
- Fudge, J. L. and S. N. Haber. 2000. The central nucleus of the amygdala projection to dopamine subpopulations in primates. Neuroscience, 97(3):479-494.
- (2000) Neuroscience , vol.97 , Issue.3 , pp. 479-494
- Fudge, J.L.¹ Haber, S.N.²

15
- 85047698537
- Emotion-triggered learning in autonomous robot control
- Gadanho, S. C. and J. Hallam. 2001. Emotion-triggered learning in autonomous robot control. Cybernetics and Systems, 32(5):531-559.
- (2001) Cybernetics and Systems , vol.32 , Issue.5 , pp. 531-559
- Gadanho, S.C.¹ Hallam, J.²

16
- 0033535628
- Amygdaloid Dl dopamine receptor involvement in Pavlovian fear conditioning
- Guarraci, F. A., R. J. Frohardt, and B. S. Kapp. 1999. Amygdaloid Dl dopamine receptor involvement in Pavlovian fear conditioning. Brain Research, 827:28-40.
- (1999) Brain Research , vol.827 , pp. 28-40
- Guarraci, F.A.¹ Frohardt, R.J.² Kapp, B.S.³

17
- 0029758993
- Neurotoxic lesions of basolateral, but not central, amygdala interfere with pavlovian second-order conditioning and reinforcer devaluation effects
- Hatfield, T., J.-S. Han, M. Conley, M. Gallagher, and P. Holland. 1996. Neurotoxic lesions of basolateral, but not central, amygdala interfere with pavlovian second-order conditioning and reinforcer devaluation effects. Journal of Neuroscience, 16:5256-5265.
- (1996) Journal of Neuroscience , vol.16 , pp. 5256-5265
- Hatfield, T.¹ Han, J.-S.² Conley, M.³ Gallagher, M.⁴ Holland, P.⁵

18
- 0013465036
- Discovering hierarchy in reinforcement learning with HEXQ
- Sydney, Australia, July
- Hengst, B. 2002. Discovering hierarchy in reinforcement learning with HEXQ. Nineteenth International Conference on Machine Learning, Sydney, Australia, July, pp. 8-12.
- (2002) Nineteenth International Conference on Machine Learning , pp. 8-12
- Hengst, B.¹

19
- 0002861883
- A model of how the basal ganglia generate and use neural signals that predict reinforcement
- edited by J. L. D. J. C. H. a. D. G. Beiser. Cambridge, MA: MIT Press
- Houk, J. C., J. L. Adams, and A. G. Barto. 1995. A model of how the basal ganglia generate and use neural signals that predict reinforcement. In Models of information processing in the basal ganglia, pp. 249-270, edited by J. L. D. J. C. H. a. D. G. Beiser. Cambridge, MA: MIT Press.
- (1995) Models of Information Processing in the Basal Ganglia , pp. 249-270
- Houk, J.C.¹ Adams, J.L.² Barto, A.G.³

20
- 85143168613
- Hierarchical learning in stochastic domains
- Amherst, MA, USA, 27-29 June
- Kaelbling, L. P. 1993. Hierarchical learning in stochastic domains.Proceedings of the Tenth International Conference on Machine Learning, Amherst, MA, USA, 27-29 June, pp. 167-173.
- (1993) Proceedings of the Tenth International Conference on Machine Learning , pp. 167-173
- Kaelbling, L.P.¹

21
- 0030795741
- Different types of fear-conditioned behavior mediated by separate nuclei within amygdala
- Killcross, S., T. W. Robbins, and B. J. Everitt. 1997. Different types of fear-conditioned behavior mediated by separate nuclei within amygdala. Nature Neuroscience, 388:377-380.
- (1997) Nature Neuroscience , vol.388 , pp. 377-380
- Killcross, S.¹ Robbins, T.W.² Everitt, B.J.³

22
- 0026847155
- Brain mechanisms of emotion and emotional learning
- LeDoux, J. E. 1992. Brain mechanisms of emotion and emotional learning. Current Opinion in Neurobiology, 2:191-197.
- (1992) Current Opinion in Neurobiology , vol.2 , pp. 191-197
- Ledoux, J.E.¹

23
- 0034076609
- Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning
- LeDoux, J. E. 2000. Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning. Annual Review of Neuroscience, 23:155-184.
- (2000) Annual Review of Neuroscience , vol.23 , pp. 155-184
- Ledoux, J.E.¹

24
- 0000123778
- Self-improving reactive agents based on reinforcement learning, planning and teaching
- Lin, L.-J. 1992. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8:293-321,
- (1992) Machine Learning , vol.8 , pp. 293-321
- Lin, L.-J.¹

25
- 0002224896
- Scaling up reinforcement learning for robot control
- Amherst, MA, USA, 27-29 June
- Lin, L.-J. 1993. Scaling up reinforcement learning for robot control. Tenth International conference on Machine Learning, Amherst, MA, USA, 27-29 June, pp. 182-189.
- (1993) Tenth International Conference on Machine Learning , pp. 182-189
- Lin, L.-J.¹

26
- 0343632381
- Excitotoxic lesions of the amygdala fail to produce impairment in visual learning for auditory secondary reinforcement but interfere with reinforcer devaluation effects in rhesus monkeys
- Malkova, L., D. Gaffan, and E. A. Murray. 1997. Excitotoxic lesions of the amygdala fail to produce impairment in visual learning for auditory secondary reinforcement but interfere with reinforcer devaluation effects in rhesus monkeys. Journal of Neuroscience, 17:6011-6020.
- (1997) Journal of Neuroscience , vol.17 , pp. 6011-6020
- Malkova, L.¹ Gaffan, D.² Murray, E.A.³

27
- 14344264466
- Q-Cut - Dynamic discovery of sub-goals in reinforcement learning
- Helsinki, Finland. Berlin, Heidelberg: Springer-Verlag
- Menache, I., S. Mannor, and N. Shimkin. 2002. Q-Cut - Dynamic discovery of sub-goals in reinforcement learning. 13th European Conference on Machine Learning, Helsinki, Finland. Berlin, Heidelberg: Springer-Verlag.
- (2002) 13th European Conference on Machine Learning
- Menache, I.¹ Mannor, S.² Shimkin, N.³

28
- 0013500961
- Ph.D. dissertation, Princeton University, Department of Mathematics
- Minsky, M. L. 1954. Theory of neural-analog reinforcement systems and its application to the brain-model problem. Ph.D. dissertation, Princeton University, Department of Mathematics.
- (1954) Theory of Neural-analog Reinforcement Systems and Its Application to the Brain-model Problem
- Minsky, M.L.¹

29
- 0035152958
- Abstract reward and punishment representations in the human orbitofrontal cortex
- O'Doherty, J. K., M. L., Kringelbach, E. T. Rolls, J. Hornak, and C. Andrews. 2001. Abstract reward and punishment representations in the human orbitofrontal cortex. Nature Neuroscience, 4(1):95-102.
- (2001) Nature Neuroscience , vol.4 , Issue.1 , pp. 95-102
- O'Doherty, J.K.¹ Kringelbach, M.L.² Rolls, E.T.³ Hornak, J.⁴ Andrews, C.⁵

30
- 0034175654
- Neuronal correlates of fear in the lateral amygdala: Multiple extracellular recordings in conscious cats
- Paré, D. and D. R. Collins. 2000. Neuronal correlates of fear in the lateral amygdala: Multiple extracellular recordings in conscious cats. Journal of Neuroscience, 20:2701-2710.
- (2000) Journal of Neuroscience , vol.20 , pp. 2701-2710
- Paré, D.¹ Collins, D.R.²

31
- 84898956770
- Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press
- Parr, R. and S. Russell. 1998. Reinforcement learning with hierarchical machines. Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press.
- (1998) Reinforcement Learning with Hierarchical Machines
- Parr, R.¹ Russell, S.²

32
- 0002055710
- Connectivity of the rat amygdaloid complex
- edited by J. P. Aggleton. New York: Oxford University Press
- Pitkanen, A. 2000. Connectivity of the rat amygdaloid complex. In The amygdala: A functional analysis, pp. 31-115, edited by J. P. Aggleton. New York: Oxford University Press.
- (2000) The Amygdala: a Functional Analysis , pp. 31-115
- Pitkanen, A.¹

33
- 0029093419
- LTP is accompanied by commensurate enhancement of auditory-evoked responses in a fear conditioning circuit
- Rogan, M. T. and J. E. LeDoux. 1995. LTP is accompanied by commensurate enhancement of auditory-evoked responses in a fear conditioning circuit. Neuron, 15:127-136.
- (1995) Neuron , vol.15 , pp. 127-136
- Rogan, M.T.¹ Ledoux, J.E.²

34
- 0033570330
- Choosing between small, likely rewards and large, unlikely rewards activates inferior and orbital prefrontal cortex
- Rogers, R. D., A. M. Owen, H. C. Middleton, E. J. Williams, J. D. Pickard, B. J. Sahakian, and T. W. Robbins. 1999. Choosing between small, likely rewards and large, unlikely rewards activates inferior and orbital prefrontal cortex. Journal of Neuroscience, 20(19):9029-9038.
- (1999) Journal of Neuroscience , vol.20 , Issue.19 , pp. 9029-9038
- Rogers, R.D.¹ Owen, A.M.² Middleton, H.C.³ Williams, E.J.⁴ Pickard, J.D.⁵ Sahakian, B.J.⁶ Robbins, T.W.⁷

35
- 0004248679
- New York: Oxford University Press
- Rolls, E. T. 1999. The brain and emotion. New York: Oxford University Press.
- (1999) The Brain and Emotion
- Rolls, E.T.¹

36
- 0034017604
- Orbitofrontal cortex and reward
- Rolls, E. T. 2000. Orbitofrontal cortex and reward. Cerebral Cortex, 10:284-294.
- (2000) Cerebral Cortex , vol.10 , pp. 284-294
- Rolls, E.T.¹

37
- 13844270304
- The monaminergic innervation of the amygdala in squirrel monkey brain: An immunohischemical study
- Sadikot, A. F. and A. Parent. 1990. The monaminergic innervation of the amygdala in squirrel monkey brain: An immunohischemical study. Experimental Brain Research 81:443-446.
- (1990) Experimental Brain Research , vol.81 , pp. 443-446
- Sadikot, A.F.¹ Parent, A.²

38
- 0032081988
- Orbitofrontal cortex and basolateral amygdala encode expected outcome during learning
- Schoenbaum, G., A. A. Chiba, and M. Gallagher. 1998. Orbitofrontal cortex and basolateral amygdala encode expected outcome during learning. Nature Neuroscience, 1:155-159.
- (1998) Nature Neuroscience , vol.1 , pp. 155-159
- Schoenbaum, G.¹ Chiba, A.A.² Gallagher, M.³

39
- 0033103761
- Neural encoding in orbitofrontal cortex and basolateral amygdala during olfactory discrimination learning
- Schoenbaum, G., A. A. Chiba, and M. Gallagher. 1999. Neural encoding in orbitofrontal cortex and basolateral amygdala during olfactory discrimination learning. Journal of Neuroscience, 19(5):1876-1884.
- (1999) Journal of Neuroscience , vol.19 , Issue.5 , pp. 1876-1884
- Schoenbaum, G.¹ Chiba, A.A.² Gallagher, M.³

40
- 0031867046
- Predective reward signal of dopamine neurons
- Schultz, W. 1998. Predective reward signal of dopamine neurons. Journal of Neurophysiology, 80:1-27.
- (1998) Journal of Neurophysiology , vol.80 , pp. 1-27
- Schultz, W.¹

41
- 0030896968
- A neural substrate of prediction and reward
- Schultz, W., P. Dayan, and P. R. Montague. 1997. A neural substrate of prediction and reward. Science, 275:1593-1599.
- (1997) Science , vol.275 , pp. 1593-1599
- Schultz, W.¹ Dayan, P.² Montague, P.R.³

42
- 0034061495
- Reward processing in primate orbitofrontal cortex and basla ganglia
- Schultz, W. L., L. Tremblay, and J. R. Hollerman. 2000. Reward processing in primate orbitofrontal cortex and basla ganglia. Cerebral Cortex, 10:272-283.
- (2000) Cerebral Cortex , vol.10 , pp. 272-283
- Schultz, W.L.¹ Tremblay, L.² Hollerman, J.R.³

43
- 84899028619
- Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press
- Shelton, C. R. 2001. Balancing multiple sources of reward in reinforcement learning. Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press.
- (2001) Balancing Multiple Sources of Reward in Reinforcement Learning
- Shelton, C.R.¹

44
- 0014041964
- Motivational and emotional control of cognition
- Simon, H. A. 1967. Motivational and emotional control of cognition. Psychological Review, 74:29-39.
- (1967) Psychological Review , vol.74 , pp. 29-39
- Simon, H.A.¹

45
- 0001027894
- Transfer of learning by composing solutions of elemental sequential tasks
- Singh, S. P. 1992. Transfer of learning by composing solutions of elemental sequential tasks. Machine Learning, 8:323-339.
- (1992) Machine Learning , vol.8 , pp. 323-339
- Singh, S.P.¹

46
- 33847202724
- Learning to predict by the methods of temporal differences
- Sutton, R. S. 1988. Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

47
- 0019537951
- Toward a modern theory of adaptive networks: Expectation and prediction
- Sutton, R. S. and A. G. Barto. 1981. Toward a modern theory of adaptive networks: Expectation and prediction. Psychological Review, 88:135-140.
- (1981) Psychological Review , vol.88 , pp. 135-140
- Sutton, R.S.¹ Barto, A.G.²

48
- 13844262427
- Improved switching among temporally abstract actions
- Cambridge, MA: MIT Press
- Sutton, R. S., S. Singh, D. Precup, and B. Ravindran. 1998. Improved switching among temporally abstract actions. Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press.
- (1998) Advances in Neural Information Processing Systems
- Sutton, R.S.¹ Singh, S.² Precup, D.³ Ravindran, B.⁴

49
- 0033170372
- Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
- Sutton, R. S., D. Precup, and S. Singh. 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1):181-211.
- (1999) Artificial Intelligence , vol.112 , Issue.1 , pp. 181-211
- Sutton, R.S.¹ Precup, D.² Singh, S.³

50
- 0033594310
- Relative reward preference in primate orbitofrontal cortex
- Tremblay, L. and W. Schultz. 1999. Relative reward preference in primate orbitofrontal cortex. Nature Neuroscience, 398:704-708.
- (1999) Nature Neuroscience , vol.398 , pp. 704-708
- Tremblay, L.¹ Schultz, W.²

51
- 0034036369
- Reward-related neuronal activity during gonogo task performance in primate orbitofrontal cortex
- Tremblay, L. and W. Schultz. 2000. Reward-related neuronal activity during gonogo task performance in primate orbitofrontal cortex. Journal of Neurophysiology, 83:1864-1876.
- (2000) Journal of Neurophysiology , vol.83 , pp. 1864-1876
- Tremblay, L.¹ Schultz, W.²

52
- 35248870803
- When robots weep: Emotional memories and decision-making
- Madison, WI: MIT Press
- Valesquez, J. D. 1998. When robots weep: Emotional memories and decision-making. AAAI'98. Madison, WI: MIT Press.
- (1998) AAAI'98
- Valesquez, J.D.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.