메뉴 건너뛰기




Volumn 71, Issue 2, 2011, Pages 370-379

A Neural Signature of Hierarchical Reinforcement Learning

Author keywords

[No Author keywords available]

Indexed keywords

ADULT; ARTICLE; BEHAVIORAL SCIENCE; BRAIN NERVE CELL; CELL ACTIVITY; CONCEPTUAL FRAMEWORK; CONTROLLED STUDY; DOMINANCE BEHAVIOR; ELECTROENCEPHALOGRAM; FEMALE; FUNCTIONAL MAGNETIC RESONANCE IMAGING; HUMAN; HUMAN EXPERIMENT; LEARNING; LEARNING ALGORITHM; MACHINE LEARNING; MALE; MATHEMATICAL COMPUTING; NERVE CELL; NEUROIMAGING; PREDICTION; PRIORITY JOURNAL; REINFORCEMENT; TASK PERFORMANCE;

EID: 79960637995     PISSN: 08966273     EISSN: 10974199     Source Type: Journal    
DOI: 10.1016/j.neuron.2011.05.042     Document Type: Article
Times cited : (152)

References (60)
  • 1
    • 42749096312 scopus 로고    scopus 로고
    • Cognitive control, hierarchy, and the rostro-caudal organization of the frontal lobes
    • Badre D. Cognitive control, hierarchy, and the rostro-caudal organization of the frontal lobes. Trends Cogn. Sci. (Regul. Ed.) 2008, 12:193-200.
    • (2008) Trends Cogn. Sci. (Regul. Ed.) , vol.12 , pp. 193-200
    • Badre, D.1
  • 2
    • 84962118342 scopus 로고    scopus 로고
    • Mechanisms of hierarchical reinforcement learning in cortico-striatal circuits 2: evidence from fMRI
    • in press. Published online June 21, 2011
    • Badre D., Frank M.J. Mechanisms of hierarchical reinforcement learning in cortico-striatal circuits 2: evidence from fMRI. Cereb. Cortex 2011, in press. Published online June 21, 2011.
    • (2011) Cereb. Cortex
    • Badre, D.1    Frank, M.J.2
  • 3
    • 63649111959 scopus 로고    scopus 로고
    • Hierarchical cognitive control deficits following damage to the human frontal lobe
    • Badre D., Hoffman J., Cooney J.W., D'Esposito M. Hierarchical cognitive control deficits following damage to the human frontal lobe. Nat. Neurosci. 2009, 12:515-522.
    • (2009) Nat. Neurosci. , vol.12 , pp. 515-522
    • Badre, D.1    Hoffman, J.2    Cooney, J.W.3    D'Esposito, M.4
  • 4
    • 79953298359 scopus 로고    scopus 로고
    • Dissociated roles of the anterior cingulate cortex in reward and conflict processing as revealed by the feedback error-related negativity and N200
    • Baker T.E., Holroyd C.B. Dissociated roles of the anterior cingulate cortex in reward and conflict processing as revealed by the feedback error-related negativity and N200. Biol. Psychol. 2011, 87:25-34.
    • (2011) Biol. Psychol. , vol.87 , pp. 25-34
    • Baker, T.E.1    Holroyd, C.B.2
  • 5
    • 0000541213 scopus 로고
    • Adaptive critics and the basal ganglia
    • MIT Press, Cambridge, MA, J.C. Houk, J. Davis, D. Beiser (Eds.)
    • Barto A.G. Adaptive critics and the basal ganglia. Models of Information Processing in the Basal Ganglia 1995, 215-232. MIT Press, Cambridge, MA. J.C. Houk, J. Davis, D. Beiser (Eds.).
    • (1995) Models of Information Processing in the Basal Ganglia , pp. 215-232
    • Barto, A.G.1
  • 6
    • 0141988716 scopus 로고    scopus 로고
    • Recent advances in hierarchical reinforcement learning
    • Barto A.G., Mahadevan S. Recent advances in hierarchical reinforcement learning. Discrete Event Dyn. Syst. 2003, 13:341-379.
    • (2003) Discrete Event Dyn. Syst. , vol.13 , pp. 341-379
    • Barto, A.G.1    Mahadevan, S.2
  • 7
    • 43049099970 scopus 로고    scopus 로고
    • Hierarchical models of behavior and prefrontal function
    • Botvinick M.M. Hierarchical models of behavior and prefrontal function. Trends Cogn. Sci. (Regul. Ed.) 2008, 12:201-208.
    • (2008) Trends Cogn. Sci. (Regul. Ed.) , vol.12 , pp. 201-208
    • Botvinick, M.M.1
  • 8
    • 0033547441 scopus 로고    scopus 로고
    • Conflict monitoring versus selection-for-action in anterior cingulate cortex
    • Botvinick M., Nystrom L.E., Fissell K., Carter C.S., Cohen J.D. Conflict monitoring versus selection-for-action in anterior cingulate cortex. Nature 1999, 402:179-181.
    • (1999) Nature , vol.402 , pp. 179-181
    • Botvinick, M.1    Nystrom, L.E.2    Fissell, K.3    Carter, C.S.4    Cohen, J.D.5
  • 9
    • 70350566799 scopus 로고    scopus 로고
    • Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective
    • Botvinick M.M., Niv Y., Barto A.C. Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 2009, 113:262-280.
    • (2009) Cognition , vol.113 , pp. 262-280
    • Botvinick, M.M.1    Niv, Y.2    Barto, A.C.3
  • 10
    • 0030612822 scopus 로고    scopus 로고
    • The psychophysics toolbox
    • Brainard D.H. The psychophysics toolbox. Spat. Vis. 1997, 10:433-436.
    • (1997) Spat. Vis. , vol.10 , pp. 433-436
    • Brainard, D.H.1
  • 11
    • 0034988599 scopus 로고    scopus 로고
    • Functional imaging of neural responses to expectancy and experience of monetary gains and losses
    • Breiter H.C., Aharon I., Kahneman D., Dale A., Shizgal P. Functional imaging of neural responses to expectancy and experience of monetary gains and losses. Neuron 2001, 30:619-639.
    • (2001) Neuron , vol.30 , pp. 619-639
    • Breiter, H.C.1    Aharon, I.2    Kahneman, D.3    Dale, A.4    Shizgal, P.5
  • 12
    • 67949115526 scopus 로고    scopus 로고
    • Prefrontal organization of cognitive control according to levels of abstraction
    • Christoff K., Keramatian K., Gordon A.M., Smith R., Mädler B. Prefrontal organization of cognitive control according to levels of abstraction. Brain Res. 2009, 1286:94-105.
    • (2009) Brain Res. , vol.1286 , pp. 94-105
    • Christoff, K.1    Keramatian, K.2    Gordon, A.M.3    Smith, R.4    Mädler, B.5
  • 13
    • 0022644104 scopus 로고
    • Stimulation of the lateral habenula inhibits dopamine-containing neurons in the substantia nigra and ventral tegmental area of the rat
    • Christoph G.R., Leonzio R.J., Wilcox K.S. Stimulation of the lateral habenula inhibits dopamine-containing neurons in the substantia nigra and ventral tegmental area of the rat. J. Neurosci. 1986, 6:613-619.
    • (1986) J. Neurosci. , vol.6 , pp. 613-619
    • Christoph, G.R.1    Leonzio, R.J.2    Wilcox, K.S.3
  • 14
    • 0034075310 scopus 로고    scopus 로고
    • Contention scheduling and the control of routine activities
    • Cooper R., Shallice T. Contention scheduling and the control of routine activities. Cogn. Neuropsychol. 2000, 17:297-338.
    • (2000) Cogn. Neuropsychol. , vol.17 , pp. 297-338
    • Cooper, R.1    Shallice, T.2
  • 15
    • 36048965534 scopus 로고    scopus 로고
    • Valence and salience contribute to nucleus accumbens activation
    • Cooper J.C., Knutson B. Valence and salience contribute to nucleus accumbens activation. Neuroimage 2008, 39:538-547.
    • (2008) Neuroimage , vol.39 , pp. 538-547
    • Cooper, J.C.1    Knutson, B.2
  • 16
    • 42949117916 scopus 로고    scopus 로고
    • The reorienting system of the human brain: from environment to theory of mind
    • Corbetta M., Patel G., Shulman G.L. The reorienting system of the human brain: from environment to theory of mind. Neuron 2008, 58:306-324.
    • (2008) Neuron , vol.58 , pp. 306-324
    • Corbetta, M.1    Patel, G.2    Shulman, G.L.3
  • 17
    • 0030175198 scopus 로고    scopus 로고
    • AFNI: software for analysis and visualization of functional magnetic resonance neuroimages
    • Cox R.W. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Comput. Biomed. Res. 1996, 29:162-173.
    • (1996) Comput. Biomed. Res. , vol.29 , pp. 162-173
    • Cox, R.W.1
  • 18
    • 70350566659 scopus 로고    scopus 로고
    • Reinforcement learning and higher level cognition: introduction to special issue
    • Daw N.D., Frank M.J. Reinforcement learning and higher level cognition: introduction to special issue. Cognition 2009, 113:259-261.
    • (2009) Cognition , vol.113 , pp. 259-261
    • Daw, N.D.1    Frank, M.J.2
  • 19
    • 52049107354 scopus 로고    scopus 로고
    • Reinforcement learning: the good, the bad and the ugly
    • Dayan P., Niv Y. Reinforcement learning: the good, the bad and the ugly. Curr. Opin. Neurobiol. 2008, 18:185-196.
    • (2008) Curr. Opin. Neurobiol. , vol.18 , pp. 185-196
    • Dayan, P.1    Niv, Y.2
  • 21
    • 13844322032 scopus 로고    scopus 로고
    • The human prefrontal cortex has evolved to represent components of structured event complexes
    • Elsevier, Amsterdam, J. Grafman (Ed.)
    • Grafman J. The human prefrontal cortex has evolved to represent components of structured event complexes. Handbook of Neuropsychology 2002, Elsevier, Amsterdam. J. Grafman (Ed.).
    • (2002) Handbook of Neuropsychology
    • Grafman, J.1
  • 22
    • 33846374385 scopus 로고    scopus 로고
    • Timing and sequence of brain activity in top-down control of visual-spatial attention
    • Grent-'t-Jong T., Woldorff M.G. Timing and sequence of brain activity in top-down control of visual-spatial attention. PLoS Biol. 2007, 5:e12.
    • (2007) PLoS Biol. , vol.5
    • Grent-'t-Jong, T.1    Woldorff, M.G.2
  • 23
    • 45949091429 scopus 로고    scopus 로고
    • Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors
    • Hare T.A., O'Doherty J., Camerer C.F., Schultz W., Rangel A. Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors. J. Neurosci. 2008, 28:5623-5630.
    • (2008) J. Neurosci. , vol.28 , pp. 5623-5630
    • Hare, T.A.1    O'Doherty, J.2    Camerer, C.F.3    Schultz, W.4    Rangel, A.5
  • 24
    • 33749080272 scopus 로고    scopus 로고
    • Heterarchical reinforcement-learning model for integration of multiple cortico-striatal loops: fMRI examination in stimulus-action-reward association learning
    • Haruno M., Kawato M. Heterarchical reinforcement-learning model for integration of multiple cortico-striatal loops: fMRI examination in stimulus-action-reward association learning. Neural Netw. 2006, 19:1242-1254.
    • (2006) Neural Netw. , vol.19 , pp. 1242-1254
    • Haruno, M.1    Kawato, M.2
  • 26
    • 85047670409 scopus 로고    scopus 로고
    • The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity
    • Holroyd C.B., Coles M.G.H. The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. Psychol. Rev. 2002, 109:679-709.
    • (2002) Psychol. Rev. , vol.109 , pp. 679-709
    • Holroyd, C.B.1    Coles, M.G.H.2
  • 27
    • 85026140894 scopus 로고    scopus 로고
    • Errors in reward prediction are reflected in the event-related brain potential
    • Holroyd C.B., Nieuwenhuis S., Yeung N., Cohen J.D. Errors in reward prediction are reflected in the event-related brain potential. Neuroreport 2003, 14:2481-2484.
    • (2003) Neuroreport , vol.14 , pp. 2481-2484
    • Holroyd, C.B.1    Nieuwenhuis, S.2    Yeung, N.3    Cohen, J.D.4
  • 29
    • 54049091454 scopus 로고    scopus 로고
    • Neuropharmacology of performance monitoring
    • Jocham G., Ullsperger M. Neuropharmacology of performance monitoring. Neurosci. Biobehav. Rev. 2009, 33:48-60.
    • (2009) Neurosci. Biobehav. Rev. , vol.33 , pp. 48-60
    • Jocham, G.1    Ullsperger, M.2
  • 31
    • 67649813444 scopus 로고    scopus 로고
    • Motivation and cognitive control in the human prefrontal cortex
    • Kouneiher F., Charron S., Koechlin E. Motivation and cognitive control in the human prefrontal cortex. Nat. Neurosci. 2009, 12:939-945.
    • (2009) Nat. Neurosci. , vol.12 , pp. 939-945
    • Kouneiher, F.1    Charron, S.2    Koechlin, E.3
  • 32
    • 29744453937 scopus 로고    scopus 로고
    • Evidence for hierarchical error processing in the human brain
    • Krigolson O.E., Holroyd C.B. Evidence for hierarchical error processing in the human brain. Neuroscience 2006, 137:13-17.
    • (2006) Neuroscience , vol.137 , pp. 13-17
    • Krigolson, O.E.1    Holroyd, C.B.2
  • 33
    • 0001990073 scopus 로고
    • The problem of serial order in behavior
    • Wiley, New York, L.A. Jeffress (Ed.)
    • Lashley K.S. The problem of serial order in behavior. Cerebral Mechanisms in Behavior: The Hixon Symposium 1951, 112-136. Wiley, New York. L.A. Jeffress (Ed.).
    • (1951) Cerebral Mechanisms in Behavior: The Hixon Symposium , pp. 112-136
    • Lashley, K.S.1
  • 34
    • 34347343926 scopus 로고    scopus 로고
    • Lateral habenula as a source of negative reward signals in dopamine neurons
    • Matsumoto M., Hikosaka O. Lateral habenula as a source of negative reward signals in dopamine neurons. Nature 2007, 447:1111-1115.
    • (2007) Nature , vol.447 , pp. 1111-1115
    • Matsumoto, M.1    Hikosaka, O.2
  • 35
    • 0031436055 scopus 로고    scopus 로고
    • Event-related brain potentials following incorrect feedback in a time-estimation task: evidence for a " generic" neural system for error detection
    • Miltner W.H.R., Braun C.H., Coles M.G.H. Event-related brain potentials following incorrect feedback in a time-estimation task: evidence for a " generic" neural system for error detection. J. Cogn. Neurosci. 1997, 9:788-798.
    • (1997) J. Cogn. Neurosci. , vol.9 , pp. 788-798
    • Miltner, W.H.R.1    Braun, C.H.2    Coles, M.G.H.3
  • 36
    • 0029981543 scopus 로고    scopus 로고
    • A framework for mesencephalic dopamine systems based on predictive Hebbian learning
    • Montague P.R., Dayan P., Sejnowski T.J. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 1996, 16:1936-1947.
    • (1996) J. Neurosci. , vol.16 , pp. 1936-1947
    • Montague, P.R.1    Dayan, P.2    Sejnowski, T.J.3
  • 37
    • 17544368654 scopus 로고    scopus 로고
    • Dopaminergic modulation of neuronal excitability in the striatum and nucleus accumbens
    • Nicola S.M., Surmeier J., Malenka R.C. Dopaminergic modulation of neuronal excitability in the striatum and nucleus accumbens. Annu. Rev. Neurosci. 2000, 23:185-215.
    • (2000) Annu. Rev. Neurosci. , vol.23 , pp. 185-215
    • Nicola, S.M.1    Surmeier, J.2    Malenka, R.C.3
  • 38
    • 21444453526 scopus 로고    scopus 로고
    • Knowing good from bad: differential activation of human cortical areas by positive and negative outcomes
    • Nieuwenhuis S., Slagter H.A., von Geusau N.J.A., Heslenfeld D.J., Holroyd C.B. Knowing good from bad: differential activation of human cortical areas by positive and negative outcomes. Eur. J. Neurosci. 2005, 21:3161-3168.
    • (2005) Eur. J. Neurosci. , vol.21 , pp. 3161-3168
    • Nieuwenhuis, S.1    Slagter, H.A.2    von Geusau, N.J.A.3    Heslenfeld, D.J.4    Holroyd, C.B.5
  • 39
    • 67349283062 scopus 로고    scopus 로고
    • Reinforcement learning in the brain
    • Niv Y. Reinforcement learning in the brain. J. Math. Psychol. 2009, 53:139-154.
    • (2009) J. Math. Psychol. , vol.53 , pp. 139-154
    • Niv, Y.1
  • 40
    • 34447643062 scopus 로고    scopus 로고
    • Model-based fMRI and its application to reward learning and decision making
    • O'Doherty J.P., Hampton A., Kim H. Model-based fMRI and its application to reward learning and decision making. Ann. N Y Acad. Sci. 2007, 1104:35-53.
    • (2007) Ann. N Y Acad. Sci. , vol.1104 , pp. 35-53
    • O'Doherty, J.P.1    Hampton, A.2    Kim, H.3
  • 41
    • 0037987978 scopus 로고    scopus 로고
    • Temporal difference models and reward-related learning in the human brain
    • O'Doherty J.P., Dayan P., Friston K.J., Critchley H.D., Dolan R.J. Temporal difference models and reward-related learning in the human brain. Neuron 2003, 38:329-337.
    • (2003) Neuron , vol.38 , pp. 329-337
    • O'Doherty, J.P.1    Dayan, P.2    Friston, K.J.3    Critchley, H.D.4    Dolan, R.J.5
  • 42
    • 29544443602 scopus 로고    scopus 로고
    • Predictive neural coding of reward preference involves dissociable responses in human ventral midbrain and ventral striatum
    • O'Doherty J.P., Buchanan T.W., Seymour B., Dolan R.J. Predictive neural coding of reward preference involves dissociable responses in human ventral midbrain and ventral striatum. Neuron 2006, 49:157-166.
    • (2006) Neuron , vol.49 , pp. 157-166
    • O'Doherty, J.P.1    Buchanan, T.W.2    Seymour, B.3    Dolan, R.J.4
  • 43
    • 84898956770 scopus 로고    scopus 로고
    • Reinforcement learning with hierarchies of machines
    • Parr R., Russell S. Reinforcement learning with hierarchies of machines. Adv. Neural Inf. Process Sys. 1998, 10:1043-1049.
    • (1998) Adv. Neural Inf. Process Sys. , vol.10 , pp. 1043-1049
    • Parr, R.1    Russell, S.2
  • 45
    • 31744432037 scopus 로고    scopus 로고
    • Can cognitive processes be inferred from neuroimaging data?
    • Poldrack R.A. Can cognitive processes be inferred from neuroimaging data?. Trends Cogn. Sci. (Regul. Ed.) 2006, 10:59-63.
    • (2006) Trends Cogn. Sci. (Regul. Ed.) , vol.10 , pp. 59-63
    • Poldrack, R.A.1
  • 46
    • 70350574601 scopus 로고    scopus 로고
    • Developing PFC representations using reinforcement learning
    • Reynolds J.R., O'Reilly R.C. Developing PFC representations using reinforcement learning. Cognition 2009, 113:281-292.
    • (2009) Cognition , vol.113 , pp. 281-292
    • Reynolds, J.R.1    O'Reilly, R.C.2
  • 48
    • 0030896968 scopus 로고    scopus 로고
    • A neural substrate of prediction and reward
    • Schultz W., Dayan P., Montague P.R. A neural substrate of prediction and reward. Science 1997, 275:1593-1599.
    • (1997) Science , vol.275 , pp. 1593-1599
    • Schultz, W.1    Dayan, P.2    Montague, P.R.3
  • 50
    • 34247869023 scopus 로고    scopus 로고
    • Differential encoding of losses and gains in the human striatum
    • Seymour B., Daw N., Dayan P., Singer T., Dolan R. Differential encoding of losses and gains in the human striatum. J. Neurosci. 2007, 27:4826-4831.
    • (2007) J. Neurosci. , vol.27 , pp. 4826-4831
    • Seymour, B.1    Daw, N.2    Dayan, P.3    Singer, T.4    Dolan, R.5
  • 53
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
    • Sutton R.S., Precup D., Singh S. Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 1999, 112:181-211.
    • (1999) Artif. Intell. , vol.112 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3
  • 55
    • 0038718773 scopus 로고    scopus 로고
    • Error monitoring using external feedback: specific roles of the habenular complex, the reward system, and the cingulate motor area revealed by functional magnetic resonance imaging
    • Ullsperger M., von Cramon D.Y. Error monitoring using external feedback: specific roles of the habenular complex, the reward system, and the cingulate motor area revealed by functional magnetic resonance imaging. J. Neurosci. 2003, 23:4308-4314.
    • (2003) J. Neurosci. , vol.23 , pp. 4308-4314
    • Ullsperger, M.1    von Cramon, D.Y.2
  • 56
    • 7044253460 scopus 로고    scopus 로고
    • Errors without conflict: implications for performance monitoring theories of anterior cingulate cortex
    • van Veen V., Holroyd C.B., Cohen J.D., Stenger V.A., Carter C.S. Errors without conflict: implications for performance monitoring theories of anterior cingulate cortex. Brain Cogn. 2004, 56:267-276.
    • (2004) Brain Cogn. , vol.56 , pp. 267-276
    • van Veen, V.1    Holroyd, C.B.2    Cohen, J.D.3    Stenger, V.A.4    Carter, C.S.5
  • 57
    • 33748709841 scopus 로고    scopus 로고
    • Dissociable systems for gain- and loss-related value predictions and errors of prediction in the human brain
    • Yacubian J., Gläscher J., Schroeder K., Sommer T., Braus D.F., Büchel C. Dissociable systems for gain- and loss-related value predictions and errors of prediction in the human brain. J. Neurosci. 2006, 26:9530-9537.
    • (2006) J. Neurosci. , vol.26 , pp. 9530-9537
    • Yacubian, J.1    Gläscher, J.2    Schroeder, K.3    Sommer, T.4    Braus, D.F.5    Büchel, C.6
  • 59
    • 3042570744 scopus 로고    scopus 로고
    • The neural basis of error detection: conflict monitoring and the error-related negativity
    • Yeung N., Botvinick M.M., Cohen J.D. The neural basis of error detection: conflict monitoring and the error-related negativity. Psychol. Rev. 2004, 111:931-959.
    • (2004) Psychol. Rev. , vol.111 , pp. 931-959
    • Yeung, N.1    Botvinick, M.M.2    Cohen, J.D.3
  • 60
    • 17744363898 scopus 로고    scopus 로고
    • ERP correlates of feedback and reward processing in the presence and absence of response choice
    • Yeung N., Holroyd C.B., Cohen J.D. ERP correlates of feedback and reward processing in the presence and absence of response choice. Cereb. Cortex 2005, 15:535-544.
    • (2005) Cereb. Cortex , vol.15 , pp. 535-544
    • Yeung, N.1    Holroyd, C.B.2    Cohen, J.D.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.