메뉴 건너뛰기




Volumn 1, Issue , 2015, Pages 94-100

The structure of reinforcement-learning mechanisms in the human brain

Author keywords

[No Author keywords available]

Indexed keywords

BRAIN FUNCTION; CONCEPTUAL FRAMEWORK; DOPAMINERGIC NERVE CELL; HIPPOCAMPUS; HUMAN; LEARNING ALGORITHM; LEARNING THEORY; NEUROIMAGING; POSTERIOR PARIETAL CORTEX; PREDICTION; PREFRONTAL CORTEX; PROBABILITY; REINFORCEMENT LEARNING THEORY; REVIEW;

EID: 84920069670     PISSN: None     EISSN: 23521546     Source Type: Journal    
DOI: 10.1016/j.cobeha.2014.10.004     Document Type: Review
Times cited : (63)

References (70)
  • 2
    • 0031867046 scopus 로고    scopus 로고
    • Predictive reward signal of dopamine neurons
    • Schultz W. Predictive reward signal of dopamine neurons. J Neurophysiol 1998, 80:1-27.
    • (1998) J Neurophysiol , vol.80 , pp. 1-27
    • Schultz, W.1
  • 3
    • 0029981543 scopus 로고    scopus 로고
    • A framework for mesencephalic dopamine systems based on predictive Hebbian learning
    • Montague P.R., Dayan P., Sejnowski T.J. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci 1996, 16:1936-1947.
    • (1996) J Neurosci , vol.16 , pp. 1936-1947
    • Montague, P.R.1    Dayan, P.2    Sejnowski, T.J.3
  • 4
    • 0030896968 scopus 로고    scopus 로고
    • A neural substrate of prediction and reward
    • Schultz W., Dayan P., Montague P.R. A neural substrate of prediction and reward. Science 1997, 275:1593-1599.
    • (1997) Science , vol.275 , pp. 1593-1599
    • Schultz, W.1    Dayan, P.2    Montague, P.R.3
  • 5
    • 36348966690 scopus 로고    scopus 로고
    • Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making
    • Schonberg T., Daw N.D., Joel D., O'Doherty J.P. Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making. J Neurosci 2007, 27:12860-12867.
    • (2007) J Neurosci , vol.27 , pp. 12860-12867
    • Schonberg, T.1    Daw, N.D.2    Joel, D.3    O'Doherty, J.P.4
  • 6
    • 0037650217 scopus 로고    scopus 로고
    • Temporal prediction errors in a passive learning task activate human striatum
    • McClure S.M., Berns G.S., Montague P.R. Temporal prediction errors in a passive learning task activate human striatum. Neuron 2003, 38:339-346.
    • (2003) Neuron , vol.38 , pp. 339-346
    • McClure, S.M.1    Berns, G.S.2    Montague, P.R.3
  • 7
    • 84855688852 scopus 로고    scopus 로고
    • Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain
    • Niv Y., Edlund J.A., Dayan P., O'Doherty J.P. Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain. J Neurosci 2012, 32:551-562.
    • (2012) J Neurosci , vol.32 , pp. 551-562
    • Niv, Y.1    Edlund, J.A.2    Dayan, P.3    O'Doherty, J.P.4
  • 8
    • 40049086223 scopus 로고    scopus 로고
    • BOLD responses reflecting dopaminergic signals in the human ventral tegmental area
    • D'Ardenne K., McClure S.M., Nystrom L.E., Cohen J.D. BOLD responses reflecting dopaminergic signals in the human ventral tegmental area. Science 2008, 319:1264-1267.
    • (2008) Science , vol.319 , pp. 1264-1267
    • D'Ardenne, K.1    McClure, S.M.2    Nystrom, L.E.3    Cohen, J.D.4
  • 9
    • 33748188120 scopus 로고    scopus 로고
    • The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans
    • Hampton A.N., Bossaerts P., O'Doherty J.P. The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. J Neurosci 2006, 26:8360-8367.
    • (2006) J Neurosci , vol.26 , pp. 8360-8367
    • Hampton, A.N.1    Bossaerts, P.2    O'Doherty, J.P.3
  • 10
    • 58449113882 scopus 로고    scopus 로고
    • Determining a role for ventromedial prefrontal cortex in encoding action-based value signals during reward-related decision making
    • Glascher J., Hampton A.N., O'Doherty J.P. Determining a role for ventromedial prefrontal cortex in encoding action-based value signals during reward-related decision making. Cereb Cortex 2009, 19:483-495.
    • (2009) Cereb Cortex , vol.19 , pp. 483-495
    • Glascher, J.1    Hampton, A.N.2    O'Doherty, J.P.3
  • 11
    • 66449094746 scopus 로고    scopus 로고
    • How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action
    • Boorman E.D., Behrens T.E., Woolrich M.W., Rushworth M.F. How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron 2009, 62:733-743.
    • (2009) Neuron , vol.62 , pp. 733-743
    • Boorman, E.D.1    Behrens, T.E.2    Woolrich, M.W.3    Rushworth, M.F.4
  • 14
    • 28044450875 scopus 로고    scopus 로고
    • Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
    • Daw N.D., Niv Y., Dayan P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci 2005, 8:1704-1711.
    • (2005) Nat Neurosci , vol.8 , pp. 1704-1711
    • Daw, N.D.1    Niv, Y.2    Dayan, P.3
  • 15
    • 0002692217 scopus 로고
    • Actions and habits: the development of a behavioural autonomy
    • Dickinson A. Actions and habits: the development of a behavioural autonomy. Philos Trans R Soc Lond B Biol Sci 1985, 308:67-78.
    • (1985) Philos Trans R Soc Lond B Biol Sci , vol.308 , pp. 67-78
    • Dickinson, A.1
  • 16
    • 0031801210 scopus 로고    scopus 로고
    • Goal-directed instrumental action: contingency and incentive learning and their cortical substrates
    • Balleine B.W., Dickinson A. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology 1998, 37:407-419.
    • (1998) Neuropharmacology , vol.37 , pp. 407-419
    • Balleine, B.W.1    Dickinson, A.2
  • 17
    • 80052143971 scopus 로고    scopus 로고
    • Separate encoding of model-based and model-free valuations in the human brain
    • Beierholm U.R., Anen C., Quartz S., Bossaerts P. Separate encoding of model-based and model-free valuations in the human brain. Neuroimage 2011, 58:955-962.
    • (2011) Neuroimage , vol.58 , pp. 955-962
    • Beierholm, U.R.1    Anen, C.2    Quartz, S.3    Bossaerts, P.4
  • 18
    • 84893508813 scopus 로고    scopus 로고
    • Neural computations underlying arbitration between model-based and model-free learning
    • Lee S.W., Shimojo S., O'Doherty J.P. Neural computations underlying arbitration between model-based and model-free learning. Neuron 2014, 81:687-699.
    • (2014) Neuron , vol.81 , pp. 687-699
    • Lee, S.W.1    Shimojo, S.2    O'Doherty, J.P.3
  • 19
    • 77953260848 scopus 로고    scopus 로고
    • States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning
    • Glascher J., Daw N., Dayan P., O'Doherty J.P. States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 2010, 66:585-595.
    • (2010) Neuron , vol.66 , pp. 585-595
    • Glascher, J.1    Daw, N.2    Dayan, P.3    O'Doherty, J.P.4
  • 20
    • 84880601439 scopus 로고    scopus 로고
    • Neural correlates of the divergence of instrumental probability distributions
    • Liljeholm M., Wang S., Zhang J., O'Doherty J.P. Neural correlates of the divergence of instrumental probability distributions. J Neurosci 2013, 33:12519-12527.
    • (2013) J Neurosci , vol.33 , pp. 12519-12527
    • Liljeholm, M.1    Wang, S.2    Zhang, J.3    O'Doherty, J.P.4
  • 21
    • 79955709936 scopus 로고    scopus 로고
    • Neural correlates of forward planning in a spatial decision task in humans
    • Simon D.A., Daw N.D. Neural correlates of forward planning in a spatial decision task in humans. J Neurosci 2011, 31:5526-5539.
    • (2011) J Neurosci , vol.31 , pp. 5526-5539
    • Simon, D.A.1    Daw, N.D.2
  • 22
    • 84860307045 scopus 로고    scopus 로고
    • Mapping value based planning and extensively trained choice in the human brain
    • Wunderlich K., Dayan P., Dolan R.J. Mapping value based planning and extensively trained choice in the human brain. Nat Neurosci 2012, 15:786-791.
    • (2012) Nat Neurosci , vol.15 , pp. 786-791
    • Wunderlich, K.1    Dayan, P.2    Dolan, R.J.3
  • 23
    • 1642580578 scopus 로고    scopus 로고
    • Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning
    • Yin H.H., Knowlton B.J., Balleine B.W. Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. Eur J Neurosci 2004, 19:181-189.
    • (2004) Eur J Neurosci , vol.19 , pp. 181-189
    • Yin, H.H.1    Knowlton, B.J.2    Balleine, B.W.3
  • 24
    • 66449119919 scopus 로고    scopus 로고
    • A specific role for posterior dorsolateral striatum in human habit learning
    • Tricomi E., Balleine B.W., O'Doherty J.P. A specific role for posterior dorsolateral striatum in human habit learning. Eur J Neurosci 2009, 29:2225-2232.
    • (2009) Eur J Neurosci , vol.29 , pp. 2225-2232
    • Tricomi, E.1    Balleine, B.W.2    O'Doherty, J.P.3
  • 25
    • 84872761547 scopus 로고    scopus 로고
    • The ubiquity of model-based reinforcement learning
    • Doll B.B., Simon D.A., Daw N.D. The ubiquity of model-based reinforcement learning. Curr Opin Neurobiol 2012, 22:1075-1081.
    • (2012) Curr Opin Neurobiol , vol.22 , pp. 1075-1081
    • Doll, B.B.1    Simon, D.A.2    Daw, N.D.3
  • 26
    • 79952746011 scopus 로고    scopus 로고
    • Model-based influences on humans' choices and striatal prediction errors
    • Daw N.D., Gershman S.J., Seymour B., Dayan P., Dolan R.J. Model-based influences on humans' choices and striatal prediction errors. Neuron 2011, 69:1204-1215.
    • (2011) Neuron , vol.69 , pp. 1204-1215
    • Daw, N.D.1    Gershman, S.J.2    Seymour, B.3    Dayan, P.4    Dolan, R.J.5
  • 27
    • 77952541839 scopus 로고    scopus 로고
    • Learning latent structure: carving nature at its joints
    • Gershman S.J., Niv Y. Learning latent structure: carving nature at its joints. Curr Opin Neurobiol 2010, 20:251-256.
    • (2010) Curr Opin Neurobiol , vol.20 , pp. 251-256
    • Gershman, S.J.1    Niv, Y.2
  • 30
    • 84889690418 scopus 로고    scopus 로고
    • Vision: are models of object recognition catching up with the brain?
    • Poggio T., Ullman S. Vision: are models of object recognition catching up with the brain?. Ann N Y Acad Sci 2013, 1305:72-82.
    • (2013) Ann N Y Acad Sci , vol.1305 , pp. 72-82
    • Poggio, T.1    Ullman, S.2
  • 31
    • 84887004840 scopus 로고    scopus 로고
    • Decision making as a window on cognition
    • Shadlen M.N., Kiani R. Decision making as a window on cognition. Neuron 2013, 80:791-806.
    • (2013) Neuron , vol.80 , pp. 791-806
    • Shadlen, M.N.1    Kiani, R.2
  • 33
    • 84908673742 scopus 로고    scopus 로고
    • Neurocognitive mechanisms of perception-action coordination: a review and theoretical integration
    • Ridderinkhof K.R. Neurocognitive mechanisms of perception-action coordination: a review and theoretical integration. Neurosci Biobehav Rev 2014.
    • (2014) Neurosci Biobehav Rev
    • Ridderinkhof, K.R.1
  • 34
    • 0030606007 scopus 로고    scopus 로고
    • Dissociating executive functions of the prefrontal cortex
    • discussion 1470-1461
    • Robbins T.W. Dissociating executive functions of the prefrontal cortex. Philos Trans R Soc Lond B Biol Sci 1996, 351:1463-1470. discussion 1470-1461.
    • (1996) Philos Trans R Soc Lond B Biol Sci , vol.351 , pp. 1463-1470
    • Robbins, T.W.1
  • 35
    • 84887059933 scopus 로고
    • Effects of different brain lesions on card sorting: the role of the frontal lobes
    • Milner B. Effects of different brain lesions on card sorting: the role of the frontal lobes. Arch Neurol 1963, 9:90-100.
    • (1963) Arch Neurol , vol.9 , pp. 90-100
    • Milner, B.1
  • 36
    • 80052561940 scopus 로고    scopus 로고
    • The human prefrontal cortex mediates integration of potential causes behind observed outcomes
    • Wunderlich K., Beierholm U.R., Bossaerts P., O'Doherty J.P. The human prefrontal cortex mediates integration of potential causes behind observed outcomes. J Neurophysiol 2011, 106:1558-1569.
    • (2011) J Neurophysiol , vol.106 , pp. 1558-1569
    • Wunderlich, K.1    Beierholm, U.R.2    Bossaerts, P.3    O'Doherty, J.P.4
  • 37
    • 84891901728 scopus 로고    scopus 로고
    • Inferring relevance in a changing world
    • Wilson R.C., Niv Y. Inferring relevance in a changing world. Front Hum Neurosci 2011, 5:189.
    • (2011) Front Hum Neurosci , vol.5 , pp. 189
    • Wilson, R.C.1    Niv, Y.2
  • 38
    • 58749086002 scopus 로고    scopus 로고
    • Novelty signals: a window into hippocampal information processing
    • Kumaran D., Maguire E.A. Novelty signals: a window into hippocampal information processing. Trends Cogn Sci 2009, 13:47-54.
    • (2009) Trends Cogn Sci , vol.13 , pp. 47-54
    • Kumaran, D.1    Maguire, E.A.2
  • 40
    • 84862001711 scopus 로고    scopus 로고
    • Transfer in reinforcement learning via shared features
    • Konidaris G., Scheidwasser I., Barto A.G. Transfer in reinforcement learning via shared features. J Mach Learn Res 2012, 13:1333-1371.
    • (2012) J Mach Learn Res , vol.13 , pp. 1333-1371
    • Konidaris, G.1    Scheidwasser, I.2    Barto, A.G.3
  • 42
    • 0141988716 scopus 로고    scopus 로고
    • Recent advances in hierarchical reinforcement learning
    • Barto A.G., Mahadevan S. Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems 2003, 13:341-379.
    • (2003) Discrete Event Dynamic Systems , vol.13 , pp. 341-379
    • Barto, A.G.1    Mahadevan, S.2
  • 43
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
    • Sutton R.S., Precup D., Singh S. Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif Intell 1999, 112:181-211.
    • (1999) Artif Intell , vol.112 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3
  • 44
    • 70350566799 scopus 로고    scopus 로고
    • Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective
    • Botvinick M.M., Niv Y., Barto A.C. Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 2009, 113:262-280.
    • (2009) Cognition , vol.113 , pp. 262-280
    • Botvinick, M.M.1    Niv, Y.2    Barto, A.C.3
  • 45
    • 69249240301 scopus 로고    scopus 로고
    • Is the rostro-caudal axis of the frontal lobe hierarchical?
    • Badre D., D'Esposito M. Is the rostro-caudal axis of the frontal lobe hierarchical?. Nat Rev Neurosci 2009, 10:659-669.
    • (2009) Nat Rev Neurosci , vol.10 , pp. 659-669
    • Badre, D.1    D'Esposito, M.2
  • 46
    • 0242497620 scopus 로고    scopus 로고
    • The architecture of cognitive control in the human prefrontal cortex
    • Koechlin E., Ody C., Kouneiher F. The architecture of cognitive control in the human prefrontal cortex. Science 2003, 302:1181-1185.
    • (2003) Science , vol.302 , pp. 1181-1185
    • Koechlin, E.1    Ody, C.2    Kouneiher, F.3
  • 49
    • 84859343955 scopus 로고    scopus 로고
    • How can a Bayesian approach inform neuroscience?
    • O'Reilly J.X., Jbabdi S., Behrens T.E. How can a Bayesian approach inform neuroscience?. Eur J Neurosci 2012, 35:1169-1179.
    • (2012) Eur J Neurosci , vol.35 , pp. 1169-1179
    • O'Reilly, J.X.1    Jbabdi, S.2    Behrens, T.E.3
  • 52
    • 79551573880 scopus 로고    scopus 로고
    • Risk, unexpected uncertainty, and estimation uncertainty: Bayesian learning in unstable settings
    • Payzan-LeNestour E., Bossaerts P. Risk, unexpected uncertainty, and estimation uncertainty: Bayesian learning in unstable settings. PLoS Comput Biol 2011, 7:e1001048.
    • (2011) PLoS Comput Biol , vol.7 , pp. e1001048
    • Payzan-LeNestour, E.1    Bossaerts, P.2
  • 53
    • 84856734754 scopus 로고    scopus 로고
    • Rostrolateral prefrontal cortex and individual differences in uncertainty-driven exploration
    • Badre D., Doll B.B., Long N.M., Frank M.J. Rostrolateral prefrontal cortex and individual differences in uncertainty-driven exploration. Neuron 2012, 73:595-607.
    • (2012) Neuron , vol.73 , pp. 595-607
    • Badre, D.1    Doll, B.B.2    Long, N.M.3    Frank, M.J.4
  • 54
    • 84870898601 scopus 로고    scopus 로고
    • Do not bet on the unknown versus try to find out more: estimation uncertainty and 'unexpected uncertainty' both modulate exploration
    • Payzan-Lenestour E., Bossaerts P. Do not bet on the unknown versus try to find out more: estimation uncertainty and 'unexpected uncertainty' both modulate exploration. Front Neurosci 2012, 6:150.
    • (2012) Front Neurosci , vol.6 , pp. 150
    • Payzan-Lenestour, E.1    Bossaerts, P.2
  • 56
    • 84903295903 scopus 로고    scopus 로고
    • Human cognition, Foundations of human reasoning in the prefrontal cortex
    • Donoso M., Collins A.G., Koechlin E. Human cognition, Foundations of human reasoning in the prefrontal cortex. Science 2014, 344:1481-1486.
    • (2014) Science , vol.344 , pp. 1481-1486
    • Donoso, M.1    Collins, A.G.2    Koechlin, E.3
  • 57
    • 79955766151 scopus 로고    scopus 로고
    • Differentiable contributions of human amygdalar subregions in the computations underlying reward and avoidance learning
    • Prevost C., McCabe J.A., Jessup R.K., Bossaerts P., O'Doherty J.P. Differentiable contributions of human amygdalar subregions in the computations underlying reward and avoidance learning. Eur J Neurosci 2011, 34:134-145.
    • (2011) Eur J Neurosci , vol.34 , pp. 134-145
    • Prevost, C.1    McCabe, J.A.2    Jessup, R.K.3    Bossaerts, P.4    O'Doherty, J.P.5
  • 58
    • 84879996099 scopus 로고    scopus 로고
    • The neural representation of unexpected uncertainty during value-based decision making
    • Payzan-LeNestour E., Dunne S., Bossaerts P., O'Doherty J.P. The neural representation of unexpected uncertainty during value-based decision making. Neuron 2013, 79:191-201.
    • (2013) Neuron , vol.79 , pp. 191-201
    • Payzan-LeNestour, E.1    Dunne, S.2    Bossaerts, P.3    O'Doherty, J.P.4
  • 59
    • 84874778074 scopus 로고    scopus 로고
    • Evidence for model-based computations in the human amygdala during Pavlovian conditioning
    • Prevost C., McNamee D., Jessup R.K., Bossaerts P., O'Doherty J.P. Evidence for model-based computations in the human amygdala during Pavlovian conditioning. PLoS Comput Biol 2013, 9:e1002918.
    • (2013) PLoS Comput Biol , vol.9 , pp. e1002918
    • Prevost, C.1    McNamee, D.2    Jessup, R.K.3    Bossaerts, P.4    O'Doherty, J.P.5
  • 61
    • 0019089514 scopus 로고
    • A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli
    • Pearce J.M., Hall G. A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol Rev 1980, 87:532-552.
    • (1980) Psychol Rev , vol.87 , pp. 532-552
    • Pearce, J.M.1    Hall, G.2
  • 62
    • 34447623582 scopus 로고    scopus 로고
    • Adding prediction risk to the theory of reward learning
    • Preuschoff K., Bossaerts P. Adding prediction risk to the theory of reward learning. Ann N Y Acad Sci 2007.
    • (2007) Ann N Y Acad Sci
    • Preuschoff, K.1    Bossaerts, P.2
  • 63
    • 80053236449 scopus 로고    scopus 로고
    • Differential roles of human striatum and amygdala in associative learning
    • Li J., Schiller D., Schoenbaum G., Phelps E.A., Daw N.D. Differential roles of human striatum and amygdala in associative learning. Nat Neurosci 2011, 14:1250-1252.
    • (2011) Nat Neurosci , vol.14 , pp. 1250-1252
    • Li, J.1    Schiller, D.2    Schoenbaum, G.3    Phelps, E.A.4    Daw, N.D.5
  • 68
    • 84859341150 scopus 로고    scopus 로고
    • Habits, action sequences and reinforcement learning
    • Dezfouli A., Balleine B.W. Habits, action sequences and reinforcement learning. Eur J Neurosci 2012, 35:1036-1051.
    • (2012) Eur J Neurosci , vol.35 , pp. 1036-1051
    • Dezfouli, A.1    Balleine, B.W.2
  • 69
    • 77649267492 scopus 로고    scopus 로고
    • A nonsupervised learning framework of human behavior patterns based on sequential actions
    • Lee S.W., Kim Y.S., Bien Z. A nonsupervised learning framework of human behavior patterns based on sequential actions. IEEE Trans Knowledge Data Eng 2010, 22:479-492.
    • (2010) IEEE Trans Knowledge Data Eng , vol.22 , pp. 479-492
    • Lee, S.W.1    Kim, Y.S.2    Bien, Z.3
  • 70
    • 84873628793 scopus 로고    scopus 로고
    • Applying human learning principles to user-centered IoT systems
    • Lee S.W., Prenzel O., Bien Z. Applying human learning principles to user-centered IoT systems. IEEE Computer 2013, 46:46-52.
    • (2013) IEEE Computer , vol.46 , pp. 46-52
    • Lee, S.W.1    Prenzel, O.2    Bien, Z.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.