메뉴 건너뛰기




Volumn 33, Issue 13, 2013, Pages 5797-5805

Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia

Author keywords

[No Author keywords available]

Indexed keywords

ADULT; ANALYTICAL ERROR; ARTICLE; BASAL GANGLION; BEHAVIOR; CORPUS STRIATUM; DECISION MAKING; DOPAMINERGIC NERVE CELL; FEMALE; FUNCTIONAL MAGNETIC RESONANCE IMAGING; HUMAN; HUMAN EXPERIMENT; MALE; OUTCOME ASSESSMENT; PREDICTION; PRIORITY JOURNAL; PROBABILITY; SPIKE; STATE DEPENDENT LEARNING; TASK PERFORMANCE; VENTRAL TEGMENTUM;

EID: 84875468581     PISSN: 02706474     EISSN: 15292401     Source Type: Journal    
DOI: 10.1523/JNEUROSCI.5445-12.2013     Document Type: Article
Times cited : (70)

References (42)
  • 1
    • 33646833114 scopus 로고    scopus 로고
    • Prediction error as a linear function of reward probability is coded in human nucleus accumbens
    • CrossRef Medline
    • Abler B, Walter H, Erk S, Kammerer H, Spitzer M (2006) Prediction error as a linear function of reward probability is coded in human nucleus accumbens. Neuroimage 31:790-795. CrossRef Medline
    • (2006) Neuroimage , vol.31 , pp. 790-795
    • Abler, B.1    Walter, H.2    Erk, S.3    Kammerer, H.4    Spitzer, M.5
  • 2
    • 0000541213 scopus 로고
    • Adaptive critics and the basal ganglia
    • (Houk JC, Davis J, Beiser D, eds), Cambridge, MA: MIT
    • Barto AG (1995) Adaptive critics and the basal ganglia. In: Models of information processing in the basal ganglia (Houk JC, Davis J, Beiser D, eds), pp 215-232. Cambridge, MA: MIT.
    • (1995) Models of Information Processing In the Basal Ganglia , pp. 215-232
    • Barto, A.G.1
  • 3
    • 0141988716 scopus 로고    scopus 로고
    • Recent advances in hierarchical reinforcement learning
    • CrossRef
    • Barto AG, Mahadevan S (2003) Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems 13:341-379. CrossRef
    • (2003) Discrete Event Dynamic Systems , vol.13 , pp. 341-379
    • Barto, A.G.1    Mahadevan, S.2
  • 5
    • 84878190351 scopus 로고    scopus 로고
    • Hierarchical reinforcement learning and decision making
    • CrossRef Medline
    • Botvinick MM (2012) Hierarchical reinforcement learning and decision making. Curr Opin Neurobiol 22:956-962. CrossRef Medline
    • (2012) Curr Opin Neurobiol , vol.22 , pp. 956-962
    • Botvinick, M.M.1
  • 6
    • 70350566799 scopus 로고    scopus 로고
    • Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective
    • CrossRef Medline
    • Botvinick MM, Niv Y, Barto AC (2009) Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 113:262-280. CrossRef Medline
    • (2009) Cognition , vol.113 , pp. 262-280
    • Botvinick, M.M.1    Niv, Y.2    Barto, A.C.3
  • 7
    • 0030612822 scopus 로고    scopus 로고
    • The psychophysics toolbox
    • CrossRef Medline
    • Brainard DH (1997) The psychophysics toolbox. Spat Vis 10:443-446. CrossRef Medline
    • (1997) Spat Vis , vol.10 , pp. 443-446
    • Brainard, D.H.1
  • 8
    • 63849268432 scopus 로고    scopus 로고
    • Phasic excitation of dopamine neurons in ventral VTA by noxious stimuli
    • CrossRef Medline
    • Brischoux F, Chakraborty S, Brierley DI, Ungless MA (2009) Phasic excitation of dopamine neurons in ventral VTA by noxious stimuli. Proc Natl Acad Sci U S A 106:4894-4899. CrossRef Medline
    • (2009) Proc Natl Acad Sci U S A , vol.106 , pp. 4894-4899
    • Brischoux, F.1    Chakraborty, S.2    Brierley, D.I.3    Ungless, M.A.4
  • 10
    • 34250348767 scopus 로고    scopus 로고
    • Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration
    • CrossRef Medline
    • Cohen JD, McClure SM, Yu AJ (2007) Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Philos Trans R Soc Lond B Biol Sci 362:933-942. CrossRef Medline
    • (2007) Philos Trans R Soc Lond B Biol Sci , vol.362 , pp. 933-942
    • Cohen, J.D.1    McClure, S.M.2    Yu, A.J.3
  • 11
    • 40049086223 scopus 로고    scopus 로고
    • BOLD responses reflecting dopaminergic signals in the human ventral tegmental area
    • CrossRef Medline
    • D'Ardenne K, McClure SM, Nystrom LE, Cohen JD (2008) BOLD responses reflecting dopaminergic signals in the human ventral tegmental area. Science 319:1264-1267. CrossRef Medline
    • (2008) Science , vol.319 , pp. 1264-1267
    • D'ardenne, K.1    McClure, S.M.2    Nystrom, L.E.3    Cohen, J.D.4
  • 12
    • 80052600884 scopus 로고    scopus 로고
    • Trial-by-trial data analysis using computational models
    • Daw ND (2009) Trial-by-trial data analysis using computational models. Attention and Performance (1-26).
    • (2009) Attention and Performance , pp. 1-26
    • Daw, N.D.1
  • 13
    • 33745223257 scopus 로고    scopus 로고
    • Cortical substrates for exploratory decisions in humans
    • CrossRef Medline
    • Daw ND, O'Doherty JP, Dayan P, Seymour B, Dolan RJ (2006) Cortical substrates for exploratory decisions in humans. Nature 441:876-879. CrossRef Medline
    • (2006) Nature , vol.441 , pp. 876-879
    • Daw, N.D.1    O'Doherty, J.P.2    Dayan, P.3    Seymour, B.4    Dolan, R.J.5
  • 14
    • 79952746011 scopus 로고    scopus 로고
    • Model-based influences on humans' choices and striatal prediction errors
    • CrossRef Medline
    • Daw ND, Gershman SJ, Seymour B, Dayan P, Dolan RJ (2011) Model-based influences on humans' choices and striatal prediction errors. Neuron 69:1204-1215. CrossRef Medline
    • (2011) Neuron , vol.69 , pp. 1204-1215
    • Daw, N.D.1    Gershman, S.J.2    Seymour, B.3    Dayan, P.4    Dolan, R.J.5
  • 15
    • 0001158047 scopus 로고
    • Improving generalization for temporal difference learning: The successor representation
    • CrossRef
    • Dayan P (1993) Improving generalization for temporal difference learning: the successor representation. Neural Computation 5:613-624. CrossRef
    • (1993) Neural Computation , vol.5 , pp. 613-624
    • Dayan, P.1
  • 16
    • 11844296013 scopus 로고    scopus 로고
    • An fMRI study of reward-related probability learning
    • CrossRef Medline
    • Delgado MR, Miller MM, Inati S, Phelps EA (2005) An fMRI study of reward-related probability learning. Neuroimage 24: 862-73. CrossRef Medline
    • (2005) Neuroimage , vol.24 , pp. 862-873
    • Delgado, M.R.1    Miller, M.M.2    Inati, S.3    Phelps, E.A.4
  • 17
    • 0002278788 scopus 로고    scopus 로고
    • Hierarchical reinforcement learning with the MAXQ value function decomposition
    • Dietterich TG (2000) Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13:227-303.
    • (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
    • Dietterich, T.G.1
  • 19
    • 70350521769 scopus 로고    scopus 로고
    • Human reinforcement learning subdivides structured action spaces by learning effector-specific values
    • CrossRef Medline
    • Gershman SJ, Pesaran B, Daw ND (2009) Human reinforcement learning subdivides structured action spaces by learning effector-specific values. J Neurosci 29:13524-13531. CrossRef Medline
    • (2009) J Neurosci , vol.29 , pp. 13524-13531
    • Gershman, S.J.1    Pesaran, B.2    Daw, N.D.3
  • 20
    • 77953260848 scopus 로고    scopus 로고
    • States versus rewards: Dissociable neural prediction error signals underlying model-based and model-free reinforcement learning
    • CrossRef Medline
    • Gläscher J, Daw N, Dayan P, O'Doherty JP (2010) States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66:585-595. CrossRef Medline
    • (2010) Neuron , vol.66 , pp. 585-595
    • Gläscher, J.1    Daw, N.2    Dayan, P.3    O'Doherty, J.P.4
  • 21
    • 80053152388 scopus 로고    scopus 로고
    • Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis
    • Glimcher PW (2011) Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proc Natl Acad Sci U S A 108.
    • (2011) Proc Natl Acad Sci U S A , pp. 108
    • Glimcher, P.W.1
  • 22
    • 45949091429 scopus 로고    scopus 로고
    • Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors
    • CrossRef Medline
    • Hare TA, O'Doherty J, Camerer CF, Schultz W, Rangel A (2008) Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors. J Neurosci 28:5623-5630. CrossRef Medline
    • (2008) J Neurosci , vol.28 , pp. 5623-5630
    • Hare, T.A.1    O'Doherty, J.2    Camerer, C.F.3    Schultz, W.4    Rangel, A.5
  • 23
    • 80054695119 scopus 로고    scopus 로고
    • A novel method for analyzing sequential eye movements reveals strategic influence on Raven's advanced progressive matrices
    • CrossRef Medline
    • Hayes TR, Petrov AA, Sederberg PB (2011) A novel method for analyzing sequential eye movements reveals strategic influence on Raven's advanced progressive matrices. J Vis 11:10. CrossRef Medline
    • (2011) J Vis , vol.11 , pp. 10
    • Hayes, T.R.1    Petrov, A.A.2    Sederberg, P.B.3
  • 24
    • 81355153880 scopus 로고    scopus 로고
    • Dissociable reward and timing signals in human midbrain and ventral striatum
    • CrossRef Medline
    • Klein-Flugge MC, Hunt LT, Bach DR, Dolan RJ, Behrens TE (2011) Dissociable reward and timing signals in human midbrain and ventral striatum. Neuron 72:654-664. CrossRef Medline
    • (2011) Neuron , vol.72 , pp. 654-664
    • Klein-Flugge, M.C.1    Hunt, L.T.2    Bach, D.R.3    Dolan, R.J.4    Behrens, T.E.5
  • 26
    • 54949094339 scopus 로고    scopus 로고
    • Policy adjustment in a dynamic economic game
    • CrossRef Medline
    • Li J, McClure SM, King-Casas B, Montague PR (2006) Policy adjustment in a dynamic economic game. PLoS ONE 1:e103. CrossRef Medline
    • (2006) PLoS ONE , vol.1
    • Li, J.1    McClure, S.M.2    King-Casas, B.3    Montague, P.R.4
  • 27
    • 79955721719 scopus 로고    scopus 로고
    • Signals in human striatum are appropriate for policy update rather than value prediction
    • CrossRef Medline
    • Li J, Daw ND (2011) Signals in human striatum are appropriate for policy update rather than value prediction. J Neurosci 31:5504-5511. CrossRef Medline
    • (2011) J Neurosci , vol.31 , pp. 5504-5511
    • Li, J.1    Daw, N.D.2
  • 28
    • 34547144795 scopus 로고    scopus 로고
    • Neural signature of fictive learning signals in a sequential investment task
    • CrossRef Medline
    • Lohrenz T, McCabe K, Camerer CF, Montague PR (2007) Neural signature of fictive learning signals in a sequential investment task. Proc Natl Acad Sci U S A 104:9493-9498. CrossRef Medline
    • (2007) Proc Natl Acad Sci U S A , vol.104 , pp. 9493-9498
    • Lohrenz, T.1    McCabe, K.2    Camerer, C.F.3    Montague, P.R.4
  • 29
    • 0037650217 scopus 로고    scopus 로고
    • Temporal prediction errors in a passive learning task activate human striatum
    • CrossRef Medline
    • McClure SM, Berns GS, Montague PR (2003) Temporal prediction errors in a passive learning task activate human striatum. Neuron 38:339-346. CrossRef Medline
    • (2003) Neuron , vol.38 , pp. 339-346
    • McClure, S.M.1    Berns, G.S.2    Montague, P.R.3
  • 30
    • 0029981543 scopus 로고    scopus 로고
    • A Framework for Mesen-cephalic Predictive Hebbian Learning
    • Medline
    • Montague PR, Dayan P, Sejnowski TJ (1996) A Framework for Mesen-cephalic Predictive Hebbian Learning. J Neurosci 16:1936-1947. Medline
    • (1996) J Neurosci , vol.16 , pp. 1936-1947
    • Montague, P.R.1    Dayan, P.2    Sejnowski, T.J.3
  • 31
    • 84855688852 scopus 로고    scopus 로고
    • Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain
    • CrossRef Medline
    • Niv Y, Edlund JA, Dayan P, O'Doherty JP (2012) Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain. J Neurosci 32:551-562. CrossRef Medline
    • (2012) J Neurosci , vol.32 , pp. 551-562
    • Niv, Y.1    Edlund, J.A.2    Dayan, P.3    O'Doherty, J.P.4
  • 32
    • 0037987978 scopus 로고    scopus 로고
    • Temporal difference models and reward-related learning in the human brain
    • CrossRef Medline
    • O'Doherty JP, Dayan P, Friston K, Critchley H, Dolan RJ (2003) Temporal difference models and reward-related learning in the human brain. Neuron 38:329-337. CrossRef Medline
    • (2003) Neuron , vol.38 , pp. 329-337
    • O'Doherty, J.P.1    Dayan, P.2    Friston, K.3    Critchley, H.4    Dolan, R.J.5
  • 33
    • 1942520195 scopus 로고    scopus 로고
    • Dissociable roles of ventral and dorsal striatum in instrumental conditioning
    • CrossRef Medline
    • O'Doherty J, Dayan P, Schultz J, Deichmann R, Friston K, Dolan RJ (2004) Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304:452-454. CrossRef Medline
    • (2004) Science , vol.304 , pp. 452-454
    • O'Doherty, J.1    Dayan, P.2    Schultz, J.3    Deichmann, R.4    Friston, K.5    Dolan, R.J.6
  • 34
    • 33746711623 scopus 로고    scopus 로고
    • Neural differentiation of expected reward and risk in human subcortical structures
    • Preuschoff K, Bossaerts P, Quartz SR (2006) Neural differentiation of expected reward and risk in human subcortical structures. Neuron.
    • (2006) Neuron
    • Preuschoff, K.1    Bossaerts, P.2    Quartz, S.R.3
  • 36
    • 36348966690 scopus 로고    scopus 로고
    • Reinforcement learning signals in the human striatum distinguish learners from non-learners during reward-based decision making
    • CrossRef Medline
    • Schönberg T, Daw ND, Joel D, O'Doherty JP (2007) Reinforcement learning signals in the human striatum distinguish learners from non-learners during reward-based decision making. J Neurosci 27:12860-12867. CrossRef Medline
    • (2007) J Neurosci , vol.27 , pp. 12860-12867
    • Schönberg, T.1    Daw, N.D.2    Joel, D.3    O'Doherty, J.P.4
  • 37
    • 0037057755 scopus 로고    scopus 로고
    • Getting formal with dopamine and reward
    • CrossRef Medline
    • Schultz W (2002) Getting formal with dopamine and reward. Neuron 36: 241-263. CrossRef Medline
    • (2002) Neuron , vol.36 , pp. 241-263
    • Schultz, W.1
  • 38
    • 0030896968 scopus 로고    scopus 로고
    • A neural substrate of prediction and reward
    • CrossRef Medline
    • Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. Science 275:1593-1599. CrossRef Medline
    • (1997) Science , vol.275 , pp. 1593-1599
    • Schultz, W.1    Dayan, P.2    Montague, P.R.3
  • 40
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • CrossRef
    • Sutton RS, Precup D, Singh S (1999) Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112:181-211. CrossRef
    • (1999) Artificial Intelligence , vol.112 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.