-
1
-
-
33646833114
-
Prediction error as a linear function of reward probability is coded in human nucleus accumbens
-
CrossRef Medline
-
Abler B, Walter H, Erk S, Kammerer H, Spitzer M (2006) Prediction error as a linear function of reward probability is coded in human nucleus accumbens. Neuroimage 31:790-795. CrossRef Medline
-
(2006)
Neuroimage
, vol.31
, pp. 790-795
-
-
Abler, B.1
Walter, H.2
Erk, S.3
Kammerer, H.4
Spitzer, M.5
-
2
-
-
0000541213
-
Adaptive critics and the basal ganglia
-
(Houk JC, Davis J, Beiser D, eds), Cambridge, MA: MIT
-
Barto AG (1995) Adaptive critics and the basal ganglia. In: Models of information processing in the basal ganglia (Houk JC, Davis J, Beiser D, eds), pp 215-232. Cambridge, MA: MIT.
-
(1995)
Models of Information Processing In the Basal Ganglia
, pp. 215-232
-
-
Barto, A.G.1
-
3
-
-
0141988716
-
Recent advances in hierarchical reinforcement learning
-
CrossRef
-
Barto AG, Mahadevan S (2003) Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems 13:341-379. CrossRef
-
(2003)
Discrete Event Dynamic Systems
, vol.13
, pp. 341-379
-
-
Barto, A.G.1
Mahadevan, S.2
-
5
-
-
84878190351
-
Hierarchical reinforcement learning and decision making
-
CrossRef Medline
-
Botvinick MM (2012) Hierarchical reinforcement learning and decision making. Curr Opin Neurobiol 22:956-962. CrossRef Medline
-
(2012)
Curr Opin Neurobiol
, vol.22
, pp. 956-962
-
-
Botvinick, M.M.1
-
6
-
-
70350566799
-
Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective
-
CrossRef Medline
-
Botvinick MM, Niv Y, Barto AC (2009) Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 113:262-280. CrossRef Medline
-
(2009)
Cognition
, vol.113
, pp. 262-280
-
-
Botvinick, M.M.1
Niv, Y.2
Barto, A.C.3
-
7
-
-
0030612822
-
The psychophysics toolbox
-
CrossRef Medline
-
Brainard DH (1997) The psychophysics toolbox. Spat Vis 10:443-446. CrossRef Medline
-
(1997)
Spat Vis
, vol.10
, pp. 443-446
-
-
Brainard, D.H.1
-
10
-
-
34250348767
-
Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration
-
CrossRef Medline
-
Cohen JD, McClure SM, Yu AJ (2007) Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Philos Trans R Soc Lond B Biol Sci 362:933-942. CrossRef Medline
-
(2007)
Philos Trans R Soc Lond B Biol Sci
, vol.362
, pp. 933-942
-
-
Cohen, J.D.1
McClure, S.M.2
Yu, A.J.3
-
11
-
-
40049086223
-
BOLD responses reflecting dopaminergic signals in the human ventral tegmental area
-
CrossRef Medline
-
D'Ardenne K, McClure SM, Nystrom LE, Cohen JD (2008) BOLD responses reflecting dopaminergic signals in the human ventral tegmental area. Science 319:1264-1267. CrossRef Medline
-
(2008)
Science
, vol.319
, pp. 1264-1267
-
-
D'ardenne, K.1
McClure, S.M.2
Nystrom, L.E.3
Cohen, J.D.4
-
12
-
-
80052600884
-
Trial-by-trial data analysis using computational models
-
Daw ND (2009) Trial-by-trial data analysis using computational models. Attention and Performance (1-26).
-
(2009)
Attention and Performance
, pp. 1-26
-
-
Daw, N.D.1
-
13
-
-
33745223257
-
Cortical substrates for exploratory decisions in humans
-
CrossRef Medline
-
Daw ND, O'Doherty JP, Dayan P, Seymour B, Dolan RJ (2006) Cortical substrates for exploratory decisions in humans. Nature 441:876-879. CrossRef Medline
-
(2006)
Nature
, vol.441
, pp. 876-879
-
-
Daw, N.D.1
O'Doherty, J.P.2
Dayan, P.3
Seymour, B.4
Dolan, R.J.5
-
14
-
-
79952746011
-
Model-based influences on humans' choices and striatal prediction errors
-
CrossRef Medline
-
Daw ND, Gershman SJ, Seymour B, Dayan P, Dolan RJ (2011) Model-based influences on humans' choices and striatal prediction errors. Neuron 69:1204-1215. CrossRef Medline
-
(2011)
Neuron
, vol.69
, pp. 1204-1215
-
-
Daw, N.D.1
Gershman, S.J.2
Seymour, B.3
Dayan, P.4
Dolan, R.J.5
-
15
-
-
0001158047
-
Improving generalization for temporal difference learning: The successor representation
-
CrossRef
-
Dayan P (1993) Improving generalization for temporal difference learning: the successor representation. Neural Computation 5:613-624. CrossRef
-
(1993)
Neural Computation
, vol.5
, pp. 613-624
-
-
Dayan, P.1
-
16
-
-
11844296013
-
An fMRI study of reward-related probability learning
-
CrossRef Medline
-
Delgado MR, Miller MM, Inati S, Phelps EA (2005) An fMRI study of reward-related probability learning. Neuroimage 24: 862-73. CrossRef Medline
-
(2005)
Neuroimage
, vol.24
, pp. 862-873
-
-
Delgado, M.R.1
Miller, M.M.2
Inati, S.3
Phelps, E.A.4
-
17
-
-
0002278788
-
Hierarchical reinforcement learning with the MAXQ value function decomposition
-
Dietterich TG (2000) Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13:227-303.
-
(2000)
Journal of Artificial Intelligence Research
, vol.13
, pp. 227-303
-
-
Dietterich, T.G.1
-
18
-
-
84875466587
-
-
2010 Neuroscience Meeting Planner, San Diego, CA: Society for Neuroscience
-
Diuk C, Botvinick MM, Barto AG, Niv Y (2010) Program No. 36:907.13. 2010 Neuroscience Meeting Planner, San Diego, CA: Society for Neuroscience.
-
(2010)
Program No. 36:907.13
-
-
Diuk, C.1
Botvinick, M.M.2
Barto, A.G.3
Niv, Y.4
-
19
-
-
70350521769
-
Human reinforcement learning subdivides structured action spaces by learning effector-specific values
-
CrossRef Medline
-
Gershman SJ, Pesaran B, Daw ND (2009) Human reinforcement learning subdivides structured action spaces by learning effector-specific values. J Neurosci 29:13524-13531. CrossRef Medline
-
(2009)
J Neurosci
, vol.29
, pp. 13524-13531
-
-
Gershman, S.J.1
Pesaran, B.2
Daw, N.D.3
-
20
-
-
77953260848
-
States versus rewards: Dissociable neural prediction error signals underlying model-based and model-free reinforcement learning
-
CrossRef Medline
-
Gläscher J, Daw N, Dayan P, O'Doherty JP (2010) States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66:585-595. CrossRef Medline
-
(2010)
Neuron
, vol.66
, pp. 585-595
-
-
Gläscher, J.1
Daw, N.2
Dayan, P.3
O'Doherty, J.P.4
-
21
-
-
80053152388
-
Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis
-
Glimcher PW (2011) Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proc Natl Acad Sci U S A 108.
-
(2011)
Proc Natl Acad Sci U S A
, pp. 108
-
-
Glimcher, P.W.1
-
22
-
-
45949091429
-
Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors
-
CrossRef Medline
-
Hare TA, O'Doherty J, Camerer CF, Schultz W, Rangel A (2008) Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors. J Neurosci 28:5623-5630. CrossRef Medline
-
(2008)
J Neurosci
, vol.28
, pp. 5623-5630
-
-
Hare, T.A.1
O'Doherty, J.2
Camerer, C.F.3
Schultz, W.4
Rangel, A.5
-
23
-
-
80054695119
-
A novel method for analyzing sequential eye movements reveals strategic influence on Raven's advanced progressive matrices
-
CrossRef Medline
-
Hayes TR, Petrov AA, Sederberg PB (2011) A novel method for analyzing sequential eye movements reveals strategic influence on Raven's advanced progressive matrices. J Vis 11:10. CrossRef Medline
-
(2011)
J Vis
, vol.11
, pp. 10
-
-
Hayes, T.R.1
Petrov, A.A.2
Sederberg, P.B.3
-
24
-
-
81355153880
-
Dissociable reward and timing signals in human midbrain and ventral striatum
-
CrossRef Medline
-
Klein-Flugge MC, Hunt LT, Bach DR, Dolan RJ, Behrens TE (2011) Dissociable reward and timing signals in human midbrain and ventral striatum. Neuron 72:654-664. CrossRef Medline
-
(2011)
Neuron
, vol.72
, pp. 654-664
-
-
Klein-Flugge, M.C.1
Hunt, L.T.2
Bach, D.R.3
Dolan, R.J.4
Behrens, T.E.5
-
27
-
-
79955721719
-
Signals in human striatum are appropriate for policy update rather than value prediction
-
CrossRef Medline
-
Li J, Daw ND (2011) Signals in human striatum are appropriate for policy update rather than value prediction. J Neurosci 31:5504-5511. CrossRef Medline
-
(2011)
J Neurosci
, vol.31
, pp. 5504-5511
-
-
Li, J.1
Daw, N.D.2
-
28
-
-
34547144795
-
Neural signature of fictive learning signals in a sequential investment task
-
CrossRef Medline
-
Lohrenz T, McCabe K, Camerer CF, Montague PR (2007) Neural signature of fictive learning signals in a sequential investment task. Proc Natl Acad Sci U S A 104:9493-9498. CrossRef Medline
-
(2007)
Proc Natl Acad Sci U S A
, vol.104
, pp. 9493-9498
-
-
Lohrenz, T.1
McCabe, K.2
Camerer, C.F.3
Montague, P.R.4
-
29
-
-
0037650217
-
Temporal prediction errors in a passive learning task activate human striatum
-
CrossRef Medline
-
McClure SM, Berns GS, Montague PR (2003) Temporal prediction errors in a passive learning task activate human striatum. Neuron 38:339-346. CrossRef Medline
-
(2003)
Neuron
, vol.38
, pp. 339-346
-
-
McClure, S.M.1
Berns, G.S.2
Montague, P.R.3
-
30
-
-
0029981543
-
A Framework for Mesen-cephalic Predictive Hebbian Learning
-
Medline
-
Montague PR, Dayan P, Sejnowski TJ (1996) A Framework for Mesen-cephalic Predictive Hebbian Learning. J Neurosci 16:1936-1947. Medline
-
(1996)
J Neurosci
, vol.16
, pp. 1936-1947
-
-
Montague, P.R.1
Dayan, P.2
Sejnowski, T.J.3
-
31
-
-
84855688852
-
Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain
-
CrossRef Medline
-
Niv Y, Edlund JA, Dayan P, O'Doherty JP (2012) Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain. J Neurosci 32:551-562. CrossRef Medline
-
(2012)
J Neurosci
, vol.32
, pp. 551-562
-
-
Niv, Y.1
Edlund, J.A.2
Dayan, P.3
O'Doherty, J.P.4
-
32
-
-
0037987978
-
Temporal difference models and reward-related learning in the human brain
-
CrossRef Medline
-
O'Doherty JP, Dayan P, Friston K, Critchley H, Dolan RJ (2003) Temporal difference models and reward-related learning in the human brain. Neuron 38:329-337. CrossRef Medline
-
(2003)
Neuron
, vol.38
, pp. 329-337
-
-
O'Doherty, J.P.1
Dayan, P.2
Friston, K.3
Critchley, H.4
Dolan, R.J.5
-
33
-
-
1942520195
-
Dissociable roles of ventral and dorsal striatum in instrumental conditioning
-
CrossRef Medline
-
O'Doherty J, Dayan P, Schultz J, Deichmann R, Friston K, Dolan RJ (2004) Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304:452-454. CrossRef Medline
-
(2004)
Science
, vol.304
, pp. 452-454
-
-
O'Doherty, J.1
Dayan, P.2
Schultz, J.3
Deichmann, R.4
Friston, K.5
Dolan, R.J.6
-
34
-
-
33746711623
-
Neural differentiation of expected reward and risk in human subcortical structures
-
Preuschoff K, Bossaerts P, Quartz SR (2006) Neural differentiation of expected reward and risk in human subcortical structures. Neuron.
-
(2006)
Neuron
-
-
Preuschoff, K.1
Bossaerts, P.2
Quartz, S.R.3
-
35
-
-
79960637995
-
A neural signature of hierarchical reinforcement learning
-
CrossRef Medline
-
Ribas-Fernandes JJ, Solway A, Diuk C, McGuire JT, Barto AG, Niv Y, Botvinick MM (2011) A neural signature of hierarchical reinforcement learning. Neuron 71:370-379. CrossRef Medline
-
(2011)
Neuron
, vol.71
, pp. 370-379
-
-
Ribas-Fernandes, J.J.1
Solway, A.2
Diuk, C.3
McGuire, J.T.4
Barto, A.G.5
Niv, Y.6
Botvinick, M.M.7
-
36
-
-
36348966690
-
Reinforcement learning signals in the human striatum distinguish learners from non-learners during reward-based decision making
-
CrossRef Medline
-
Schönberg T, Daw ND, Joel D, O'Doherty JP (2007) Reinforcement learning signals in the human striatum distinguish learners from non-learners during reward-based decision making. J Neurosci 27:12860-12867. CrossRef Medline
-
(2007)
J Neurosci
, vol.27
, pp. 12860-12867
-
-
Schönberg, T.1
Daw, N.D.2
Joel, D.3
O'Doherty, J.P.4
-
37
-
-
0037057755
-
Getting formal with dopamine and reward
-
CrossRef Medline
-
Schultz W (2002) Getting formal with dopamine and reward. Neuron 36: 241-263. CrossRef Medline
-
(2002)
Neuron
, vol.36
, pp. 241-263
-
-
Schultz, W.1
-
38
-
-
0030896968
-
A neural substrate of prediction and reward
-
CrossRef Medline
-
Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. Science 275:1593-1599. CrossRef Medline
-
(1997)
Science
, vol.275
, pp. 1593-1599
-
-
Schultz, W.1
Dayan, P.2
Montague, P.R.3
-
40
-
-
0033170372
-
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
-
CrossRef
-
Sutton RS, Precup D, Singh S (1999) Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112:181-211. CrossRef
-
(1999)
Artificial Intelligence
, vol.112
, pp. 181-211
-
-
Sutton, R.S.1
Precup, D.2
Singh, S.3
-
41
-
-
84862673530
-
Learning to simulate others' decisions
-
CrossRef Medline
-
Suzuki S, Harasawa N, Ueno K, Gardner JL, Ichinohe N, Haruno M, Cheng K, Nakahara H (2012) Learning to simulate others' decisions. Neuron 74: 1125-1137. CrossRef Medline
-
(2012)
Neuron
, vol.74
, pp. 1125-1137
-
-
Suzuki, S.1
Harasawa, N.2
Ueno, K.3
Gardner, J.L.4
Ichinohe, N.5
Haruno, M.6
Cheng, K.7
Nakahara, H.8
|