-
1
-
-
0002861883
-
A model of how the basal ganglia generate and use neural signals that predict reinforcement
-
JC Houk, XL Davis, and DG Beiser, editors MIT Press
-
JC Houk, XL Adams, and AG Barto. A model of how the basal ganglia generate and use neural signals that predict reinforcement. In JC Houk, XL Davis, and DG Beiser, editors, Models of Information Processing in the Basal Ganglia, pages 249-270. MIT Press, 1995.
-
(1995)
Models of Information Processing in the Basal Ganglia
, pp. 249-270
-
-
Houk, J.C.1
Adams, X.L.2
Barto, A.G.3
-
2
-
-
0029981543
-
A framework for mesencephalic dopamine systems based on predictive hebbian learning
-
PR Montague, P Dayan, and TJ Sejnowski. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci, 16:1936-1947, 1996.
-
(1996)
J Neurosci
, vol.16
, pp. 1936-1947
-
-
Montague, P.R.1
Dayan, P.2
Sejnowski, T.J.3
-
3
-
-
0030896968
-
A neural substrate of prediction and reward
-
W Schultz, P Dayan, and PR Montague. A neural substrate of prediction and reward. Science, 275:1593-1599, 1997.
-
(1997)
Science
, vol.275
, pp. 1593-1599
-
-
Schultz, W.1
Dayan, P.2
Montague, P.R.3
-
4
-
-
0032930935
-
A neural network with dopamine-like reinforcement signal that learns a spatial delayed response task
-
RE Suri and W Schultz. A neural network with dopamine-like reinforcement signal that learns a spatial delayed response task. Neurosci, 91:871-890, 1999.
-
(1999)
Neurosci
, vol.91
, pp. 871-890
-
-
Suri, R.E.1
Schultz, W.2
-
5
-
-
0036835734
-
Long-term reward prediction in TD models of the dopamine system
-
ND Daw and DS Touretzky. Long-term reward prediction in TD models of the dopamine system. Neural Comp, 14:2567-2583, 2002.
-
(2002)
Neural Comp
, vol.14
, pp. 2567-2583
-
-
Daw, N.D.1
Touretzky, D.S.2
-
6
-
-
33847202724
-
Learning to predict by the method of temporal differences
-
RS Sutton. Learning to predict by the method of temporal differences. Machine Learning, 3:9-44, 1988.
-
(1988)
Machine Learning
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
7
-
-
0031867046
-
Predictive reward signal of dopamine neurons
-
W Schultz. Predictive reward signal of dopamine neurons. J Neurophys, 80:1-27, 1998.
-
(1998)
J Neurophys
, vol.80
, pp. 1-27
-
-
Schultz, W.1
-
9
-
-
33644688754
-
Dopamine neurons report an error in the temporal prediction of reward during learning
-
JR Hollerman and W Schultz. Dopamine neurons report an error in the temporal prediction of reward during learning. Nature Neurosci, 1:304-309, 1998.
-
(1998)
Nature Neurosci
, vol.1
, pp. 304-309
-
-
Hollerman, J.R.1
Schultz, W.2
-
10
-
-
70350584243
-
Combining conti gural and TD learning on a robot
-
IEEE Computer Society
-
DS Touretzky, ND Daw, and EJ Tira-Thompson. Combining conti gural and TD learning on a robot. In ICDL 2, pages 47-52. IEEE Computer Society, 2002.
-
(2002)
ICDL
, vol.2
, pp. 47-52
-
-
Touretzky, D.S.1
Daw, N.D.2
Tira-Thompson, E.J.3
-
11
-
-
33745774575
-
The reward responses of dopamine neurons persist when prediction of reward is probabilistic with respect to time or occurrence
-
CD Fiorillo and W Schultz. The reward responses of dopamine neurons persist when prediction of reward is probabilistic with respect to time or occurrence. InSoc. Neurosci. Abstracts, Volume 27: 827-5, 2001.
-
(2001)
InSoc. Neurosci. Abstracts
, vol.27
, pp. 827-825
-
-
Fiorillo, C.D.1
Schultz, W.2
-
12
-
-
0032073263
-
Planning and acting in partially observable stochastic domains
-
LP Kaelbling, ML Littman, and AR Cassandra. Planning and acting in partially observable stochastic domains. Artiflntell, 101:99-134, 1998.
-
(1998)
Artiflntell
, vol.101
, pp. 99-134
-
-
Kaelbling, L.P.1
Littman, M.L.2
Cassandra, A.R.3
-
13
-
-
85150714688
-
Reinforcement learning methods for continuous-time Markov decision problems
-
MIT Press
-
SJ Bradtke and MO Duff. Reinforcement learning methods for continuous-time Markov Decision Problems. In NIPS 7, pages 393-400. MIT Press, 1995.
-
(1995)
NIPS
, vol.7
, pp. 393-400
-
-
Bradtke, S.J.1
Duff, M.O.2
-
14
-
-
0026998041
-
Reinforcement learning with perceptual aliasing: The perceptual distinctions approach
-
L Chrisman. Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In AAAI 10, pages 183-188, 1992.
-
(1992)
AAAI
, vol.10
, pp. 183-188
-
-
Chrisman, L.1
-
15
-
-
84899024060
-
Modeling temporal structure in classical conditioning
-
MIT Press
-
AC Courville and DS Touretzky. Modeling temporal structure in classical conditioning. In NIPS 14, pages 3-10. MIT Press, 2001.
-
(2001)
NIPS
, vol.14
, pp. 3-10
-
-
Courville, A.C.1
Touretzky, D.S.2
-
16
-
-
84898950247
-
Acquisition in autoshaping
-
MIT Press
-
S Kakade and P Dayan. Acquisition in autoshaping. In NIPS 12, pages 24-30. MIT Press, 2000.
-
(2000)
NIPS
, vol.12
, pp. 24-30
-
-
Kakade, S.1
Dayan, P.2
-
17
-
-
0000272386
-
Explicit state occupancy modeling by hidden semi-Markov models: Application of derin's scheme
-
Y Guedon and C Cocozza-Thivent. Explicit state occupancy modeling by hidden semi-Markov models: Application of Derin's scheme. Comp Speech and Lang, 4:161-192, 1990.
-
(1990)
Comp Speech and Lang
, vol.4
, pp. 161-192
-
-
Guedon, Y.1
Cocozza-Thivent, C.2
-
18
-
-
0034169238
-
Time, rate and conditioning
-
CR Gallistel and J Gibbon. Time, rate and conditioning. Psych Rev, 107(2):289-344, 2000.
-
(2000)
Psych Rev
, vol.107
, Issue.2
, pp. 289-344
-
-
Gallistel, C.R.1
Gibbon, J.2
-
19
-
-
0035726809
-
Anticipatory responses of dopamine neurons and cortical neurons reproduced by internal model
-
RE Suri. Anticipatory responses of dopamine neurons and cortical neurons reproduced by internal model. Exp Brain Research, 140:234-240, 2001.
-
(2001)
Exp Brain Research
, vol.140
, pp. 234-240
-
-
Suri, R.E.1
-
20
-
-
0011072195
-
Motivated reinforcement learning
-
MIT Press
-
P Dayan. Motivated reinforcement learning. In NIPS 14, pages 11-18. MIT Press, 2001.
-
(2001)
NIPS
, vol.14
, pp. 11-18
-
-
Dayan, P.1
|