-
2
-
-
78649507911
-
A Bayesian sampling approach to exploration in reinforcement learning
-
Montreal, Canada
-
Asmuth, J., Li, L., Littman, M., Nouri, A., Wingate, D.: A bayesian sampling approach to exploration in reinforcement learning. In: UAI, Montreal, Canada (2009)
-
(2009)
UAI
-
-
Asmuth, J.1
Li, L.2
Littman, M.3
Nouri, A.4
Wingate, D.5
-
3
-
-
23244432007
-
An integrative theory of locus coeruleus-norepinephrine function: Adaptive gain and optimal performance
-
Aston-Jones, G., Cohen, J.D.: An integrative theory of locus coeruleus-norepinephrine function: Adaptive gain and optimal performance. Annu. Rev. Neurosci. 28, 403-450 (2005)
-
(2005)
Annu. Rev. Neurosci.
, vol.28
, pp. 403-450
-
-
Aston-Jones, G.1
Cohen, J.D.2
-
4
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2), 235-256 (2002a)
-
(2002)
Mach. Learn.
, vol.47
, Issue.2
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
5
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1), 48-77 (2002b)
-
(2002)
SIAM J. Comput.
, vol.32
, Issue.1
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.4
-
6
-
-
28444472936
-
Neural bases of food-seeking: Affect, arousal and reward in corticostriatolimbic circuits
-
Balleine, B.W.: Neural bases of food-seeking: Affect, arousal and reward in corticostriatolimbic circuits. Physiol. Behav. 86(5), 717-730 (2005)
-
(2005)
Physiol. Behav.
, vol.86
, Issue.5
, pp. 717-730
-
-
Balleine, B.W.1
-
7
-
-
0028101302
-
Columnar organization in the midbrain periaqueductal gray: Modules for emotional expression?
-
Bandler, R., Shipley, M.T.: Columnar organization in the midbrain periaqueductal gray: Modules for emotional expression? Trends Neurosci. 17(9), 379-389 (1994)
-
(1994)
Trends Neurosci.
, vol.17
, Issue.9
, pp. 379-389
-
-
Bandler, R.1
Shipley, M.T.2
-
8
-
-
0000541213
-
Adaptive critics and the basal ganglia
-
Houk, J., Davis, J., Beiser, D. (eds.) MIT, Cambridge
-
Barto, A.: Adaptive critics and the basal ganglia. In: Houk, J., Davis, J., Beiser, D. (eds.) Models of Information Processing in the Basal Ganglia, pp. 215-232. MIT, Cambridge (1995)
-
(1995)
Models of Information Processing in the Basal Ganglia
, pp. 215-232
-
-
Barto, A.1
-
9
-
-
0141988716
-
Recent advances in hierarchical reinforcement learning
-
Barto, A., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discr. Event Dyn. Syst. 13(4), 341-379 (2003)
-
(2003)
Discr. Event Dyn. Syst.
, vol.13
, Issue.4
, pp. 341-379
-
-
Barto, A.1
Mahadevan, S.2
-
10
-
-
33749651693
-
Intrinsically motivated learning of hierarchical collections of skills
-
La Jolla, CA
-
Barto, A., Singh, S., Chentanez, N.: Intrinsically motivated learning of hierarchical collections of skills. In: ICDL 2004, La Jolla, CA (2004)
-
(2004)
ICDL 2004
-
-
Barto, A.1
Singh, S.2
Chentanez, N.3
-
11
-
-
0020970738
-
Neuronlike elements that can solve difficult learning control problems
-
Barto, A., Sutton, R., Anderson, C.: Neuronlike elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. 13(5), 834-846 (1983)
-
(1983)
IEEE Trans. Syst. Man Cybern.
, vol.13
, Issue.5
, pp. 834-846
-
-
Barto, A.1
Sutton, R.2
Anderson, C.3
-
12
-
-
84929046579
-
Intrinsic motivation and reinforcement learning
-
Baldassarre, G., Mirolli, M. (eds.) Springer, Berlin
-
Barto, A.G.: Intrinsic motivation and reinforcement learning. In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems, pp. 17-47. Springer, Berlin (2012)
-
(2012)
Intrinsically Motivated Learning in Natural and Artificial Systems
, pp. 17-47
-
-
Barto, A.G.1
-
13
-
-
84898936541
-
The infinite hidden Markov model
-
Vancouver, Canada
-
Beal, M., Ghahramani, Z., Rasmussen, C.: The infinite hidden Markov model. In: NIPS, pp. 577-584, Vancouver, Canada (2002)
-
(2002)
NIPS
, pp. 577-584
-
-
Beal, M.1
Ghahramani, Z.2
Rasmussen, C.3
-
14
-
-
34548295327
-
Learning the value of information in an uncertain world
-
Behrens, T.E.J., Woolrich, M.W., Walton, M.E., Rushworth, M.F.S.: Learning the value of information in an uncertain world. Nat. Neurosci. 10(9), 1214-1221 (2007)
-
(2007)
Nat. Neurosci.
, vol.10
, Issue.9
, pp. 1214-1221
-
-
Behrens, T.E.J.1
Woolrich, M.W.2
Walton, M.E.3
Rushworth, M.F.S.4
-
15
-
-
85012688561
-
-
Princeton University Press, Princeton
-
Bellman, R.E.: Dynamic Programming. Princeton University Press, Princeton (1957)
-
(1957)
Dynamic Programming
-
-
Bellman, R.E.1
-
16
-
-
2442701355
-
Motivation concepts in behavioral neuroscience
-
Berridge, K.C.: Motivation concepts in behavioral neuroscience. Physiol. Behav. 81, 179-209 (2004)
-
(2004)
Physiol. Behav.
, vol.81
, pp. 179-209
-
-
Berridge, K.C.1
-
18
-
-
0023800192
-
Ethoexperimental approaches to the biology of emotion
-
Blanchard, D.C., Blanchard, R.J.: Ethoexperimental approaches to the biology of emotion. Annu. Rev. Psychol. 39, 43-68 (1988)
-
(1988)
Annu. Rev. Psychol.
, vol.39
, pp. 43-68
-
-
Blanchard, D.C.1
Blanchard, R.J.2
-
19
-
-
13844281871
-
Bringing up robot: Fundamental mechanisms for creating a self-motivated, self-organizing architecture
-
Blank, D., Kumar, D., Meeden, L., Marshall, J.: Bringing up robot: Fundamental mechanisms for creating a self-motivated, self-organizing architecture. Cybern. Syst. 36(2), 125-150 (2005)
-
(2005)
Cybern. Syst.
, vol.36
, Issue.2
, pp. 125-150
-
-
Blank, D.1
Kumar, D.2
Meeden, L.3
Marshall, J.4
-
20
-
-
58149417523
-
Species-specific defense reactions and avoidance learning
-
Bolles, R.C.: Species-specific defense reactions and avoidance learning. Psychol. Rev. 77, 32-48 (1970)
-
(1970)
Psychol. Rev.
, vol.77
, pp. 32-48
-
-
Bolles, R.C.1
-
21
-
-
70350566799
-
Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective
-
Botvinick, M.M., Niv, Y., Barto, A.C.: Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective. Cognition 113(3), 262-280 (2009)
-
(2009)
Cognition
, vol.113
, Issue.3
, pp. 262-280
-
-
Botvinick, M.M.1
Niv, Y.2
Barto, A.C.3
-
22
-
-
78649651245
-
Opponency revisited: Competition and cooperation between dopamine and serotonin
-
Boureau, Y.-L., Dayan, P.: Opponency revisited: Competition and cooperation between dopamine and serotonin. Neuropsychopharmacology 36(1), 74-97 (2011)
-
(2011)
Neuropsychopharmacology
, vol.36
, Issue.1
, pp. 74-97
-
-
Boureau, Y.-L.1
Dayan, P.2
-
23
-
-
0041965975
-
R-max - A general polynomial time algorithm for near-optimal reinforcement learning
-
Brafman, R., Tennenholtz, M.: R-max-a general polynomial time algorithm for near-optimal reinforcement learning. J. Mach. Learn. Res. 3, 213-231 (2003)
-
(2003)
J. Mach. Learn. Res.
, vol.3
, pp. 213-231
-
-
Brafman, R.1
Tennenholtz, M.2
-
24
-
-
0000696066
-
The misbehavior of organisms
-
Breland, K., Breland, M.: The misbehavior of organisms. Am. Psychol. 16(9), 681-84 (1961)
-
(1961)
Am. Psychol.
, vol.16
, Issue.9
, pp. 681-684
-
-
Breland, K.1
Breland, M.2
-
25
-
-
0023981451
-
The ART of adaptive pattern recognition by a self-organizing neural network
-
Carpenter, G., Grossberg, S.: The ART of adaptive pattern recognition by a self-organizing neural network. Computer 21, 77-88 (1988)
-
(1988)
Computer
, vol.21
, pp. 77-88
-
-
Carpenter, G.1
Grossberg, S.2
-
26
-
-
0031189914
-
Multitask learning
-
Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41-75 (1997)
-
(1997)
Mach. Learn.
, vol.28
, Issue.1
, pp. 41-75
-
-
Caruana, R.1
-
28
-
-
33750189183
-
Similarity and discrimination in classical conditioning: A latent variable account
-
Vancouver, Canada
-
Courville, A., Daw, N., Touretzky, D.: Similarity and discrimination in classical conditioning: A latent variable account. In: NIPS, pp. 313-320, Vancouver, Canada (2004)
-
(2004)
NIPS
, pp. 313-320
-
-
Courville, A.1
Daw, N.2
Touretzky, D.3
-
29
-
-
33646492363
-
The computational neurobiology of learning and reward
-
Daw, N.D., Doya, K.: The computational neurobiology of learning and reward. Curr. Opin. Neurobiol. 16(2), 199-204 (2006)
-
(2006)
Curr. Opin. Neurobiol.
, vol.16
, Issue.2
, pp. 199-204
-
-
Daw, N.D.1
Doya, K.2
-
30
-
-
0036592008
-
Opponent interactions between serotonin and dopamine
-
Daw, N.D., Kakade, S., Dayan, P.: Opponent interactions between serotonin and dopamine. Neural Netw. 15, 603-16 (2002)
-
(2002)
Neural Netw.
, vol.15
, pp. 603-616
-
-
Daw, N.D.1
Kakade, S.2
Dayan, P.3
-
31
-
-
28044450875
-
Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
-
Daw, N.D., Niv, Y., Dayan, P.: Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8(12), 1704-1711 (2005)
-
(2005)
Nat. Neurosci.
, vol.8
, Issue.12
, pp. 1704-1711
-
-
Daw, N.D.1
Niv, Y.2
Dayan, P.3
-
32
-
-
33745223257
-
Cortical substrates for exploratory decisions in humans
-
Daw, N.D., O'Doherty, J.P., Dayan, P., Seymour, B., Dolan, R.J.: Cortical substrates for exploratory decisions in humans. Nature 441 (7095), 876-879 (2006)
-
(2006)
Nature
, vol.441
, Issue.7095
, pp. 876-879
-
-
Daw, N.D.1
O'Doherty, J.P.2
Dayan, P.3
Seymour, B.4
Dolan, R.J.5
-
33
-
-
50549094930
-
Bilinearity, rules, and prefrontal cortex
-
Dayan, P.: Bilinearity, rules, and prefrontal cortex. Front. Comput. Neurosci. 1, 1 (2007)
-
(2007)
Front. Comput. Neurosci.
, vol.1
, pp. 1
-
-
Dayan, P.1
-
34
-
-
0001234682
-
Feudal reinforcement learning
-
Hanson, S.J., Cowan, J.D., Giles, C.L. (eds.) MIT, Cambridge
-
Dayan, P., Hinton, G.: Feudal reinforcement learning. In: Hanson, S.J., Cowan, J.D., Giles, C.L. (eds.) Advances in Neural Information Processing Systems (NIPS) 5. MIT, Cambridge (1993)
-
(1993)
Advances in Neural Information Processing Systems (NIPS) 5
-
-
Dayan, P.1
Hinton, G.2
-
35
-
-
40149109071
-
Serotonin, inhibition, and negative mood
-
Dayan, P., Huys, Q.J.M.: Serotonin, inhibition, and negative mood. PLoS Comput. Biol. 4(2), e4 (2008)
-
(2008)
PLoS Comput. Biol.
, vol.4
, Issue.2
, pp. e4
-
-
Dayan, P.1
Huys, Q.J.M.2
-
37
-
-
33749055062
-
The misbehavior of value and the discipline of the will
-
Dayan, P., Niv, Y., Seymour, B., Daw, N.D.: The misbehavior of value and the discipline of the will. Neural Netw. 19(8), 1153-1160 (2006)
-
(2006)
Neural Netw.
, vol.19
, Issue.8
, pp. 1153-1160
-
-
Dayan, P.1
Niv, Y.2
Seymour, B.3
Daw, N.D.4
-
38
-
-
0030260201
-
Exploration bonuses and dual control
-
Dayan, P., Sejnowski, T.: Exploration bonuses and dual control. Mach. Learn. 25(1), 5-22 (1996)
-
(1996)
Mach. Learn.
, vol.25
, Issue.1
, pp. 5-22
-
-
Dayan, P.1
Sejnowski, T.2
-
40
-
-
1142281527
-
Model based Bayesian exploration
-
Stockholm, Sweden
-
Dearden, R., Friedman, N., Andre, D.: Model based Bayesian exploration. In: UAI, Stockholm, Sweden pp. 150-159 (1999)
-
(1999)
UAI
, pp. 150-159
-
-
Dearden, R.1
Friedman, N.2
Andre, D.3
-
43
-
-
0043250430
-
The role of learning in motivation
-
Gallistel, C. (ed.) Wiley, New York
-
Dickinson, A., Balleine, B.: The role of learning in motivation. In: Gallistel, C. (ed.) Stevens' Handbook of Experimental Psychology, Vol. 3, pp. 497-533. Wiley, New York (2002)
-
(2002)
Stevens' Handbook of Experimental Psychology
, vol.3
, pp. 497-533
-
-
Dickinson, A.1
Balleine, B.2
-
44
-
-
0001806701
-
The MAXQ method for hierarchical reinforcement learning
-
Madison, Wisconsin
-
Dietterich, T.: The MAXQ method for hierarchical reinforcement learning. In: ICML, pp. 118-126, Madison, Wisconsin, (1998)
-
(1998)
ICML
, pp. 118-126
-
-
Dietterich, T.1
-
45
-
-
0002278788
-
Hierarchical reinforcement learning with the MAXQ value function decomposition
-
Dietterich, T.: Hierarchical reinforcement learning with the MAXQ value function decomposition. J. Artif. Intell. Res. 13(1), 227-303 (2000)
-
(2000)
J. Artif. Intell. Res.
, vol.13
, Issue.1
, pp. 227-303
-
-
Dietterich, T.1
-
46
-
-
0036592023
-
Metalearning and neuromodulation
-
Doya, K.: Metalearning and neuromodulation. Neural Netw. 15(4-6), 495-506 (2002)
-
(2002)
Neural Netw.
, vol.15
, Issue.4-6
, pp. 495-506
-
-
Doya, K.1
-
47
-
-
0036618011
-
Multiple model-based reinforcement learning
-
Doya, K., Samejima, K., ichi Katagiri, K., Kawato, M.: Multiple model-based reinforcement learning. Neural Comput. 14(6), 1347-1369 (2002)
-
(2002)
Neural Comput.
, vol.14
, Issue.6
, pp. 1347-1369
-
-
Doya, K.1
Samejima, K.2
Ichi Katagiri, K.3
Kawato, M.4
-
49
-
-
0036832959
-
Structure in the space of value functions
-
Foster, D., Dayan, P.: Structure in the space of value functions. Mach. Learn. 49(2), 325-346 (2002)
-
(2002)
Mach. Learn.
, vol.49
, Issue.2
, pp. 325-346
-
-
Foster, D.1
Dayan, P.2
-
50
-
-
84900513897
-
Learning to selectively attend
-
Portland, Oregon
-
Gershman, S., Cohen, J., Niv, Y.: Learning to selectively attend. In: Proceedings of the 32nd Annual Conference of the Cognitive Science Society, Portland, Oregon (2010a)
-
(2010)
Proceedings of the 32nd Annual Conference of the Cognitive Science Society
-
-
Gershman, S.1
Cohen, J.2
Niv, Y.3
-
51
-
-
77952541839
-
Learning latent structure: Carving nature at its joints
-
Gershman, S., Niv, Y.: Learning latent structure: Carving nature at its joints. Curr. Opin. Neurobiol. (2010)
-
(2010)
Curr. Opin. Neurobiol.
-
-
Gershman, S.1
Niv, Y.2
-
52
-
-
74049117596
-
Context, learning, and extinction
-
Gershman, S.J., Blei, D.M., Niv, Y.: Context, learning, and extinction. Psychol. Rev. 117(1), 197-209 (2010b)
-
(2010)
Psychol. Rev.
, vol.117
, Issue.1
, pp. 197-209
-
-
Gershman, S.J.1
Blei, D.M.2
Niv, Y.3
-
54
-
-
0010966147
-
Rats learn the relationship between responding and environmental events: An expansion of the learned helplessness hypothesis
-
Goodkin, F.: Rats learn the relationship between responding and environmental events: An expansion of the learned helplessness hypothesis. Learn. Motiv. 7, 382-393 (1976)
-
(1976)
Learn. Motiv.
, vol.7
, pp. 382-393
-
-
Goodkin, F.1
-
57
-
-
34548566262
-
Towards an executive without a homunculus: Computational models of the prefrontal cortex/basal ganglia system
-
Hazy, T.E., Frank, M.J., O'reilly, R.C.: Towards an executive without a homunculus: Computational models of the prefrontal cortex/basal ganglia system. Philos. Trans. R. Soc. Lond. B Biol. Sci. 362 (1485), 1601-1613 (2007)
-
(2007)
Philos. Trans. R. Soc. Lond. B Biol. Sci.
, vol.362
, Issue.1485
, pp. 1601-1613
-
-
Hazy, T.E.1
Frank, M.J.2
O'Reilly, R.C.3
-
58
-
-
0034031837
-
Multiple forms of short-term plasticity at excitatory synapses in rat medial prefrontal cortex
-
Hempel, C.M., Hartman, K.H., Wang, X.J., Turrigiano, G.G., Nelson, S.B.: Multiple forms of short-term plasticity at excitatory synapses in rat medial prefrontal cortex. J. Neurophysiol. 83(5), 3031-3041 (2000)
-
(2000)
J. Neurophysiol.
, vol.83
, Issue.5
, pp. 3031-3041
-
-
Hempel, C.M.1
Hartman, K.H.2
Wang, X.J.3
Turrigiano, G.G.4
Nelson, S.B.5
-
59
-
-
0022979089
-
An approach through the looking-glass
-
Hershberger, W.A.: An approach through the looking-glass. Anim. Learn. Behav. 14, 443-51 (1986)
-
(1986)
Anim. Learn. Behav.
, vol.14
, pp. 443-451
-
-
Hershberger, W.A.1
-
60
-
-
0029652445
-
The "wake-sleep" algorithm for unsupervised neural networks
-
Hinton, G.E., Dayan, P., Frey, B.J., Neal, R.M.: The "wake-sleep" algorithm for unsupervised neural networks. Science 268 (5214), 1158-1161 (1995)
-
(1995)
Science
, vol.268
, Issue.5214
, pp. 1158-1161
-
-
Hinton, G.E.1
Dayan, P.2
Frey, B.J.3
Neal, R.M.4
-
61
-
-
0031590130
-
Generative models for discovering sparse distributed representations
-
Hinton, G.E., Ghahramani, Z.: Generative models for discovering sparse distributed representations. Philos. Trans. R. Soc. Lond. B Biol. Sci. 352 (1358), 1177-1190 (1997)
-
(1997)
Philos. Trans. R. Soc. Lond. B Biol. Sci.
, vol.352
, Issue.1358
, pp. 1177-1190
-
-
Hinton, G.E.1
Ghahramani, Z.2
-
62
-
-
33746600649
-
Reducing the dimensionality of data with neural networks
-
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313 (5786), 504-507 (2006)
-
(2006)
Science
, vol.313
, Issue.5786
, pp. 504-507
-
-
Hinton, G.E.1
Salakhutdinov, R.R.2
-
63
-
-
0031747058
-
Amount of training affects associatively-activated event representation
-
Holland, P.: Amount of training affects associatively-activated event representation. Neuropharmacology 37(4-5), 461-469 (1998)
-
(1998)
Neuropharmacology
, vol.37
, Issue.4-5
, pp. 461-469
-
-
Holland, P.1
-
64
-
-
0034061668
-
Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events
-
Horvitz, J.C.: Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events. Neuroscience 96(4), 651-656 (2000)
-
(2000)
Neuroscience
, vol.96
, Issue.4
, pp. 651-656
-
-
Horvitz, J.C.1
-
65
-
-
0030757872
-
Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat
-
Horvitz, J.C., Stewart, T., Jacobs, B.L.: Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat. Brain Res. 759(2), 251-258 (1997)
-
(1997)
Brain Res.
, vol.759
, Issue.2
, pp. 251-258
-
-
Horvitz, J.C.1
Stewart, T.2
Jacobs, B.L.3
-
67
-
-
34447328072
-
Inherent value systems for autonomous mental development
-
Huang, X., Weng, J.: Inherent value systems for autonomous mental development. Int. J. Human. Robot. 4, 407-433 (2007)
-
(2007)
Int. J. Human. Robot.
, vol.4
, pp. 407-433
-
-
Huang, X.1
Weng, J.2
-
69
-
-
67651041654
-
Reinforcers and control
-
Ph.D. Thesis, Gatsby Computational Neuroscience Unit, UCL
-
Huys, Q.: Reinforcers and control. Towards a computational ætiology of depression. Ph.D. Thesis, Gatsby Computational Neuroscience Unit, UCL (2007)
-
(2007)
Towards a Computational ætiology of Depression
-
-
Huys, Q.1
-
70
-
-
70350570499
-
A Bayesian formulation of behavioral control
-
Huys, Q.J.M., Dayan, P.: A Bayesian formulation of behavioral control. Cognition 113, 314-328 (2009)
-
(2009)
Cognition
, vol.113
, pp. 314-328
-
-
Huys, Q.J.M.1
Dayan, P.2
-
71
-
-
0036592028
-
Control of exploitation-exploration meta-parameter in reinforcement learning
-
Ishii, S., Yoshida, W., Yoshimoto, J.: Control of exploitation-exploration meta-parameter in reinforcement learning. Neural Netw. 15(4-6), 665-687 (2002)
-
(2002)
Neural Netw.
, vol.15
, Issue.4-6
, pp. 665-687
-
-
Ishii, S.1
Yoshida, W.2
Yoshimoto, J.3
-
72
-
-
0032073263
-
Planning and acting in partially observable stochastic domains
-
Kaelbling, L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1-2), 99-134 (1998)
-
(1998)
Artif. Intell.
, vol.101
, Issue.1-2
, pp. 99-134
-
-
Kaelbling, L.1
Littman, M.2
Cassandra, A.3
-
73
-
-
0036592029
-
Dopamine: Generalization and bonuses
-
Kakade, S., Dayan, P.: Dopamine: Generalization and bonuses. Neural Netw. 15(4-6), 549-559 (2002)
-
(2002)
Neural Netw.
, vol.15
, Issue.4-6
, pp. 549-559
-
-
Kakade, S.1
Dayan, P.2
-
74
-
-
0036832954
-
Near-optimal reinforcement learning in polynomial time
-
Kearns, M., Singh, S.: Near-optimal reinforcement learning in polynomial time. Mach. Learn. 49(2), 209-232 (2002)
-
(2002)
Mach. Learn.
, vol.49
, Issue.2
, pp. 209-232
-
-
Kearns, M.1
Singh, S.2
-
75
-
-
0035712679
-
Parallel circuits mediating distinct emotional coping reactions to different types of stress
-
Keay, K.A., Bandler, R.: Parallel circuits mediating distinct emotional coping reactions to different types of stress. Neurosci. Biobehav. Rev. 25(7-8), 669-678 (2001)
-
(2001)
Neurosci. Biobehav. Rev.
, vol.25
, Issue.7-8
, pp. 669-678
-
-
Keay, K.A.1
Bandler, R.2
-
76
-
-
0037382264
-
Coordination of actions and habits in the medial prefrontal cortex of rats
-
Killcross, S., Coutureau, E.: Coordination of actions and habits in the medial prefrontal cortex of rats. Cereb. Cortex 13(4), 400-408 (2003)
-
(2003)
Cereb. Cortex
, vol.13
, Issue.4
, pp. 400-408
-
-
Killcross, S.1
Coutureau, E.2
-
77
-
-
84880873347
-
Building portable options: Skill transfer in reinforcement learning
-
Hyderabad, India
-
Konidaris, G., Barto, A.: Building portable options: Skill transfer in reinforcement learning. In: IJCAI, pp. 895-900, Hyderabad, India (2007)
-
(2007)
IJCAI
, pp. 895-900
-
-
Konidaris, G.1
Barto, A.2
-
78
-
-
78751681641
-
Efficient skill learning using abstraction selection
-
Pasadena, California
-
Konidaris, G., Barto, A.: Efficient skill learning using abstraction selection. In: IJCAI, pp. 1107-1112, Pasadena, California (2009)
-
(2009)
IJCAI
, pp. 1107-1112
-
-
Konidaris, G.1
Barto, A.2
-
79
-
-
59649113160
-
Flexible shaping: How learning in small steps helps
-
Krueger, K.A., Dayan, P.: Flexible shaping: How learning in small steps helps. Cognition 110(3), 380-394 (2009)
-
(2009)
Cognition
, vol.110
, Issue.3
, pp. 380-394
-
-
Krueger, K.A.1
Dayan, P.2
-
81
-
-
0037840849
-
On the undecidability of probabilistic planning and related stochastic optimization problems
-
Madani, O., Hanks, S., Condon, A.: On the undecidability of probabilistic planning and related stochastic optimization problems. Artif. Intell. 147(1-2), 5-34 (2003)
-
(2003)
Artif. Intell.
, vol.147
, Issue.1-2
, pp. 5-34
-
-
Madani, O.1
Hanks, S.2
Condon, A.3
-
82
-
-
33846261103
-
Behavioral control, the medial prefrontal cortex, and resilience
-
Maier, S.F., Amat, J., Baratta, M.V., Paul, E., Watkins, L.R.: Behavioral control, the medial prefrontal cortex, and resilience. Dialogues Clin. Neurosci. 8(4), 397-406 (2006)
-
(2006)
Dialogues Clin. Neurosci.
, vol.8
, Issue.4
, pp. 397-406
-
-
Maier, S.F.1
Amat, J.2
Baratta, M.V.3
Paul, E.4
Watkins, L.R.5
-
83
-
-
19844365569
-
Stressor controllability and learned helplessness: The roles of the dorsal raphe nucleus, serotonin, and corticotropin-releasing factor
-
Maier, S.F., Watkins, L.R.: Stressor controllability and learned helplessness: The roles of the dorsal raphe nucleus, serotonin, and corticotropin-releasing factor. Neurosci. Biobehav. Rev. 29(4-5), 829-841 (2005)
-
(2005)
Neurosci. Biobehav. Rev.
, vol.29
, Issue.4-5
, pp. 829-841
-
-
Maier, S.F.1
Watkins, L.R.2
-
84
-
-
3042590043
-
A two-dimensional neuropsychology of defense: Fear/anxiety and defensive distance
-
McNaughton, N., Corr, P.J.: A two-dimensional neuropsychology of defense: Fear/anxiety and defensive distance. Neurosci. Biobehav. Rev. 28(3), 285-305 (2004)
-
(2004)
Neurosci. Biobehav. Rev.
, vol.28
, Issue.3
, pp. 285-305
-
-
McNaughton, N.1
Corr, P.J.2
-
85
-
-
84906736869
-
Functions and mechanisms of intrinsic motivations: The knowledge versus competence distinction
-
Baldassarre, G., Mirolli, M. (eds.) Springer, Berlin
-
Mirolli, M., Baldassarre, G.: Functions and mechanisms of intrinsic motivations: The knowledge versus competence distinction. In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems, pp. 49-72. Springer, Berlin (2012)
-
(2012)
Intrinsically Motivated Learning in Natural and Artificial Systems
, pp. 49-72
-
-
Mirolli, M.1
Baldassarre, G.2
-
86
-
-
40849102598
-
Synaptic theory of working memory
-
Mongillo, G., Barak, O., Tsodyks, M.: Synaptic theory of working memory. Science 319 (5869), 1543-1546 (2008)
-
(2008)
Science
, vol.319
, Issue.5869
, pp. 1543-1546
-
-
Mongillo, G.1
Barak, O.2
Tsodyks, M.3
-
87
-
-
0029981543
-
A framework for mesencephalic dopamine systems based on predictive hebbian learning
-
Montague, P.R., Dayan, P., Sejnowski, T.J.: A framework for mesencephalic dopamine systems based on predictive hebbian learning. J. Neurosci. 16(5), 1936-1947 (1996)
-
(1996)
J. Neurosci.
, vol.16
, Issue.5
, pp. 1936-1947
-
-
Montague, P.R.1
Dayan, P.2
Sejnowski, T.J.3
-
88
-
-
77950032550
-
Markov chain sampling methods for dirichlet process mixture models
-
Neal, R.: Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Stat. 9(2), 249-265 (2000)
-
(2000)
J. Comput. Graph. Stat.
, vol.9
, Issue.2
, pp. 249-265
-
-
Neal, R.1
-
89
-
-
0141596576
-
Policy invariance under reward transformations: Theory and application to reward shaping
-
Bled, Slovenia
-
Ng, A., Harada, D., Russell, S.: Policy invariance under reward transformations: Theory and application to reward shaping. In: ICML, pp. 278-287, Bled, Slovenia (1999)
-
(1999)
ICML
, pp. 278-287
-
-
Ng, A.1
Harada, D.2
Russell, S.3
-
90
-
-
84858776393
-
Multi-resolution exploration in continuous spaces
-
Nouri, A., Littman, M.: Multi-resolution exploration in continuous spaces. NIPS, pp. 1209-1216 (2009)
-
(2009)
NIPS
, pp. 1209-1216
-
-
Nouri, A.1
Littman, M.2
-
91
-
-
33644927837
-
Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia
-
O'Reilly, R.C., Frank, M.J.: Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia. Neural Comput. 18(2), 283-328 (2006)
-
(2006)
Neural Comput.
, vol.18
, Issue.2
, pp. 283-328
-
-
O'Reilly, R.C.1
Frank, M.J.2
-
92
-
-
34047267520
-
Intrinsic motivation systems for autonomous mental development
-
Oudeyer, P., Kaplan, F., Hafner, V.: Intrinsic motivation systems for autonomous mental development. IEEE Trans. Evol. Comput. 11(2), 265-286 (2007)
-
(2007)
IEEE Trans. Evol. Comput.
, vol.11
, Issue.2
, pp. 265-286
-
-
Oudeyer, P.1
Kaplan, F.2
Hafner, V.3
-
93
-
-
33748408630
-
Affective neuroscience
-
New York
-
Panksepp, J.: Affective Neuroscience. OUP, New York (1998)
-
(1998)
OUP
-
-
Panksepp, J.1
-
94
-
-
0000977910
-
The complexity of Markov decision processes
-
Papadimitriou, C., Tsitsiklis, J.: The complexity of Markov decision processes. Math. Oper. Res. 12(3), 441-450 (1987)
-
(1987)
Math. Oper. Res.
, vol.12
, Issue.3
, pp. 441-450
-
-
Papadimitriou, C.1
Tsitsiklis, J.2
-
95
-
-
84898956770
-
Reinforcement learning with hierarchies of machines
-
Denver, Colorado
-
Parr, R., Russell, S.: Reinforcement learning with hierarchies of machines. In: NIPS, pp. 1043-1049, Denver, Colorado (1998)
-
(1998)
NIPS
, pp. 1043-1049
-
-
Parr, R.1
Russell, S.2
-
96
-
-
33749251297
-
An analytic solution to discrete Bayesian reinforcement learning
-
Pittsburgh, Pennslyvania
-
Poupart, P., Vlassis, N., Hoey, J., Regan, K.: An analytic solution to discrete bayesian reinforcement learning. In: ICML, pp. 697-704, Pittsburgh, Pennslyvania (2006)
-
(2006)
ICML
, pp. 697-704
-
-
Poupart, P.1
Vlassis, N.2
Hoey, J.3
Regan, K.4
-
97
-
-
0012586376
-
-
MIT, Cambridge
-
Rao, R.P.N., Olshausen, B.A., Lewicki, M.S. (eds.): Probabilistic Models of the Brain: Perception and Neural Function. MIT, Cambridge (2002)
-
(2002)
Probabilistic Models of the Brain: Perception and Neural Function
-
-
Rao, R.P.N.1
Olshausen, B.A.2
Lewicki, M.S.3
-
98
-
-
84929046152
-
The role of the basal ganglia in discovering novel actions
-
Baldassarre, G., Mirolli, M. (eds.) Springer, Berlin
-
Redgrave, P., Gurney, K., Stafford, T., Thirkettle, M., Lewis, J.: The role of the basal ganglia in discovering novel actions. In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems, pp. 129-149. Springer, Berlin (2012)
-
(2012)
Intrinsically Motivated Learning in Natural and Artificial Systems
, pp. 129-149
-
-
Redgrave, P.1
Gurney, K.2
Stafford, T.3
Thirkettle, M.4
Lewis, J.5
-
99
-
-
0033119561
-
Is the short-latency dopamine response too short to signal reward error?
-
Redgrave, P., Prescott, T.J., Gurney, K.: Is the short-latency dopamine response too short to signal reward error? Trends Neurosci. 22(4), 146-151 (1999)
-
(1999)
Trends Neurosci
, vol.22
, Issue.4
, pp. 146-151
-
-
Redgrave, P.1
Prescott, T.J.2
Gurney, K.3
-
100
-
-
0035341482
-
Fear and feeding in the nucleus accumbens shell: Rostrocaudal segregation of GABA-elicited defensive behavior versus eating behavior
-
Reynolds, S.M., Berridge, K.C. (2001): Fear and feeding in the nucleus accumbens shell: Rostrocaudal segregation of GABA-elicited defensive behavior versus eating behavior. J. Neurosci. 21(9), 3261-3270 (1999)
-
(1999)
J. Neurosci.
, vol.21
, Issue.9
, pp. 3261-3270
-
-
Reynolds, S.M.1
Berridge, K.C.2
-
101
-
-
0037104732
-
Positive and negative motivation in nucleus accumbens shell: Bivalent rostrocaudal gradients for GABA-elicited eating, taste "liking"/"disliking" reactions, place preference/avoidance, and fear
-
Reynolds, S.M., Berridge, K.C.: Positive and negative motivation in nucleus accumbens shell: Bivalent rostrocaudal gradients for GABA-elicited eating, taste "liking"/"disliking" reactions, place preference/avoidance, and fear. J. Neurosci. 22(16), 7308-7320 (2002)
-
(2002)
J. Neurosci.
, vol.22
, Issue.16
, pp. 7308-7320
-
-
Reynolds, S.M.1
Berridge, K.C.2
-
102
-
-
41149151266
-
Emotional environments retune the valence of appetitive versus fearful functions in nucleus accumbens
-
Reynolds, S.M., Berridge, K.C.: Emotional environments retune the valence of appetitive versus fearful functions in nucleus accumbens. Nat. Neurosci. 11(4), 423-425 (2008)
-
(2008)
Nat. Neurosci.
, vol.11
, Issue.4
, pp. 423-425
-
-
Reynolds, S.M.1
Berridge, K.C.2
-
103
-
-
0031189347
-
CHILD: A first step towards continual learning
-
Ring, M.: CHILD: A first step towards continual learning. Mach. Learn. 28(1), 77-104 (1997)
-
(1997)
Mach. Learn.
, vol.28
, Issue.1
, pp. 77-104
-
-
Ring, M.1
-
104
-
-
84929054210
-
Toward a formal framework for continual learning
-
Whistler, Canada
-
Ring, M.: Toward a formal framework for continual learning. In: NIPS Workshop on Inductive Transfer, Whistler, Canada (2005)
-
(2005)
NIPS Workshop on Inductive Transfer
-
-
Ring, M.1
-
105
-
-
41149161631
-
Choice, uncertainty and value in prefrontal and cingulate cortex
-
Rushworth, M.F.S., Behrens, T.E.J.: Choice, uncertainty and value in prefrontal and cingulate cortex. Nat. Neurosci. 11(4), 389-397 (2008)
-
(2008)
Nat. Neurosci.
, vol.11
, Issue.4
, pp. 389-397
-
-
Rushworth, M.F.S.1
Behrens, T.E.J.2
-
106
-
-
0002209063
-
Intrinsic and extrinsic motivations: Classic definitions and new directions
-
Ryan, R., Deci, E.: Intrinsic and extrinsic motivations: Classic definitions and new directions. Contemp. Educ. Psychol. 25(1), 54-67 (2000)
-
(2000)
Contemp. Educ. Psychol.
, vol.25
, Issue.1
, pp. 54-67
-
-
Ryan, R.1
Deci, E.2
-
107
-
-
0742324926
-
Inter-module credit assignment in modular reinforcement learning
-
Samejima, K., Doya, K., Kawato, M.: Inter-module credit assignment in modular reinforcement learning. Neural Netw. 16(7), 985-994 (2003)
-
(2003)
Neural Netw.
, vol.16
, Issue.7
, pp. 985-994
-
-
Samejima, K.1
Doya, K.2
Kawato, M.3
-
108
-
-
0001201756
-
Some studies in machine learning using the game of checkers
-
Samuel, A.: Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 3, 210-229 (1959)
-
(1959)
IBM J. Res. Dev.
, vol.3
, pp. 210-229
-
-
Samuel, A.1
-
109
-
-
79958838807
-
Evolving childhood's length and learning parameters in an intrinsically motivated reinforcement learning robot
-
Piscataway, New Jersey
-
Schembri, M., Mirolli, M., Baldassarre, G.: Evolving childhood's length and learning parameters in an intrinsically motivated reinforcement learning robot. In: Proceedings of the Seventh International Conference on Epigenetic Robotics, pp. 141-148, Piscataway, New Jersey (2007)
-
(2007)
Proceedings of the Seventh International Conference on Epigenetic Robotics
, pp. 141-148
-
-
Schembri, M.1
Mirolli, M.2
Baldassarre, G.3
-
110
-
-
0026306990
-
Curious model-building control systems
-
Seattle, Washington State IEEE
-
Schmidhuber, J.: Curious model-building control systems. In: IJCNN, pp. 1458-1463, Seattle, Washington State IEEE (1991)
-
(1991)
IJCNN
, pp. 1458-1463
-
-
Schmidhuber, J.1
-
111
-
-
84880251870
-
Gödel machines: Fully self-referential optimal universal self-improvers
-
Schmidhuber, J.: Gödel machines: Fully self-referential optimal universal self-improvers. Artif. Gen. Intell., pp. 199-226 (2006)
-
(2006)
Artif. Gen. Intell.
, pp. 199-226
-
-
Schmidhuber, J.1
-
112
-
-
70349334569
-
Ultimate cognition à la gödel
-
Schmidhuber, J.: Ultimate cognition à la gödel. Cogn. Comput. 1, 117-193 (2009)
-
(2009)
Cogn. Comput.
, vol.1
, pp. 117-193
-
-
Schmidhuber, J.1
-
114
-
-
0002193484
-
Relation between classical conditioning and instrumental learning
-
Prokasy, W. (ed.) Appelton-Century-Crofts, New York
-
Sheffield, F.: Relation between classical conditioning and instrumental learning. In: Prokasy, W. (ed.) Classical Conditioning, pp. 302-322. Appelton-Century-Crofts, New York (1965)
-
(1965)
Classical Conditioning
, pp. 302-322
-
-
Sheffield, F.1
-
115
-
-
33749261645
-
An intrinsic reward mechanism for efficient exploration
-
Pittsburgh, Pennsylvania
-
Şimşek, Ö., Barto, A.G.: An intrinsic reward mechanism for efficient exploration. In: ICML, pp. 833-840, Pittsburgh, Pennsylvania (2006)
-
(2006)
ICML
, pp. 833-840
-
-
Şimşek, O.1
Barto, A.G.2
-
116
-
-
0001027894
-
Transfer of learning by composing solutions of elemental sequential tasks
-
Singh, S.: Transfer of learning by composing solutions of elemental sequential tasks. Mach. Learn. 8(3), 323-339 (1992)
-
(1992)
Mach. Learn.
, vol.8
, Issue.3
, pp. 323-339
-
-
Singh, S.1
-
117
-
-
84899031920
-
Intrinsically motivated reinforcement learning
-
Vancouver, Canada
-
Singh, S., Barto, A., Chentanez, N.: Intrinsically motivated reinforcement learning. In: NIPS, pp. 1281-1288, Vancouver, Canada (2005)
-
(2005)
NIPS
, pp. 1281-1288
-
-
Singh, S.1
Barto, A.2
Chentanez, N.3
-
118
-
-
0030240189
-
A guide to constructs of control
-
Skinner, E.A.: A guide to constructs of control. J. Pers. Soc. Psychol. 71(3), 549-570 (1996)
-
(1996)
J. Pers. Soc. Psychol.
, vol.71
, Issue.3
, pp. 549-570
-
-
Skinner, E.A.1
-
119
-
-
33646230819
-
Dopamine, prediction error and associative learning: A model-based account
-
Smith, A., Li, M., Becker, S., Kapur, S.: Dopamine, prediction error and associative learning: A model-based account. Network 17(1), 61-84 (2006)
-
(2006)
Network
, vol.17
, Issue.1
, pp. 61-84
-
-
Smith, A.1
Li, M.2
Becker, S.3
Kapur, S.4
-
120
-
-
0001425882
-
Reconciling the role of central serotonin neurons in human and animal behaviour
-
Soubrié, P.: Reconciling the role of central serotonin neurons in human and animal behaviour. Behav. Brain Sci. 9, 319-364 (1986)
-
(1986)
Behav. Brain Sci.
, vol.9
, pp. 319-364
-
-
Soubrié, P.1
-
121
-
-
14344258433
-
A Bayesian framework for reinforcement learning
-
Stanford, California
-
Strens, M.: A Bayesian framework for reinforcement learning. In: ICML, pp. 943-950, Stanford, California (2000)
-
(2000)
ICML
, pp. 943-950
-
-
Strens, M.1
-
122
-
-
0032930935
-
A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task
-
Suri, R.E., Schultz, W.: A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task. Neuroscience 91(3), 871-890 (1999)
-
(1999)
Neuroscience
, vol.91
, Issue.3
, pp. 871-890
-
-
Suri, R.E.1
Schultz, W.2
-
123
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
Sutton, R.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9-44 (1988)
-
(1988)
Mach. Learn.
, vol.3
, Issue.1
, pp. 9-44
-
-
Sutton, R.1
-
124
-
-
85132026293
-
Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
-
Sutton, R.: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. ICML Austin, Texas 216, 224 (1990)
-
(1990)
ICML Austin, Texas
, vol.216
, pp. 224
-
-
Sutton, R.1
-
125
-
-
0033170372
-
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
-
Sutton, R., Precup, D., Singh, S.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1), 181-211 (1999)
-
(1999)
Artif. Intell.
, vol.112
, Issue.1
, pp. 181-211
-
-
Sutton, R.1
Precup, D.2
Singh, S.3
-
127
-
-
56049088540
-
Multitask reinforcement learning on the distribution of MDPs
-
Tanaka, F., Yamamura, M.: Multitask reinforcement learning on the distribution of MDPs. IEEJ Trans. Electron. Inform. Syst. C 123(5), 1004-1011 (2003)
-
(2003)
IEEJ Trans. Electron. Inform. Syst. C
, vol.123
, Issue.5
, pp. 1004-1011
-
-
Tanaka, F.1
Yamamura, M.2
-
128
-
-
33749249312
-
Hierarchical dirichlet processes
-
Teh, Y., Jordan, M., Beal, M., Blei, D.: Hierarchical dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566-1581 (2006)
-
(2006)
J. Am. Stat. Assoc.
, vol.101
, Issue.476
, pp. 1566-1581
-
-
Teh, Y.1
Jordan, M.2
Beal, M.3
Blei, D.4
-
129
-
-
33746260413
-
Theory-based Bayesian models of inductive learning and reasoning
-
Tenenbaum, J., Griffiths, T., Kemp, C.: Theory-based Bayesian models of inductive learning and reasoning. Trends Cogn. Sci. 10(7), 309-318 (2006)
-
(2006)
Trends Cogn. Sci.
, vol.10
, Issue.7
, pp. 309-318
-
-
Tenenbaum, J.1
Griffiths, T.2
Kemp, C.3
-
130
-
-
84862302350
-
Hierarchical beta processes and the Indian buffet process
-
San Juan, Puerto Rico
-
Thibaux, R., Jordan, M.: Hierarchical beta processes and the Indian buffet process. In: AIStats, pp. 564-571, San Juan, Puerto Rico (2007)
-
(2007)
AIStats
, pp. 564-571
-
-
Thibaux, R.1
Jordan, M.2
-
132
-
-
33749882712
-
Finding structure in reinforcement learning
-
Denver, Colorado
-
Thrun, S., Schwartz, A.: Finding structure in reinforcement learning. In: NIPS, pp. 385-392, Denver, Colorado (1995)
-
(1995)
NIPS
, pp. 385-392
-
-
Thrun, S.1
Schwartz, A.2
-
133
-
-
58149442669
-
Cognitive maps in rats and men
-
Tolman, E.C.: Cognitive maps in rats and men. Psychol. Rev. 55(4), 189-208 (1948)
-
(1948)
Psychol. Rev.
, vol.55
, Issue.4
, pp. 189-208
-
-
Tolman, E.C.1
-
134
-
-
66449119919
-
A specific role for posterior dorsolateral striatum in human habit learning
-
Tricomi, E., Balleine, B.W., O'Doherty, J.P.: A specific role for posterior dorsolateral striatum in human habit learning. Eur. J. Neurosci. 29(11), 2225-2232 (2009)
-
(2009)
Eur. J. Neurosci.
, vol.29
, Issue.11
, pp. 2225-2232
-
-
Tricomi, E.1
Balleine, B.W.2
O'Doherty, J.P.3
-
135
-
-
34247147767
-
Determining the neural substrates of goal-directed learning in the human brain
-
Valentin, V.V., Dickinson, A., O'Doherty, J.P.: Determining the neural substrates of goal-directed learning in the human brain. J. Neurosci. 27(15), 4019-4026 (2007)
-
(2007)
J. Neurosci.
, vol.27
, Issue.15
, pp. 4019-4026
-
-
Valentin, V.V.1
Dickinson, A.2
O'Doherty, J.P.3
-
136
-
-
63149146163
-
Learning flexible sensori-motor mappings in a complex network
-
Vasilaki, E., Fusi, S., Wang, X.-J., Senn, W. (2009): Learning flexible sensori-motor mappings in a complex network. Biol. Cybern. 100(2), 147-158 (2007)
-
(2007)
Biol. Cybern.
, vol.100
, Issue.2
, pp. 147-158
-
-
Vasilaki, E.1
Fusi, S.2
Wang, X.-J.3
Senn, W.4
-
137
-
-
31844436266
-
Bayesian sparse sampling for on-line reward optimization
-
Bonn, Germany
-
Wang, T., Lizotte, D., Bowling, M., Schuurmans, D.: Bayesian sparse sampling for on-line reward optimization. In: ICML, pp. 956-963, Bonn, Germany (2005)
-
(2005)
ICML
, pp. 956-963
-
-
Wang, T.1
Lizotte, D.2
Bowling, M.3
Schuurmans, D.4
-
139
-
-
0345161973
-
Efficient model-based exploration
-
Zurich, Switzerland
-
Wiering, M., Schmidhuber, J.: Efficient model-based exploration. In: Simulation of Adaptive Behavior, pp. 223-228, Zurich, Switzerland (1998)
-
(1998)
Simulation of Adaptive Behavior
, pp. 223-228
-
-
Wiering, M.1
Schmidhuber, J.2
-
140
-
-
84989993724
-
Auto-maintenance in the pigeon: Sustained pecking despite contingent non-reinforcement
-
Williams, D.R., Williams, H.: Auto-maintenance in the pigeon: Sustained pecking despite contingent non-reinforcement. J. Exp. Anal. Behav. 12(4), 511-520 (1969)
-
(1969)
J. Exp. Anal. Behav.
, vol.12
, Issue.4
, pp. 511-520
-
-
Williams, D.R.1
Williams, H.2
-
141
-
-
34547994508
-
Multi-task reinforcement learning: A hierarchical Bayesian approach
-
Corvallis, Oregon
-
Wilson, A., Fern, A., Ray, S., Tadepalli, P.: Multi-task reinforcement learning: A hierarchical bayesian approach. In: ICML, pp. 1015-1022, Corvallis, Oregon (2007)
-
(2007)
ICML
, pp. 1015-1022
-
-
Wilson, A.1
Fern, A.2
Ray, S.3
Tadepalli, P.4
-
142
-
-
84881042664
-
Bayesian policy search with policy priors
-
AAAI Press, Menlo Park
-
Wingate, D., Goodman, N.D., Roy, D.M., Kaelbling, L.P., Tenenbaum, J.B.: Bayesian policy search with policy priors. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence-Volume, Vol. 2, pp. 1565-1570. AAAI Press, Menlo Park (2011)
-
(2011)
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence-Volume
, vol.2
, pp. 1565-1570
-
-
Wingate, D.1
Goodman, N.D.2
Roy, D.M.3
Kaelbling, L.P.4
Tenenbaum, J.B.5
-
143
-
-
0032192424
-
Multiple paired forward and inverse models for motor control
-
Wolpert, D.M., Kawato, M.: Multiple paired forward and inverse models for motor control. Neural Netw. 11(7-8), 1317-1329 (1998)
-
(1998)
Neural Netw.
, vol.11
, Issue.7-8
, pp. 1317-1329
-
-
Wolpert, D.M.1
Kawato, M.2
-
144
-
-
33646853495
-
Resolution of uncertainty in prefrontal cortex
-
Yoshida, W., Ishii, S.: Resolution of uncertainty in prefrontal cortex. Neuron 50(5), 781-789 (2006)
-
(2006)
Neuron
, vol.50
, Issue.5
, pp. 781-789
-
-
Yoshida, W.1
Ishii, S.2
-
145
-
-
20444388016
-
Uncertainty, neuromodulation, and attention
-
Yu, A.J., Dayan, P.: Uncertainty, neuromodulation, and attention. Neuron 46(4), 681-692 (2005)
-
(2005)
Neuron
, vol.46
, Issue.4
, pp. 681-692
-
-
Yu, A.J.1
Dayan, P.2
|