SCOPUS 정보 검색 플랫폼

Intrinsically Motivated Learning in Natural and Artificial Systems

Volumn 9783642323751, Issue , 2013, Pages 73-91

Exploration from generalization mediated by multiple controllers

(1) Dayan, Peter a

a UNIVERSITY COLLEGE LONDON (United Kingdom)

Author keywords

[No Author keywords available]

Indexed keywords

GENERALISATION; INTRINSIC MOTIVATION; LEARN+; LEARNING PROCESS; MULTIPLE CONTROLLERS; POTENTIAL BENEFITS;

EID: 84886641205 PISSN: None EISSN: None Source Type: Book
DOI: 10.1007/978-3-642-32375-1_4 Document Type: Chapter

Times cited : (11)

References (145)

1
- 84966722327
- Improving Bayesian reinforcement learning using transition abstraction
- Canada
- Acuna, D., Schrater, P.: Improving bayesian reinforcement learning using transition abstraction. In: ICML/UAI/COLT Workshop on Abstraction in Reinforcement Learning. Montreal, Canada (2009)
- (2009) ICML/UAI/COLT Workshop on Abstraction in Reinforcement Learning. Montreal
- Acuna, D.¹ Schrater, P.²

2
- 78649507911
- A Bayesian sampling approach to exploration in reinforcement learning
- Montreal, Canada
- Asmuth, J., Li, L., Littman, M., Nouri, A., Wingate, D.: A bayesian sampling approach to exploration in reinforcement learning. In: UAI, Montreal, Canada (2009)
- (2009) UAI
- Asmuth, J.¹ Li, L.² Littman, M.³ Nouri, A.⁴ Wingate, D.⁵

3
- 23244432007
- An integrative theory of locus coeruleus-norepinephrine function: Adaptive gain and optimal performance
- Aston-Jones, G., Cohen, J.D.: An integrative theory of locus coeruleus-norepinephrine function: Adaptive gain and optimal performance. Annu. Rev. Neurosci. 28, 403-450 (2005)
- (2005) Annu. Rev. Neurosci. , vol.28 , pp. 403-450
- Aston-Jones, G.¹ Cohen, J.D.²

4
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2), 235-256 (2002a)
- (2002) Mach. Learn. , vol.47 , Issue.2 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

5
- 0037709910
- The nonstochastic multiarmed bandit problem
- Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1), 48-77 (2002b)
- (2002) SIAM J. Comput. , vol.32 , Issue.1 , pp. 48-77
- Auer, P.¹ Cesa-Bianchi, N.² Freund, Y.³ Schapire, R.⁴

6
- 28444472936
- Neural bases of food-seeking: Affect, arousal and reward in corticostriatolimbic circuits
- Balleine, B.W.: Neural bases of food-seeking: Affect, arousal and reward in corticostriatolimbic circuits. Physiol. Behav. 86(5), 717-730 (2005)
- (2005) Physiol. Behav. , vol.86 , Issue.5 , pp. 717-730
- Balleine, B.W.¹

7
- 0028101302
- Columnar organization in the midbrain periaqueductal gray: Modules for emotional expression?
- Bandler, R., Shipley, M.T.: Columnar organization in the midbrain periaqueductal gray: Modules for emotional expression? Trends Neurosci. 17(9), 379-389 (1994)
- (1994) Trends Neurosci. , vol.17 , Issue.9 , pp. 379-389
- Bandler, R.¹ Shipley, M.T.²

8
- 0000541213
- Adaptive critics and the basal ganglia
- Houk, J., Davis, J., Beiser, D. (eds.) MIT, Cambridge
- Barto, A.: Adaptive critics and the basal ganglia. In: Houk, J., Davis, J., Beiser, D. (eds.) Models of Information Processing in the Basal Ganglia, pp. 215-232. MIT, Cambridge (1995)
- (1995) Models of Information Processing in the Basal Ganglia , pp. 215-232
- Barto, A.¹

9
- 0141988716
- Recent advances in hierarchical reinforcement learning
- Barto, A., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discr. Event Dyn. Syst. 13(4), 341-379 (2003)
- (2003) Discr. Event Dyn. Syst. , vol.13 , Issue.4 , pp. 341-379
- Barto, A.¹ Mahadevan, S.²

10
- 33749651693
- Intrinsically motivated learning of hierarchical collections of skills
- La Jolla, CA
- Barto, A., Singh, S., Chentanez, N.: Intrinsically motivated learning of hierarchical collections of skills. In: ICDL 2004, La Jolla, CA (2004)
- (2004) ICDL 2004
- Barto, A.¹ Singh, S.² Chentanez, N.³

11
- 0020970738
- Neuronlike elements that can solve difficult learning control problems
- Barto, A., Sutton, R., Anderson, C.: Neuronlike elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. 13(5), 834-846 (1983)
- (1983) IEEE Trans. Syst. Man Cybern. , vol.13 , Issue.5 , pp. 834-846
- Barto, A.¹ Sutton, R.² Anderson, C.³

12
- 84929046579
- Intrinsic motivation and reinforcement learning
- Baldassarre, G., Mirolli, M. (eds.) Springer, Berlin
- Barto, A.G.: Intrinsic motivation and reinforcement learning. In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems, pp. 17-47. Springer, Berlin (2012)
- (2012) Intrinsically Motivated Learning in Natural and Artificial Systems , pp. 17-47
- Barto, A.G.¹

13
- 84898936541
- The infinite hidden Markov model
- Vancouver, Canada
- Beal, M., Ghahramani, Z., Rasmussen, C.: The infinite hidden Markov model. In: NIPS, pp. 577-584, Vancouver, Canada (2002)
- (2002) NIPS , pp. 577-584
- Beal, M.¹ Ghahramani, Z.² Rasmussen, C.³

14
- 34548295327
- Learning the value of information in an uncertain world
- Behrens, T.E.J., Woolrich, M.W., Walton, M.E., Rushworth, M.F.S.: Learning the value of information in an uncertain world. Nat. Neurosci. 10(9), 1214-1221 (2007)
- (2007) Nat. Neurosci. , vol.10 , Issue.9 , pp. 1214-1221
- Behrens, T.E.J.¹ Woolrich, M.W.² Walton, M.E.³ Rushworth, M.F.S.⁴

15
- 85012688561
- Princeton University Press, Princeton
- Bellman, R.E.: Dynamic Programming. Princeton University Press, Princeton (1957)
- (1957) Dynamic Programming
- Bellman, R.E.¹

16
- 2442701355
- Motivation concepts in behavioral neuroscience
- Berridge, K.C.: Motivation concepts in behavioral neuroscience. Physiol. Behav. 81, 179-209 (2004)
- (2004) Physiol. Behav. , vol.81 , pp. 179-209
- Berridge, K.C.¹

17
- 0004181906
- Springer, Berlin
- Berry, D.A., Fristedt, B.: Bandit Problems: Sequential Allocation of Experiments. Springer, Berlin (1985)
- (1985) Bandit Problems: Sequential Allocation of Experiments
- Berry, D.A.¹ Fristedt, B.²

18
- 0023800192
- Ethoexperimental approaches to the biology of emotion
- Blanchard, D.C., Blanchard, R.J.: Ethoexperimental approaches to the biology of emotion. Annu. Rev. Psychol. 39, 43-68 (1988)
- (1988) Annu. Rev. Psychol. , vol.39 , pp. 43-68
- Blanchard, D.C.¹ Blanchard, R.J.²

19
- 13844281871
- Bringing up robot: Fundamental mechanisms for creating a self-motivated, self-organizing architecture
- Blank, D., Kumar, D., Meeden, L., Marshall, J.: Bringing up robot: Fundamental mechanisms for creating a self-motivated, self-organizing architecture. Cybern. Syst. 36(2), 125-150 (2005)
- (2005) Cybern. Syst. , vol.36 , Issue.2 , pp. 125-150
- Blank, D.¹ Kumar, D.² Meeden, L.³ Marshall, J.⁴

20
- 58149417523
- Species-specific defense reactions and avoidance learning
- Bolles, R.C.: Species-specific defense reactions and avoidance learning. Psychol. Rev. 77, 32-48 (1970)
- (1970) Psychol. Rev. , vol.77 , pp. 32-48
- Bolles, R.C.¹

21
- 70350566799
- Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective
- Botvinick, M.M., Niv, Y., Barto, A.C.: Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective. Cognition 113(3), 262-280 (2009)
- (2009) Cognition , vol.113 , Issue.3 , pp. 262-280
- Botvinick, M.M.¹ Niv, Y.² Barto, A.C.³

22
- 78649651245
- Opponency revisited: Competition and cooperation between dopamine and serotonin
- Boureau, Y.-L., Dayan, P.: Opponency revisited: Competition and cooperation between dopamine and serotonin. Neuropsychopharmacology 36(1), 74-97 (2011)
- (2011) Neuropsychopharmacology , vol.36 , Issue.1 , pp. 74-97
- Boureau, Y.-L.¹ Dayan, P.²

23
- 0041965975
- R-max - A general polynomial time algorithm for near-optimal reinforcement learning
- Brafman, R., Tennenholtz, M.: R-max-a general polynomial time algorithm for near-optimal reinforcement learning. J. Mach. Learn. Res. 3, 213-231 (2003)
- (2003) J. Mach. Learn. Res. , vol.3 , pp. 213-231
- Brafman, R.¹ Tennenholtz, M.²

24
- 0000696066
- The misbehavior of organisms
- Breland, K., Breland, M.: The misbehavior of organisms. Am. Psychol. 16(9), 681-84 (1961)
- (1961) Am. Psychol. , vol.16 , Issue.9 , pp. 681-684
- Breland, K.¹ Breland, M.²

25
- 0023981451
- The ART of adaptive pattern recognition by a self-organizing neural network
- Carpenter, G., Grossberg, S.: The ART of adaptive pattern recognition by a self-organizing neural network. Computer 21, 77-88 (1988)
- (1988) Computer , vol.21 , pp. 77-88
- Carpenter, G.¹ Grossberg, S.²

26
- 0031189914
- Multitask learning
- Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41-75 (1997)
- (1997) Mach. Learn. , vol.28 , Issue.1 , pp. 41-75
- Caruana, R.¹

27
- 84929054208
- Ph.D. Thesis, Université Pierre et Marie Curie, Paris
- Collins, A.: Apprentissage et Contrôle Cognitif: Une Théorie de la Fonction Executive Préfrontale Humaine. Ph.D. Thesis, Université Pierre et Marie Curie, Paris (2010)
- (2010) Apprentissage et Contrôle Cognitif: Une Théorie de la Fonction Executive Préfrontale Humaine
- Collins, A.¹

28
- 33750189183
- Similarity and discrimination in classical conditioning: A latent variable account
- Vancouver, Canada
- Courville, A., Daw, N., Touretzky, D.: Similarity and discrimination in classical conditioning: A latent variable account. In: NIPS, pp. 313-320, Vancouver, Canada (2004)
- (2004) NIPS , pp. 313-320
- Courville, A.¹ Daw, N.² Touretzky, D.³

29
- 33646492363
- The computational neurobiology of learning and reward
- Daw, N.D., Doya, K.: The computational neurobiology of learning and reward. Curr. Opin. Neurobiol. 16(2), 199-204 (2006)
- (2006) Curr. Opin. Neurobiol. , vol.16 , Issue.2 , pp. 199-204
- Daw, N.D.¹ Doya, K.²

30
- 0036592008
- Opponent interactions between serotonin and dopamine
- Daw, N.D., Kakade, S., Dayan, P.: Opponent interactions between serotonin and dopamine. Neural Netw. 15, 603-16 (2002)
- (2002) Neural Netw. , vol.15 , pp. 603-616
- Daw, N.D.¹ Kakade, S.² Dayan, P.³

31
- 28044450875
- Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
- Daw, N.D., Niv, Y., Dayan, P.: Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8(12), 1704-1711 (2005)
- (2005) Nat. Neurosci. , vol.8 , Issue.12 , pp. 1704-1711
- Daw, N.D.¹ Niv, Y.² Dayan, P.³

32
- 33745223257
- Cortical substrates for exploratory decisions in humans
- Daw, N.D., O'Doherty, J.P., Dayan, P., Seymour, B., Dolan, R.J.: Cortical substrates for exploratory decisions in humans. Nature 441 (7095), 876-879 (2006)
- (2006) Nature , vol.441 , Issue.7095 , pp. 876-879
- Daw, N.D.¹ O'Doherty, J.P.² Dayan, P.³ Seymour, B.⁴ Dolan, R.J.⁵

33
- 50549094930
- Bilinearity, rules, and prefrontal cortex
- Dayan, P.: Bilinearity, rules, and prefrontal cortex. Front. Comput. Neurosci. 1, 1 (2007)
- (2007) Front. Comput. Neurosci. , vol.1 , pp. 1
- Dayan, P.¹

34
- 0001234682
- Feudal reinforcement learning
- Hanson, S.J., Cowan, J.D., Giles, C.L. (eds.) MIT, Cambridge
- Dayan, P., Hinton, G.: Feudal reinforcement learning. In: Hanson, S.J., Cowan, J.D., Giles, C.L. (eds.) Advances in Neural Information Processing Systems (NIPS) 5. MIT, Cambridge (1993)
- (1993) Advances in Neural Information Processing Systems (NIPS) 5
- Dayan, P.¹ Hinton, G.²

35
- 40149109071
- Serotonin, inhibition, and negative mood
- Dayan, P., Huys, Q.J.M.: Serotonin, inhibition, and negative mood. PLoS Comput. Biol. 4(2), e4 (2008)
- (2008) PLoS Comput. Biol. , vol.4 , Issue.2 , pp. e4
- Dayan, P.¹ Huys, Q.J.M.²

36
- 67349100969
- Serotonin in affective control
- Dayan, P., Huys, Q.J.M.: Serotonin in affective control. Annu. Rev. Neurosci. 32, 95-126 (2009)
- (2009) Annu. Rev. Neurosci. , vol.32 , pp. 95-126
- Dayan, P.¹ Huys, Q.J.M.²

37
- 33749055062
- The misbehavior of value and the discipline of the will
- Dayan, P., Niv, Y., Seymour, B., Daw, N.D.: The misbehavior of value and the discipline of the will. Neural Netw. 19(8), 1153-1160 (2006)
- (2006) Neural Netw. , vol.19 , Issue.8 , pp. 1153-1160
- Dayan, P.¹ Niv, Y.² Seymour, B.³ Daw, N.D.⁴

38
- 0030260201
- Exploration bonuses and dual control
- Dayan, P., Sejnowski, T.: Exploration bonuses and dual control. Mach. Learn. 25(1), 5-22 (1996)
- (1996) Mach. Learn. , vol.25 , Issue.1 , pp. 5-22
- Dayan, P.¹ Sejnowski, T.²

39
- 0026340713
- 5-HT and mechanisms of defence
- Deakin, J.F.W., Graeff, F.G.: 5-HT and mechanisms of defence. J. Psychopharmacol. 5, 305-316 (1991)
- (1991) J. Psychopharmacol. , vol.5 , pp. 305-316
- Deakin, J.F.W.¹ Graeff, F.G.²

40
- 1142281527
- Model based Bayesian exploration
- Stockholm, Sweden
- Dearden, R., Friedman, N., Andre, D.: Model based Bayesian exploration. In: UAI, Stockholm, Sweden pp. 150-159 (1999)
- (1999) UAI , pp. 150-159
- Dearden, R.¹ Friedman, N.² Andre, D.³

41
- 0003631043
- Plenum, New York
- Deci, E., Ryan, R.: Intrinsic motivation and self-determination in human behavior. Plenum, New York (1985)
- (1985) Intrinsic Motivation and Self-determination in Human Behavior
- Deci, E.¹ Ryan, R.²

42
- 0003793529
- Cambridge University Press, Cambridge
- Dickinson, A.: Contemporary animal learning theory. Cambridge University Press, Cambridge (1980)
- (1980) Contemporary Animal Learning Theory
- Dickinson, A.¹

43
- 0043250430
- The role of learning in motivation
- Gallistel, C. (ed.) Wiley, New York
- Dickinson, A., Balleine, B.: The role of learning in motivation. In: Gallistel, C. (ed.) Stevens' Handbook of Experimental Psychology, Vol. 3, pp. 497-533. Wiley, New York (2002)
- (2002) Stevens' Handbook of Experimental Psychology , vol.3 , pp. 497-533
- Dickinson, A.¹ Balleine, B.²

44
- 0001806701
- The MAXQ method for hierarchical reinforcement learning
- Madison, Wisconsin
- Dietterich, T.: The MAXQ method for hierarchical reinforcement learning. In: ICML, pp. 118-126, Madison, Wisconsin, (1998)
- (1998) ICML , pp. 118-126
- Dietterich, T.¹

45
- 0002278788
- Hierarchical reinforcement learning with the MAXQ value function decomposition
- Dietterich, T.: Hierarchical reinforcement learning with the MAXQ value function decomposition. J. Artif. Intell. Res. 13(1), 227-303 (2000)
- (2000) J. Artif. Intell. Res. , vol.13 , Issue.1 , pp. 227-303
- Dietterich, T.¹

46
- 0036592023
- Metalearning and neuromodulation
- Doya, K.: Metalearning and neuromodulation. Neural Netw. 15(4-6), 495-506 (2002)
- (2002) Neural Netw. , vol.15 , Issue.4-6 , pp. 495-506
- Doya, K.¹

47
- 0036618011
- Multiple model-based reinforcement learning
- Doya, K., Samejima, K., ichi Katagiri, K., Kawato, M.: Multiple model-based reinforcement learning. Neural Comput. 14(6), 1347-1369 (2002)
- (2002) Neural Comput. , vol.14 , Issue.6 , pp. 1347-1369
- Doya, K.¹ Samejima, K.² Ichi Katagiri, K.³ Kawato, M.⁴

48
- 1942450858
- Ph.D. Thesis, Computer Science Department, University of Massachusetts, Amherst
- Duff, M.: Optimal Learning: Computational approaches for Bayes-adaptive Markov decision processes. Ph.D. Thesis, Computer Science Department, University of Massachusetts, Amherst (2000)
- (2000) Optimal Learning: Computational Approaches for Bayes-adaptive Markov Decision Processes
- Duff, M.¹

49
- 0036832959
- Structure in the space of value functions
- Foster, D., Dayan, P.: Structure in the space of value functions. Mach. Learn. 49(2), 325-346 (2002)
- (2002) Mach. Learn. , vol.49 , Issue.2 , pp. 325-346
- Foster, D.¹ Dayan, P.²

50
- 84900513897
- Learning to selectively attend
- Portland, Oregon
- Gershman, S., Cohen, J., Niv, Y.: Learning to selectively attend. In: Proceedings of the 32nd Annual Conference of the Cognitive Science Society, Portland, Oregon (2010a)
- (2010) Proceedings of the 32nd Annual Conference of the Cognitive Science Society
- Gershman, S.¹ Cohen, J.² Niv, Y.³

51
- 77952541839
- Learning latent structure: Carving nature at its joints
- Gershman, S., Niv, Y.: Learning latent structure: Carving nature at its joints. Curr. Opin. Neurobiol. (2010)
- (2010) Curr. Opin. Neurobiol.
- Gershman, S.¹ Niv, Y.²

52
- 74049117596
- Context, learning, and extinction
- Gershman, S.J., Blei, D.M., Niv, Y.: Context, learning, and extinction. Psychol. Rev. 117(1), 197-209 (2010b)
- (2010) Psychol. Rev. , vol.117 , Issue.1 , pp. 197-209
- Gershman, S.J.¹ Blei, D.M.² Niv, Y.³

53
- 84891584370
- Wiley, New York
- Gittins, J.C.: Multi-Armed Bandit Allocation Indices. Wiley, New York (1989)
- (1989) Multi-Armed Bandit Allocation Indices
- Gittins, J.C.¹

54
- 0010966147
- Rats learn the relationship between responding and environmental events: An expansion of the learned helplessness hypothesis
- Goodkin, F.: Rats learn the relationship between responding and environmental events: An expansion of the learned helplessness hypothesis. Learn. Motiv. 7, 382-393 (1976)
- (1976) Learn. Motiv. , vol.7 , pp. 382-393
- Goodkin, F.¹

55
- 0004084439
- 2nd edn. OUP, Oxford
- Gray, J.A., McNaughton, N.: The Neuropsychology of Anxiety, 2nd edn. OUP, Oxford (2003)
- (2003) The Neuropsychology of Anxiety
- Gray, J.A.¹ McNaughton, N.²

56
- 0004145775
- Harper & Row, New York
- Guthrie, E.: The Psychology of Learning. Harper & Row, New York (1952)
- (1952) The Psychology of Learning
- Guthrie, E.¹

57
- 34548566262
- Towards an executive without a homunculus: Computational models of the prefrontal cortex/basal ganglia system
- Hazy, T.E., Frank, M.J., O'reilly, R.C.: Towards an executive without a homunculus: Computational models of the prefrontal cortex/basal ganglia system. Philos. Trans. R. Soc. Lond. B Biol. Sci. 362 (1485), 1601-1613 (2007)
- (2007) Philos. Trans. R. Soc. Lond. B Biol. Sci. , vol.362 , Issue.1485 , pp. 1601-1613
- Hazy, T.E.¹ Frank, M.J.² O'Reilly, R.C.³

58
- 0034031837
- Multiple forms of short-term plasticity at excitatory synapses in rat medial prefrontal cortex
- Hempel, C.M., Hartman, K.H., Wang, X.J., Turrigiano, G.G., Nelson, S.B.: Multiple forms of short-term plasticity at excitatory synapses in rat medial prefrontal cortex. J. Neurophysiol. 83(5), 3031-3041 (2000)
- (2000) J. Neurophysiol. , vol.83 , Issue.5 , pp. 3031-3041
- Hempel, C.M.¹ Hartman, K.H.² Wang, X.J.³ Turrigiano, G.G.⁴ Nelson, S.B.⁵

59
- 0022979089
- An approach through the looking-glass
- Hershberger, W.A.: An approach through the looking-glass. Anim. Learn. Behav. 14, 443-51 (1986)
- (1986) Anim. Learn. Behav. , vol.14 , pp. 443-451
- Hershberger, W.A.¹

60
- 0029652445
- The "wake-sleep" algorithm for unsupervised neural networks
- Hinton, G.E., Dayan, P., Frey, B.J., Neal, R.M.: The "wake-sleep" algorithm for unsupervised neural networks. Science 268 (5214), 1158-1161 (1995)
- (1995) Science , vol.268 , Issue.5214 , pp. 1158-1161
- Hinton, G.E.¹ Dayan, P.² Frey, B.J.³ Neal, R.M.⁴

61
- 0031590130
- Generative models for discovering sparse distributed representations
- Hinton, G.E., Ghahramani, Z.: Generative models for discovering sparse distributed representations. Philos. Trans. R. Soc. Lond. B Biol. Sci. 352 (1358), 1177-1190 (1997)
- (1997) Philos. Trans. R. Soc. Lond. B Biol. Sci. , vol.352 , Issue.1358 , pp. 1177-1190
- Hinton, G.E.¹ Ghahramani, Z.²

62
- 33746600649
- Reducing the dimensionality of data with neural networks
- Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313 (5786), 504-507 (2006)
- (2006) Science , vol.313 , Issue.5786 , pp. 504-507
- Hinton, G.E.¹ Salakhutdinov, R.R.²

63
- 0031747058
- Amount of training affects associatively-activated event representation
- Holland, P.: Amount of training affects associatively-activated event representation. Neuropharmacology 37(4-5), 461-469 (1998)
- (1998) Neuropharmacology , vol.37 , Issue.4-5 , pp. 461-469
- Holland, P.¹

64
- 0034061668
- Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events
- Horvitz, J.C.: Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events. Neuroscience 96(4), 651-656 (2000)
- (2000) Neuroscience , vol.96 , Issue.4 , pp. 651-656
- Horvitz, J.C.¹

65
- 0030757872
- Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat
- Horvitz, J.C., Stewart, T., Jacobs, B.L.: Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat. Brain Res. 759(2), 251-258 (1997)
- (1997) Brain Res. , vol.759 , Issue.2 , pp. 251-258
- Horvitz, J.C.¹ Stewart, T.² Jacobs, B.L.³

66
- 84939003870
- Information value theory
- Howard, R.: Information value theory. IEEE Trans. Syst. Sci. Cybern. 2(1), 22-26 (1966)
- (1966) IEEE Trans. Syst. Sci. Cybern. , vol.2 , Issue.1 , pp. 22-26
- Howard, R.¹

67
- 34447328072
- Inherent value systems for autonomous mental development
- Huang, X., Weng, J.: Inherent value systems for autonomous mental development. Int. J. Human. Robot. 4, 407-433 (2007)
- (2007) Int. J. Human. Robot. , vol.4 , pp. 407-433
- Huang, X.¹ Weng, J.²

68
- 21844479189
- Springer, Berlin
- Hutter, M.: Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Springer, Berlin (2005)
- (2005) Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability
- Hutter, M.¹

69
- 67651041654
- Reinforcers and control
- Ph.D. Thesis, Gatsby Computational Neuroscience Unit, UCL
- Huys, Q.: Reinforcers and control. Towards a computational ætiology of depression. Ph.D. Thesis, Gatsby Computational Neuroscience Unit, UCL (2007)
- (2007) Towards a Computational ætiology of Depression
- Huys, Q.¹

70
- 70350570499
- A Bayesian formulation of behavioral control
- Huys, Q.J.M., Dayan, P.: A Bayesian formulation of behavioral control. Cognition 113, 314-328 (2009)
- (2009) Cognition , vol.113 , pp. 314-328
- Huys, Q.J.M.¹ Dayan, P.²

71
- 0036592028
- Control of exploitation-exploration meta-parameter in reinforcement learning
- Ishii, S., Yoshida, W., Yoshimoto, J.: Control of exploitation-exploration meta-parameter in reinforcement learning. Neural Netw. 15(4-6), 665-687 (2002)
- (2002) Neural Netw. , vol.15 , Issue.4-6 , pp. 665-687
- Ishii, S.¹ Yoshida, W.² Yoshimoto, J.³

72
- 0032073263
- Planning and acting in partially observable stochastic domains
- Kaelbling, L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1-2), 99-134 (1998)
- (1998) Artif. Intell. , vol.101 , Issue.1-2 , pp. 99-134
- Kaelbling, L.¹ Littman, M.² Cassandra, A.³

73
- 0036592029
- Dopamine: Generalization and bonuses
- Kakade, S., Dayan, P.: Dopamine: Generalization and bonuses. Neural Netw. 15(4-6), 549-559 (2002)
- (2002) Neural Netw. , vol.15 , Issue.4-6 , pp. 549-559
- Kakade, S.¹ Dayan, P.²

74
- 0036832954
- Near-optimal reinforcement learning in polynomial time
- Kearns, M., Singh, S.: Near-optimal reinforcement learning in polynomial time. Mach. Learn. 49(2), 209-232 (2002)
- (2002) Mach. Learn. , vol.49 , Issue.2 , pp. 209-232
- Kearns, M.¹ Singh, S.²

75
- 0035712679
- Parallel circuits mediating distinct emotional coping reactions to different types of stress
- Keay, K.A., Bandler, R.: Parallel circuits mediating distinct emotional coping reactions to different types of stress. Neurosci. Biobehav. Rev. 25(7-8), 669-678 (2001)
- (2001) Neurosci. Biobehav. Rev. , vol.25 , Issue.7-8 , pp. 669-678
- Keay, K.A.¹ Bandler, R.²

76
- 0037382264
- Coordination of actions and habits in the medial prefrontal cortex of rats
- Killcross, S., Coutureau, E.: Coordination of actions and habits in the medial prefrontal cortex of rats. Cereb. Cortex 13(4), 400-408 (2003)
- (2003) Cereb. Cortex , vol.13 , Issue.4 , pp. 400-408
- Killcross, S.¹ Coutureau, E.²

77
- 84880873347
- Building portable options: Skill transfer in reinforcement learning
- Hyderabad, India
- Konidaris, G., Barto, A.: Building portable options: Skill transfer in reinforcement learning. In: IJCAI, pp. 895-900, Hyderabad, India (2007)
- (2007) IJCAI , pp. 895-900
- Konidaris, G.¹ Barto, A.²

78
- 78751681641
- Efficient skill learning using abstraction selection
- Pasadena, California
- Konidaris, G., Barto, A.: Efficient skill learning using abstraction selection. In: IJCAI, pp. 1107-1112, Pasadena, California (2009)
- (2009) IJCAI , pp. 1107-1112
- Konidaris, G.¹ Barto, A.²

79
- 59649113160
- Flexible shaping: How learning in small steps helps
- Krueger, K.A., Dayan, P.: Flexible shaping: How learning in small steps helps. Cognition 110(3), 380-394 (2009)
- (2009) Cognition , vol.110 , Issue.3 , pp. 380-394
- Krueger, K.A.¹ Dayan, P.²

80
- 0004282622
- Oxford University Press, Oxford
- Mackintosh, N.J.: Conditioning and Associative Learning. Oxford University Press, Oxford (1983)
- (1983) Conditioning and Associative Learning
- Mackintosh, N.J.¹

81
- 0037840849
- On the undecidability of probabilistic planning and related stochastic optimization problems
- Madani, O., Hanks, S., Condon, A.: On the undecidability of probabilistic planning and related stochastic optimization problems. Artif. Intell. 147(1-2), 5-34 (2003)
- (2003) Artif. Intell. , vol.147 , Issue.1-2 , pp. 5-34
- Madani, O.¹ Hanks, S.² Condon, A.³

82
- 33846261103
- Behavioral control, the medial prefrontal cortex, and resilience
- Maier, S.F., Amat, J., Baratta, M.V., Paul, E., Watkins, L.R.: Behavioral control, the medial prefrontal cortex, and resilience. Dialogues Clin. Neurosci. 8(4), 397-406 (2006)
- (2006) Dialogues Clin. Neurosci. , vol.8 , Issue.4 , pp. 397-406
- Maier, S.F.¹ Amat, J.² Baratta, M.V.³ Paul, E.⁴ Watkins, L.R.⁵

83
- 19844365569
- Stressor controllability and learned helplessness: The roles of the dorsal raphe nucleus, serotonin, and corticotropin-releasing factor
- Maier, S.F., Watkins, L.R.: Stressor controllability and learned helplessness: The roles of the dorsal raphe nucleus, serotonin, and corticotropin-releasing factor. Neurosci. Biobehav. Rev. 29(4-5), 829-841 (2005)
- (2005) Neurosci. Biobehav. Rev. , vol.29 , Issue.4-5 , pp. 829-841
- Maier, S.F.¹ Watkins, L.R.²

84
- 3042590043
- A two-dimensional neuropsychology of defense: Fear/anxiety and defensive distance
- McNaughton, N., Corr, P.J.: A two-dimensional neuropsychology of defense: Fear/anxiety and defensive distance. Neurosci. Biobehav. Rev. 28(3), 285-305 (2004)
- (2004) Neurosci. Biobehav. Rev. , vol.28 , Issue.3 , pp. 285-305
- McNaughton, N.¹ Corr, P.J.²

85
- 84906736869
- Functions and mechanisms of intrinsic motivations: The knowledge versus competence distinction
- Baldassarre, G., Mirolli, M. (eds.) Springer, Berlin
- Mirolli, M., Baldassarre, G.: Functions and mechanisms of intrinsic motivations: The knowledge versus competence distinction. In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems, pp. 49-72. Springer, Berlin (2012)
- (2012) Intrinsically Motivated Learning in Natural and Artificial Systems , pp. 49-72
- Mirolli, M.¹ Baldassarre, G.²

86
- 40849102598
- Synaptic theory of working memory
- Mongillo, G., Barak, O., Tsodyks, M.: Synaptic theory of working memory. Science 319 (5869), 1543-1546 (2008)
- (2008) Science , vol.319 , Issue.5869 , pp. 1543-1546
- Mongillo, G.¹ Barak, O.² Tsodyks, M.³

87
- 0029981543
- A framework for mesencephalic dopamine systems based on predictive hebbian learning
- Montague, P.R., Dayan, P., Sejnowski, T.J.: A framework for mesencephalic dopamine systems based on predictive hebbian learning. J. Neurosci. 16(5), 1936-1947 (1996)
- (1996) J. Neurosci. , vol.16 , Issue.5 , pp. 1936-1947
- Montague, P.R.¹ Dayan, P.² Sejnowski, T.J.³

88
- 77950032550
- Markov chain sampling methods for dirichlet process mixture models
- Neal, R.: Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Stat. 9(2), 249-265 (2000)
- (2000) J. Comput. Graph. Stat. , vol.9 , Issue.2 , pp. 249-265
- Neal, R.¹

89
- 0141596576
- Policy invariance under reward transformations: Theory and application to reward shaping
- Bled, Slovenia
- Ng, A., Harada, D., Russell, S.: Policy invariance under reward transformations: Theory and application to reward shaping. In: ICML, pp. 278-287, Bled, Slovenia (1999)
- (1999) ICML , pp. 278-287
- Ng, A.¹ Harada, D.² Russell, S.³

90
- 84858776393
- Multi-resolution exploration in continuous spaces
- Nouri, A., Littman, M.: Multi-resolution exploration in continuous spaces. NIPS, pp. 1209-1216 (2009)
- (2009) NIPS , pp. 1209-1216
- Nouri, A.¹ Littman, M.²

91
- 33644927837
- Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia
- O'Reilly, R.C., Frank, M.J.: Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia. Neural Comput. 18(2), 283-328 (2006)
- (2006) Neural Comput. , vol.18 , Issue.2 , pp. 283-328
- O'Reilly, R.C.¹ Frank, M.J.²

92
- 34047267520
- Intrinsic motivation systems for autonomous mental development
- Oudeyer, P., Kaplan, F., Hafner, V.: Intrinsic motivation systems for autonomous mental development. IEEE Trans. Evol. Comput. 11(2), 265-286 (2007)
- (2007) IEEE Trans. Evol. Comput. , vol.11 , Issue.2 , pp. 265-286
- Oudeyer, P.¹ Kaplan, F.² Hafner, V.³

93
- 33748408630
- Affective neuroscience
- New York
- Panksepp, J.: Affective Neuroscience. OUP, New York (1998)
- (1998) OUP
- Panksepp, J.¹

94
- 0000977910
- The complexity of Markov decision processes
- Papadimitriou, C., Tsitsiklis, J.: The complexity of Markov decision processes. Math. Oper. Res. 12(3), 441-450 (1987)
- (1987) Math. Oper. Res. , vol.12 , Issue.3 , pp. 441-450
- Papadimitriou, C.¹ Tsitsiklis, J.²

95
- 84898956770
- Reinforcement learning with hierarchies of machines
- Denver, Colorado
- Parr, R., Russell, S.: Reinforcement learning with hierarchies of machines. In: NIPS, pp. 1043-1049, Denver, Colorado (1998)
- (1998) NIPS , pp. 1043-1049
- Parr, R.¹ Russell, S.²

96
- 33749251297
- An analytic solution to discrete Bayesian reinforcement learning
- Pittsburgh, Pennslyvania
- Poupart, P., Vlassis, N., Hoey, J., Regan, K.: An analytic solution to discrete bayesian reinforcement learning. In: ICML, pp. 697-704, Pittsburgh, Pennslyvania (2006)
- (2006) ICML , pp. 697-704
- Poupart, P.¹ Vlassis, N.² Hoey, J.³ Regan, K.⁴

97
- 0012586376
- MIT, Cambridge
- Rao, R.P.N., Olshausen, B.A., Lewicki, M.S. (eds.): Probabilistic Models of the Brain: Perception and Neural Function. MIT, Cambridge (2002)
- (2002) Probabilistic Models of the Brain: Perception and Neural Function
- Rao, R.P.N.¹ Olshausen, B.A.² Lewicki, M.S.³

98
- 84929046152
- The role of the basal ganglia in discovering novel actions
- Baldassarre, G., Mirolli, M. (eds.) Springer, Berlin
- Redgrave, P., Gurney, K., Stafford, T., Thirkettle, M., Lewis, J.: The role of the basal ganglia in discovering novel actions. In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems, pp. 129-149. Springer, Berlin (2012)
- (2012) Intrinsically Motivated Learning in Natural and Artificial Systems , pp. 129-149
- Redgrave, P.¹ Gurney, K.² Stafford, T.³ Thirkettle, M.⁴ Lewis, J.⁵

99
- 0033119561
- Is the short-latency dopamine response too short to signal reward error?
- Redgrave, P., Prescott, T.J., Gurney, K.: Is the short-latency dopamine response too short to signal reward error? Trends Neurosci. 22(4), 146-151 (1999)
- (1999) Trends Neurosci , vol.22 , Issue.4 , pp. 146-151
- Redgrave, P.¹ Prescott, T.J.² Gurney, K.³

100
- 0035341482
- Fear and feeding in the nucleus accumbens shell: Rostrocaudal segregation of GABA-elicited defensive behavior versus eating behavior
- Reynolds, S.M., Berridge, K.C. (2001): Fear and feeding in the nucleus accumbens shell: Rostrocaudal segregation of GABA-elicited defensive behavior versus eating behavior. J. Neurosci. 21(9), 3261-3270 (1999)
- (1999) J. Neurosci. , vol.21 , Issue.9 , pp. 3261-3270
- Reynolds, S.M.¹ Berridge, K.C.²

101
- 0037104732
- Positive and negative motivation in nucleus accumbens shell: Bivalent rostrocaudal gradients for GABA-elicited eating, taste "liking"/"disliking" reactions, place preference/avoidance, and fear
- Reynolds, S.M., Berridge, K.C.: Positive and negative motivation in nucleus accumbens shell: Bivalent rostrocaudal gradients for GABA-elicited eating, taste "liking"/"disliking" reactions, place preference/avoidance, and fear. J. Neurosci. 22(16), 7308-7320 (2002)
- (2002) J. Neurosci. , vol.22 , Issue.16 , pp. 7308-7320
- Reynolds, S.M.¹ Berridge, K.C.²

102
- 41149151266
- Emotional environments retune the valence of appetitive versus fearful functions in nucleus accumbens
- Reynolds, S.M., Berridge, K.C.: Emotional environments retune the valence of appetitive versus fearful functions in nucleus accumbens. Nat. Neurosci. 11(4), 423-425 (2008)
- (2008) Nat. Neurosci. , vol.11 , Issue.4 , pp. 423-425
- Reynolds, S.M.¹ Berridge, K.C.²

103
- 0031189347
- CHILD: A first step towards continual learning
- Ring, M.: CHILD: A first step towards continual learning. Mach. Learn. 28(1), 77-104 (1997)
- (1997) Mach. Learn. , vol.28 , Issue.1 , pp. 77-104
- Ring, M.¹

104
- 84929054210
- Toward a formal framework for continual learning
- Whistler, Canada
- Ring, M.: Toward a formal framework for continual learning. In: NIPS Workshop on Inductive Transfer, Whistler, Canada (2005)
- (2005) NIPS Workshop on Inductive Transfer
- Ring, M.¹

105
- 41149161631
- Choice, uncertainty and value in prefrontal and cingulate cortex
- Rushworth, M.F.S., Behrens, T.E.J.: Choice, uncertainty and value in prefrontal and cingulate cortex. Nat. Neurosci. 11(4), 389-397 (2008)
- (2008) Nat. Neurosci. , vol.11 , Issue.4 , pp. 389-397
- Rushworth, M.F.S.¹ Behrens, T.E.J.²

106
- 0002209063
- Intrinsic and extrinsic motivations: Classic definitions and new directions
- Ryan, R., Deci, E.: Intrinsic and extrinsic motivations: Classic definitions and new directions. Contemp. Educ. Psychol. 25(1), 54-67 (2000)
- (2000) Contemp. Educ. Psychol. , vol.25 , Issue.1 , pp. 54-67
- Ryan, R.¹ Deci, E.²

107
- 0742324926
- Inter-module credit assignment in modular reinforcement learning
- Samejima, K., Doya, K., Kawato, M.: Inter-module credit assignment in modular reinforcement learning. Neural Netw. 16(7), 985-994 (2003)
- (2003) Neural Netw. , vol.16 , Issue.7 , pp. 985-994
- Samejima, K.¹ Doya, K.² Kawato, M.³

108
- 0001201756
- Some studies in machine learning using the game of checkers
- Samuel, A.: Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 3, 210-229 (1959)
- (1959) IBM J. Res. Dev. , vol.3 , pp. 210-229
- Samuel, A.¹

109
- 79958838807
- Evolving childhood's length and learning parameters in an intrinsically motivated reinforcement learning robot
- Piscataway, New Jersey
- Schembri, M., Mirolli, M., Baldassarre, G.: Evolving childhood's length and learning parameters in an intrinsically motivated reinforcement learning robot. In: Proceedings of the Seventh International Conference on Epigenetic Robotics, pp. 141-148, Piscataway, New Jersey (2007)
- (2007) Proceedings of the Seventh International Conference on Epigenetic Robotics , pp. 141-148
- Schembri, M.¹ Mirolli, M.² Baldassarre, G.³

110
- 0026306990
- Curious model-building control systems
- Seattle, Washington State IEEE
- Schmidhuber, J.: Curious model-building control systems. In: IJCNN, pp. 1458-1463, Seattle, Washington State IEEE (1991)
- (1991) IJCNN , pp. 1458-1463
- Schmidhuber, J.¹

111
- 84880251870
- Gödel machines: Fully self-referential optimal universal self-improvers
- Schmidhuber, J.: Gödel machines: Fully self-referential optimal universal self-improvers. Artif. Gen. Intell., pp. 199-226 (2006)
- (2006) Artif. Gen. Intell. , pp. 199-226
- Schmidhuber, J.¹

112
- 70349334569
- Ultimate cognition à la gödel
- Schmidhuber, J.: Ultimate cognition à la gödel. Cogn. Comput. 1, 117-193 (2009)
- (2009) Cogn. Comput. , vol.1 , pp. 117-193
- Schmidhuber, J.¹

113
- 0003554731
- WH Freeman, San Francisco
- Seligman, M.: Helplessness: On Depression, Development, and Death. WH Freeman, San Francisco (1975)
- (1975) Helplessness: On Depression, Development, and Death
- Seligman, M.¹

114
- 0002193484
- Relation between classical conditioning and instrumental learning
- Prokasy, W. (ed.) Appelton-Century-Crofts, New York
- Sheffield, F.: Relation between classical conditioning and instrumental learning. In: Prokasy, W. (ed.) Classical Conditioning, pp. 302-322. Appelton-Century-Crofts, New York (1965)
- (1965) Classical Conditioning , pp. 302-322
- Sheffield, F.¹

115
- 33749261645
- An intrinsic reward mechanism for efficient exploration
- Pittsburgh, Pennsylvania
- Şimşek, Ö., Barto, A.G.: An intrinsic reward mechanism for efficient exploration. In: ICML, pp. 833-840, Pittsburgh, Pennsylvania (2006)
- (2006) ICML , pp. 833-840
- Şimşek, O.¹ Barto, A.G.²

116
- 0001027894
- Transfer of learning by composing solutions of elemental sequential tasks
- Singh, S.: Transfer of learning by composing solutions of elemental sequential tasks. Mach. Learn. 8(3), 323-339 (1992)
- (1992) Mach. Learn. , vol.8 , Issue.3 , pp. 323-339
- Singh, S.¹

117
- 84899031920
- Intrinsically motivated reinforcement learning
- Vancouver, Canada
- Singh, S., Barto, A., Chentanez, N.: Intrinsically motivated reinforcement learning. In: NIPS, pp. 1281-1288, Vancouver, Canada (2005)
- (2005) NIPS , pp. 1281-1288
- Singh, S.¹ Barto, A.² Chentanez, N.³

118
- 0030240189
- A guide to constructs of control
- Skinner, E.A.: A guide to constructs of control. J. Pers. Soc. Psychol. 71(3), 549-570 (1996)
- (1996) J. Pers. Soc. Psychol. , vol.71 , Issue.3 , pp. 549-570
- Skinner, E.A.¹

119
- 33646230819
- Dopamine, prediction error and associative learning: A model-based account
- Smith, A., Li, M., Becker, S., Kapur, S.: Dopamine, prediction error and associative learning: A model-based account. Network 17(1), 61-84 (2006)
- (2006) Network , vol.17 , Issue.1 , pp. 61-84
- Smith, A.¹ Li, M.² Becker, S.³ Kapur, S.⁴

120
- 0001425882
- Reconciling the role of central serotonin neurons in human and animal behaviour
- Soubrié, P.: Reconciling the role of central serotonin neurons in human and animal behaviour. Behav. Brain Sci. 9, 319-364 (1986)
- (1986) Behav. Brain Sci. , vol.9 , pp. 319-364
- Soubrié, P.¹

121
- 14344258433
- A Bayesian framework for reinforcement learning
- Stanford, California
- Strens, M.: A Bayesian framework for reinforcement learning. In: ICML, pp. 943-950, Stanford, California (2000)
- (2000) ICML , pp. 943-950
- Strens, M.¹

122
- 0032930935
- A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task
- Suri, R.E., Schultz, W.: A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task. Neuroscience 91(3), 871-890 (1999)
- (1999) Neuroscience , vol.91 , Issue.3 , pp. 871-890
- Suri, R.E.¹ Schultz, W.²

123
- 33847202724
- Learning to predict by the methods of temporal differences
- Sutton, R.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9-44 (1988)
- (1988) Mach. Learn. , vol.3 , Issue.1 , pp. 9-44
- Sutton, R.¹

124
- 85132026293
- Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
- Sutton, R.: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. ICML Austin, Texas 216, 224 (1990)
- (1990) ICML Austin, Texas , vol.216 , pp. 224
- Sutton, R.¹

125
- 0033170372
- Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
- Sutton, R., Precup, D., Singh, S.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1), 181-211 (1999)
- (1999) Artif. Intell. , vol.112 , Issue.1 , pp. 181-211
- Sutton, R.¹ Precup, D.² Singh, S.³

126
- 0004102479
- MIT, Cambridge
- Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning). MIT, Cambridge (1998)
- (1998) Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning)
- Sutton, R.S.¹ Barto, A.G.²

127
- 56049088540
- Multitask reinforcement learning on the distribution of MDPs
- Tanaka, F., Yamamura, M.: Multitask reinforcement learning on the distribution of MDPs. IEEJ Trans. Electron. Inform. Syst. C 123(5), 1004-1011 (2003)
- (2003) IEEJ Trans. Electron. Inform. Syst. C , vol.123 , Issue.5 , pp. 1004-1011
- Tanaka, F.¹ Yamamura, M.²

128
- 33749249312
- Hierarchical dirichlet processes
- Teh, Y., Jordan, M., Beal, M., Blei, D.: Hierarchical dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566-1581 (2006)
- (2006) J. Am. Stat. Assoc. , vol.101 , Issue.476 , pp. 1566-1581
- Teh, Y.¹ Jordan, M.² Beal, M.³ Blei, D.⁴

129
- 33746260413
- Theory-based Bayesian models of inductive learning and reasoning
- Tenenbaum, J., Griffiths, T., Kemp, C.: Theory-based Bayesian models of inductive learning and reasoning. Trends Cogn. Sci. 10(7), 309-318 (2006)
- (2006) Trends Cogn. Sci. , vol.10 , Issue.7 , pp. 309-318
- Tenenbaum, J.¹ Griffiths, T.² Kemp, C.³

130
- 84862302350
- Hierarchical beta processes and the Indian buffet process
- San Juan, Puerto Rico
- Thibaux, R., Jordan, M.: Hierarchical beta processes and the Indian buffet process. In: AIStats, pp. 564-571, San Juan, Puerto Rico (2007)
- (2007) AIStats , pp. 564-571
- Thibaux, R.¹ Jordan, M.²

131
- 0003998491
- MacMillan, New York
- Thorndike, E.: Animal Intelligence. MacMillan, New York (1911)
- (1911) Animal Intelligence
- Thorndike, E.¹

132
- 33749882712
- Finding structure in reinforcement learning
- Denver, Colorado
- Thrun, S., Schwartz, A.: Finding structure in reinforcement learning. In: NIPS, pp. 385-392, Denver, Colorado (1995)
- (1995) NIPS , pp. 385-392
- Thrun, S.¹ Schwartz, A.²

133
- 58149442669
- Cognitive maps in rats and men
- Tolman, E.C.: Cognitive maps in rats and men. Psychol. Rev. 55(4), 189-208 (1948)
- (1948) Psychol. Rev. , vol.55 , Issue.4 , pp. 189-208
- Tolman, E.C.¹

134
- 66449119919
- A specific role for posterior dorsolateral striatum in human habit learning
- Tricomi, E., Balleine, B.W., O'Doherty, J.P.: A specific role for posterior dorsolateral striatum in human habit learning. Eur. J. Neurosci. 29(11), 2225-2232 (2009)
- (2009) Eur. J. Neurosci. , vol.29 , Issue.11 , pp. 2225-2232
- Tricomi, E.¹ Balleine, B.W.² O'Doherty, J.P.³

135
- 34247147767
- Determining the neural substrates of goal-directed learning in the human brain
- Valentin, V.V., Dickinson, A., O'Doherty, J.P.: Determining the neural substrates of goal-directed learning in the human brain. J. Neurosci. 27(15), 4019-4026 (2007)
- (2007) J. Neurosci. , vol.27 , Issue.15 , pp. 4019-4026
- Valentin, V.V.¹ Dickinson, A.² O'Doherty, J.P.³

136
- 63149146163
- Learning flexible sensori-motor mappings in a complex network
- Vasilaki, E., Fusi, S., Wang, X.-J., Senn, W. (2009): Learning flexible sensori-motor mappings in a complex network. Biol. Cybern. 100(2), 147-158 (2007)
- (2007) Biol. Cybern. , vol.100 , Issue.2 , pp. 147-158
- Vasilaki, E.¹ Fusi, S.² Wang, X.-J.³ Senn, W.⁴

137
- 31844436266
- Bayesian sparse sampling for on-line reward optimization
- Bonn, Germany
- Wang, T., Lizotte, D., Bowling, M., Schuurmans, D.: Bayesian sparse sampling for on-line reward optimization. In: ICML, pp. 956-963, Bonn, Germany (2005)
- (2005) ICML , pp. 956-963
- Wang, T.¹ Lizotte, D.² Bowling, M.³ Schuurmans, D.⁴

138
- 0004049893
- Ph.D. Thesis, University of Cambridge
- Watkins, C. (1989): Learning from delayed rewards. Ph.D. Thesis, University of Cambridge (2005)
- (1989) Learning from Delayed Rewards
- Watkins, C.¹

139
- 0345161973
- Efficient model-based exploration
- Zurich, Switzerland
- Wiering, M., Schmidhuber, J.: Efficient model-based exploration. In: Simulation of Adaptive Behavior, pp. 223-228, Zurich, Switzerland (1998)
- (1998) Simulation of Adaptive Behavior , pp. 223-228
- Wiering, M.¹ Schmidhuber, J.²

140
- 84989993724
- Auto-maintenance in the pigeon: Sustained pecking despite contingent non-reinforcement
- Williams, D.R., Williams, H.: Auto-maintenance in the pigeon: Sustained pecking despite contingent non-reinforcement. J. Exp. Anal. Behav. 12(4), 511-520 (1969)
- (1969) J. Exp. Anal. Behav. , vol.12 , Issue.4 , pp. 511-520
- Williams, D.R.¹ Williams, H.²

141
- 34547994508
- Multi-task reinforcement learning: A hierarchical Bayesian approach
- Corvallis, Oregon
- Wilson, A., Fern, A., Ray, S., Tadepalli, P.: Multi-task reinforcement learning: A hierarchical bayesian approach. In: ICML, pp. 1015-1022, Corvallis, Oregon (2007)
- (2007) ICML , pp. 1015-1022
- Wilson, A.¹ Fern, A.² Ray, S.³ Tadepalli, P.⁴

142
- 84881042664
- Bayesian policy search with policy priors
- AAAI Press, Menlo Park
- Wingate, D., Goodman, N.D., Roy, D.M., Kaelbling, L.P., Tenenbaum, J.B.: Bayesian policy search with policy priors. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence-Volume, Vol. 2, pp. 1565-1570. AAAI Press, Menlo Park (2011)
- (2011) Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence-Volume , vol.2 , pp. 1565-1570
- Wingate, D.¹ Goodman, N.D.² Roy, D.M.³ Kaelbling, L.P.⁴ Tenenbaum, J.B.⁵

143
- 0032192424
- Multiple paired forward and inverse models for motor control
- Wolpert, D.M., Kawato, M.: Multiple paired forward and inverse models for motor control. Neural Netw. 11(7-8), 1317-1329 (1998)
- (1998) Neural Netw. , vol.11 , Issue.7-8 , pp. 1317-1329
- Wolpert, D.M.¹ Kawato, M.²

144
- 33646853495
- Resolution of uncertainty in prefrontal cortex
- Yoshida, W., Ishii, S.: Resolution of uncertainty in prefrontal cortex. Neuron 50(5), 781-789 (2006)
- (2006) Neuron , vol.50 , Issue.5 , pp. 781-789
- Yoshida, W.¹ Ishii, S.²

145
- 20444388016
- Uncertainty, neuromodulation, and attention
- Yu, A.J., Dayan, P.: Uncertainty, neuromodulation, and attention. Neuron 46(4), 681-692 (2005)
- (2005) Neuron , vol.46 , Issue.4 , pp. 681-692
- Yu, A.J.¹ Dayan, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.