메뉴 건너뛰기




Volumn 9783642323751, Issue , 2013, Pages 73-91

Exploration from generalization mediated by multiple controllers

Author keywords

[No Author keywords available]

Indexed keywords

GENERALISATION; INTRINSIC MOTIVATION; LEARN+; LEARNING PROCESS; MULTIPLE CONTROLLERS; POTENTIAL BENEFITS;

EID: 84886641205     PISSN: None     EISSN: None     Source Type: Book    
DOI: 10.1007/978-3-642-32375-1_4     Document Type: Chapter
Times cited : (11)

References (145)
  • 2
    • 78649507911 scopus 로고    scopus 로고
    • A Bayesian sampling approach to exploration in reinforcement learning
    • Montreal, Canada
    • Asmuth, J., Li, L., Littman, M., Nouri, A., Wingate, D.: A bayesian sampling approach to exploration in reinforcement learning. In: UAI, Montreal, Canada (2009)
    • (2009) UAI
    • Asmuth, J.1    Li, L.2    Littman, M.3    Nouri, A.4    Wingate, D.5
  • 3
    • 23244432007 scopus 로고    scopus 로고
    • An integrative theory of locus coeruleus-norepinephrine function: Adaptive gain and optimal performance
    • Aston-Jones, G., Cohen, J.D.: An integrative theory of locus coeruleus-norepinephrine function: Adaptive gain and optimal performance. Annu. Rev. Neurosci. 28, 403-450 (2005)
    • (2005) Annu. Rev. Neurosci. , vol.28 , pp. 403-450
    • Aston-Jones, G.1    Cohen, J.D.2
  • 4
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2), 235-256 (2002a)
    • (2002) Mach. Learn. , vol.47 , Issue.2 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 6
    • 28444472936 scopus 로고    scopus 로고
    • Neural bases of food-seeking: Affect, arousal and reward in corticostriatolimbic circuits
    • Balleine, B.W.: Neural bases of food-seeking: Affect, arousal and reward in corticostriatolimbic circuits. Physiol. Behav. 86(5), 717-730 (2005)
    • (2005) Physiol. Behav. , vol.86 , Issue.5 , pp. 717-730
    • Balleine, B.W.1
  • 7
    • 0028101302 scopus 로고
    • Columnar organization in the midbrain periaqueductal gray: Modules for emotional expression?
    • Bandler, R., Shipley, M.T.: Columnar organization in the midbrain periaqueductal gray: Modules for emotional expression? Trends Neurosci. 17(9), 379-389 (1994)
    • (1994) Trends Neurosci. , vol.17 , Issue.9 , pp. 379-389
    • Bandler, R.1    Shipley, M.T.2
  • 8
    • 0000541213 scopus 로고
    • Adaptive critics and the basal ganglia
    • Houk, J., Davis, J., Beiser, D. (eds.) MIT, Cambridge
    • Barto, A.: Adaptive critics and the basal ganglia. In: Houk, J., Davis, J., Beiser, D. (eds.) Models of Information Processing in the Basal Ganglia, pp. 215-232. MIT, Cambridge (1995)
    • (1995) Models of Information Processing in the Basal Ganglia , pp. 215-232
    • Barto, A.1
  • 9
    • 0141988716 scopus 로고    scopus 로고
    • Recent advances in hierarchical reinforcement learning
    • Barto, A., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discr. Event Dyn. Syst. 13(4), 341-379 (2003)
    • (2003) Discr. Event Dyn. Syst. , vol.13 , Issue.4 , pp. 341-379
    • Barto, A.1    Mahadevan, S.2
  • 10
    • 33749651693 scopus 로고    scopus 로고
    • Intrinsically motivated learning of hierarchical collections of skills
    • La Jolla, CA
    • Barto, A., Singh, S., Chentanez, N.: Intrinsically motivated learning of hierarchical collections of skills. In: ICDL 2004, La Jolla, CA (2004)
    • (2004) ICDL 2004
    • Barto, A.1    Singh, S.2    Chentanez, N.3
  • 11
    • 0020970738 scopus 로고
    • Neuronlike elements that can solve difficult learning control problems
    • Barto, A., Sutton, R., Anderson, C.: Neuronlike elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. 13(5), 834-846 (1983)
    • (1983) IEEE Trans. Syst. Man Cybern. , vol.13 , Issue.5 , pp. 834-846
    • Barto, A.1    Sutton, R.2    Anderson, C.3
  • 12
    • 84929046579 scopus 로고    scopus 로고
    • Intrinsic motivation and reinforcement learning
    • Baldassarre, G., Mirolli, M. (eds.) Springer, Berlin
    • Barto, A.G.: Intrinsic motivation and reinforcement learning. In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems, pp. 17-47. Springer, Berlin (2012)
    • (2012) Intrinsically Motivated Learning in Natural and Artificial Systems , pp. 17-47
    • Barto, A.G.1
  • 13
    • 84898936541 scopus 로고    scopus 로고
    • The infinite hidden Markov model
    • Vancouver, Canada
    • Beal, M., Ghahramani, Z., Rasmussen, C.: The infinite hidden Markov model. In: NIPS, pp. 577-584, Vancouver, Canada (2002)
    • (2002) NIPS , pp. 577-584
    • Beal, M.1    Ghahramani, Z.2    Rasmussen, C.3
  • 15
    • 85012688561 scopus 로고
    • Princeton University Press, Princeton
    • Bellman, R.E.: Dynamic Programming. Princeton University Press, Princeton (1957)
    • (1957) Dynamic Programming
    • Bellman, R.E.1
  • 16
    • 2442701355 scopus 로고    scopus 로고
    • Motivation concepts in behavioral neuroscience
    • Berridge, K.C.: Motivation concepts in behavioral neuroscience. Physiol. Behav. 81, 179-209 (2004)
    • (2004) Physiol. Behav. , vol.81 , pp. 179-209
    • Berridge, K.C.1
  • 18
    • 0023800192 scopus 로고
    • Ethoexperimental approaches to the biology of emotion
    • Blanchard, D.C., Blanchard, R.J.: Ethoexperimental approaches to the biology of emotion. Annu. Rev. Psychol. 39, 43-68 (1988)
    • (1988) Annu. Rev. Psychol. , vol.39 , pp. 43-68
    • Blanchard, D.C.1    Blanchard, R.J.2
  • 19
    • 13844281871 scopus 로고    scopus 로고
    • Bringing up robot: Fundamental mechanisms for creating a self-motivated, self-organizing architecture
    • Blank, D., Kumar, D., Meeden, L., Marshall, J.: Bringing up robot: Fundamental mechanisms for creating a self-motivated, self-organizing architecture. Cybern. Syst. 36(2), 125-150 (2005)
    • (2005) Cybern. Syst. , vol.36 , Issue.2 , pp. 125-150
    • Blank, D.1    Kumar, D.2    Meeden, L.3    Marshall, J.4
  • 20
    • 58149417523 scopus 로고
    • Species-specific defense reactions and avoidance learning
    • Bolles, R.C.: Species-specific defense reactions and avoidance learning. Psychol. Rev. 77, 32-48 (1970)
    • (1970) Psychol. Rev. , vol.77 , pp. 32-48
    • Bolles, R.C.1
  • 21
    • 70350566799 scopus 로고    scopus 로고
    • Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective
    • Botvinick, M.M., Niv, Y., Barto, A.C.: Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective. Cognition 113(3), 262-280 (2009)
    • (2009) Cognition , vol.113 , Issue.3 , pp. 262-280
    • Botvinick, M.M.1    Niv, Y.2    Barto, A.C.3
  • 22
    • 78649651245 scopus 로고    scopus 로고
    • Opponency revisited: Competition and cooperation between dopamine and serotonin
    • Boureau, Y.-L., Dayan, P.: Opponency revisited: Competition and cooperation between dopamine and serotonin. Neuropsychopharmacology 36(1), 74-97 (2011)
    • (2011) Neuropsychopharmacology , vol.36 , Issue.1 , pp. 74-97
    • Boureau, Y.-L.1    Dayan, P.2
  • 23
    • 0041965975 scopus 로고    scopus 로고
    • R-max - A general polynomial time algorithm for near-optimal reinforcement learning
    • Brafman, R., Tennenholtz, M.: R-max-a general polynomial time algorithm for near-optimal reinforcement learning. J. Mach. Learn. Res. 3, 213-231 (2003)
    • (2003) J. Mach. Learn. Res. , vol.3 , pp. 213-231
    • Brafman, R.1    Tennenholtz, M.2
  • 24
    • 0000696066 scopus 로고
    • The misbehavior of organisms
    • Breland, K., Breland, M.: The misbehavior of organisms. Am. Psychol. 16(9), 681-84 (1961)
    • (1961) Am. Psychol. , vol.16 , Issue.9 , pp. 681-684
    • Breland, K.1    Breland, M.2
  • 25
    • 0023981451 scopus 로고
    • The ART of adaptive pattern recognition by a self-organizing neural network
    • Carpenter, G., Grossberg, S.: The ART of adaptive pattern recognition by a self-organizing neural network. Computer 21, 77-88 (1988)
    • (1988) Computer , vol.21 , pp. 77-88
    • Carpenter, G.1    Grossberg, S.2
  • 26
    • 0031189914 scopus 로고    scopus 로고
    • Multitask learning
    • Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41-75 (1997)
    • (1997) Mach. Learn. , vol.28 , Issue.1 , pp. 41-75
    • Caruana, R.1
  • 28
    • 33750189183 scopus 로고    scopus 로고
    • Similarity and discrimination in classical conditioning: A latent variable account
    • Vancouver, Canada
    • Courville, A., Daw, N., Touretzky, D.: Similarity and discrimination in classical conditioning: A latent variable account. In: NIPS, pp. 313-320, Vancouver, Canada (2004)
    • (2004) NIPS , pp. 313-320
    • Courville, A.1    Daw, N.2    Touretzky, D.3
  • 29
    • 33646492363 scopus 로고    scopus 로고
    • The computational neurobiology of learning and reward
    • Daw, N.D., Doya, K.: The computational neurobiology of learning and reward. Curr. Opin. Neurobiol. 16(2), 199-204 (2006)
    • (2006) Curr. Opin. Neurobiol. , vol.16 , Issue.2 , pp. 199-204
    • Daw, N.D.1    Doya, K.2
  • 30
    • 0036592008 scopus 로고    scopus 로고
    • Opponent interactions between serotonin and dopamine
    • Daw, N.D., Kakade, S., Dayan, P.: Opponent interactions between serotonin and dopamine. Neural Netw. 15, 603-16 (2002)
    • (2002) Neural Netw. , vol.15 , pp. 603-616
    • Daw, N.D.1    Kakade, S.2    Dayan, P.3
  • 31
    • 28044450875 scopus 로고    scopus 로고
    • Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
    • Daw, N.D., Niv, Y., Dayan, P.: Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8(12), 1704-1711 (2005)
    • (2005) Nat. Neurosci. , vol.8 , Issue.12 , pp. 1704-1711
    • Daw, N.D.1    Niv, Y.2    Dayan, P.3
  • 32
    • 33745223257 scopus 로고    scopus 로고
    • Cortical substrates for exploratory decisions in humans
    • Daw, N.D., O'Doherty, J.P., Dayan, P., Seymour, B., Dolan, R.J.: Cortical substrates for exploratory decisions in humans. Nature 441 (7095), 876-879 (2006)
    • (2006) Nature , vol.441 , Issue.7095 , pp. 876-879
    • Daw, N.D.1    O'Doherty, J.P.2    Dayan, P.3    Seymour, B.4    Dolan, R.J.5
  • 33
    • 50549094930 scopus 로고    scopus 로고
    • Bilinearity, rules, and prefrontal cortex
    • Dayan, P.: Bilinearity, rules, and prefrontal cortex. Front. Comput. Neurosci. 1, 1 (2007)
    • (2007) Front. Comput. Neurosci. , vol.1 , pp. 1
    • Dayan, P.1
  • 35
    • 40149109071 scopus 로고    scopus 로고
    • Serotonin, inhibition, and negative mood
    • Dayan, P., Huys, Q.J.M.: Serotonin, inhibition, and negative mood. PLoS Comput. Biol. 4(2), e4 (2008)
    • (2008) PLoS Comput. Biol. , vol.4 , Issue.2 , pp. e4
    • Dayan, P.1    Huys, Q.J.M.2
  • 36
    • 67349100969 scopus 로고    scopus 로고
    • Serotonin in affective control
    • Dayan, P., Huys, Q.J.M.: Serotonin in affective control. Annu. Rev. Neurosci. 32, 95-126 (2009)
    • (2009) Annu. Rev. Neurosci. , vol.32 , pp. 95-126
    • Dayan, P.1    Huys, Q.J.M.2
  • 37
    • 33749055062 scopus 로고    scopus 로고
    • The misbehavior of value and the discipline of the will
    • Dayan, P., Niv, Y., Seymour, B., Daw, N.D.: The misbehavior of value and the discipline of the will. Neural Netw. 19(8), 1153-1160 (2006)
    • (2006) Neural Netw. , vol.19 , Issue.8 , pp. 1153-1160
    • Dayan, P.1    Niv, Y.2    Seymour, B.3    Daw, N.D.4
  • 38
    • 0030260201 scopus 로고    scopus 로고
    • Exploration bonuses and dual control
    • Dayan, P., Sejnowski, T.: Exploration bonuses and dual control. Mach. Learn. 25(1), 5-22 (1996)
    • (1996) Mach. Learn. , vol.25 , Issue.1 , pp. 5-22
    • Dayan, P.1    Sejnowski, T.2
  • 40
    • 1142281527 scopus 로고    scopus 로고
    • Model based Bayesian exploration
    • Stockholm, Sweden
    • Dearden, R., Friedman, N., Andre, D.: Model based Bayesian exploration. In: UAI, Stockholm, Sweden pp. 150-159 (1999)
    • (1999) UAI , pp. 150-159
    • Dearden, R.1    Friedman, N.2    Andre, D.3
  • 43
    • 0043250430 scopus 로고    scopus 로고
    • The role of learning in motivation
    • Gallistel, C. (ed.) Wiley, New York
    • Dickinson, A., Balleine, B.: The role of learning in motivation. In: Gallistel, C. (ed.) Stevens' Handbook of Experimental Psychology, Vol. 3, pp. 497-533. Wiley, New York (2002)
    • (2002) Stevens' Handbook of Experimental Psychology , vol.3 , pp. 497-533
    • Dickinson, A.1    Balleine, B.2
  • 44
    • 0001806701 scopus 로고    scopus 로고
    • The MAXQ method for hierarchical reinforcement learning
    • Madison, Wisconsin
    • Dietterich, T.: The MAXQ method for hierarchical reinforcement learning. In: ICML, pp. 118-126, Madison, Wisconsin, (1998)
    • (1998) ICML , pp. 118-126
    • Dietterich, T.1
  • 45
    • 0002278788 scopus 로고    scopus 로고
    • Hierarchical reinforcement learning with the MAXQ value function decomposition
    • Dietterich, T.: Hierarchical reinforcement learning with the MAXQ value function decomposition. J. Artif. Intell. Res. 13(1), 227-303 (2000)
    • (2000) J. Artif. Intell. Res. , vol.13 , Issue.1 , pp. 227-303
    • Dietterich, T.1
  • 46
    • 0036592023 scopus 로고    scopus 로고
    • Metalearning and neuromodulation
    • Doya, K.: Metalearning and neuromodulation. Neural Netw. 15(4-6), 495-506 (2002)
    • (2002) Neural Netw. , vol.15 , Issue.4-6 , pp. 495-506
    • Doya, K.1
  • 49
    • 0036832959 scopus 로고    scopus 로고
    • Structure in the space of value functions
    • Foster, D., Dayan, P.: Structure in the space of value functions. Mach. Learn. 49(2), 325-346 (2002)
    • (2002) Mach. Learn. , vol.49 , Issue.2 , pp. 325-346
    • Foster, D.1    Dayan, P.2
  • 51
    • 77952541839 scopus 로고    scopus 로고
    • Learning latent structure: Carving nature at its joints
    • Gershman, S., Niv, Y.: Learning latent structure: Carving nature at its joints. Curr. Opin. Neurobiol. (2010)
    • (2010) Curr. Opin. Neurobiol.
    • Gershman, S.1    Niv, Y.2
  • 52
    • 74049117596 scopus 로고    scopus 로고
    • Context, learning, and extinction
    • Gershman, S.J., Blei, D.M., Niv, Y.: Context, learning, and extinction. Psychol. Rev. 117(1), 197-209 (2010b)
    • (2010) Psychol. Rev. , vol.117 , Issue.1 , pp. 197-209
    • Gershman, S.J.1    Blei, D.M.2    Niv, Y.3
  • 54
    • 0010966147 scopus 로고
    • Rats learn the relationship between responding and environmental events: An expansion of the learned helplessness hypothesis
    • Goodkin, F.: Rats learn the relationship between responding and environmental events: An expansion of the learned helplessness hypothesis. Learn. Motiv. 7, 382-393 (1976)
    • (1976) Learn. Motiv. , vol.7 , pp. 382-393
    • Goodkin, F.1
  • 57
    • 34548566262 scopus 로고    scopus 로고
    • Towards an executive without a homunculus: Computational models of the prefrontal cortex/basal ganglia system
    • Hazy, T.E., Frank, M.J., O'reilly, R.C.: Towards an executive without a homunculus: Computational models of the prefrontal cortex/basal ganglia system. Philos. Trans. R. Soc. Lond. B Biol. Sci. 362 (1485), 1601-1613 (2007)
    • (2007) Philos. Trans. R. Soc. Lond. B Biol. Sci. , vol.362 , Issue.1485 , pp. 1601-1613
    • Hazy, T.E.1    Frank, M.J.2    O'Reilly, R.C.3
  • 58
    • 0034031837 scopus 로고    scopus 로고
    • Multiple forms of short-term plasticity at excitatory synapses in rat medial prefrontal cortex
    • Hempel, C.M., Hartman, K.H., Wang, X.J., Turrigiano, G.G., Nelson, S.B.: Multiple forms of short-term plasticity at excitatory synapses in rat medial prefrontal cortex. J. Neurophysiol. 83(5), 3031-3041 (2000)
    • (2000) J. Neurophysiol. , vol.83 , Issue.5 , pp. 3031-3041
    • Hempel, C.M.1    Hartman, K.H.2    Wang, X.J.3    Turrigiano, G.G.4    Nelson, S.B.5
  • 59
    • 0022979089 scopus 로고
    • An approach through the looking-glass
    • Hershberger, W.A.: An approach through the looking-glass. Anim. Learn. Behav. 14, 443-51 (1986)
    • (1986) Anim. Learn. Behav. , vol.14 , pp. 443-451
    • Hershberger, W.A.1
  • 60
    • 0029652445 scopus 로고
    • The "wake-sleep" algorithm for unsupervised neural networks
    • Hinton, G.E., Dayan, P., Frey, B.J., Neal, R.M.: The "wake-sleep" algorithm for unsupervised neural networks. Science 268 (5214), 1158-1161 (1995)
    • (1995) Science , vol.268 , Issue.5214 , pp. 1158-1161
    • Hinton, G.E.1    Dayan, P.2    Frey, B.J.3    Neal, R.M.4
  • 61
    • 0031590130 scopus 로고    scopus 로고
    • Generative models for discovering sparse distributed representations
    • Hinton, G.E., Ghahramani, Z.: Generative models for discovering sparse distributed representations. Philos. Trans. R. Soc. Lond. B Biol. Sci. 352 (1358), 1177-1190 (1997)
    • (1997) Philos. Trans. R. Soc. Lond. B Biol. Sci. , vol.352 , Issue.1358 , pp. 1177-1190
    • Hinton, G.E.1    Ghahramani, Z.2
  • 62
    • 33746600649 scopus 로고    scopus 로고
    • Reducing the dimensionality of data with neural networks
    • Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313 (5786), 504-507 (2006)
    • (2006) Science , vol.313 , Issue.5786 , pp. 504-507
    • Hinton, G.E.1    Salakhutdinov, R.R.2
  • 63
    • 0031747058 scopus 로고    scopus 로고
    • Amount of training affects associatively-activated event representation
    • Holland, P.: Amount of training affects associatively-activated event representation. Neuropharmacology 37(4-5), 461-469 (1998)
    • (1998) Neuropharmacology , vol.37 , Issue.4-5 , pp. 461-469
    • Holland, P.1
  • 64
    • 0034061668 scopus 로고    scopus 로고
    • Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events
    • Horvitz, J.C.: Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events. Neuroscience 96(4), 651-656 (2000)
    • (2000) Neuroscience , vol.96 , Issue.4 , pp. 651-656
    • Horvitz, J.C.1
  • 65
    • 0030757872 scopus 로고    scopus 로고
    • Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat
    • Horvitz, J.C., Stewart, T., Jacobs, B.L.: Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat. Brain Res. 759(2), 251-258 (1997)
    • (1997) Brain Res. , vol.759 , Issue.2 , pp. 251-258
    • Horvitz, J.C.1    Stewart, T.2    Jacobs, B.L.3
  • 67
    • 34447328072 scopus 로고    scopus 로고
    • Inherent value systems for autonomous mental development
    • Huang, X., Weng, J.: Inherent value systems for autonomous mental development. Int. J. Human. Robot. 4, 407-433 (2007)
    • (2007) Int. J. Human. Robot. , vol.4 , pp. 407-433
    • Huang, X.1    Weng, J.2
  • 69
    • 67651041654 scopus 로고    scopus 로고
    • Reinforcers and control
    • Ph.D. Thesis, Gatsby Computational Neuroscience Unit, UCL
    • Huys, Q.: Reinforcers and control. Towards a computational ætiology of depression. Ph.D. Thesis, Gatsby Computational Neuroscience Unit, UCL (2007)
    • (2007) Towards a Computational ætiology of Depression
    • Huys, Q.1
  • 70
    • 70350570499 scopus 로고    scopus 로고
    • A Bayesian formulation of behavioral control
    • Huys, Q.J.M., Dayan, P.: A Bayesian formulation of behavioral control. Cognition 113, 314-328 (2009)
    • (2009) Cognition , vol.113 , pp. 314-328
    • Huys, Q.J.M.1    Dayan, P.2
  • 71
    • 0036592028 scopus 로고    scopus 로고
    • Control of exploitation-exploration meta-parameter in reinforcement learning
    • Ishii, S., Yoshida, W., Yoshimoto, J.: Control of exploitation-exploration meta-parameter in reinforcement learning. Neural Netw. 15(4-6), 665-687 (2002)
    • (2002) Neural Netw. , vol.15 , Issue.4-6 , pp. 665-687
    • Ishii, S.1    Yoshida, W.2    Yoshimoto, J.3
  • 72
    • 0032073263 scopus 로고    scopus 로고
    • Planning and acting in partially observable stochastic domains
    • Kaelbling, L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1-2), 99-134 (1998)
    • (1998) Artif. Intell. , vol.101 , Issue.1-2 , pp. 99-134
    • Kaelbling, L.1    Littman, M.2    Cassandra, A.3
  • 73
    • 0036592029 scopus 로고    scopus 로고
    • Dopamine: Generalization and bonuses
    • Kakade, S., Dayan, P.: Dopamine: Generalization and bonuses. Neural Netw. 15(4-6), 549-559 (2002)
    • (2002) Neural Netw. , vol.15 , Issue.4-6 , pp. 549-559
    • Kakade, S.1    Dayan, P.2
  • 74
    • 0036832954 scopus 로고    scopus 로고
    • Near-optimal reinforcement learning in polynomial time
    • Kearns, M., Singh, S.: Near-optimal reinforcement learning in polynomial time. Mach. Learn. 49(2), 209-232 (2002)
    • (2002) Mach. Learn. , vol.49 , Issue.2 , pp. 209-232
    • Kearns, M.1    Singh, S.2
  • 75
    • 0035712679 scopus 로고    scopus 로고
    • Parallel circuits mediating distinct emotional coping reactions to different types of stress
    • Keay, K.A., Bandler, R.: Parallel circuits mediating distinct emotional coping reactions to different types of stress. Neurosci. Biobehav. Rev. 25(7-8), 669-678 (2001)
    • (2001) Neurosci. Biobehav. Rev. , vol.25 , Issue.7-8 , pp. 669-678
    • Keay, K.A.1    Bandler, R.2
  • 76
    • 0037382264 scopus 로고    scopus 로고
    • Coordination of actions and habits in the medial prefrontal cortex of rats
    • Killcross, S., Coutureau, E.: Coordination of actions and habits in the medial prefrontal cortex of rats. Cereb. Cortex 13(4), 400-408 (2003)
    • (2003) Cereb. Cortex , vol.13 , Issue.4 , pp. 400-408
    • Killcross, S.1    Coutureau, E.2
  • 77
    • 84880873347 scopus 로고    scopus 로고
    • Building portable options: Skill transfer in reinforcement learning
    • Hyderabad, India
    • Konidaris, G., Barto, A.: Building portable options: Skill transfer in reinforcement learning. In: IJCAI, pp. 895-900, Hyderabad, India (2007)
    • (2007) IJCAI , pp. 895-900
    • Konidaris, G.1    Barto, A.2
  • 78
    • 78751681641 scopus 로고    scopus 로고
    • Efficient skill learning using abstraction selection
    • Pasadena, California
    • Konidaris, G., Barto, A.: Efficient skill learning using abstraction selection. In: IJCAI, pp. 1107-1112, Pasadena, California (2009)
    • (2009) IJCAI , pp. 1107-1112
    • Konidaris, G.1    Barto, A.2
  • 79
    • 59649113160 scopus 로고    scopus 로고
    • Flexible shaping: How learning in small steps helps
    • Krueger, K.A., Dayan, P.: Flexible shaping: How learning in small steps helps. Cognition 110(3), 380-394 (2009)
    • (2009) Cognition , vol.110 , Issue.3 , pp. 380-394
    • Krueger, K.A.1    Dayan, P.2
  • 81
    • 0037840849 scopus 로고    scopus 로고
    • On the undecidability of probabilistic planning and related stochastic optimization problems
    • Madani, O., Hanks, S., Condon, A.: On the undecidability of probabilistic planning and related stochastic optimization problems. Artif. Intell. 147(1-2), 5-34 (2003)
    • (2003) Artif. Intell. , vol.147 , Issue.1-2 , pp. 5-34
    • Madani, O.1    Hanks, S.2    Condon, A.3
  • 83
    • 19844365569 scopus 로고    scopus 로고
    • Stressor controllability and learned helplessness: The roles of the dorsal raphe nucleus, serotonin, and corticotropin-releasing factor
    • Maier, S.F., Watkins, L.R.: Stressor controllability and learned helplessness: The roles of the dorsal raphe nucleus, serotonin, and corticotropin-releasing factor. Neurosci. Biobehav. Rev. 29(4-5), 829-841 (2005)
    • (2005) Neurosci. Biobehav. Rev. , vol.29 , Issue.4-5 , pp. 829-841
    • Maier, S.F.1    Watkins, L.R.2
  • 84
    • 3042590043 scopus 로고    scopus 로고
    • A two-dimensional neuropsychology of defense: Fear/anxiety and defensive distance
    • McNaughton, N., Corr, P.J.: A two-dimensional neuropsychology of defense: Fear/anxiety and defensive distance. Neurosci. Biobehav. Rev. 28(3), 285-305 (2004)
    • (2004) Neurosci. Biobehav. Rev. , vol.28 , Issue.3 , pp. 285-305
    • McNaughton, N.1    Corr, P.J.2
  • 85
    • 84906736869 scopus 로고    scopus 로고
    • Functions and mechanisms of intrinsic motivations: The knowledge versus competence distinction
    • Baldassarre, G., Mirolli, M. (eds.) Springer, Berlin
    • Mirolli, M., Baldassarre, G.: Functions and mechanisms of intrinsic motivations: The knowledge versus competence distinction. In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems, pp. 49-72. Springer, Berlin (2012)
    • (2012) Intrinsically Motivated Learning in Natural and Artificial Systems , pp. 49-72
    • Mirolli, M.1    Baldassarre, G.2
  • 86
    • 40849102598 scopus 로고    scopus 로고
    • Synaptic theory of working memory
    • Mongillo, G., Barak, O., Tsodyks, M.: Synaptic theory of working memory. Science 319 (5869), 1543-1546 (2008)
    • (2008) Science , vol.319 , Issue.5869 , pp. 1543-1546
    • Mongillo, G.1    Barak, O.2    Tsodyks, M.3
  • 87
    • 0029981543 scopus 로고    scopus 로고
    • A framework for mesencephalic dopamine systems based on predictive hebbian learning
    • Montague, P.R., Dayan, P., Sejnowski, T.J.: A framework for mesencephalic dopamine systems based on predictive hebbian learning. J. Neurosci. 16(5), 1936-1947 (1996)
    • (1996) J. Neurosci. , vol.16 , Issue.5 , pp. 1936-1947
    • Montague, P.R.1    Dayan, P.2    Sejnowski, T.J.3
  • 88
    • 77950032550 scopus 로고    scopus 로고
    • Markov chain sampling methods for dirichlet process mixture models
    • Neal, R.: Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Stat. 9(2), 249-265 (2000)
    • (2000) J. Comput. Graph. Stat. , vol.9 , Issue.2 , pp. 249-265
    • Neal, R.1
  • 89
    • 0141596576 scopus 로고    scopus 로고
    • Policy invariance under reward transformations: Theory and application to reward shaping
    • Bled, Slovenia
    • Ng, A., Harada, D., Russell, S.: Policy invariance under reward transformations: Theory and application to reward shaping. In: ICML, pp. 278-287, Bled, Slovenia (1999)
    • (1999) ICML , pp. 278-287
    • Ng, A.1    Harada, D.2    Russell, S.3
  • 90
    • 84858776393 scopus 로고    scopus 로고
    • Multi-resolution exploration in continuous spaces
    • Nouri, A., Littman, M.: Multi-resolution exploration in continuous spaces. NIPS, pp. 1209-1216 (2009)
    • (2009) NIPS , pp. 1209-1216
    • Nouri, A.1    Littman, M.2
  • 91
    • 33644927837 scopus 로고    scopus 로고
    • Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia
    • O'Reilly, R.C., Frank, M.J.: Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia. Neural Comput. 18(2), 283-328 (2006)
    • (2006) Neural Comput. , vol.18 , Issue.2 , pp. 283-328
    • O'Reilly, R.C.1    Frank, M.J.2
  • 92
    • 34047267520 scopus 로고    scopus 로고
    • Intrinsic motivation systems for autonomous mental development
    • Oudeyer, P., Kaplan, F., Hafner, V.: Intrinsic motivation systems for autonomous mental development. IEEE Trans. Evol. Comput. 11(2), 265-286 (2007)
    • (2007) IEEE Trans. Evol. Comput. , vol.11 , Issue.2 , pp. 265-286
    • Oudeyer, P.1    Kaplan, F.2    Hafner, V.3
  • 93
    • 33748408630 scopus 로고    scopus 로고
    • Affective neuroscience
    • New York
    • Panksepp, J.: Affective Neuroscience. OUP, New York (1998)
    • (1998) OUP
    • Panksepp, J.1
  • 94
    • 0000977910 scopus 로고
    • The complexity of Markov decision processes
    • Papadimitriou, C., Tsitsiklis, J.: The complexity of Markov decision processes. Math. Oper. Res. 12(3), 441-450 (1987)
    • (1987) Math. Oper. Res. , vol.12 , Issue.3 , pp. 441-450
    • Papadimitriou, C.1    Tsitsiklis, J.2
  • 95
    • 84898956770 scopus 로고    scopus 로고
    • Reinforcement learning with hierarchies of machines
    • Denver, Colorado
    • Parr, R., Russell, S.: Reinforcement learning with hierarchies of machines. In: NIPS, pp. 1043-1049, Denver, Colorado (1998)
    • (1998) NIPS , pp. 1043-1049
    • Parr, R.1    Russell, S.2
  • 96
    • 33749251297 scopus 로고    scopus 로고
    • An analytic solution to discrete Bayesian reinforcement learning
    • Pittsburgh, Pennslyvania
    • Poupart, P., Vlassis, N., Hoey, J., Regan, K.: An analytic solution to discrete bayesian reinforcement learning. In: ICML, pp. 697-704, Pittsburgh, Pennslyvania (2006)
    • (2006) ICML , pp. 697-704
    • Poupart, P.1    Vlassis, N.2    Hoey, J.3    Regan, K.4
  • 99
    • 0033119561 scopus 로고    scopus 로고
    • Is the short-latency dopamine response too short to signal reward error?
    • Redgrave, P., Prescott, T.J., Gurney, K.: Is the short-latency dopamine response too short to signal reward error? Trends Neurosci. 22(4), 146-151 (1999)
    • (1999) Trends Neurosci , vol.22 , Issue.4 , pp. 146-151
    • Redgrave, P.1    Prescott, T.J.2    Gurney, K.3
  • 100
    • 0035341482 scopus 로고    scopus 로고
    • Fear and feeding in the nucleus accumbens shell: Rostrocaudal segregation of GABA-elicited defensive behavior versus eating behavior
    • Reynolds, S.M., Berridge, K.C. (2001): Fear and feeding in the nucleus accumbens shell: Rostrocaudal segregation of GABA-elicited defensive behavior versus eating behavior. J. Neurosci. 21(9), 3261-3270 (1999)
    • (1999) J. Neurosci. , vol.21 , Issue.9 , pp. 3261-3270
    • Reynolds, S.M.1    Berridge, K.C.2
  • 101
    • 0037104732 scopus 로고    scopus 로고
    • Positive and negative motivation in nucleus accumbens shell: Bivalent rostrocaudal gradients for GABA-elicited eating, taste "liking"/"disliking" reactions, place preference/avoidance, and fear
    • Reynolds, S.M., Berridge, K.C.: Positive and negative motivation in nucleus accumbens shell: Bivalent rostrocaudal gradients for GABA-elicited eating, taste "liking"/"disliking" reactions, place preference/avoidance, and fear. J. Neurosci. 22(16), 7308-7320 (2002)
    • (2002) J. Neurosci. , vol.22 , Issue.16 , pp. 7308-7320
    • Reynolds, S.M.1    Berridge, K.C.2
  • 102
    • 41149151266 scopus 로고    scopus 로고
    • Emotional environments retune the valence of appetitive versus fearful functions in nucleus accumbens
    • Reynolds, S.M., Berridge, K.C.: Emotional environments retune the valence of appetitive versus fearful functions in nucleus accumbens. Nat. Neurosci. 11(4), 423-425 (2008)
    • (2008) Nat. Neurosci. , vol.11 , Issue.4 , pp. 423-425
    • Reynolds, S.M.1    Berridge, K.C.2
  • 103
    • 0031189347 scopus 로고    scopus 로고
    • CHILD: A first step towards continual learning
    • Ring, M.: CHILD: A first step towards continual learning. Mach. Learn. 28(1), 77-104 (1997)
    • (1997) Mach. Learn. , vol.28 , Issue.1 , pp. 77-104
    • Ring, M.1
  • 104
    • 84929054210 scopus 로고    scopus 로고
    • Toward a formal framework for continual learning
    • Whistler, Canada
    • Ring, M.: Toward a formal framework for continual learning. In: NIPS Workshop on Inductive Transfer, Whistler, Canada (2005)
    • (2005) NIPS Workshop on Inductive Transfer
    • Ring, M.1
  • 105
    • 41149161631 scopus 로고    scopus 로고
    • Choice, uncertainty and value in prefrontal and cingulate cortex
    • Rushworth, M.F.S., Behrens, T.E.J.: Choice, uncertainty and value in prefrontal and cingulate cortex. Nat. Neurosci. 11(4), 389-397 (2008)
    • (2008) Nat. Neurosci. , vol.11 , Issue.4 , pp. 389-397
    • Rushworth, M.F.S.1    Behrens, T.E.J.2
  • 106
    • 0002209063 scopus 로고    scopus 로고
    • Intrinsic and extrinsic motivations: Classic definitions and new directions
    • Ryan, R., Deci, E.: Intrinsic and extrinsic motivations: Classic definitions and new directions. Contemp. Educ. Psychol. 25(1), 54-67 (2000)
    • (2000) Contemp. Educ. Psychol. , vol.25 , Issue.1 , pp. 54-67
    • Ryan, R.1    Deci, E.2
  • 107
    • 0742324926 scopus 로고    scopus 로고
    • Inter-module credit assignment in modular reinforcement learning
    • Samejima, K., Doya, K., Kawato, M.: Inter-module credit assignment in modular reinforcement learning. Neural Netw. 16(7), 985-994 (2003)
    • (2003) Neural Netw. , vol.16 , Issue.7 , pp. 985-994
    • Samejima, K.1    Doya, K.2    Kawato, M.3
  • 108
    • 0001201756 scopus 로고
    • Some studies in machine learning using the game of checkers
    • Samuel, A.: Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 3, 210-229 (1959)
    • (1959) IBM J. Res. Dev. , vol.3 , pp. 210-229
    • Samuel, A.1
  • 109
  • 110
    • 0026306990 scopus 로고
    • Curious model-building control systems
    • Seattle, Washington State IEEE
    • Schmidhuber, J.: Curious model-building control systems. In: IJCNN, pp. 1458-1463, Seattle, Washington State IEEE (1991)
    • (1991) IJCNN , pp. 1458-1463
    • Schmidhuber, J.1
  • 111
    • 84880251870 scopus 로고    scopus 로고
    • Gödel machines: Fully self-referential optimal universal self-improvers
    • Schmidhuber, J.: Gödel machines: Fully self-referential optimal universal self-improvers. Artif. Gen. Intell., pp. 199-226 (2006)
    • (2006) Artif. Gen. Intell. , pp. 199-226
    • Schmidhuber, J.1
  • 112
    • 70349334569 scopus 로고    scopus 로고
    • Ultimate cognition à la gödel
    • Schmidhuber, J.: Ultimate cognition à la gödel. Cogn. Comput. 1, 117-193 (2009)
    • (2009) Cogn. Comput. , vol.1 , pp. 117-193
    • Schmidhuber, J.1
  • 114
    • 0002193484 scopus 로고
    • Relation between classical conditioning and instrumental learning
    • Prokasy, W. (ed.) Appelton-Century-Crofts, New York
    • Sheffield, F.: Relation between classical conditioning and instrumental learning. In: Prokasy, W. (ed.) Classical Conditioning, pp. 302-322. Appelton-Century-Crofts, New York (1965)
    • (1965) Classical Conditioning , pp. 302-322
    • Sheffield, F.1
  • 115
    • 33749261645 scopus 로고    scopus 로고
    • An intrinsic reward mechanism for efficient exploration
    • Pittsburgh, Pennsylvania
    • Şimşek, Ö., Barto, A.G.: An intrinsic reward mechanism for efficient exploration. In: ICML, pp. 833-840, Pittsburgh, Pennsylvania (2006)
    • (2006) ICML , pp. 833-840
    • Şimşek, O.1    Barto, A.G.2
  • 116
    • 0001027894 scopus 로고
    • Transfer of learning by composing solutions of elemental sequential tasks
    • Singh, S.: Transfer of learning by composing solutions of elemental sequential tasks. Mach. Learn. 8(3), 323-339 (1992)
    • (1992) Mach. Learn. , vol.8 , Issue.3 , pp. 323-339
    • Singh, S.1
  • 117
    • 84899031920 scopus 로고    scopus 로고
    • Intrinsically motivated reinforcement learning
    • Vancouver, Canada
    • Singh, S., Barto, A., Chentanez, N.: Intrinsically motivated reinforcement learning. In: NIPS, pp. 1281-1288, Vancouver, Canada (2005)
    • (2005) NIPS , pp. 1281-1288
    • Singh, S.1    Barto, A.2    Chentanez, N.3
  • 118
    • 0030240189 scopus 로고    scopus 로고
    • A guide to constructs of control
    • Skinner, E.A.: A guide to constructs of control. J. Pers. Soc. Psychol. 71(3), 549-570 (1996)
    • (1996) J. Pers. Soc. Psychol. , vol.71 , Issue.3 , pp. 549-570
    • Skinner, E.A.1
  • 119
    • 33646230819 scopus 로고    scopus 로고
    • Dopamine, prediction error and associative learning: A model-based account
    • Smith, A., Li, M., Becker, S., Kapur, S.: Dopamine, prediction error and associative learning: A model-based account. Network 17(1), 61-84 (2006)
    • (2006) Network , vol.17 , Issue.1 , pp. 61-84
    • Smith, A.1    Li, M.2    Becker, S.3    Kapur, S.4
  • 120
    • 0001425882 scopus 로고
    • Reconciling the role of central serotonin neurons in human and animal behaviour
    • Soubrié, P.: Reconciling the role of central serotonin neurons in human and animal behaviour. Behav. Brain Sci. 9, 319-364 (1986)
    • (1986) Behav. Brain Sci. , vol.9 , pp. 319-364
    • Soubrié, P.1
  • 121
    • 14344258433 scopus 로고    scopus 로고
    • A Bayesian framework for reinforcement learning
    • Stanford, California
    • Strens, M.: A Bayesian framework for reinforcement learning. In: ICML, pp. 943-950, Stanford, California (2000)
    • (2000) ICML , pp. 943-950
    • Strens, M.1
  • 122
    • 0032930935 scopus 로고    scopus 로고
    • A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task
    • Suri, R.E., Schultz, W.: A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task. Neuroscience 91(3), 871-890 (1999)
    • (1999) Neuroscience , vol.91 , Issue.3 , pp. 871-890
    • Suri, R.E.1    Schultz, W.2
  • 123
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton, R.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9-44 (1988)
    • (1988) Mach. Learn. , vol.3 , Issue.1 , pp. 9-44
    • Sutton, R.1
  • 124
    • 85132026293 scopus 로고
    • Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
    • Sutton, R.: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. ICML Austin, Texas 216, 224 (1990)
    • (1990) ICML Austin, Texas , vol.216 , pp. 224
    • Sutton, R.1
  • 125
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • Sutton, R., Precup, D., Singh, S.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1), 181-211 (1999)
    • (1999) Artif. Intell. , vol.112 , Issue.1 , pp. 181-211
    • Sutton, R.1    Precup, D.2    Singh, S.3
  • 127
    • 56049088540 scopus 로고    scopus 로고
    • Multitask reinforcement learning on the distribution of MDPs
    • Tanaka, F., Yamamura, M.: Multitask reinforcement learning on the distribution of MDPs. IEEJ Trans. Electron. Inform. Syst. C 123(5), 1004-1011 (2003)
    • (2003) IEEJ Trans. Electron. Inform. Syst. C , vol.123 , Issue.5 , pp. 1004-1011
    • Tanaka, F.1    Yamamura, M.2
  • 128
    • 33749249312 scopus 로고    scopus 로고
    • Hierarchical dirichlet processes
    • Teh, Y., Jordan, M., Beal, M., Blei, D.: Hierarchical dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566-1581 (2006)
    • (2006) J. Am. Stat. Assoc. , vol.101 , Issue.476 , pp. 1566-1581
    • Teh, Y.1    Jordan, M.2    Beal, M.3    Blei, D.4
  • 129
    • 33746260413 scopus 로고    scopus 로고
    • Theory-based Bayesian models of inductive learning and reasoning
    • Tenenbaum, J., Griffiths, T., Kemp, C.: Theory-based Bayesian models of inductive learning and reasoning. Trends Cogn. Sci. 10(7), 309-318 (2006)
    • (2006) Trends Cogn. Sci. , vol.10 , Issue.7 , pp. 309-318
    • Tenenbaum, J.1    Griffiths, T.2    Kemp, C.3
  • 130
    • 84862302350 scopus 로고    scopus 로고
    • Hierarchical beta processes and the Indian buffet process
    • San Juan, Puerto Rico
    • Thibaux, R., Jordan, M.: Hierarchical beta processes and the Indian buffet process. In: AIStats, pp. 564-571, San Juan, Puerto Rico (2007)
    • (2007) AIStats , pp. 564-571
    • Thibaux, R.1    Jordan, M.2
  • 132
    • 33749882712 scopus 로고
    • Finding structure in reinforcement learning
    • Denver, Colorado
    • Thrun, S., Schwartz, A.: Finding structure in reinforcement learning. In: NIPS, pp. 385-392, Denver, Colorado (1995)
    • (1995) NIPS , pp. 385-392
    • Thrun, S.1    Schwartz, A.2
  • 133
    • 58149442669 scopus 로고
    • Cognitive maps in rats and men
    • Tolman, E.C.: Cognitive maps in rats and men. Psychol. Rev. 55(4), 189-208 (1948)
    • (1948) Psychol. Rev. , vol.55 , Issue.4 , pp. 189-208
    • Tolman, E.C.1
  • 134
    • 66449119919 scopus 로고    scopus 로고
    • A specific role for posterior dorsolateral striatum in human habit learning
    • Tricomi, E., Balleine, B.W., O'Doherty, J.P.: A specific role for posterior dorsolateral striatum in human habit learning. Eur. J. Neurosci. 29(11), 2225-2232 (2009)
    • (2009) Eur. J. Neurosci. , vol.29 , Issue.11 , pp. 2225-2232
    • Tricomi, E.1    Balleine, B.W.2    O'Doherty, J.P.3
  • 135
    • 34247147767 scopus 로고    scopus 로고
    • Determining the neural substrates of goal-directed learning in the human brain
    • Valentin, V.V., Dickinson, A., O'Doherty, J.P.: Determining the neural substrates of goal-directed learning in the human brain. J. Neurosci. 27(15), 4019-4026 (2007)
    • (2007) J. Neurosci. , vol.27 , Issue.15 , pp. 4019-4026
    • Valentin, V.V.1    Dickinson, A.2    O'Doherty, J.P.3
  • 136
    • 63149146163 scopus 로고    scopus 로고
    • Learning flexible sensori-motor mappings in a complex network
    • Vasilaki, E., Fusi, S., Wang, X.-J., Senn, W. (2009): Learning flexible sensori-motor mappings in a complex network. Biol. Cybern. 100(2), 147-158 (2007)
    • (2007) Biol. Cybern. , vol.100 , Issue.2 , pp. 147-158
    • Vasilaki, E.1    Fusi, S.2    Wang, X.-J.3    Senn, W.4
  • 137
    • 31844436266 scopus 로고    scopus 로고
    • Bayesian sparse sampling for on-line reward optimization
    • Bonn, Germany
    • Wang, T., Lizotte, D., Bowling, M., Schuurmans, D.: Bayesian sparse sampling for on-line reward optimization. In: ICML, pp. 956-963, Bonn, Germany (2005)
    • (2005) ICML , pp. 956-963
    • Wang, T.1    Lizotte, D.2    Bowling, M.3    Schuurmans, D.4
  • 139
  • 140
    • 84989993724 scopus 로고
    • Auto-maintenance in the pigeon: Sustained pecking despite contingent non-reinforcement
    • Williams, D.R., Williams, H.: Auto-maintenance in the pigeon: Sustained pecking despite contingent non-reinforcement. J. Exp. Anal. Behav. 12(4), 511-520 (1969)
    • (1969) J. Exp. Anal. Behav. , vol.12 , Issue.4 , pp. 511-520
    • Williams, D.R.1    Williams, H.2
  • 141
    • 34547994508 scopus 로고    scopus 로고
    • Multi-task reinforcement learning: A hierarchical Bayesian approach
    • Corvallis, Oregon
    • Wilson, A., Fern, A., Ray, S., Tadepalli, P.: Multi-task reinforcement learning: A hierarchical bayesian approach. In: ICML, pp. 1015-1022, Corvallis, Oregon (2007)
    • (2007) ICML , pp. 1015-1022
    • Wilson, A.1    Fern, A.2    Ray, S.3    Tadepalli, P.4
  • 143
    • 0032192424 scopus 로고    scopus 로고
    • Multiple paired forward and inverse models for motor control
    • Wolpert, D.M., Kawato, M.: Multiple paired forward and inverse models for motor control. Neural Netw. 11(7-8), 1317-1329 (1998)
    • (1998) Neural Netw. , vol.11 , Issue.7-8 , pp. 1317-1329
    • Wolpert, D.M.1    Kawato, M.2
  • 144
    • 33646853495 scopus 로고    scopus 로고
    • Resolution of uncertainty in prefrontal cortex
    • Yoshida, W., Ishii, S.: Resolution of uncertainty in prefrontal cortex. Neuron 50(5), 781-789 (2006)
    • (2006) Neuron , vol.50 , Issue.5 , pp. 781-789
    • Yoshida, W.1    Ishii, S.2
  • 145
    • 20444388016 scopus 로고    scopus 로고
    • Uncertainty, neuromodulation, and attention
    • Yu, A.J., Dayan, P.: Uncertainty, neuromodulation, and attention. Neuron 46(4), 681-692 (2005)
    • (2005) Neuron , vol.46 , Issue.4 , pp. 681-692
    • Yu, A.J.1    Dayan, P.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.