메뉴 건너뛰기




Volumn , Issue , 2013, Pages 299-320

Advanced Reinforcement Learning

Author keywords

Dopamine; Hierarchical reinforcement learning; Reinforcement learning; Uncertainty

Indexed keywords


EID: 84897397355     PISSN: None     EISSN: None     Source Type: Book    
DOI: 10.1016/B978-0-12-416008-8.00016-4     Document Type: Chapter
Times cited : (31)

References (131)
  • 1
    • 0025321039 scopus 로고
    • Functional architecture of basal ganglia circuits: neural substrates of parallel processing
    • Alexander G.E., Crutcher M.D. Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends Neurosci. 1990, 13:266-271.
    • (1990) Trends Neurosci. , vol.13 , pp. 266-271
    • Alexander, G.E.1    Crutcher, M.D.2
  • 3
    • 84857207526 scopus 로고    scopus 로고
    • Mechanisms of hierarchical reinforcement learning in cortico-striatal circuits 2: evidence from fMRI
    • Badre D., Frank M.J. Mechanisms of hierarchical reinforcement learning in cortico-striatal circuits 2: evidence from fMRI. Cereb. Cortex 2012, 22:527-536.
    • (2012) Cereb. Cortex , vol.22 , pp. 527-536
    • Badre, D.1    Frank, M.J.2
  • 4
    • 77952341346 scopus 로고    scopus 로고
    • Frontal cortex and the discovery of abstract action rules
    • Badre D., Kayser A.S., D'Esposito M. Frontal cortex and the discovery of abstract action rules. Neuron 2010, 66:315-326.
    • (2010) Neuron , vol.66 , pp. 315-326
    • Badre, D.1    Kayser, A.S.2    D'Esposito, M.3
  • 5
    • 0000541213 scopus 로고
    • Adaptive critics and the basal ganglia
    • MIT Press, Cambridge, MA, J.C. Houk, J.L. Davis, D.G. Beiser (Eds.)
    • Barto A.G. Adaptive critics and the basal ganglia. Models of Information Processing in the Basal Ganglia 1995, 215-232. MIT Press, Cambridge, MA. J.C. Houk, J.L. Davis, D.G. Beiser (Eds.).
    • (1995) Models of Information Processing in the Basal Ganglia , pp. 215-232
    • Barto, A.G.1
  • 6
    • 0141988716 scopus 로고    scopus 로고
    • Recent advances in hierarchical reinforcement learning
    • Barto A.G., Mahadevan S. Recent advances in hierarchical reinforcement learning. Discrete Event Dyn. Syst. 2003, 13:341-379.
    • (2003) Discrete Event Dyn. Syst. , vol.13 , pp. 341-379
    • Barto, A.G.1    Mahadevan, S.2
  • 7
    • 0019519039 scopus 로고
    • Associative search network - a reinforcement learning associative memory
    • Barto A.G., Sutton R.S., Brouwer P.S. Associative search network - a reinforcement learning associative memory. Biol. Cybern. 1981, 40(3):201-211.
    • (1981) Biol. Cybern. , vol.40 , Issue.3 , pp. 201-211
    • Barto, A.G.1    Sutton, R.S.2    Brouwer, P.S.3
  • 8
    • 34548778113 scopus 로고    scopus 로고
    • Statistics of midbrain dopamine neuron spike trains in the awake primate
    • Bayer H.M., Lau B., Glimcher P.W. Statistics of midbrain dopamine neuron spike trains in the awake primate. J. Neurophysiol. 2007, 98:1428-1439.
    • (2007) J. Neurophysiol. , vol.98 , pp. 1428-1439
    • Bayer, H.M.1    Lau, B.2    Glimcher, P.W.3
  • 9
    • 34548295327 scopus 로고    scopus 로고
    • Learning the value of information in an uncertain world
    • Behrens T., Woolrich M., Walton M., Rushworth M. Learning the value of information in an uncertain world. Nat. Neurosci. 2007, 10:1214-1221.
    • (2007) Nat. Neurosci. , vol.10 , pp. 1214-1221
    • Behrens, T.1    Woolrich, M.2    Walton, M.3    Rushworth, M.4
  • 10
    • 85012688561 scopus 로고
    • Princeton University Press, Princeton
    • Bellman R. Dynamic Programming 1957, Princeton University Press, Princeton.
    • (1957) Dynamic Programming
    • Bellman, R.1
  • 11
    • 33847634405 scopus 로고    scopus 로고
    • The debate over dopamine's role in reward: the case for incentive salience
    • Berridge K.C. The debate over dopamine's role in reward: the case for incentive salience. Psychopharmacology 2007, 191:391-431.
    • (2007) Psychopharmacology , vol.191 , pp. 391-431
    • Berridge, K.C.1
  • 13
    • 34248999741 scopus 로고    scopus 로고
    • Short-term memory traces for action bias in human reinforcement learning
    • Bogacz R., McClure S.M., Li J., Cohen J.D., Montague P.R. Short-term memory traces for action bias in human reinforcement learning. Brain Res. 2007, 1153:111-121.
    • (2007) Brain Res. , vol.1153 , pp. 111-121
    • Bogacz, R.1    McClure, S.M.2    Li, J.3    Cohen, J.D.4    Montague, P.R.5
  • 14
    • 79960203266 scopus 로고    scopus 로고
    • Multiplicity of control in the basal ganglia: computational roles of striatal subregions
    • Bornstein A.M., Daw N.D. Multiplicity of control in the basal ganglia: computational roles of striatal subregions. Curr. Opin. Neurobiol. 2011, 21:374-380.
    • (2011) Curr. Opin. Neurobiol. , vol.21 , pp. 374-380
    • Bornstein, A.M.1    Daw, N.D.2
  • 15
    • 70350566799 scopus 로고    scopus 로고
    • Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective
    • Botvinick M.M., Niv Y., Barto A.C. Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 2009, 113:262-280.
    • (2009) Cognition , vol.113 , pp. 262-280
    • Botvinick, M.M.1    Niv, Y.2    Barto, A.C.3
  • 16
    • 0021329392 scopus 로고
    • Supplementary motor area of the monkey's cerebral cortex: short-and long-term deficits after unilateral ablation and the effects of subsequent callosal section
    • Brinkman C. Supplementary motor area of the monkey's cerebral cortex: short-and long-term deficits after unilateral ablation and the effects of subsequent callosal section. J. Neurosci. 1984, 4:918-929.
    • (1984) J. Neurosci. , vol.4 , pp. 918-929
    • Brinkman, C.1
  • 18
    • 42149177173 scopus 로고    scopus 로고
    • Dopamine, reward prediction error, and economics
    • Caplin A., Dean M. Dopamine, reward prediction error, and economics. Q. J. Econ. 2008, 123:663-701.
    • (2008) Q. J. Econ. , vol.123 , pp. 663-701
    • Caplin, A.1    Dean, M.2
  • 19
    • 84899032145 scopus 로고    scopus 로고
    • All learning is local: Multi-agent learning in global reward games
    • Chang Y.H., Ho T., Kaelbling L.P. All learning is local: Multi-agent learning in global reward games. Adv. Neural Inf. Process. Syst. 2003, 16:807-814.
    • (2003) Adv. Neural Inf. Process. Syst. , vol.16 , pp. 807-814
    • Chang, Y.H.1    Ho, T.2    Kaelbling, L.P.3
  • 20
    • 0016948894 scopus 로고
    • Optimal foraging, the marginal value theorem
    • Charnov E.L. Optimal foraging, the marginal value theorem. Theor. Popul. Biol. 1976, 9:129-136.
    • (1976) Theor. Popul. Biol. , vol.9 , pp. 129-136
    • Charnov, E.L.1
  • 21
    • 84856431209 scopus 로고    scopus 로고
    • Neuron-type-specific signals for reward and punishment in the ventral tegmental area
    • Cohen J.Y., Haesler S., Vong L., Lowell B.B., Uchida N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 2012, 482:85-88.
    • (2012) Nature , vol.482 , pp. 85-88
    • Cohen, J.Y.1    Haesler, S.2    Vong, L.3    Lowell, B.B.4    Uchida, N.5
  • 23
    • 33746365099 scopus 로고    scopus 로고
    • Bayesian theories of conditioning in a changing world
    • Courville A.C., Daw N.D., Touretzky D.S. Bayesian theories of conditioning in a changing world. Trends Cogn. Sci. 2006, 10:294-300.
    • (2006) Trends Cogn. Sci. , vol.10 , pp. 294-300
    • Courville, A.C.1    Daw, N.D.2    Touretzky, D.S.3
  • 24
    • 0014289343 scopus 로고
    • Anticipatory responding and avoidance discrimination as factors in avoidance conditioning
    • D'amato M., Fazzaro J., Etkin M. Anticipatory responding and avoidance discrimination as factors in avoidance conditioning. J. Exp. Psychol. 1968, 77:41.
    • (1968) J. Exp. Psychol. , vol.77 , pp. 41
    • D'amato, M.1    Fazzaro, J.2    Etkin, M.3
  • 25
    • 33745787929 scopus 로고    scopus 로고
    • Representation and timing in theories of the dopamine system
    • Daw N.D., Courville A.C., Touretzky D.S. Representation and timing in theories of the dopamine system. Neural Comput. 2006, 18:1637-1677.
    • (2006) Neural Comput. , vol.18 , pp. 1637-1677
    • Daw, N.D.1    Courville, A.C.2    Touretzky, D.S.3
  • 26
    • 0036592008 scopus 로고    scopus 로고
    • Opponent interactions between serotonin and dopamine
    • Daw N.D., Kakade S., Dayan P. Opponent interactions between serotonin and dopamine. Neural Netw. 2002, 15:603-616.
    • (2002) Neural Netw. , vol.15 , pp. 603-616
    • Daw, N.D.1    Kakade, S.2    Dayan, P.3
  • 27
    • 28044450875 scopus 로고    scopus 로고
    • Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
    • Daw N.D., Niv Y., Dayan P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 2005, 8:1704-1711.
    • (2005) Nat. Neurosci. , vol.8 , pp. 1704-1711
    • Daw, N.D.1    Niv, Y.2    Dayan, P.3
  • 28
    • 60749114870 scopus 로고    scopus 로고
    • Decision theory, reinforcement learning, and the brain
    • Dayan P., Daw N.D. Decision theory, reinforcement learning, and the brain. Cogn. Affect. Behav. Neurosci. 2008, 8:429-453.
    • (2008) Cogn. Affect. Behav. Neurosci. , vol.8 , pp. 429-453
    • Dayan, P.1    Daw, N.D.2
  • 31
    • 0031619316 scopus 로고    scopus 로고
    • Bayesian Q-learning. In: John Wiley & Sons Ltd
    • Dearden R., Friedman N., Russell S., 1998. Bayesian Q-learning. In: John Wiley & Sons Ltd, pp. 761-768.
    • (1998) , pp. 761-768
    • Dearden, R.1    Friedman, N.2    Russell, S.3
  • 32
    • 84859341150 scopus 로고    scopus 로고
    • Habits, action sequences and reinforcement learning
    • Dezfouli A., Balleine B.W. Habits, action sequences and reinforcement learning. Eur. J. Neurosci. 2012, 35:1036-1051.
    • (2012) Eur. J. Neurosci. , vol.35 , pp. 1036-1051
    • Dezfouli, A.1    Balleine, B.W.2
  • 34
    • 0041859307 scopus 로고    scopus 로고
    • Afferent modulation of dopamine neuron firing differentially regulates tonic and phasic dopamine transmission
    • Floresco S.B., West A.R., Ash B., Moore H., Grace A.A. Afferent modulation of dopamine neuron firing differentially regulates tonic and phasic dopamine transmission. Nat. Neurosci. 2003, 6:968-973.
    • (2003) Nat. Neurosci. , vol.6 , pp. 968-973
    • Floresco, S.B.1    West, A.R.2    Ash, B.3    Moore, H.4    Grace, A.A.5
  • 35
    • 84857211334 scopus 로고    scopus 로고
    • Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis
    • Frank M.J., Badre D. Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis. Cereb. Cortex 2012, 22:509-526.
    • (2012) Cereb. Cortex , vol.22 , pp. 509-526
    • Frank, M.J.1    Badre, D.2
  • 36
    • 10344250993 scopus 로고    scopus 로고
    • By carrot or by stick: cognitive reinforcement learning in Parkinsonism
    • Frank M.J., Seeberger L.C., O'Reilly R.C. By carrot or by stick: cognitive reinforcement learning in Parkinsonism. Science 2004, 306:1940-1943.
    • (2004) Science , vol.306 , pp. 1940-1943
    • Frank, M.J.1    Seeberger, L.C.2    O'Reilly, R.C.3
  • 37
    • 0016341973 scopus 로고
    • Parametric analysis of brain stimulation reward in the rat: I. The transient process and the memory-containing process
    • Gallistel C., Stellar J.R., Bubis E. Parametric analysis of brain stimulation reward in the rat: I. The transient process and the memory-containing process. J. Comp. Physiol. Psychol. 1974, 87:848.
    • (1974) J. Comp. Physiol. Psychol. , vol.87 , pp. 848
    • Gallistel, C.1    Stellar, J.R.2    Bubis, E.3
  • 38
    • 0031025660 scopus 로고    scopus 로고
    • Real-time measurement of electrically evoked extracellular dopamine in the striatum of freely moving rats
    • Garris P.A., Christensen J.R.C., Rebec G.V., Wightman R.M. Real-time measurement of electrically evoked extracellular dopamine in the striatum of freely moving rats. J. Neurochem. 2002, 68:152-161.
    • (2002) J. Neurochem. , vol.68 , pp. 152-161
    • Garris, P.A.1    Christensen, J.R.C.2    Rebec, G.V.3    Wightman, R.M.4
  • 39
    • 70350521769 scopus 로고    scopus 로고
    • Human reinforcement learning subdivides structured action spaces by learning effector-specific values
    • Gershman S., Pesaran B., Daw N. Human reinforcement learning subdivides structured action spaces by learning effector-specific values. J. Neurosci. 2009, 29:13524-13531.
    • (2009) J. Neurosci. , vol.29 , pp. 13524-13531
    • Gershman, S.1    Pesaran, B.2    Daw, N.3
  • 41
    • 84900513897 scopus 로고    scopus 로고
    • Learning to selectively attend. Proceedings of the 32nd Annual Conference of the Cognitive Science Society
    • Gershman S.J., Cohen J.D., Niv, Y., 2010b. Learning to selectively attend. Proceedings of the 32nd Annual Conference of the Cognitive Science Society, pp. 1270-1275.
    • (2010) , pp. 1270-1275
    • Gershman, S.J.1    Cohen, J.D.2    Niv, Y.3
  • 42
    • 77952541839 scopus 로고    scopus 로고
    • Learning latent structure: carving nature at its joints
    • Gershman S.J., Niv Y. Learning latent structure: carving nature at its joints. Curr. Opin. Neurobiol. 2010, 20:251.
    • (2010) Curr. Opin. Neurobiol. , vol.20 , pp. 251
    • Gershman, S.J.1    Niv, Y.2
  • 43
    • 0037057757 scopus 로고    scopus 로고
    • Banburismus and the brain: decoding the relationship between sensory stimuli, decisions, and reward
    • Gold J., Shadlen M. Banburismus and the brain: decoding the relationship between sensory stimuli, decisions, and reward. Neuron 2002, 36:299-308.
    • (2002) Neuron , vol.36 , pp. 299-308
    • Gold, J.1    Shadlen, M.2
  • 44
  • 45
    • 0034654526 scopus 로고    scopus 로고
    • Striatonigrostriatal pathways in primates form an ascending spiral from the shell to the dorsolateral striatum
    • Haber S.N., Fudge J.L., McFarland N.R. Striatonigrostriatal pathways in primates form an ascending spiral from the shell to the dorsolateral striatum. J. Neurosci. 2000, 20:2369-2382.
    • (2000) J. Neurosci. , vol.20 , pp. 2369-2382
    • Haber, S.N.1    Fudge, J.L.2    McFarland, N.R.3
  • 46
    • 84865218296 scopus 로고    scopus 로고
    • Evidence for hyperbolic temporal discounting of reward in control of movements
    • Haith A.M., Reppert T.R., Shadmehr R. Evidence for hyperbolic temporal discounting of reward in control of movements. J. Neurosci. 2012, 32:11727-11736.
    • (2012) J. Neurosci. , vol.32 , pp. 11727-11736
    • Haith, A.M.1    Reppert, T.R.2    Shadmehr, R.3
  • 48
    • 0030695932 scopus 로고    scopus 로고
    • Brain mechanisms for changes in processing of conditioned stimuli in Pavlovian conditioning: Implications for behavior theory
    • Holland P.C. Brain mechanisms for changes in processing of conditioned stimuli in Pavlovian conditioning: Implications for behavior theory. Learn. Behav. 1997, 25:373-399.
    • (1997) Learn. Behav. , vol.25 , pp. 373-399
    • Holland, P.C.1
  • 49
    • 0034061668 scopus 로고    scopus 로고
    • Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events
    • Horvitz J. Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events. Neuroscience 2000, 96:651-656.
    • (2000) Neuroscience , vol.96 , pp. 651-656
    • Horvitz, J.1
  • 51
    • 34948906745 scopus 로고    scopus 로고
    • Solving the distal reward problem through linkage of STDP and dopamine signaling
    • Izhikevich E.M. Solving the distal reward problem through linkage of STDP and dopamine signaling. Cereb. Cortex 2007, 17(10):2443-2452.
    • (2007) Cereb. Cortex , vol.17 , Issue.10 , pp. 2443-2452
    • Izhikevich, E.M.1
  • 53
    • 58149236469 scopus 로고    scopus 로고
    • Midbrain dopaminergic neurons and striatal cholinergic interneurons encode the difference between reward and aversive events at different epochs of probabilistic classical conditioning trials
    • Joshua M., Adler A., Mitelman R., Vaadia E., Bergman H. Midbrain dopaminergic neurons and striatal cholinergic interneurons encode the difference between reward and aversive events at different epochs of probabilistic classical conditioning trials. J. Neurosci. 2008, 28:11673-11684.
    • (2008) J. Neurosci. , vol.28 , pp. 11673-11684
    • Joshua, M.1    Adler, A.2    Mitelman, R.3    Vaadia, E.4    Bergman, H.5
  • 54
    • 0032073263 scopus 로고    scopus 로고
    • Planning and acting in partially observable stochastic domains
    • Kaelbling L.P., Littman M.L., Cassandra A.R. Planning and acting in partially observable stochastic domains. Artif. Intell. 1998, 101:99-134.
    • (1998) Artif. Intell. , vol.101 , pp. 99-134
    • Kaelbling, L.P.1    Littman, M.L.2    Cassandra, A.R.3
  • 55
    • 85047672086 scopus 로고    scopus 로고
    • Acquisition and extinction in autoshaping
    • Kakade S., Dayan P. Acquisition and extinction in autoshaping. Psychol. Rev; Psychol. Rev. 2002, 109:533.
    • (2002) Psychol. Rev; Psychol. Rev. , vol.109 , pp. 533
    • Kakade, S.1    Dayan, P.2
  • 56
    • 85024429815 scopus 로고
    • A new approach to linear filtering and prediction problems
    • Kalman R.E. A new approach to linear filtering and prediction problems. J. Basic Eng. 1960, 82:35-45.
    • (1960) J. Basic Eng. , vol.82 , pp. 35-45
    • Kalman, R.E.1
  • 57
    • 79954735886 scopus 로고    scopus 로고
    • Models of trace decay, eligibility for reinforcement, and delay of reinforcement gradients, from exponential to hyperboloid
    • Killeen P.R. Models of trace decay, eligibility for reinforcement, and delay of reinforcement gradients, from exponential to hyperboloid. Behav. Process. 2011, 87(1):57-63.
    • (2011) Behav. Process. , vol.87 , Issue.1 , pp. 57-63
    • Killeen, P.R.1
  • 58
    • 8444239052 scopus 로고    scopus 로고
    • The Bayesian brain: the role of uncertainty in neural coding and computation
    • Knill D.C., Pouget A. The Bayesian brain: the role of uncertainty in neural coding and computation. Trends Neurosci. 2004, 27:712-719.
    • (2004) Trends Neurosci. , vol.27 , pp. 712-719
    • Knill, D.C.1    Pouget, A.2
  • 59
    • 84902425386 scopus 로고
    • Integrative activity of the brain. Leopold Voss, Leipzig.
    • Konorski, J., 1967. Integrative activity of the brain. Leopold Voss, Leipzig.
    • (1967)
    • Konorski, J.1
  • 60
    • 0017719091 scopus 로고
    • Clinical consequences of corticectomies involving the supplementary motor area in man
    • Laplane D., Talairach J., Meininger V., Bancaud J., Orgogozo J. Clinical consequences of corticectomies involving the supplementary motor area in man. J. Neurol. Sci. 1977, 34:301-314.
    • (1977) J. Neurol. Sci. , vol.34 , pp. 301-314
    • Laplane, D.1    Talairach, J.2    Meininger, V.3    Bancaud, J.4    Orgogozo, J.5
  • 61
    • 0032606945 scopus 로고    scopus 로고
    • Probabilistic framework for the adaptation and comparison of image codes
    • Lewicki M.S., Olshausen B.A. Probabilistic framework for the adaptation and comparison of image codes. JOSA A 1999, 16:1587-1601.
    • (1999) JOSA A , vol.16 , pp. 1587-1601
    • Lewicki, M.S.1    Olshausen, B.A.2
  • 62
    • 80053236449 scopus 로고    scopus 로고
    • Differential roles of human striatum and amygdala in associative learning
    • Li J., Schiller D., Schoenbaum G., Phelps E.A., Daw N.D. Differential roles of human striatum and amygdala in associative learning. Nat. Neurosci. 2011, 14:1250-1252.
    • (2011) Nat. Neurosci. , vol.14 , pp. 1250-1252
    • Li, J.1    Schiller, D.2    Schoenbaum, G.3    Phelps, E.A.4    Daw, N.D.5
  • 63
    • 0012327484 scopus 로고    scopus 로고
    • Using eligibility traces to find the best memoryless policy in partially observable markov decision processes
    • Morgan Kaufmann, San Francisco, CA.
    • Loch J., Singh S., 1998. Using eligibility traces to find the best memoryless policy in partially observable markov decision processes. In: Proceedings of the 15th International Conference on Machine Learning. Morgan Kaufmann, San Francisco, CA.
    • (1998) In: Proceedings of the 15th International Conference on Machine Learning
    • Loch, J.1    Singh, S.2
  • 64
    • 0002303119 scopus 로고
    • The action of central nervous system stimulant drugs: a general theory concerning amphetamine effects
    • Lyon M., Robbins T. The action of central nervous system stimulant drugs: a general theory concerning amphetamine effects. Curr. Dev. Psychopharmacol. 1975, 2:79-163.
    • (1975) Curr. Dev. Psychopharmacol. , vol.2 , pp. 79-163
    • Lyon, M.1    Robbins, T.2
  • 65
    • 33750437292 scopus 로고    scopus 로고
    • Bayesian inference with probabilistic population codes
    • Ma W.J., Beck J.M., Latham P.E., Pouget A. Bayesian inference with probabilistic population codes. Nat. Neurosci. 2006, 9:1432-1438.
    • (2006) Nat. Neurosci. , vol.9 , pp. 1432-1438
    • Ma, W.J.1    Beck, J.M.2    Latham, P.E.3    Pouget, A.4
  • 66
    • 77949897253 scopus 로고    scopus 로고
    • Two-factor theory, the actor-critic model, and conditioned avoidance
    • Maia T. Two-factor theory, the actor-critic model, and conditioned avoidance. Learn Behav. 2010, 38:50-67.
    • (2010) Learn Behav. , vol.38 , pp. 50-67
    • Maia, T.1
  • 67
    • 33845305449 scopus 로고    scopus 로고
    • The ventral tegmental area revisited: is there an electrophysiological marker for dopaminergic neurons?
    • Margolis E.B., Lock H., Hjelmstad G.O., Fields H.L. The ventral tegmental area revisited: is there an electrophysiological marker for dopaminergic neurons?. J. Physiol. 2006, 577:907-924.
    • (2006) J. Physiol. , vol.577 , pp. 907-924
    • Margolis, E.B.1    Lock, H.2    Hjelmstad, G.O.3    Fields, H.L.4
  • 68
    • 67349098495 scopus 로고    scopus 로고
    • Two types of dopamine neuron distinctly convey positive and negative motivational signals
    • Matsumoto M., Hikosaka O. Two types of dopamine neuron distinctly convey positive and negative motivational signals. Nature 2009, 459:837-841.
    • (2009) Nature , vol.459 , pp. 837-841
    • Matsumoto, M.1    Hikosaka, O.2
  • 69
    • 34447136107 scopus 로고    scopus 로고
    • Why don't we move faster? Parkinson's disease, movement vigor, and implicit motivation
    • Mazzoni P., Hristova A., Krakauer J.W. Why don't we move faster? Parkinson's disease, movement vigor, and implicit motivation. J. Neurosci. 2007, 27:7105-7116.
    • (2007) J. Neurosci. , vol.27 , pp. 7105-7116
    • Mazzoni, P.1    Hristova, A.2    Krakauer, J.W.3
  • 70
    • 33644772461 scopus 로고    scopus 로고
    • A cerebellar model for predictive motor control tested in a brain-based device
    • McKinstry J.L., Edelman G.M., Krichmar J.L. A cerebellar model for predictive motor control tested in a brain-based device. Proc. Natl. Acad. Sci. U.S.A 2006, 103(9):3387-3392.
    • (2006) Proc. Natl. Acad. Sci. U.S.A , vol.103 , Issue.9 , pp. 3387-3392
    • McKinstry, J.L.1    Edelman, G.M.2    Krichmar, J.L.3
  • 71
    • 0030026069 scopus 로고    scopus 로고
    • Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli
    • Mirenowicz J., Schultz W. Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli. Nature 1996, 379:449-451.
    • (1996) Nature , vol.379 , pp. 449-451
    • Mirenowicz, J.1    Schultz, W.2
  • 72
    • 0029981543 scopus 로고    scopus 로고
    • A framework for mesencephalic dopamine systems based on predictive Hebbian learning
    • Montague P.R., Dayan P., Sejnowski T.J. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 1996, 16:1936-1947.
    • (1996) J. Neurosci. , vol.16 , pp. 1936-1947
    • Montague, P.R.1    Dayan, P.2    Sejnowski, T.J.3
  • 73
  • 75
    • 0347164639 scopus 로고
    • Two-factor learning theory: summary and comment
    • Mowrer O.H. Two-factor learning theory: summary and comment. Psychol. Rev. 1951, 58:350.
    • (1951) Psychol. Rev. , vol.58 , pp. 350
    • Mowrer, O.H.1
  • 76
    • 0023881335 scopus 로고
    • A selective impairment of motion perception following lesions of the middle temporal visual area (MT)
    • Newsome W.T., Pare E.B. A selective impairment of motion perception following lesions of the middle temporal visual area (MT). J. Neurosci. 1988, 8:2201-2211.
    • (1988) J. Neurosci. , vol.8 , pp. 2201-2211
    • Newsome, W.T.1    Pare, E.B.2
  • 77
    • 33745774340 scopus 로고    scopus 로고
    • How fast to work: response vigor, motivation and tonic dopamine
    • Niv Y., Daw N., Dayan P. How fast to work: response vigor, motivation and tonic dopamine. Adv. Neural Inf. Process. Syst. 2006, 18:1019.
    • (2006) Adv. Neural Inf. Process. Syst. , vol.18 , pp. 1019
    • Niv, Y.1    Daw, N.2    Dayan, P.3
  • 78
    • 33847675011 scopus 로고    scopus 로고
    • Tonic dopamine: opportunity costs and the control of response vigor
    • Niv Y., Daw N.D., Joel D., Dayan P. Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology (Berl) 2007, 191:507-520.
    • (2007) Psychopharmacology (Berl) , vol.191 , pp. 507-520
    • Niv, Y.1    Daw, N.D.2    Joel, D.3    Dayan, P.4
  • 79
    • 77956209239 scopus 로고    scopus 로고
    • Temporally extended dopamine responses to perceptually demanding reward-predictive stimuli
    • Nomoto K., Schultz W., Watanabe T., Sakagami M. Temporally extended dopamine responses to perceptually demanding reward-predictive stimuli. J. Neurosci. 2010, 30:10692-10702.
    • (2010) J. Neurosci. , vol.30 , pp. 10692-10702
    • Nomoto, K.1    Schultz, W.2    Watanabe, T.3    Sakagami, M.4
  • 80
    • 0029938380 scopus 로고    scopus 로고
    • Emergence of simple-cell receptive field properties by learning a sparse code for natural images
    • Olshausen B.A. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 1996, 381:607-609.
    • (1996) Nature , vol.381 , pp. 607-609
    • Olshausen, B.A.1
  • 81
    • 70350558451 scopus 로고    scopus 로고
    • Brain hemispheres selectively track the expected value of contralateral options
    • Palminteri S., Boraud T., Lafargue G., Dubois B., Pessiglione M. Brain hemispheres selectively track the expected value of contralateral options. J. Neurosci. 2009, 29:13465-13472.
    • (2009) J. Neurosci. , vol.29 , pp. 13465-13472
    • Palminteri, S.1    Boraud, T.2    Lafargue, G.3    Dubois, B.4    Pessiglione, M.5
  • 82
    • 21544455210 scopus 로고    scopus 로고
    • Dopamine cells respond to predicted events during classical conditioning: evidence for eligibility traces in the reward-learning network
    • Pan W.-X., Schmidt R., Wickens J.R., Hyland B.I. Dopamine cells respond to predicted events during classical conditioning: evidence for eligibility traces in the reward-learning network. J. Neurosci. 2005, 25(26):6235-6242.
    • (2005) J. Neurosci. , vol.25 , Issue.26 , pp. 6235-6242
    • Pan, W.-X.1    Schmidt, R.2    Wickens, J.R.3    Hyland, B.I.4
  • 83
    • 84898956770 scopus 로고    scopus 로고
    • Reinforcement learning with hierarchies of machines
    • Parr R., Russell S. Reinforcement learning with hierarchies of machines. Adv. Neural Inf. Process. Syst. 1998, 1043-1049.
    • (1998) Adv. Neural Inf. Process. Syst. , pp. 1043-1049
    • Parr, R.1    Russell, S.2
  • 84
    • 0019089514 scopus 로고
    • A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli
    • Pearce J.M., Hall G. A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol. Rev. 1980, 87:532.
    • (1980) Psychol. Rev. , vol.87 , pp. 532
    • Pearce, J.M.1    Hall, G.2
  • 86
    • 0033566079 scopus 로고    scopus 로고
    • Neural correlates of decision variables in parietal cortex
    • Platt M.L., Glimcher P.W. Neural correlates of decision variables in parietal cortex. Nature 1999, 400:233-238.
    • (1999) Nature , vol.400 , pp. 233-238
    • Platt, M.L.1    Glimcher, P.W.2
  • 87
    • 34447623582 scopus 로고    scopus 로고
    • Adding prediction risk to the theory of reward learning
    • Preuschoff K., Bossaerts P. Adding prediction risk to the theory of reward learning. Ann. N. Y. Acad. Sci. 2007, 1104:135-146.
    • (2007) Ann. N. Y. Acad. Sci. , vol.1104 , pp. 135-146
    • Preuschoff, K.1    Bossaerts, P.2
  • 89
    • 79960241771 scopus 로고    scopus 로고
    • Decision making under uncertainty: a neural model based on partially observable markov decision processes
    • Rao R.P.N. Decision making under uncertainty: a neural model based on partially observable markov decision processes. Front. Comput. Neurosci. 2010, 4:146.
    • (2010) Front. Comput. Neurosci. , vol.4 , pp. 146
    • Rao, R.P.N.1
  • 91
    • 0036592025 scopus 로고    scopus 로고
    • Dopamine-dependent plasticity of corticostriatal synapses
    • Reynolds J.N., Wickens J.R. Dopamine-dependent plasticity of corticostriatal synapses. Neural Netw. 2002, 15:507-521.
    • (2002) Neural Netw. , vol.15 , pp. 507-521
    • Reynolds, J.N.1    Wickens, J.R.2
  • 93
    • 33847662975 scopus 로고    scopus 로고
    • A role for mesencephalic dopamine in activation: commentary on Berridge (2006)
    • Robbins T.W., Everitt B.J. A role for mesencephalic dopamine in activation: commentary on Berridge (2006). Psychopharmacology (Berl) 2007, 191:433-437.
    • (2007) Psychopharmacology (Berl) , vol.191 , pp. 433-437
    • Robbins, T.W.1    Everitt, B.J.2
  • 94
    • 77249084637 scopus 로고    scopus 로고
    • Neural correlates of variations in event processing during learning in basolateral amygdala
    • Roesch M.R., Calu D.J., Esber G.R., Schoenbaum G. Neural correlates of variations in event processing during learning in basolateral amygdala. J. Neurosci. 2010, 30:2464-2471.
    • (2010) J. Neurosci. , vol.30 , pp. 2464-2471
    • Roesch, M.R.1    Calu, D.J.2    Esber, G.R.3    Schoenbaum, G.4
  • 95
    • 36448968271 scopus 로고    scopus 로고
    • Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards
    • Roesch M.R., Calu D.J., Schoenbaum G. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nat. Neurosci. 2007, 10:1615-1624.
    • (2007) Nat. Neurosci. , vol.10 , pp. 1615-1624
    • Roesch, M.R.1    Calu, D.J.2    Schoenbaum, G.3
  • 96
    • 0036850727 scopus 로고    scopus 로고
    • Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task
    • Roitman J.D., Shadlen M.N. Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task. J. Neurosci. 2002, 22:9475-9489.
    • (2002) J. Neurosci. , vol.22 , pp. 9475-9489
    • Roitman, J.D.1    Shadlen, M.N.2
  • 97
    • 80054091681 scopus 로고    scopus 로고
    • Credit assignment in multiple goal embodied visuomotor behavior
    • Rothkopf C.A., Ballard D.H. Credit assignment in multiple goal embodied visuomotor behavior. Front. Psychol. 2010, 1:173.
    • (2010) Front. Psychol. , vol.1 , pp. 173
    • Rothkopf, C.A.1    Ballard, D.H.2
  • 98
    • 0003636089 scopus 로고
    • On-line Q-learning using connectionist systems
    • Cambridge University.
    • Rummery, G., Niranjan, M., 1994. On-line Q-learning using connectionist systems, Cambridge University.
    • (1994)
    • Rummery, G.1    Niranjan, M.2
  • 100
    • 77957728784 scopus 로고    scopus 로고
    • Testing the reward prediction error hypothesis with an axiomatic model
    • Rutledge R.B., Dean M., Caplin A., Glimcher P.W. Testing the reward prediction error hypothesis with an axiomatic model. J. Neurosci. 2010, 30:13525-13536.
    • (2010) J. Neurosci. , vol.30 , pp. 13525-13536
    • Rutledge, R.B.1    Dean, M.2    Caplin, A.3    Glimcher, P.W.4
  • 101
    • 72849112662 scopus 로고    scopus 로고
    • Dopaminergic drugs modulate learning rates and perseveration in Parkinson's patients in a dynamic foraging task
    • Rutledge R.B., Lazzaro S.C., Lau B., Myers C.E., Gluck M.A., Glimcher P.W. Dopaminergic drugs modulate learning rates and perseveration in Parkinson's patients in a dynamic foraging task. J. Neurosci. 2009, 29:15104-15114.
    • (2009) J. Neurosci. , vol.29 , pp. 15104-15114
    • Rutledge, R.B.1    Lazzaro, S.C.2    Lau, B.3    Myers, C.E.4    Gluck, M.A.5    Glimcher, P.W.6
  • 102
    • 33847619341 scopus 로고    scopus 로고
    • Effort-related functions of nucleus accumbens dopamine and associated forebrain circuits
    • Salamone J.D., Correa M., Farrar A., Mingote S.M. Effort-related functions of nucleus accumbens dopamine and associated forebrain circuits. Psychopharmacology 2007, 191:461-482.
    • (2007) Psychopharmacology , vol.191 , pp. 461-482
    • Salamone, J.D.1    Correa, M.2    Farrar, A.3    Mingote, S.M.4
  • 103
    • 0030896968 scopus 로고    scopus 로고
    • A neural substrate of prediction and reward
    • Schultz W., Dayan P., Montague P.R. A neural substrate of prediction and reward. Science 1997, 275:1593-1599.
    • (1997) Science , vol.275 , pp. 1593-1599
    • Schultz, W.1    Dayan, P.2    Montague, P.R.3
  • 104
    • 34447632392 scopus 로고    scopus 로고
    • Dynamic signals related to choices and outcomes in the dorsolateral prefrontal cortex
    • Seo H., Barraclough D.J., Lee D. Dynamic signals related to choices and outcomes in the dorsolateral prefrontal cortex. Cereb. Cortex 2007, 17(Suppl. 1):i110-117.
    • (2007) Cereb. Cortex , vol.17 , Issue.SUPPL. 1
    • Seo, H.1    Barraclough, D.J.2    Lee, D.3
  • 106
    • 0016045280 scopus 로고
    • An opponent-process theory of motivation: I. Temporal dynamics of affect
    • Solomon R.L., Corbit J.D. An opponent-process theory of motivation: I. Temporal dynamics of affect. Psychol. Rev. 1974, 81:119.
    • (1974) Psychol. Rev. , vol.81 , pp. 119
    • Solomon, R.L.1    Corbit, J.D.2
  • 108
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton R.S. Learning to predict by the methods of temporal differences. Mach. Learn. 1988, 3:9-44.
    • (1988) Mach. Learn. , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 112
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • Sutton R.S., Precup D., Singh S. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artif. Intell. 1999, 112:181-211.
    • (1999) Artif. Intell. , vol.112 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3
  • 113
    • 82255179147 scopus 로고    scopus 로고
    • Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex
    • Takahashi Y.K., Roesch M.R., Wilson R.C., Toreson K., O'Donnell P., Niv Y., et al. Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex. Nat. Neurosci 2011, 14(12):1590-1597.
    • (2011) Nat. Neurosci , vol.14 , Issue.12 , pp. 1590-1597
    • Takahashi, Y.K.1    Roesch, M.R.2    Wilson, R.C.3    Toreson, K.4    O'Donnell, P.5    Niv, Y.6
  • 114
    • 0023726188 scopus 로고
    • Neuronal activity in cortical motor areas related to ipsilateral, contralateral, and bilateral digit movements of the monkey
    • Tanji J., Okano K., Sato K.C. Neuronal activity in cortical motor areas related to ipsilateral, contralateral, and bilateral digit movements of the monkey. J. Neurophysiol. 1988, 60:325-343.
    • (1988) J. Neurophysiol. , vol.60 , pp. 325-343
    • Tanji, J.1    Okano, K.2    Sato, K.C.3
  • 116
    • 77953675717 scopus 로고    scopus 로고
    • Differential dynamics of activity changes in dorsolateral and dorsomedial striatal loops during learning
    • Thorn C.A., Atallah H., Howe M., Graybiel A.M. Differential dynamics of activity changes in dorsolateral and dorsomedial striatal loops during learning. Neuron 2010, 66:781-795.
    • (2010) Neuron , vol.66 , pp. 781-795
    • Thorn, C.A.1    Atallah, H.2    Howe, M.3    Graybiel, A.M.4
  • 117
    • 14844349975 scopus 로고    scopus 로고
    • Adaptive coding of reward value by dopamine neurons
    • Tobler P.N., Fiorillo C.D., Schultz W. Adaptive coding of reward value by dopamine neurons. Science 2005, 307:1642-1645.
    • (2005) Science , vol.307 , pp. 1642-1645
    • Tobler, P.N.1    Fiorillo, C.D.2    Schultz, W.3
  • 119
    • 66249125042 scopus 로고    scopus 로고
    • Phasic firing in dopaminergic neurons is sufficient for behavioral conditioning
    • Tsai H.C., Zhang F., Adamantidis A., Stuber G.D., Bonci A., de Lecea L., et al. Phasic firing in dopaminergic neurons is sufficient for behavioral conditioning. Science 2009, 324:1080-1084.
    • (2009) Science , vol.324 , pp. 1080-1084
    • Tsai, H.C.1    Zhang, F.2    Adamantidis, A.3    Stuber, G.D.4    Bonci, A.5    de Lecea, L.6
  • 120
    • 84862766564 scopus 로고    scopus 로고
    • Are you or aren't you? Challenges associated with physiologically identifying dopamine neurons
    • Ungless M.A., Grace A.A. Are you or aren't you? Challenges associated with physiologically identifying dopamine neurons. Trends Neurosci 2012, 35(7):422-430.
    • (2012) Trends Neurosci , vol.35 , Issue.7 , pp. 422-430
    • Ungless, M.A.1    Grace, A.A.2
  • 121
    • 1642404961 scopus 로고    scopus 로고
    • Uniform inhibition of dopamine neurons in the ventral tegmental area by aversive stimuli
    • Ungless M.A., Magill P.J., Bolam J.P. Uniform inhibition of dopamine neurons in the ventral tegmental area by aversive stimuli. Science 2004, 303:2040-2042.
    • (2004) Science , vol.303 , pp. 2040-2042
    • Ungless, M.A.1    Magill, P.J.2    Bolam, J.P.3
  • 123
    • 0033680563 scopus 로고    scopus 로고
    • Coincidence detection in single dendritic spines mediated by calcium release
    • Wang S.S., Denk W., Hausser M. Coincidence detection in single dendritic spines mediated by calcium release. Nat. Neurosci. 2000, 3(12):1266-1273.
    • (2000) Nat. Neurosci. , vol.3 , Issue.12 , pp. 1266-1273
    • Wang, S.S.1    Denk, W.2    Hausser, M.3
  • 124
    • 84155183278 scopus 로고    scopus 로고
    • NMDA receptors in dopaminergic neurons are crucial for habit learning
    • Wang L.P., Li F., Wang D., Xie K., Shen X., Tsien J.Z. NMDA receptors in dopaminergic neurons are crucial for habit learning. Neuron 2011, 72:1055-1066.
    • (2011) Neuron , vol.72 , pp. 1055-1066
    • Wang, L.P.1    Li, F.2    Wang, D.3    Xie, K.4    Shen, X.5    Tsien, J.Z.6
  • 127
    • 84891901728 scopus 로고    scopus 로고
    • Inferring relevance in a changing world
    • Wilson R.C., Niv Y. Inferring relevance in a changing world. Front. Human Neurosci. 2011, 5:189.
    • (2011) Front. Human Neurosci. , vol.5 , pp. 189
    • Wilson, R.C.1    Niv, Y.2
  • 128
    • 70350125547 scopus 로고    scopus 로고
    • Neural computations underlying action-based decision making in the human brain
    • Wunderlich K., Rangel A., O'Doherty J.P. Neural computations underlying action-based decision making in the human brain. Proc. Natl. Acad. Sci. U.S.A. 2009, 106:17199-17204.
    • (2009) Proc. Natl. Acad. Sci. U.S.A. , vol.106 , pp. 17199-17204
    • Wunderlich, K.1    Rangel, A.2    O'Doherty, J.P.3
  • 129
    • 34347347169 scopus 로고    scopus 로고
    • Probabilistic reasoning by neurons
    • Yang T., Shadlen M.N. Probabilistic reasoning by neurons. Nature 2007, 447:1075-1080.
    • (2007) Nature , vol.447 , pp. 1075-1080
    • Yang, T.1    Shadlen, M.N.2
  • 130
    • 20444388016 scopus 로고    scopus 로고
    • Uncertainty, neuromodulation, and attention
    • Yu A.J., Dayan P. Uncertainty, neuromodulation, and attention. Neuron 2005, 46:681-692.
    • (2005) Neuron , vol.46 , pp. 681-692
    • Yu, A.J.1    Dayan, P.2
  • 131
    • 33746220445 scopus 로고    scopus 로고
    • Vision as Bayesian inference: analysis by synthesis?
    • Yuille A., Kersten D. Vision as Bayesian inference: analysis by synthesis?. Trends Cogn. Sci. 2006, 10:301-308.
    • (2006) Trends Cogn. Sci. , vol.10 , pp. 301-308
    • Yuille, A.1    Kersten, D.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.