메뉴 건너뛰기




Volumn 4, Issue 10, 2009, Pages

Temporal-difference reinforcement learning with distributed representations

Author keywords

[No Author keywords available]

Indexed keywords

DOPAMINE;

EID: 70449382577     PISSN: None     EISSN: 19326203     Source Type: Journal    
DOI: 10.1371/journal.pone.0007362     Document Type: Article
Times cited : (56)

References (113)
  • 1
    • 0029981543 scopus 로고    scopus 로고
    • A framework for mesencephalic dopamine systems based on predictive Hebbian learning
    • Montague PR, Dayan P, Sejnowski TJ (1996) A framework for mesencephalic dopamine systems based on predictive Hebbian learning. Journal of Neuroscience 16: 1936-1947.
    • (1996) Journal of Neuroscience , vol.16 , pp. 1936-1947
    • Montague, P.R.1    Dayan, P.2    Sejnowski, T.J.3
  • 2
    • 0030896968 scopus 로고    scopus 로고
    • A neural substrate of prediction and reward
    • Schultz W, Dayan P, Montague R (1997) A neural substrate of prediction and reward. Science 275: 1593-1599.
    • (1997) Science , vol.275 , pp. 1593-1599
    • Schultz, W.1    Dayan, P.2    Montague, R.3
  • 3
    • 0002337786 scopus 로고    scopus 로고
    • Metalearning, neuromodulation, and emotion
    • Hatano G, Okada N, Tanabe H, eds, Elsevier
    • Doya K (2000) Metalearning, neuromodulation, and emotion. In: Hatano G, Okada N, Tanabe H, eds. Affective Minds, Elsevier.
    • (2000) Affective Minds
    • Doya, K.1
  • 5
    • 33745787929 scopus 로고    scopus 로고
    • Representation and timing in theories of the dopamine system
    • Daw ND, Courville AC, Touretzky DS (2006) Representation and timing in theories of the dopamine system. Neural Computation 18: 1637-1677.
    • (2006) Neural Computation , vol.18 , pp. 1637-1677
    • Daw, N.D.1    Courville, A.C.2    Touretzky, D.S.3
  • 6
    • 34548837994 scopus 로고    scopus 로고
    • Reconciling reinforcement learning models with behavioral extinction and renewal: Implications for addiction, relapse, and problem gambling
    • Redish AD, Jensen S, Johnson A, Kurth-Nelson Z (2007) Reconciling reinforcement learning models with behavioral extinction and renewal: Implications for addiction, relapse, and problem gambling. Psychological Review 114: 784-805.
    • (2007) Psychological Review , vol.114 , pp. 784-805
    • Redish, A.D.1    Jensen, S.2    Johnson, A.3    Kurth-Nelson, Z.4
  • 7
    • 70449457869 scopus 로고    scopus 로고
    • Sutton RS, ed (1992) Special issue on reinforcement learning, 8(3/ 4) of Machine Learning. Boston: Kluwer Academic Publishers.
    • Sutton RS, ed (1992) Special issue on reinforcement learning, volume 8(3/ 4) of Machine Learning. Boston: Kluwer Academic Publishers.
  • 9
    • 0002109138 scopus 로고
    • A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement
    • Black AH, Prokesy WF, eds, Current Research and Theory. New York: Appleton Century Crofts. pp
    • Rescorla RA, Wagner AR (1972) A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In: Black AH, Prokesy WF, eds. Classical Conditioning II: Current Research and Theory. New York: Appleton Century Crofts. pp 64-99.
    • (1972) Classical Conditioning II , pp. 64-99
    • Rescorla, R.A.1    Wagner, A.R.2
  • 10
    • 0019537951 scopus 로고
    • Toward a modern theory of adaptive networks: Expectation and prediction
    • Sutton RS, Barto AG (1981) Toward a modern theory of adaptive networks: Expectation and prediction. Psychological Review 88: 135-170.
    • (1981) Psychological Review , vol.88 , pp. 135-170
    • Sutton, R.S.1    Barto, A.G.2
  • 11
    • 0000541213 scopus 로고
    • Houk JC, Davis JL, Beiser DG, eds. Models of Information Processing in the Basal Ganglia. Cambridge MA: MIT Press. pp
    • Barto AG (1995) Adaptive critics and the basal ganglia. In: Houk JC, Davis JL, Beiser DG, eds. Models of Information Processing in the Basal Ganglia. Cambridge MA: MIT Press. pp 215-232.
    • (1995) Adaptive critics and the basal ganglia , pp. 215-232
    • Barto, A.G.1
  • 13
    • 0037057755 scopus 로고    scopus 로고
    • Getting formal with dopamine and reward
    • Schultz W (2002) Getting formal with dopamine and reward. Neuron 36: 241-263.
    • (2002) Neuron , vol.36 , pp. 241-263
    • Schultz, W.1
  • 14
    • 10344225664 scopus 로고    scopus 로고
    • Addiction as a computational process gone awry
    • Redish AD (2004) Addiction as a computational process gone awry. Science 306: 1944-1947.
    • (2004) Science , vol.306 , pp. 1944-1947
    • Redish, A.D.1
  • 15
    • 0036592029 scopus 로고    scopus 로고
    • Dopamine: Generalization and bonuses
    • Kakade S, Dayan P (2002) Dopamine: generalization and bonuses. Neural Networks 15: 549-599.
    • (2002) Neural Networks , vol.15 , pp. 549-599
    • Kakade, S.1    Dayan, P.2
  • 16
    • 0037987978 scopus 로고    scopus 로고
    • Temporal difference models and reward-related learning in the human brain
    • O'Doherty JP, PeterDayan, KF, Critchley H, Dolan RJ (2003) Temporal difference models and reward-related learning in the human brain. Neuron 38: 329-337.
    • (2003) Neuron , vol.38 , pp. 329-337
    • O'Doherty, J.P.1    PeterDayan, K.F.2    Critchley, H.3    Dolan, R.J.4
  • 17
    • 9644310472 scopus 로고    scopus 로고
    • Reward representations and reward-related learning in the human brain: Insights from neuroimaging
    • O'Doherty JP (2004) Reward representations and reward-related learning in the human brain: insights from neuroimaging. Current Opinion in Neurobiology 14: 769-776.
    • (2004) Current Opinion in Neurobiology , vol.14 , pp. 769-776
    • O'Doherty, J.P.1
  • 18
    • 21544435722 scopus 로고    scopus 로고
    • Midbrain dopamine neurons encode a quantitative reward prediction error signal
    • Bayer HM, Glimcher P (2005) Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47: 129-141.
    • (2005) Neuron , vol.47 , pp. 129-141
    • Bayer, H.M.1    Glimcher, P.2
  • 19
    • 21544455210 scopus 로고    scopus 로고
    • Dopamine Cells Respond to Predicted Events during Classical Conditioning: Evidence for Eligibility Traces in the Reward-Learning Network
    • Pan WX, Schmidt R, Wickens JR, Hyland BI (2005) Dopamine Cells Respond to Predicted Events during Classical Conditioning: Evidence for Eligibility Traces in the Reward-Learning Network. J Neurosci 25: 6235-6242.
    • (2005) J Neurosci , vol.25 , pp. 6235-6242
    • Pan, W.X.1    Schmidt, R.2    Wickens, J.R.3    Hyland, B.I.4
  • 20
    • 20444397095 scopus 로고    scopus 로고
    • Extinction of cocaine self-administration reveals functionally and temporally distinct dopaminergic signals in the nucleus accumbens
    • Stuber GD, Wightman RM, Carelli RM (2005) Extinction of cocaine self-administration reveals functionally and temporally distinct dopaminergic signals in the nucleus accumbens. Neuron 46: 661-669.
    • (2005) Neuron , vol.46 , pp. 661-669
    • Stuber, G.D.1    Wightman, R.M.2    Carelli, R.M.3
  • 21
    • 34547536392 scopus 로고    scopus 로고
    • Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens
    • Day JJ, Roitman MF, Wightman RM, Carelli RM (2007) Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens. Nature Neuroscience 10: 1020-1028.
    • (2007) Nature Neuroscience , vol.10 , pp. 1020-1028
    • Day, J.J.1    Roitman, M.F.2    Wightman, R.M.3    Carelli, R.M.4
  • 22
    • 34548778113 scopus 로고    scopus 로고
    • Statistics of midbrain dopamine neuron spike trains in the awake primate
    • Bayer HM, Lau B, Glimcher PW (2007) Statistics of midbrain dopamine neuron spike trains in the awake primate. J Neurophysiol 98: 1428-1439.
    • (2007) J Neurophysiol , vol.98 , pp. 1428-1439
    • Bayer, H.M.1    Lau, B.2    Glimcher, P.W.3
  • 24
    • 34547742206 scopus 로고    scopus 로고
    • Multiple model-based reinforcement learning explains dopamine neuronal activity
    • Bertin M, Schweighofer N, Doya K (2007) Multiple model-based reinforcement learning explains dopamine neuronal activity. Neural Networks 20: 668-675.
    • (2007) Neural Networks , vol.20 , pp. 668-675
    • Bertin, M.1    Schweighofer, N.2    Doya, K.3
  • 25
    • 57349130536 scopus 로고    scopus 로고
    • Stimulus representation and the timing of reward-prediction errors in models of the dopamine system
    • Ludvig EA, Sutton RS, Kehoe EJ (2008) Stimulus representation and the timing of reward-prediction errors in models of the dopamine system. Neural Computation 20: 3034-3054.
    • (2008) Neural Computation , vol.20 , pp. 3034-3054
    • Ludvig, E.A.1    Sutton, R.S.2    Kehoe, E.J.3
  • 27
    • 0022930826 scopus 로고
    • Parallel organization of functionally segregated circuits linking basal ganglia and cortex
    • Alexander GE, DeLong MR, Strick PL (1986) Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annual Reviews Neuroscience 9: 357-381.
    • (1986) Annual Reviews Neuroscience , vol.9 , pp. 357-381
    • Alexander, G.E.1    DeLong, M.R.2    Strick, P.L.3
  • 29
    • 0034654526 scopus 로고    scopus 로고
    • Striatonigrostriatal pathways in primates form an ascending spiral from the shell to the dorsolateral striatum
    • Haber SN, Fudge JL, McFarland NR (2000) Striatonigrostriatal pathways in primates form an ascending spiral from the shell to the dorsolateral striatum. Journal of Neuroscience 20: 2369-2382.
    • (2000) Journal of Neuroscience , vol.20 , pp. 2369-2382
    • Haber, S.N.1    Fudge, J.L.2    McFarland, N.R.3
  • 30
    • 3343026029 scopus 로고    scopus 로고
    • Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops
    • Tanaka SC, Doya K, Okada G, Ueda K, Okamoto Y, et al. (2004) Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nature Neuroscience 7: 887-893.
    • (2004) Nature Neuroscience , vol.7 , pp. 887-893
    • Tanaka, S.C.1    Doya, K.2    Okada, G.3    Ueda, K.4    Okamoto, Y.5
  • 32
    • 0026505520 scopus 로고
    • Responses of monkey dopamine neurons during learning of behavioral reactions
    • Ljungberg T, Apicella P, Schultz W (1992) Responses of monkey dopamine neurons during learning of behavioral reactions. Journal of Neurophysiology 67: 145-163.
    • (1992) Journal of Neurophysiology , vol.67 , pp. 145-163
    • Ljungberg, T.1    Apicella, P.2    Schultz, W.3
  • 33
    • 33644688754 scopus 로고    scopus 로고
    • Dopamine neurons report an error in the temporal prediction of reward during learning
    • Hollerman JR, Schultz W (1998) Dopamine neurons report an error in the temporal prediction of reward during learning. Nature Neuroscience 1: 304-309.
    • (1998) Nature Neuroscience , vol.1 , pp. 304-309
    • Hollerman, J.R.1    Schultz, W.2
  • 34
    • 0031867046 scopus 로고    scopus 로고
    • Predictive reward signal of dopamine neurons
    • Schultz W (1998) Predictive reward signal of dopamine neurons. Journal of Neurophysiology 80: 1-27.
    • (1998) Journal of Neurophysiology , vol.80 , pp. 1-27
    • Schultz, W.1
  • 35
    • 1842684992 scopus 로고    scopus 로고
    • Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioural ecology
    • Schultz W (2004) Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioural ecology. Current Opinion in Neurobiology 14: 139-147.
    • (2004) Current Opinion in Neurobiology , vol.14 , pp. 139-147
    • Schultz, W.1
  • 37
    • 70449450852 scopus 로고    scopus 로고
    • Redish AD, Kurth-Nelson Z (2010) Neural models of temporal discounting. In: Madden G, Bickel W, eds. Impulsivity: The Behavioral and Neurological Science of Discounting, APA books. pp 123-158.
    • Redish AD, Kurth-Nelson Z (2010) Neural models of temporal discounting. In: Madden G, Bickel W, eds. Impulsivity: The Behavioral and Neurological Science of Discounting, APA books. pp 123-158.
  • 39
    • 0030920737 scopus 로고    scopus 로고
    • Choice, delay, probability and conditioned reinforcement
    • Mazur J (1997) Choice, delay, probability and conditioned reinforcement. Animal Learning and Behavior 25: 131-147.
    • (1997) Animal Learning and Behavior , vol.25 , pp. 131-147
    • Mazur, J.1
  • 40
    • 0035229091 scopus 로고    scopus 로고
    • Hyperbolic value addition and general models of animal choice
    • Mazur JE (2001) Hyperbolic value addition and general models of animal choice. Psychological Review 108: 96-112.
    • (2001) Psychological Review , vol.108 , pp. 96-112
    • Mazur, J.E.1
  • 45
    • 0346706306 scopus 로고    scopus 로고
    • One hundred years of forgetting: A quantitative description of retention
    • Rubin DC, Wenzel AE (1996) One hundred years of forgetting: A quantitative description of retention. Psyhcological Review 103: 734-760.
    • (1996) Psyhcological Review , vol.103 , pp. 734-760
    • Rubin, D.C.1    Wenzel, A.E.2
  • 48
    • 0032812326 scopus 로고    scopus 로고
    • Discounting of delayed rewards in opioid-dependent outpatients exponential or hyperbolic discounting functions?
    • Madden GJ, Bickel WK, Jacobs EA (1999) Discounting of delayed rewards in opioid-dependent outpatients exponential or hyperbolic discounting functions? Experimental and Clinical Psychopharmacology 7: 284-293.
    • (1999) Experimental and Clinical Psychopharmacology , vol.7 , pp. 284-293
    • Madden, G.J.1    Bickel, W.K.2    Jacobs, E.A.3
  • 49
    • 0035638928 scopus 로고    scopus 로고
    • Is time-discounting hyperbolic or subadditive?
    • Read D (2001) Is time-discounting hyperbolic or subadditive? Journal of Risk and Uncertainty 23: 5-32.
    • (2001) Journal of Risk and Uncertainty , vol.23 , pp. 5-32
    • Read, D.1
  • 50
    • 0031912057 scopus 로고    scopus 로고
    • Polydrug abuse in heroin addicts: A behavioral economic analysis
    • Petry NM, Bickel WK (1998) Polydrug abuse in heroin addicts: a behavioral economic analysis. Addiction 93: 321-335.
    • (1998) Addiction , vol.93 , pp. 321-335
    • Petry, N.M.1    Bickel, W.K.2
  • 51
    • 0032743153 scopus 로고    scopus 로고
    • Measures of impulsivity in cigarette smokers and non-smokers
    • Mitchell SH (1999) Measures of impulsivity in cigarette smokers and non-smokers. Psychopharmacology 146: 455-464.
    • (1999) Psychopharmacology , vol.146 , pp. 455-464
    • Mitchell, S.H.1
  • 52
    • 0036672359 scopus 로고    scopus 로고
    • Discounting of delayed health gains and losses by current, never- and ex-smokers of cigarettes
    • Odum AL, Madden GJ, Bickel WK (2002) Discounting of delayed health gains and losses by current, never- and ex-smokers of cigarettes. Nicotine and Tobacco Research 4: 295-303.
    • (2002) Nicotine and Tobacco Research , vol.4 , pp. 295-303
    • Odum, A.L.1    Madden, G.J.2    Bickel, W.K.3
  • 53
    • 0142155119 scopus 로고    scopus 로고
    • Pathological gambling severity is associated with impulsivity in a delay discounting procedure
    • Alessi SM, Petry NM (2003) Pathological gambling severity is associated with impulsivity in a delay discounting procedure. Behavioural Processes 64: 345-354.
    • (2003) Behavioural Processes , vol.64 , pp. 345-354
    • Alessi, S.M.1    Petry, N.M.2
  • 54
    • 33751168257 scopus 로고    scopus 로고
    • A review of delay-discounting research with humans: Relations to drug use and gambling
    • Reynolds B (2006) A review of delay-discounting research with humans: relations to drug use and gambling. Behavioural Pharmacology 17: 651-667.
    • (2006) Behavioural Pharmacology , vol.17 , pp. 651-667
    • Reynolds, B.1
  • 55
    • 1942436827 scopus 로고    scopus 로고
    • Memory traces of trace memories: Neurogenesis, synaptogenesis and awareness
    • Shors TJ (2004) Memory traces of trace memories: neurogenesis, synaptogenesis and awareness. Trends in Neurosciences 27: 250-256.
    • (2004) Trends in Neurosciences , vol.27 , pp. 250-256
    • Shors, T.J.1
  • 59
    • 0023035964 scopus 로고
    • Hippocampus and trace conditioning of the rabbit's classically conditioned nictitating membrane response
    • Solomon PR, Schaaf ERV, Thompson RF, Weisz DJ (1986) Hippocampus and trace conditioning of the rabbit's classically conditioned nictitating membrane response. Behavioral Neuroscience 100: 729-744.
    • (1986) Behavioral Neuroscience , vol.100 , pp. 729-744
    • Solomon, P.R.1    Schaaf, E.R.V.2    Thompson, R.F.3    Weisz, D.J.4
  • 61
    • 35148864530 scopus 로고    scopus 로고
    • Dorsal, ventral, and complete excitotoxic lesions of the hippocampus in rats failed to impair appetitive trace conditioning
    • Thibaudeau G, Potvin O, Allen K, Dore FY, Goulet S (2007) Dorsal, ventral, and complete excitotoxic lesions of the hippocampus in rats failed to impair appetitive trace conditioning. Behavioural Brain Research 185: 9-20.
    • (2007) Behavioural Brain Research , vol.185 , pp. 9-20
    • Thibaudeau, G.1    Potvin, O.2    Allen, K.3    Dore, F.Y.4    Goulet, S.5
  • 62
    • 20644435564 scopus 로고    scopus 로고
    • The formation of neural codes in the hippocampus: Trace conditioning as a prototypical paradigm for studying the random recoding hypothesis
    • Levy WB, Sanyal A, Rodriguez P, Sullivan DW, Wu XB (2005) The formation of neural codes in the hippocampus: trace conditioning as a prototypical paradigm for studying the random recoding hypothesis. Biol Cybern 92: 409-426.
    • (2005) Biol Cybern , vol.92 , pp. 409-426
    • Levy, W.B.1    Sanyal, A.2    Rodriguez, P.3    Sullivan, D.W.4    Wu, X.B.5
  • 63
    • 51149102880 scopus 로고    scopus 로고
    • Internally generated cell assembly sequences in the rat hippocampus
    • Pastalkova E, Itskov V, Amarasingham A, Buzsaki G (2008) Internally generated cell assembly sequences in the rat hippocampus. Science 321: 1322-1327.
    • (2008) Science , vol.321 , pp. 1322-1327
    • Pastalkova, E.1    Itskov, V.2    Amarasingham, A.3    Buzsaki, G.4
  • 64
    • 0021137772 scopus 로고
    • Bridging temporal gaps between cs and us in autoshaping: A test of a local context hypothesis
    • Kaplan PS (1984) Bridging temporal gaps between cs and us in autoshaping: A test of a local context hypothesis. Animal Learning and Behavior 12: 142-148.
    • (1984) Animal Learning and Behavior , vol.12 , pp. 142-148
    • Kaplan, P.S.1
  • 65
    • 0037431291 scopus 로고    scopus 로고
    • Dopamine as chicken and egg
    • Self D (2003) Dopamine as chicken and egg. Nature 422: 573-574.
    • (2003) Nature , vol.422 , pp. 573-574
    • Self, D.1
  • 66
    • 0030026069 scopus 로고    scopus 로고
    • Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli
    • Mirenowicz J, Schultz W (1996) Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli. Nature 379: 449-451.
    • (1996) Nature , vol.379 , pp. 449-451
    • Mirenowicz, J.1    Schultz, W.2
  • 67
    • 0035315989 scopus 로고    scopus 로고
    • Temporal difference model reproduces anticipatory neural activity
    • Suri RE, Schultz W (2001) Temporal difference model reproduces anticipatory neural activity. Neural Computation 13: 841-862.
    • (2001) Neural Computation , vol.13 , pp. 841-862
    • Suri, R.E.1    Schultz, W.2
  • 68
    • 0037459319 scopus 로고    scopus 로고
    • Discrete coding of reward probability and uncertainty by dopamine neurons
    • Fiorillo CD, Tobler PN, Schultz W (2003) Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299: 1898-1902.
    • (2003) Science , vol.299 , pp. 1898-1902
    • Fiorillo, C.D.1    Tobler, P.N.2    Schultz, W.3
  • 69
    • 0027964829 scopus 로고
    • Importance of unpredictability for reward responses in primate dopamine neurons
    • Mirenowicz J, Schultz W (1994) Importance of unpredictability for reward responses in primate dopamine neurons. Journal of Neurophysiology 72: 1024-1027.
    • (1994) Journal of Neurophysiology , vol.72 , pp. 1024-1027
    • Mirenowicz, J.1    Schultz, W.2
  • 70
    • 13244267004 scopus 로고    scopus 로고
    • Temporal sequence learning, prediction, and control - a review of different models and their relation to biological mechanisms
    • Wörgötter F, Porr B (2005) Temporal sequence learning, prediction, and control - a review of different models and their relation to biological mechanisms. Neural Computation 17: 245-319.
    • (2005) Neural Computation , vol.17 , pp. 245-319
    • Wörgötter, F.1    Porr, B.2
  • 71
    • 0036592008 scopus 로고    scopus 로고
    • Opponent interactions between serotonin and dopamine
    • Daw ND, Kakade S, Dayan P (2002) Opponent interactions between serotonin and dopamine. Neural Networks 15: 603-616.
    • (2002) Neural Networks , vol.15 , pp. 603-616
    • Daw, N.D.1    Kakade, S.2    Dayan, P.3
  • 72
    • 7044239264 scopus 로고    scopus 로고
    • Behavior: A marketplace in the brain?
    • Ainslie G, Monterosso J (2004) Behavior: A marketplace in the brain? Science 306: 421-423.
    • (2004) Science , vol.306 , pp. 421-423
    • Ainslie, G.1    Monterosso, J.2
  • 73
    • 0032558817 scopus 로고    scopus 로고
    • On hyperbolic discounting and uncertain hazard rates
    • Sozou PD (1998) On hyperbolic discounting and uncertain hazard rates. The Royal Society London B 265: 2015-2020.
    • (1998) The Royal Society London , vol.B 265 , pp. 2015-2020
    • Sozou, P.D.1
  • 74
    • 0031309579 scopus 로고    scopus 로고
    • Kacelnik A (1997) Normative and descriptive models of decision making: time discounting and risk sensitivity. In: Bock GR, Cardew G, eds. Characterizing Human Psychological Adaptations. Chichester UK: Wiley, 208 of Ciba Foundation Symposia. pp 51-66. Discussion 67-70.
    • Kacelnik A (1997) Normative and descriptive models of decision making: time discounting and risk sensitivity. In: Bock GR, Cardew G, eds. Characterizing Human Psychological Adaptations. Chichester UK: Wiley, volume 208 of Ciba Foundation Symposia. pp 51-66. Discussion 67-70.
  • 75
    • 70449389676 scopus 로고    scopus 로고
    • An economic perspective on addiction and matching
    • Laibson DI (1996) An economic perspective on addiction and matching. Behavioral and Brain Sciences 19: 583-584.
    • (1996) Behavioral and Brain Sciences , vol.19 , pp. 583-584
    • Laibson, D.I.1
  • 76
    • 5144224271 scopus 로고    scopus 로고
    • Separate neural systems value immediate and delayed monetary rewards
    • McClure SM, Laibson DI, Loewenstein G, Cohen JD (2004) Separate neural systems value immediate and delayed monetary rewards. Science 306: 503-507.
    • (2004) Science , vol.306 , pp. 503-507
    • McClure, S.M.1    Laibson, D.I.2    Loewenstein, G.3    Cohen, J.D.4
  • 80
    • 39149087042 scopus 로고    scopus 로고
    • Is a bird in the hand worth two in the future? the neuroeconomics of intertemporal decision-making
    • Kalenscher T, Pennartz CMA (2008) Is a bird in the hand worth two in the future? the neuroeconomics of intertemporal decision-making. Progress in Neurobiology 84: 284-315.
    • (2008) Progress in Neurobiology , vol.84 , pp. 284-315
    • Kalenscher, T.1    Pennartz, C.M.A.2
  • 83
    • 0742324926 scopus 로고    scopus 로고
    • Inter-module credit assignment in modular reinforcement learning
    • Samejima K, Doya K, Kawato M (2003) Inter-module credit assignment in modular reinforcement learning. Neural Networks 16: 985-994.
    • (2003) Neural Networks , vol.16 , pp. 985-994
    • Samejima, K.1    Doya, K.2    Kawato, M.3
  • 85
    • 0030513846 scopus 로고    scopus 로고
    • A sequence predicting CA3 is a flexible associator that learns and uses context to solve hippocampal-like tasks
    • Levy WB (1996) A sequence predicting CA3 is a flexible associator that learns and uses context to solve hippocampal-like tasks. Hippocampus 6: 579-591.
    • (1996) Hippocampus , vol.6 , pp. 579-591
    • Levy, W.B.1
  • 86
    • 0032519055 scopus 로고    scopus 로고
    • Probabilistic interpretation of population codes
    • Zemel RS, Dayan P, Pouget A (1998) Probabilistic interpretation of population codes. Neural Computation 10: 403-430.
    • (1998) Neural Computation , vol.10 , pp. 403-430
    • Zemel, R.S.1    Dayan, P.2    Pouget, A.3
  • 88
    • 0347625931 scopus 로고    scopus 로고
    • Detecting dynamical changes within a simulated neural ensemble using a measure of representational quality
    • Jackson JC, Redish AD (2003) Detecting dynamical changes within a simulated neural ensemble using a measure of representational quality. Network: Computation in Neural Systems 14: 629-645.
    • (2003) Network: Computation in Neural Systems , vol.14 , pp. 629-645
    • Jackson, J.C.1    Redish, A.D.2
  • 89
    • 14844299356 scopus 로고    scopus 로고
    • Reconstruction of the postsubiculum head direction signal from neural ensembles
    • Johnson A, Seeland KD, Redish AD (2005) Reconstruction of the postsubiculum head direction signal from neural ensembles. Hippocampus 15: 86-96.
    • (2005) Hippocampus , vol.15 , pp. 86-96
    • Johnson, A.1    Seeland, K.D.2    Redish, A.D.3
  • 91
    • 84899017487 scopus 로고    scopus 로고
    • Dietterich TG, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems, Cambridge, MA: MIT Press
    • Dayan P (2002) Motivated reinforcement learning. In: Dietterich TG, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems 14. Cambridge, MA: MIT Press.
    • (2002) Motivated reinforcement learning , pp. 14
    • Dayan, P.1
  • 92
    • 0032930935 scopus 로고    scopus 로고
    • A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task
    • Suri RE, Schultz W (1999) A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task. Neuroscience 91: 871-890.
    • (1999) Neuroscience , vol.91 , pp. 871-890
    • Suri, R.E.1    Schultz, W.2
  • 93
    • 0003781238 scopus 로고    scopus 로고
    • New York: Cambridge University Press
    • Norris JR (1997) Markov Chains. New York: Cambridge University Press.
    • (1997) Markov Chains
    • Norris, J.R.1
  • 98
    • 0032643313 scopus 로고    scopus 로고
    • Solving semi-markov decision problems using average reward reinforcement learning
    • Das T, Gosavi A, Mahadevan S, Marchalleck N (1999) Solving semi-markov decision problems using average reward reinforcement learning. Management Science 45: 575-596.
    • (1999) Management Science , vol.45 , pp. 575-596
    • Das, T.1    Gosavi, A.2    Mahadevan, S.3    Marchalleck, N.4
  • 100
    • 33646467129 scopus 로고    scopus 로고
    • Evidence that the delay-period activity of dopamine neurons corresponds to reward uncertainty rather than backpropogating TD errors
    • Fiorillo CD, Tobler PN, Schultz W (2005) Evidence that the delay-period activity of dopamine neurons corresponds to reward uncertainty rather than backpropogating TD errors. Behavioral and Brain Functions 1: 7.
    • (2005) Behavioral and Brain Functions , vol.1 , pp. 7
    • Fiorillo, C.D.1    Tobler, P.N.2    Schultz, W.3
  • 101
    • 34147168649 scopus 로고    scopus 로고
    • Coordinated accumbal dopamine release and neural activity drive goal-directed behavior
    • Cheer JF, Aragona BJ, Heien MLAV, Seipel AT, Carelli RM, et al. (2007) Coordinated accumbal dopamine release and neural activity drive goal-directed behavior. Neuron 54: 237-244.
    • (2007) Neuron , vol.54 , pp. 237-244
    • Cheer, J.F.1    Aragona, B.J.2    Heien, M.L.A.V.3    Seipel, A.T.4    Carelli, R.M.5
  • 103
    • 48149101941 scopus 로고    scopus 로고
    • The temporal precision of reward prediction in dopamine neurons
    • Fiorillo CD, Newsome WT, Schultz W (2008) The temporal precision of reward prediction in dopamine neurons. Nature Neuroscience 11: 966-973.
    • (2008) Nature Neuroscience , vol.11 , pp. 966-973
    • Fiorillo, C.D.1    Newsome, W.T.2    Schultz, W.3
  • 104
    • 0033213819 scopus 로고    scopus 로고
    • What are the computations of the cerebellum, the basal ganglia, and the cerebral cortex?
    • Doya K (1999) What are the computations of the cerebellum, the basal ganglia, and the cerebral cortex? Neural networks 12: 961-974.
    • (1999) Neural networks , vol.12 , pp. 961-974
    • Doya, K.1
  • 105
    • 0034524427 scopus 로고    scopus 로고
    • Complementary roles of basal ganglia and cerebellum in learning and motor control
    • Doya K (2000) Complementary roles of basal ganglia and cerebellum in learning and motor control. Current Opinion in Neurobiology 10: 732-739.
    • (2000) Current Opinion in Neurobiology , vol.10 , pp. 732-739
    • Doya, K.1
  • 106
    • 28144449057 scopus 로고    scopus 로고
    • Representation of action-specific reward values in the striatum
    • Samejima K, Ueda Y, Doya K, Kimura M (2005) Representation of action-specific reward values in the striatum. Science 310: 1337-1340.
    • (2005) Science , vol.310 , pp. 1337-1340
    • Samejima, K.1    Ueda, Y.2    Doya, K.3    Kimura, M.4
  • 107
    • 34147191094 scopus 로고    scopus 로고
    • Efficient reinforcement learning: Computational theories, neuroscience and robotics
    • Kawato M, Samejima K (2007) Efficient reinforcement learning: computational theories, neuroscience and robotics. Current Opinion in Neurobiology 17: 205-212.
    • (2007) Current Opinion in Neurobiology , vol.17 , pp. 205-212
    • Kawato, M.1    Samejima, K.2
  • 108
    • 0025321039 scopus 로고
    • Functional architecture of basal ganglia circuits: Neural substrates of parallel processing
    • Alexander GE, Crutcher MD (1990) Functional architecture of basal ganglia circuits: Neural substrates of parallel processing. Trends in Neurosciences 13: 266-271.
    • (1990) Trends in Neurosciences , vol.13 , pp. 266-271
    • Alexander, G.E.1    Crutcher, M.D.2
  • 111
    • 34447630083 scopus 로고    scopus 로고
    • Schweighofer N, Tanaka SC, Doya K (2007) Serotonin and the evaluation of future rewards. theory, experiments, and possible neural mechanisms. Annals of the New York Academy of Sciences 1104: 289-300.
    • Schweighofer N, Tanaka SC, Doya K (2007) Serotonin and the evaluation of future rewards. theory, experiments, and possible neural mechanisms. Annals of the New York Academy of Sciences 1104: 289-300.
  • 112
    • 70449457651 scopus 로고    scopus 로고
    • An fMRI study of the delay discounting of reward after tryptophan depletion and loading
    • Society for Neuroscience Abstracts
    • Tanaka SC, Schweighofer N, Asahi S, Okamoto Y, Doya K (2004) An fMRI study of the delay discounting of reward after tryptophan depletion and loading. 2: reward-expectation. Society for Neuroscience Abstracts.
    • (2004) 2: Reward-expectation
    • Tanaka, S.C.1    Schweighofer, N.2    Asahi, S.3    Okamoto, Y.4    Doya, K.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.