메뉴 건너뛰기




Volumn 68, Issue , 2017, Pages 101-128

Reinforcement Learning and Episodic Memory in Humans and Animals: An Integrative Framework

Author keywords

Decision making; Memory; Reinforcement learning

Indexed keywords

ANIMAL; BRAIN; DECISION MAKING; EPISODIC MEMORY; HUMAN; LEARNING; PHYSIOLOGY; REINFORCEMENT; REWARD;

EID: 85009518318     PISSN: 00664308     EISSN: 15452085     Source Type: Book Series    
DOI: 10.1146/annurev-psych-122414-033625     Document Type: Article
Times cited : (340)

References (135)
  • 1
    • 84946268134 scopus 로고
    • Variations in the sensitivity of instrumental responding to reinforcer devaluation
    • Adams CD. 1982. Variations in the sensitivity of instrumental responding to reinforcer devaluation. Q. J. Exp. Psychol. 34:77-98
    • (1982) Q. J. Exp. Psychol. , vol.34 , pp. 77-98
    • Adams, C.D.1
  • 2
    • 0025321039 scopus 로고
    • Functional architecture of basal ganglia circuits: Neural substrates of parallel processing
    • Alexander GE, Crutcher MD. 1990. Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends Neurosci. 13:266-71
    • (1990) Trends Neurosci , vol.13 , pp. 266-271
    • Alexander, G.E.1    Crutcher, M.D.2
  • 3
    • 0038458652 scopus 로고    scopus 로고
    • Small feedback-based decisions and their limited correspondence to description-based decisions
    • Reinforcement Learning and Episodes 123
    • Barron G, Erev I. 2003. Small feedback-based decisions and their limited correspondence to description-based decisions. J. Behav. Decis. Making 16:215-33 www.annualreviews.org Reinforcement Learning and Episodes 123
    • (2003) J. Behav. Decis. Making , vol.16 , pp. 215-233
    • Barron, G.1    Erev, I.2
  • 4
    • 21544435722 scopus 로고    scopus 로고
    • Midbrain dopamine neurons encode a quantitative reward prediction error signal
    • Bayer HM, Glimcher PW. 2005. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47:129-41
    • (2005) Neuron , vol.47 , pp. 129-141
    • Bayer, H.M.1    Glimcher, P.W.2
  • 5
    • 85012688561 scopus 로고
    • Princeton, NJ: Princeton Univ. Press
    • Bellman R. 1957. Dynamic Programming. Princeton, NJ: Princeton Univ. Press
    • (1957) Dynamic Programming
    • Bellman, R.1
  • 8
    • 67349181341 scopus 로고    scopus 로고
    • Learning, risk attitude and hot stoves in restless bandit problems
    • BieleG, Erev I, Ert E. 2009. Learning, risk attitude and hot stoves in restless bandit problems. J. Math. Psychol. 53(3):155-67
    • (2009) J. Math. Psychol. , vol.53 , Issue.3 , pp. 155-167
    • Biele, G.1    Erev, I.2    Ert, E.3
  • 9
    • 84996573175 scopus 로고    scopus 로고
    • What's past is present: Reminders of past choices bias decisions for reward in humans
    • Bornstein AM, Khaw MW, Shohamy D, Daw ND. 2015. What's past is present: Reminders of past choices bias decisions for reward in humans. bioRxiv 033910. doi: 10.1101/033910
    • (2015) BioRxiv 033910
    • Bornstein, A.M.1    Khaw, M.W.2    Shohamy, D.3    Daw, N.D.4
  • 10
    • 70350566799 scopus 로고    scopus 로고
    • Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective
    • Botvinick MM, Niv Y, Barto AC. 2009. Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 113:262-80
    • (2009) Cognition , vol.113 , pp. 262-280
    • Botvinick, M.M.1    Niv, Y.2    Barto, A.C.3
  • 11
    • 52849104617 scopus 로고    scopus 로고
    • On the control of control: The role of dopamine in regulating prefrontal function and working memory
    • ed. S Monsell, J Driver, Cambridge, MA: MIT Press
    • Braver TS, Cohen JD. 2000. On the control of control: the role of dopamine in regulating prefrontal function and working memory. In Control of Cognitive Processes: Attention and Performance XVIII, ed. S Monsell, J Driver, pp. 713-37. Cambridge, MA: MIT Press
    • (2000) Control of Cognitive Processes: Attention and Performance XVIII , pp. 713-737
    • Braver, T.S.1    Cohen, J.D.2
  • 12
    • 0038862801 scopus 로고
    • Sensory pre-conditioning
    • BrogdenW. 1939. Sensory pre-conditioning. J. Exp. Psychol. 25:323-32
    • (1939) J. Exp. Psychol. , vol.25 , pp. 323-332
    • Brogden, W.1
  • 13
    • 84856940819 scopus 로고    scopus 로고
    • Cooperative interactions between hippocampal and striatal systems support flexible navigation
    • Brown TI, Ross RS, Tobyne SM, Stern CE. 2012. Cooperative interactions between hippocampal and striatal systems support flexible navigation. NeuroImage 60:1316-30
    • (2012) NeuroImage , vol.60 , pp. 1316-1330
    • Brown, T.I.1    Ross, R.S.2    Tobyne, S.M.3    Stern, C.E.4
  • 14
    • 79251550466 scopus 로고    scopus 로고
    • Hippocampal replay in the awake state: A potential substrate for memory consolidation and retrieval
    • CarrMF, Jadhav SP, Frank LM. 2011. Hippocampal replay in the awake state: a potential substrate for memory consolidation and retrieval. Nat. Neurosci. 14:147-53
    • (2011) Nat. Neurosci , vol.14 , pp. 147-153
    • Carr, M.F.1    Jadhav, S.P.2    Frank, L.M.3
  • 15
    • 0036529871 scopus 로고    scopus 로고
    • Computational perspectives on dopamine function in prefrontal cortex
    • Cohen JD, Braver TS, Brown JW. 2002. Computational perspectives on dopamine function in prefrontal cortex. Curr. Opin. Neurobiol. 12:223-29
    • (2002) Curr. Opin. Neurobiol , vol.12 , pp. 223-229
    • Cohen, J.D.1    Braver, T.S.2    Brown, J.W.3
  • 16
    • 84856431209 scopus 로고    scopus 로고
    • Neuron-type-specific signals for reward and punishment in the ventral tegmental area
    • Cohen JY, Haesler S, Vong L, Lowell BB, Uchida N. 2012. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482:85-88
    • (2012) Nature , vol.482 , pp. 85-88
    • Cohen, J.Y.1    Haesler, S.2    Vong, L.3    Lowell, B.B.4    Uchida, N.5
  • 17
    • 84912082331 scopus 로고    scopus 로고
    • Workingmemory contributions to reinforcement learning impairments in schizophrenia
    • Collins AG, Brown JK, Gold JM,Waltz JA, FrankMJ. 2014. Workingmemory contributions to reinforcement learning impairments in schizophrenia. J. Neurosci. 34:13747-56
    • (2014) J. Neurosci , vol.34 , pp. 13747-13756
    • Collins, A.G.1    Brown, J.K.2    Gold, J.M.3    Waltz, J.A.4    Frank, M.J.5
  • 18
    • 84859317336 scopus 로고    scopus 로고
    • How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis
    • Collins AG, Frank MJ. 2012. How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis. Eur. J. Neurosci. 35:1024-35
    • (2012) Eur. J. Neurosci , vol.35 , pp. 1024-1035
    • Collins, A.G.1    Frank, M.J.2
  • 19
    • 84946811477 scopus 로고    scopus 로고
    • Habitual control of goal selection in humans
    • Cushman F, Morris A. 2015. Habitual control of goal selection in humans. PNAS 112:13817-22
    • (2015) PNAS , vol.112 , pp. 13817-13822
    • Cushman, F.1    Morris, A.2
  • 20
    • 84897397355 scopus 로고    scopus 로고
    • Advanced reinforcement learning
    • Daw ND. 2013. Advanced reinforcement learning. See Glimcher &Fehr 2013, pp. 299-320
    • (2013) See Glimcher &Fehr 2013 , pp. 299-320
    • Daw, N.D.1
  • 21
    • 33745787929 scopus 로고    scopus 로고
    • Representation and timing in theories of the dopamine system
    • DawND, Courville AC, Touretzky DS. 2006. Representation and timing in theories of the dopamine system. Neural Comput. 18:1637-77
    • (2006) Neural Comput , vol.18 , pp. 1637-1677
    • Daw, N.D.1    Courville, A.C.2    Touretzky, D.S.3
  • 23
    • 79952746011 scopus 로고    scopus 로고
    • Model-based influences on humans' choices and striatal prediction errors
    • Daw ND, Gershman SJ, Seymour B, Dayan P, Dolan RJ. 2011. Model-based influences on humans' choices and striatal prediction errors. Neuron 69:1204-15
    • (2011) Neuron , vol.69 , pp. 1204-1215
    • Daw, N.D.1    Gershman, S.J.2    Seymour, B.3    Dayan, P.4    Dolan, R.J.5
  • 24
    • 28044450875 scopus 로고    scopus 로고
    • Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
    • Daw ND, Niv Y, Dayan P. 2005. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8:1704-11
    • (2005) Nat. Neurosci , vol.8 , pp. 1704-1711
    • Daw, N.D.1    Niv, Y.2    Dayan, P.3
  • 26
    • 84897401223 scopus 로고    scopus 로고
    • Value learning through reinforcement: The basics of dopamine and reinforcement learning
    • DawND, Tobler PN. 2013. Value learning through reinforcement: the basics of dopamine and reinforcement learning. See Glimcher &Fehr 2013, pp. 283-98
    • (2013) See Glimcher &Fehr 2013 , pp. 283-298
    • Daw, N.D.1    Tobler, P.N.2
  • 27
    • 0001158047 scopus 로고
    • Improving generalization for temporal difference learning: The successor representation
    • Dayan P. 1993. Improving generalization for temporal difference learning: the successor representation. Neural Comput. 5:613-24
    • (1993) Neural Comput , vol.5 , pp. 613-624
    • Dayan, P.1
  • 28
    • 84892682926 scopus 로고    scopus 로고
    • Actions, action sequences and habits: Evidence that goal-directed and habitual action control are hierarchically organized
    • Dezfouli A, Balleine BW. 2013. Actions, action sequences and habits: evidence that goal-directed and habitual action control are hierarchically organized. PLOS Comput. Biol. 9:e1003364
    • (2013) PLOS Comput. Biol , vol.9 , pp. e1003364
    • Dezfouli, A.1    Balleine, B.W.2
  • 29
    • 0043250430 scopus 로고    scopus 로고
    • The role of learning in the operation of motivational systems
    • Learning,Motivation and Emotion, ed.CRGallistel, New York: John Wiley &Sons. 3rd ed
    • Dickinson A, Balleine BW. 2002. The role of learning in the operation of motivational systems. In Steven's Handbook of Experimental Psychology,Volume 3: Learning,Motivation and Emotion, ed.CRGallistel, pp. 497-534. New York: John Wiley &Sons. 3rd ed
    • (2002) Steven's Handbook of Experimental Psychology , vol.3 , pp. 497-534
    • Dickinson, A.1    Balleine, B.W.2
  • 30
    • 84875468581 scopus 로고    scopus 로고
    • Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia
    • Diuk C, Tsai K, Wallis J, Botvinick M, Niv Y. 2013. Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia. J. Neurosci. 33:5797-805
    • (2013) J. Neurosci , vol.33 , pp. 5797-5805
    • Diuk, C.1    Tsai, K.2    Wallis, J.3    Botvinick, M.4    Niv, Y.5
  • 31
    • 84885802926 scopus 로고    scopus 로고
    • Goals and habits in the brain
    • Dolan RJ, Dayan P. 2013. Goals and habits in the brain. Neuron 80:312-25
    • (2013) Neuron , vol.80 , pp. 312-325
    • Dolan, R.J.1    Dayan, P.2
  • 34
    • 31844451013 scopus 로고    scopus 로고
    • Reinforcement learning with Gaussian processes
    • 22nd, Bonn, Ger., New York: Assoc. Comput. Mach
    • Engel Y, Mannor S, Meir R. 2005. Reinforcement learning with Gaussian processes. Proc. Int. Conf. Mach. Learn., 22nd, Bonn, Ger., pp. 201-8. New York: Assoc. Comput. Mach
    • (2005) Proc. Int. Conf. Mach. Learn , pp. 201-208
    • Engel, Y.1    Mannor, S.2    Meir, R.3
  • 35
    • 56849099116 scopus 로고    scopus 로고
    • Loss aversion, diminishing sensitivity, and the effect of experience on repeated decisions
    • Erev I, Ert E, Yechiam E. 2008. Loss aversion, diminishing sensitivity, and the effect of experience on repeated decisions. J. Behav. Decis. Making 21:575-97
    • (2008) J. Behav. Decis. Making , vol.21 , pp. 575-597
    • Erev, I.1    Ert, E.2    Yechiam, E.3
  • 36
    • 27644454882 scopus 로고    scopus 로고
    • Neural systems of reinforcement for drug addiction: From actions to habits to compulsion
    • Everitt BJ, Robbins TW. 2005. Neural systems of reinforcement for drug addiction: from actions to habits to compulsion. Nat. Neurosci. 8:1481-89
    • (2005) Nat. Neurosci , vol.8 , pp. 1481-1489
    • Everitt, B.J.1    Robbins, T.W.2
  • 37
    • 79952059195 scopus 로고    scopus 로고
    • What constitutes an episode in episodic memory?
    • Ezzyat Y, Davachi L. 2011. What constitutes an episode in episodic memory? Psychol. Sci. 22(2):243-52
    • (2011) Psychol. Sci. , vol.22 , Issue.2 , pp. 243-252
    • Ezzyat, Y.1    Davachi, L.2
  • 38
    • 0033968832 scopus 로고    scopus 로고
    • Amodel of hippocampally dependent navigation, using the temporal difference learning rule
    • Foster DJ, Morris RGM,Dayan P. 2000. Amodel of hippocampally dependent navigation, using the temporal difference learning rule. Hippocampus 10:1-16
    • (2000) Hippocampus , vol.10 , pp. 1-16
    • Foster, D.J.1    Morris, R.G.M.2    Dayan, P.3
  • 39
    • 10344250993 scopus 로고    scopus 로고
    • By carrot or by stick: Cognitive reinforcement learning in parkinsonism
    • Frank MJ, Seeberger LC, O'Reilly RC. 2004. By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science 306:1940-43
    • (2004) Science , vol.306 , pp. 1940-1943
    • Frank, M.J.1    Seeberger, L.C.2    O'Reilly, R.C.3
  • 40
    • 40949160181 scopus 로고    scopus 로고
    • Solving the credit assignment problem: Explicit and implicit learning of action sequences with probabilistic outcomes
    • Fu W-T, Anderson JR. 2008. Solving the credit assignment problem: explicit and implicit learning of action sequences with probabilistic outcomes. Psychol. Res. 72:321-30
    • (2008) Psychol. Res , vol.72 , pp. 321-330
    • Fu, W.-T.1    Anderson, J.R.2
  • 41
    • 0031602724 scopus 로고    scopus 로고
    • Cognitive neuroscience of human memory
    • Gabrieli JD. 1998. Cognitive neuroscience of human memory. Annu. Rev. Psychol. 49:87-115
    • (1998) Annu. Rev. Psychol. , vol.49 , pp. 87-115
    • Gabrieli, J.D.1
  • 42
    • 4444288656 scopus 로고    scopus 로고
    • Kernels and distances for structured data
    • Gärtner T, Lloyd JW, Flach PA. 2004. Kernels and distances for structured data. Mach. Learn. 57:205-32
    • (2004) Mach. Learn , vol.57 , pp. 205-232
    • Gärtner, T.1    Lloyd, J.W.2    Flach, P.A.3
  • 43
  • 45
    • 84893351295 scopus 로고    scopus 로고
    • Retrospective revaluation in sequential decision making: A tale of two systems
    • Gershman SJ, Markman AB, Otto AR. 2014. Retrospective revaluation in sequential decision making: a tale of two systems. J. Exp. Psychol. Gen. 143:182-94
    • (2014) J. Exp. Psychol. Gen , vol.143 , pp. 182-194
    • Gershman, S.J.1    Markman, A.B.2    Otto, A.R.3
  • 48
    • 84961875552 scopus 로고    scopus 로고
    • Characterizing a psychiatric symptom dimension related to deficits in goal-directed control
    • Gillan CM, Kosinski M, Whelan R, Phelps EA, Daw ND. 2016. Characterizing a psychiatric symptom dimension related to deficits in goal-directed control. eLife 5:e11305
    • (2016) ELife , vol.5 , pp. e11305
    • Gillan, C.M.1    Kosinski, M.2    Whelan, R.3    Phelps, E.A.4    Daw, N.D.5
  • 49
    • 77953260848 scopus 로고    scopus 로고
    • States versus rewards: Dissociable neural prediction error signals underlying model-based and model-free reinforcement learning
    • Gläscher J, Daw N, Dayan P, O'Doherty JP. 2010. States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66:585-95
    • (2010) Neuron , vol.66 , pp. 585-595
    • Gläscher, J.1    Daw, N.2    Dayan, P.3    O'Doherty, J.P.4
  • 51
    • 80755180871 scopus 로고    scopus 로고
    • Instance-based learning: Integrating sampling and repeated decisions from experience
    • Gonzalez C, Dutt V. 2011. Instance-based learning: integrating sampling and repeated decisions from experience. Psychol. Rev. 118:523-51
    • (2011) Psychol. Rev. , vol.118 , pp. 523-551
    • Gonzalez, C.1    Dutt, V.2
  • 52
    • 0037971564 scopus 로고    scopus 로고
    • Instance-based learning in dynamic decision making
    • Gonzalez C, Lerch JF, Lebiere C. 2003. Instance-based learning in dynamic decision making. Cogn. Sci. 27:591-635
    • (2003) Cogn. Sci. , vol.27 , pp. 591-635
    • Gonzalez, C.1    Lerch, J.F.2    Lebiere, C.3
  • 54
    • 80055084825 scopus 로고    scopus 로고
    • Grid cells, place cells, and geodesic generalization for spatial reinforcement learning
    • Gustafson NJ, Daw ND. 2011. Grid cells, place cells, and geodesic generalization for spatial reinforcement learning. PLOS Comput. Biol. 7:e1002235
    • (2011) PLOS Comput. Biol , vol.7 , pp. e1002235
    • Gustafson, N.J.1    Daw, N.D.2
  • 55
    • 45949091429 scopus 로고    scopus 로고
    • Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors
    • Hare TA, O'Doherty J, Camerer CF, Schultz W, Rangel A. 2008. Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors. J. Neurosci. 28:5623-30
    • (2008) J. Neurosci , vol.28 , pp. 5623-5630
    • Hare, T.A.1    O'Doherty, J.2    Camerer, C.F.3    Schultz, W.4    Rangel, A.5
  • 56
    • 84892388605 scopus 로고    scopus 로고
    • Phasic dopamine release in the rat nucleus accumbens symmetrically encodes a reward prediction error term
    • Hart AS,RutledgeRB, Glimcher PW, Phillips PE. 2014. Phasic dopamine release in the rat nucleus accumbens symmetrically encodes a reward prediction error term. J. Neurosci. 34:698-704
    • (2014) J. Neurosci , vol.34 , pp. 698-704
    • Hart, A.S.1    Rutledge, R.B.2    Glimcher, P.W.3    Phillips, P.E.4
  • 58
    • 56849118500 scopus 로고    scopus 로고
    • The description-experience gap in risky choice: The role of sample size and experienced probabilities
    • Hau R, Pleskac TJ, Kiefer J, Hertwig R. 2008. The description-experience gap in risky choice: the role of sample size and experienced probabilities. J. Behav. Decis. Making 21:493-518
    • (2008) J. Behav. Decis. Making , vol.21 , pp. 493-518
    • Hau, R.1    Pleskac, T.J.2    Kiefer, J.3    Hertwig, R.4
  • 59
    • 70449671239 scopus 로고    scopus 로고
    • The description-experience gap in risky choice
    • Hertwig R, Erev I. 2009. The description-experience gap in risky choice. Trends Cogn. Sci. 13:517-23
    • (2009) Trends Cogn. Sci. , vol.13 , pp. 517-523
    • Hertwig, R.1    Erev, I.2
  • 60
    • 0002861883 scopus 로고
    • A model of how the basal ganglia generate and use neural signals that predict reinforcement
    • ed. JCHouk, JLDavis, DG Beiser, Cambridge, MA: MIT Press
    • Houk JC, Adams JL, Barto AG. 1995. A model of how the basal ganglia generate and use neural signals that predict reinforcement. In Models of Information Processing in the Basal Ganglia, ed. JCHouk, JLDavis, DG Beiser, pp. 249-70. Cambridge, MA: MIT Press
    • (1995) Models of Information Processing in the Basal Ganglia , pp. 249-270
    • Houk, J.C.1    Adams, J.L.2    Barto, A.G.3
  • 61
    • 84924325916 scopus 로고    scopus 로고
    • Interplay of approximate planning strategies
    • Reinforcement Learning and Episodes 125
    • Huys QJ, Lally N, Faulkner P, Eshel N, Seifritz E, et al. 2015. Interplay of approximate planning strategies. PNAS 112:3098-103 www.annualreviews.org Reinforcement Learning and Episodes 125
    • (2015) PNAS , vol.112 , pp. 3098-3103
    • Huys, Q.J.1    Lally, N.2    Faulkner, P.3    Eshel, N.4    Seifritz, E.5
  • 63
    • 36048937548 scopus 로고    scopus 로고
    • Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point
    • Johnson A, Redish AD. 2007. Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point. J. Neurosci. 27:12176-89
    • (2007) J. Neurosci , vol.27 , pp. 12176-12189
    • Johnson, A.1    Redish, A.D.2
  • 64
    • 0032073263 scopus 로고    scopus 로고
    • Planning and acting in partially observable stochastic domains
    • Kaelbling LP, Littman ML, Cassandra AR. 1998. Planning and acting in partially observable stochastic domains. Artif. Intell. 101:99-134
    • (1998) Artif. Intell. , vol.101 , pp. 99-134
    • Kaelbling, L.P.1    Littman, M.L.2    Cassandra, A.R.3
  • 65
    • 0000050520 scopus 로고
    • Norm theory: Comparing reality to its alternatives
    • Kahneman D, Miller DT. 1986. Norm theory: comparing reality to its alternatives. Psychol. Rev. 93:136-53
    • (1986) Psychol. Rev. , vol.93 , pp. 136-153
    • Kahneman, D.1    Miller, D.T.2
  • 66
    • 85009479014 scopus 로고
    • Prospect theory: An analysis of decision under risk
    • Kahneman D, Tversky A. 1979. Prospect theory: an analysis of decision under risk. Econometrica 47:263-91
    • (1979) Econometrica , vol.47 , pp. 263-291
    • Kahneman, D.1    Tversky, A.2
  • 67
    • 26944457467 scopus 로고    scopus 로고
    • Bias-variance" error bounds for temporal difference updates
    • 13th, Stanford, CA, New York: Assoc. Comput. Mach
    • Kearns MJ, Singh SP. 2000. "Bias-variance" error bounds for temporal difference updates. Proc. Annu. Conf. Comput. Learn. Theory, 13th, Stanford, CA, pp. 142-47. New York: Assoc. Comput. Mach
    • (2000) Proc. Annu. Conf. Comput. Learn. Theory , pp. 142-147
    • Kearns, M.J.1    Singh, S.P.2
  • 68
    • 79958143780 scopus 로고    scopus 로고
    • Speed/accuracy trade-off between the habitual and the goal-directed processes
    • Keramati M, Dezfouli A, Piray P. 2011. Speed/accuracy trade-off between the habitual and the goal-directed processes. PLOS Comput. Biol. 7:e1002055
    • (2011) PLOS Comput. Biol , vol.7 , pp. e1002055
    • Keramati, M.1    Dezfouli, A.2    Piray, P.3
  • 69
    • 0029815726 scopus 로고    scopus 로고
    • A neostriatal habit learning system in humans
    • Knowlton BJ,Mangels JA, Squire LR. 1996. A neostriatal habit learning system in humans. Science 273:1399-402
    • (1996) Science , vol.273 , pp. 1399-1402
    • Knowlton, B.J.1    Mangels, J.A.2    Squire, L.R.3
  • 70
    • 0026477904 scopus 로고
    • ALCOVE: An exemplar-based connectionist model of category learning
    • Kruschke JK. 1992. ALCOVE: an exemplar-based connectionist model of category learning. Psychol. Rev. 99:22-44
    • (1992) Psychol. Rev. , vol.99 , pp. 22-44
    • Kruschke, J.K.1
  • 73
    • 33644782012 scopus 로고    scopus 로고
    • Dynamic response-by-response models of matching behavior in rhesus monkeys
    • Lau B, Glimcher PW. 2005. Dynamic response-by-response models of matching behavior in rhesus monkeys. J. Exp. Anal. Behav. 84:555-79
    • (2005) J. Exp. Anal. Behav. , vol.84 , pp. 555-579
    • Lau, B.1    Glimcher, P.W.2
  • 74
    • 84893508813 scopus 로고    scopus 로고
    • Neural computations underlying arbitration between model-based and model-free learning
    • Lee SW, Shimojo S, O'Doherty JP. 2014. Neural computations underlying arbitration between model-based and model-free learning. Neuron 81:687-99
    • (2014) Neuron , vol.81 , pp. 687-699
    • Lee, S.W.1    Shimojo, S.2    O'Doherty, J.P.3
  • 75
    • 85162020429 scopus 로고    scopus 로고
    • Hippocampal contributions to control: The third way
    • Lengyel M, Dayan P. 2007. Hippocampal contributions to control: the third way. Adv. Neural Inf. Process. Syst., 20:889-96
    • (2007) Adv. Neural Inf. Process. Syst. , vol.20 , pp. 889-896
    • Lengyel, M.1    Dayan, P.2
  • 76
    • 85015536997 scopus 로고    scopus 로고
    • The high availability of extreme events serves resource-rational decisionmaking
    • 36th, Quebec City, Can., Wheat Ridge, CO: Cogn. Sci. Soc
    • Lieder F, HsuM,GriffithsTL. 2014. The high availability of extreme events serves resource-rational decisionmaking. Proc. Ann. Conf. Cogn. Sci. Soc., 36th, Quebec City, Can., pp. 2567-72. Wheat Ridge, CO: Cogn. Sci. Soc
    • (2014) Proc. Ann. Conf. Cogn. Sci. Soc , pp. 2567-2572
    • Lieder, F.1    Hsu, M.2    Griffiths, T.L.3
  • 77
    • 1942539715 scopus 로고    scopus 로고
    • SUSTAIN: A network model of category learning
    • Love BC, Medin DL, Gureckis TM. 2004. SUSTAIN: a network model of category learning. Psychol. Rev. 111:309-32
    • (2004) Psychol. Rev. , vol.111 , pp. 309-332
    • Love, B.C.1    Medin, D.L.2    Gureckis, T.M.3
  • 79
    • 57349130536 scopus 로고    scopus 로고
    • Stimulus representation and the timing of reward-prediction errors in models of the dopamine system
    • Ludvig EA, Sutton RS, Kehoe EJ. 2008. Stimulus representation and the timing of reward-prediction errors in models of the dopamine system. Neural Comput. 20:3034-54
    • (2008) Neural Comput , vol.20 , pp. 3034-3054
    • Ludvig, E.A.1    Sutton, R.S.2    Kehoe, E.J.3
  • 80
    • 0002674217 scopus 로고
    • Memory and attentional factors in consumer choice: Concepts and research methods
    • Lynch JG Jr., Srull TK. 1982. Memory and attentional factors in consumer choice: concepts and research methods. J. Consum. Res. 9:18-37
    • (1982) J. Consum. Res , vol.9 , pp. 18-37
    • Lynch, J.G.1    Srull, T.K.2
  • 81
    • 84901298185 scopus 로고    scopus 로고
    • Remembering the best and worst of times: Memories for extreme outcomes bias risky decisions
    • Madan CR, Ludvig EA, Spetch ML. 2014. Remembering the best and worst of times: memories for extreme outcomes bias risky decisions. Psychon. Bull. Rev. 21:629-36
    • (2014) Psychon. Bull. Rev. , vol.21 , pp. 629-636
    • Madan, C.R.1    Ludvig, E.A.2    Spetch, M.L.3
  • 82
    • 35748957806 scopus 로고    scopus 로고
    • Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes
    • Mahadevan S, Maggioni M. 2007. Proto-value functions: a Laplacian framework for learning representation and control in Markov decision processes. J. Mach. Learn. Res. 8:2169-231
    • (2007) J. Mach. Learn. Res , vol.8 , pp. 2169-2231
    • Mahadevan, S.1    Maggioni, M.2
  • 83
    • 84924051598 scopus 로고    scopus 로고
    • Human-level control through deep reinforcement learning
    • Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, et al. 2015. Human-level control through deep reinforcement learning. Nature 518:529-33
    • (2015) Nature , vol.518 , pp. 529-533
    • Mnih, V.1    Kavukcuoglu, K.2    Silver, D.3    Rusu, A.A.4    Veness, J.5
  • 84
    • 0029981543 scopus 로고    scopus 로고
    • A framework for mesencephalic dopamine systems based on predictive Hebbian learning
    • Montague PR, Dayan P, Sejnowski TJ. 1996. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16:1936-47
    • (1996) J. Neurosci , vol.16 , pp. 1936-1947
    • Montague, P.R.1    Dayan, P.2    Sejnowski, T.J.3
  • 86
    • 0001464753 scopus 로고
    • Recall and consumer consideration sets: Influencing choice without altering brand evaluations
    • Nedungadi P. 1990. Recall and consumer consideration sets: influencing choice without altering brand evaluations. J. Consum. Res. 17:263-76
    • (1990) J. Consum. Res , vol.17 , pp. 263-276
    • Nedungadi, P.1
  • 87
    • 67349283062 scopus 로고    scopus 로고
    • Reinforcement learning in the brain
    • Niv Y. 2009. Reinforcement learning in the brain. J. Math. Psychol. 53:139-54
    • (2009) J. Math. Psychol. , vol.53 , pp. 139-154
    • Niv, Y.1
  • 88
    • 84930260511 scopus 로고    scopus 로고
    • Reinforcement learning in multidimensional environments relies on attention mechanisms
    • Niv Y, Daniel R, Geana A, Gershman SJ, Leong YC, et al. 2015. Reinforcement learning in multidimensional environments relies on attention mechanisms. J. Neurosci. 35:8145-57
    • (2015) J. Neurosci , vol.35 , pp. 8145-8157
    • Niv, Y.1    Daniel, R.2    Geana, A.3    Gershman, S.J.4    Leong, Y.C.5
  • 89
    • 0022686961 scopus 로고
    • Attention, similarity, and the identification-categorization relationship
    • Nosofsky RM. 1986. Attention, similarity, and the identification-categorization relationship. J. Exp. Psychol. Gen. 115:39-57
    • (1986) J. Exp. Psychol. Gen , vol.115 , pp. 39-57
    • Nosofsky, R.M.1
  • 91
    • 33644927837 scopus 로고    scopus 로고
    • Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia
    • O'Reilly RC, Frank MJ. 2006. Making working memory work: a computational model of learning in the prefrontal cortex and basal ganglia. Neural Comput. 18:283-328
    • (2006) Neural Comput , vol.18 , pp. 283-328
    • O'Reilly, R.C.1    Frank, M.J.2
  • 92
    • 0036832956 scopus 로고    scopus 로고
    • Kernel-based reinforcement learning
    • Ormoneit D, Seń S. 2002. Kernel-based reinforcement learning. Mach. Learn. 49:161-78
    • (2002) Mach. Learn , vol.49 , pp. 161-178
    • Ormoneit, D.1    Seń, S.2
  • 93
    • 84877341847 scopus 로고    scopus 로고
    • The curse of planning: Dissecting multiple reinforcement-learning systems by taxing the central executive
    • Otto AR, Gershman SJ, Markman AB, Daw ND. 2013a. The curse of planning: dissecting multiple reinforcement-learning systems by taxing the central executive. Psychol. Sci. 24:751-61
    • (2013) Psychol. Sci. , vol.24 , pp. 751-761
    • Otto, A.R.1    Gershman, S.J.2    Markman, A.B.3    Daw, N.D.4
  • 94
    • 84891354506 scopus 로고    scopus 로고
    • Working-memory capacity protectsmodel-based learning from stress
    • Otto AR, Raio CM, Chiang A, Phelps EA, DawND. 2013b.Working-memory capacity protectsmodel-based learning from stress. PNAS 110:20941-46
    • (2013) PNAS , vol.110 , pp. 20941-20946
    • Otto, A.R.1    Raio, C.M.2    Chiang, A.3    Phelps, E.A.4    Daw, N.D.5
  • 95
    • 0029972847 scopus 로고    scopus 로고
    • Inactivation of hippocampus or caudate nucleus with lidocaine differentially affects expression of place and response learning
    • PackardMG, McGaugh JL. 1996. Inactivation of hippocampus or caudate nucleus with lidocaine differentially affects expression of place and response learning. Neurobiol. Learn. Mem. 65:65-72
    • (1996) Neurobiol. Learn. Mem , vol.65 , pp. 65-72
    • Packard, M.G.1    McGaugh, J.L.2
  • 96
    • 84964311329 scopus 로고    scopus 로고
    • Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target
    • Parker NF, Cameron CM, Taliaferro JP, Lee J, Choi JY, et al. 2016. Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target. Nat. Neurosci. 19:845-54
    • (2016) Nat. Neurosci , vol.19 , pp. 845-854
    • Parker, N.F.1    Cameron, C.M.2    Taliaferro, J.P.3    Lee, J.4    Choi, J.Y.5
  • 98
    • 33748302924 scopus 로고    scopus 로고
    • Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans
    • Pessiglione M, Seymour B, Flandin G, Dolan RJ, Frith CD. 2006. Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans. Nature 442:1042-45
    • (2006) Nature , vol.442 , pp. 1042-1045
    • Pessiglione, M.1    Seymour, B.2    Flandin, G.3    Dolan, R.J.4    Frith, C.D.5
  • 99
    • 84877578934 scopus 로고    scopus 로고
    • Hippocampal place-cell sequences depict future paths to remembered goals
    • Pfeiffer BE, Foster DJ. 2013. Hippocampal place-cell sequences depict future paths to remembered goals. Nature 497:74-79
    • (2013) Nature , vol.497 , pp. 74-79
    • Pfeiffer, B.E.1    Foster, D.J.2
  • 101
    • 79960241771 scopus 로고    scopus 로고
    • Decision making under uncertainty: A neural model based on partially observable Markov decision processes
    • Rao RP. 2010. Decision making under uncertainty: a neural model based on partially observable Markov decision processes. Front. Comput. Neurosci. 4:146
    • (2010) Front. Comput. Neurosci , vol.4 , pp. 146
    • Rao, R.P.1
  • 102
    • 10344225664 scopus 로고    scopus 로고
    • Addiction as a computational process gone awry
    • Redish AD. 2004. Addiction as a computational process gone awry. Science 306:1944-47
    • (2004) Science , vol.306 , pp. 1944-1947
    • Redish, A.D.1
  • 104
    • 80055099662 scopus 로고    scopus 로고
    • The hippocampus is functionally connected to the striatum and orbitofrontal cortex during context dependent decision making
    • Ross RS, Sherrill KR, Stern CE. 2011. The hippocampus is functionally connected to the striatum and orbitofrontal cortex during context dependent decision making. Brain Res. 1423:53-66
    • (2011) Brain Res , vol.1423 , pp. 53-66
    • Ross, R.S.1    Sherrill, K.R.2    Stern, C.E.3
  • 105
    • 84964211344 scopus 로고    scopus 로고
    • Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework
    • Sadacca BF, Jones JL, Schoenbaum G. 2016. Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework. eLife 5:e13665
    • (2016) ELife , vol.5 , pp. e13665
    • Sadacca, B.F.1    Jones, J.L.2    Schoenbaum, G.3
  • 108
    • 70349967987 scopus 로고    scopus 로고
    • Selective impairment of prediction error signaling in human dorsolateral but not ventral striatum in Parkinson's disease patients: Evidence from a model-based fMRI study
    • SchonbergT,O'Doherty JP, JoelD, Inzelberg R, Segev Y,Daw ND. 2010. Selective impairment of prediction error signaling in human dorsolateral but not ventral striatum in Parkinson's disease patients: evidence from a model-based fMRI study. NeuroImage 49:772-81
    • (2010) NeuroImage , vol.49 , pp. 772-781
    • Schonberg, T.1    O'Doherty, J.P.2    Joel, D.3    Inzelberg, R.4    Segev, Y.5    Daw, N.D.6
  • 109
    • 0030896968 scopus 로고    scopus 로고
    • A neural substrate of prediction and reward
    • SchultzW, Dayan P, Montague PR. 1997. A neural substrate of prediction and reward. Science 275:1593-99
    • (1997) Science , vol.275 , pp. 1593-1599
    • Schultz, W.1    Dayan, P.2    Montague, P.R.3
  • 110
    • 84860163389 scopus 로고    scopus 로고
    • Serotonin selectively modulates reward value in human decision-making
    • Seymour B, Daw ND, Roiser JP, Dayan P, Dolan R. 2012. Serotonin selectively modulates reward value in human decision-making. J. Neurosci. 32:5833-42
    • (2012) J. Neurosci , vol.32 , pp. 5833-5842
    • Seymour, B.1    Daw, N.D.2    Roiser, J.P.3    Dayan, P.4    Dolan, R.5
  • 111
    • 84941026115 scopus 로고    scopus 로고
    • Integrating memories to guide decisions
    • Shohamy D, Daw ND. 2015. Integrating memories to guide decisions. Curr. Opin. Behav. Sci. 5:85-90
    • (2015) Curr. Opin. Behav. Sci. , vol.5 , pp. 85-90
    • Shohamy, D.1    Daw, N.D.2
  • 112
    • 9644283111 scopus 로고    scopus 로고
    • The role of dopamine in cognitive sequence learning: Evidence from Parkinson's disease
    • Shohamy D, Myers CE, Grossman S, Sage J, Gluck MA. 2005. The role of dopamine in cognitive sequence learning: evidence from Parkinson's disease. Behav. Brain Res. 156:191-99
    • (2005) Behav. Brain Res , vol.156 , pp. 191-199
    • Shohamy, D.1    Myers, C.E.2    Grossman, S.3    Sage, J.4    Gluck, M.A.5
  • 113
    • 53849090288 scopus 로고    scopus 로고
    • Integrating memories in the human brain: Hippocampal-midbrain encoding of overlapping events
    • Shohamy D, Wagner AD. 2008. Integrating memories in the human brain: hippocampal-midbrain encoding of overlapping events. Neuron 60:378-89
    • (2008) Neuron , vol.60 , pp. 378-389
    • Shohamy, D.1    Wagner, A.D.2
  • 114
    • 0000275661 scopus 로고
    • Choice in context: Tradeoff contrast and extremeness aversion
    • Simonson I, Tversky A. 1992. Choice in context: tradeoff contrast and extremeness aversion. J. Mark. Res. 29:281-95
    • (1992) J. Mark. Res , vol.29 , pp. 281-295
    • Simonson, I.1    Tversky, A.2
  • 115
    • 33645947016 scopus 로고    scopus 로고
    • Mistake #37: The effect of previously encountered prices on current housing demand
    • Simonsohn U, Loewenstein G. 2006. Mistake #37: the effect of previously encountered prices on current housing demand. Econ. J. 116:175-99
    • (2006) Econ. J , vol.116 , pp. 175-199
    • Simonsohn, U.1    Loewenstein, G.2
  • 116
    • 0030012117 scopus 로고    scopus 로고
    • Replay of neuronal firing sequences in rat hippocampus during sleep following spatial experience
    • Skaggs WE, McNaughton BL. 1996. Replay of neuronal firing sequences in rat hippocampus during sleep following spatial experience. Science 271:1870-73
    • (1996) Science , vol.271 , pp. 1870-1873
    • Skaggs, W.E.1    McNaughton, B.L.2
  • 117
    • 84941703217 scopus 로고    scopus 로고
    • Evidence integration in model-based tree search
    • Solway A, Botvinick MM. 2015. Evidence integration in model-based tree search. PNAS 112:11708-13
    • (2015) PNAS , vol.112 , pp. 11708-11713
    • Solway, A.1    Botvinick, M.M.2
  • 118
    • 0026847039 scopus 로고
    • Memory and the hippocampus: A synthesis from findings with rats, monkeys, and humans
    • Reinforcement Learning and Episodes 127
    • Squire LR. 1992. Memory and the hippocampus: a synthesis from findings with rats, monkeys, and humans. Psychol. Rev. 99:195-231 www.annualreviews.org Reinforcement Learning and Episodes 127
    • (1992) Psychol. Rev. , vol.99 , pp. 195-231
    • Squire, L.R.1
  • 121
    • 27944495741 scopus 로고    scopus 로고
    • Absolute identification by relative judgment
    • StewartN, Brown GD, Chater N. 2005. Absolute identification by relative judgment. Psychol. Rev. 112:881-911
    • (2005) Psychol. Rev. , vol.112 , pp. 881-911
    • Stewart, N.1    Brown, G.D.2    Chater, N.3
  • 123
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton RS. 1988. Learning to predict by the methods of temporal differences. Mach. Learn. 3:9-44
    • (1988) Mach. Learn , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 124
    • 0012929784 scopus 로고
    • Dyna, an integrated architecture for learning, planning, and reacting
    • Sutton RS. 1991. Dyna, an integrated architecture for learning, planning, and reacting. ACM SIGART Bull. 2:160-63
    • (1991) ACM SIGART Bull , vol.2 , pp. 160-163
    • Sutton, R.S.1
  • 126
    • 0034704229 scopus 로고    scopus 로고
    • A global geometric framework for nonlinear dimensionality reduction
    • Tenenbaum JB, De Silva V, Langford JC. 2000. A global geometric framework for nonlinear dimensionality reduction. Science 290:2319-23
    • (2000) Science , vol.290 , pp. 2319-2323
    • Tenenbaum, J.B.1    De Silva, V.2    Langford, J.C.3
  • 127
    • 77549088095 scopus 로고    scopus 로고
    • Learning to use working memory in partially observable environments through dopaminergic reinforcement
    • Todd MT, Niv Y, Cohen JD. 2008. Learning to use working memory in partially observable environments through dopaminergic reinforcement. Adv. Neural Inf. Process. Sys. 21:1689-96
    • (2008) Adv. Neural Inf. Process. Sys , vol.21 , pp. 1689-1696
    • Todd, M.T.1    Niv, Y.2    Cohen, J.D.3
  • 128
    • 58149442669 scopus 로고
    • Cognitive maps in rats and men
    • Tolman EC. 1948. Cognitive maps in rats and men. Psychol. Rev. 55:189-208
    • (1948) Psychol. Rev. , vol.55 , pp. 189-208
    • Tolman, E.C.1
  • 129
    • 0000838862 scopus 로고
    • Episodic and semantic memory 1
    • ed. ETulving,WDonaldson, New York: Academic
    • Tulving E. 1972. Episodic and semantic memory 1. In Organization and Memory, ed. ETulving,WDonaldson, pp. 381-402. New York: Academic
    • (1972) Organization and Memory , pp. 381-402
    • Tulving, E.1
  • 130
    • 84941785218 scopus 로고    scopus 로고
    • Ventromedial frontal cortex is critical for guiding attention to reward-predictive visual features in humans
    • Vaidya AR, Fellows LK. 2015. Ventromedial frontal cortex is critical for guiding attention to reward-predictive visual features in humans. J. Neurosci. 35:12813-23
    • (2015) J. Neurosci , vol.35 , pp. 12813-12823
    • Vaidya, A.R.1    Fellows, L.K.2
  • 131
    • 79951967897 scopus 로고    scopus 로고
    • Theta phase precession in rat ventral striatum links place and reward information
    • van der Meer MA, Redish AD. 2011. Theta phase precession in rat ventral striatum links place and reward information. J. Neurosci. 31:2843-54
    • (2011) J. Neurosci , vol.31 , pp. 2843-2854
    • Vander Meer, M.A.1    Redish, A.D.2
  • 133
    • 84908518146 scopus 로고    scopus 로고
    • Episodic memory encoding interferes with reward learning and decreases striatal prediction errors
    • Wimmer GE, Braun EK, Daw ND, Shohamy D. 2014. Episodic memory encoding interferes with reward learning and decreases striatal prediction errors. J. Neurosci. 34:14901-12
    • (2014) J. Neurosci , vol.34 , pp. 14901-14912
    • Wimmer, G.E.1    Braun, E.K.2    Daw, N.D.3    Shohamy, D.4
  • 134
    • 84867287309 scopus 로고    scopus 로고
    • Preference by association: How memory mechanisms in the hippocampus bias decisions
    • Wimmer GE, Shohamy D. 2012. Preference by association: how memory mechanisms in the hippocampus bias decisions. Science 338:270-73
    • (2012) Science , vol.338 , pp. 270-273
    • Wimmer, G.E.1    Shohamy, D.2
  • 135
    • 39849087310 scopus 로고    scopus 로고
    • Modeling the role of working memory and episodic memory in behavioral tasks
    • Zilli EA, Hasselmo ME. 2008. Modeling the role of working memory and episodic memory in behavioral tasks. Hippocampus 18:193-209
    • (2008) Hippocampus , vol.18 , pp. 193-209
    • Zilli, E.A.1    Hasselmo, M.E.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.