SCOPUS 정보 검색 플랫폼

Annual Review of Psychology

Volumn 68, Issue , 2017, Pages 101-128

Reinforcement Learning and Episodic Memory in Humans and Animals: An Integrative Framework

(2) Gershman, Samuel J a Daw, Nathaniel D b

a HARVARD UNIVERSITY (United States)

b PRINCETON UNIVERSITY (United States)

Author keywords

Decision making; Memory; Reinforcement learning

Indexed keywords

ANIMAL; BRAIN; DECISION MAKING; EPISODIC MEMORY; HUMAN; LEARNING; PHYSIOLOGY; REINFORCEMENT; REWARD;

ANIMALS; BRAIN; DECISION MAKING; HUMANS; LEARNING; MEMORY, EPISODIC; REINFORCEMENT (PSYCHOLOGY); REWARD;

EID: 85009518318 PISSN: 00664308 EISSN: 15452085 Source Type: Book Series
DOI: 10.1146/annurev-psych-122414-033625 Document Type: Article

Times cited : (340)

References (135)

1
- 84946268134
- Variations in the sensitivity of instrumental responding to reinforcer devaluation
- Adams CD. 1982. Variations in the sensitivity of instrumental responding to reinforcer devaluation. Q. J. Exp. Psychol. 34:77-98
- (1982) Q. J. Exp. Psychol. , vol.34 , pp. 77-98
- Adams, C.D.¹

2
- 0025321039
- Functional architecture of basal ganglia circuits: Neural substrates of parallel processing
- Alexander GE, Crutcher MD. 1990. Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends Neurosci. 13:266-71
- (1990) Trends Neurosci , vol.13 , pp. 266-271
- Alexander, G.E.¹ Crutcher, M.D.²

3
- 0038458652
- Small feedback-based decisions and their limited correspondence to description-based decisions
- Reinforcement Learning and Episodes 123
- Barron G, Erev I. 2003. Small feedback-based decisions and their limited correspondence to description-based decisions. J. Behav. Decis. Making 16:215-33 www.annualreviews.org Reinforcement Learning and Episodes 123
- (2003) J. Behav. Decis. Making , vol.16 , pp. 215-233
- Barron, G.¹ Erev, I.²

4
- 21544435722
- Midbrain dopamine neurons encode a quantitative reward prediction error signal
- Bayer HM, Glimcher PW. 2005. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47:129-41
- (2005) Neuron , vol.47 , pp. 129-141
- Bayer, H.M.¹ Glimcher, P.W.²

5
- 85012688561
- Princeton, NJ: Princeton Univ. Press
- Bellman R. 1957. Dynamic Programming. Princeton, NJ: Princeton Univ. Press
- (1957) Dynamic Programming
- Bellman, R.¹

6
- 0003487482
- Nashua, NH: Athena Sci.
- Bertsekas DP, Tsitsiklis JN. 1996. Neuro-Dynamic Programming. Nashua, NH: Athena Sci.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

7
- 0004093003
- Boston: Addison-Wesley
- Bettman JR. 1979. Information Processing Theory of Consumer Choice. Boston: Addison-Wesley
- (1979) Information Processing Theory of Consumer Choice
- Bettman, J.R.¹

8
- 67349181341
- Learning, risk attitude and hot stoves in restless bandit problems
- BieleG, Erev I, Ert E. 2009. Learning, risk attitude and hot stoves in restless bandit problems. J. Math. Psychol. 53(3):155-67
- (2009) J. Math. Psychol. , vol.53 , Issue.3 , pp. 155-167
- Biele, G.¹ Erev, I.² Ert, E.³

9
- 84996573175
- What's past is present: Reminders of past choices bias decisions for reward in humans
- Bornstein AM, Khaw MW, Shohamy D, Daw ND. 2015. What's past is present: Reminders of past choices bias decisions for reward in humans. bioRxiv 033910. doi: 10.1101/033910
- (2015) BioRxiv 033910
- Bornstein, A.M.¹ Khaw, M.W.² Shohamy, D.³ Daw, N.D.⁴

10
- 70350566799
- Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective
- Botvinick MM, Niv Y, Barto AC. 2009. Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 113:262-80
- (2009) Cognition , vol.113 , pp. 262-280
- Botvinick, M.M.¹ Niv, Y.² Barto, A.C.³

11
- 52849104617
- On the control of control: The role of dopamine in regulating prefrontal function and working memory
- ed. S Monsell, J Driver, Cambridge, MA: MIT Press
- Braver TS, Cohen JD. 2000. On the control of control: the role of dopamine in regulating prefrontal function and working memory. In Control of Cognitive Processes: Attention and Performance XVIII, ed. S Monsell, J Driver, pp. 713-37. Cambridge, MA: MIT Press
- (2000) Control of Cognitive Processes: Attention and Performance XVIII , pp. 713-737
- Braver, T.S.¹ Cohen, J.D.²

12
- 0038862801
- Sensory pre-conditioning
- BrogdenW. 1939. Sensory pre-conditioning. J. Exp. Psychol. 25:323-32
- (1939) J. Exp. Psychol. , vol.25 , pp. 323-332
- Brogden, W.¹

13
- 84856940819
- Cooperative interactions between hippocampal and striatal systems support flexible navigation
- Brown TI, Ross RS, Tobyne SM, Stern CE. 2012. Cooperative interactions between hippocampal and striatal systems support flexible navigation. NeuroImage 60:1316-30
- (2012) NeuroImage , vol.60 , pp. 1316-1330
- Brown, T.I.¹ Ross, R.S.² Tobyne, S.M.³ Stern, C.E.⁴

14
- 79251550466
- Hippocampal replay in the awake state: A potential substrate for memory consolidation and retrieval
- CarrMF, Jadhav SP, Frank LM. 2011. Hippocampal replay in the awake state: a potential substrate for memory consolidation and retrieval. Nat. Neurosci. 14:147-53
- (2011) Nat. Neurosci , vol.14 , pp. 147-153
- Carr, M.F.¹ Jadhav, S.P.² Frank, L.M.³

15
- 0036529871
- Computational perspectives on dopamine function in prefrontal cortex
- Cohen JD, Braver TS, Brown JW. 2002. Computational perspectives on dopamine function in prefrontal cortex. Curr. Opin. Neurobiol. 12:223-29
- (2002) Curr. Opin. Neurobiol , vol.12 , pp. 223-229
- Cohen, J.D.¹ Braver, T.S.² Brown, J.W.³

16
- 84856431209
- Neuron-type-specific signals for reward and punishment in the ventral tegmental area
- Cohen JY, Haesler S, Vong L, Lowell BB, Uchida N. 2012. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482:85-88
- (2012) Nature , vol.482 , pp. 85-88
- Cohen, J.Y.¹ Haesler, S.² Vong, L.³ Lowell, B.B.⁴ Uchida, N.⁵

17
- 84912082331
- Workingmemory contributions to reinforcement learning impairments in schizophrenia
- Collins AG, Brown JK, Gold JM,Waltz JA, FrankMJ. 2014. Workingmemory contributions to reinforcement learning impairments in schizophrenia. J. Neurosci. 34:13747-56
- (2014) J. Neurosci , vol.34 , pp. 13747-13756
- Collins, A.G.¹ Brown, J.K.² Gold, J.M.³ Waltz, J.A.⁴ Frank, M.J.⁵

18
- 84859317336
- How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis
- Collins AG, Frank MJ. 2012. How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis. Eur. J. Neurosci. 35:1024-35
- (2012) Eur. J. Neurosci , vol.35 , pp. 1024-1035
- Collins, A.G.¹ Frank, M.J.²

19
- 84946811477
- Habitual control of goal selection in humans
- Cushman F, Morris A. 2015. Habitual control of goal selection in humans. PNAS 112:13817-22
- (2015) PNAS , vol.112 , pp. 13817-13822
- Cushman, F.¹ Morris, A.²

20
- 84897397355
- Advanced reinforcement learning
- Daw ND. 2013. Advanced reinforcement learning. See Glimcher &Fehr 2013, pp. 299-320
- (2013) See Glimcher &Fehr 2013 , pp. 299-320
- Daw, N.D.¹

21
- 33745787929
- Representation and timing in theories of the dopamine system
- DawND, Courville AC, Touretzky DS. 2006. Representation and timing in theories of the dopamine system. Neural Comput. 18:1637-77
- (2006) Neural Comput , vol.18 , pp. 1637-1677
- Daw, N.D.¹ Courville, A.C.² Touretzky, D.S.³

22
- 84907545889
- The algorithmic anatomy of model-based evaluation
- Daw ND, Dayan P. 2014. The algorithmic anatomy of model-based evaluation. Philos. Trans. R. Soc. Lond. B Biol. Sci. 369:20130478
- (2014) Philos. Trans. R. Soc. Lond. B Biol. Sci. , vol.369 , pp. 20130478
- Daw, N.D.¹ Dayan, P.²

23
- 79952746011
- Model-based influences on humans' choices and striatal prediction errors
- Daw ND, Gershman SJ, Seymour B, Dayan P, Dolan RJ. 2011. Model-based influences on humans' choices and striatal prediction errors. Neuron 69:1204-15
- (2011) Neuron , vol.69 , pp. 1204-1215
- Daw, N.D.¹ Gershman, S.J.² Seymour, B.³ Dayan, P.⁴ Dolan, R.J.⁵

24
- 28044450875
- Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
- Daw ND, Niv Y, Dayan P. 2005. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8:1704-11
- (2005) Nat. Neurosci , vol.8 , pp. 1704-1711
- Daw, N.D.¹ Niv, Y.² Dayan, P.³

25
- 84885024210
- Multiple systems for value learning
- Daw ND, O'Doherty JP. 2013. Multiple systems for value learning. See Glimcher &Fehr 2013, pp. 393-410
- (2013) See Glimcher &Fehr 2013 , pp. 393-410
- Daw, N.D.¹ O'Doherty, J.P.²

26
- 84897401223
- Value learning through reinforcement: The basics of dopamine and reinforcement learning
- DawND, Tobler PN. 2013. Value learning through reinforcement: the basics of dopamine and reinforcement learning. See Glimcher &Fehr 2013, pp. 283-98
- (2013) See Glimcher &Fehr 2013 , pp. 283-298
- Daw, N.D.¹ Tobler, P.N.²

27
- 0001158047
- Improving generalization for temporal difference learning: The successor representation
- Dayan P. 1993. Improving generalization for temporal difference learning: the successor representation. Neural Comput. 5:613-24
- (1993) Neural Comput , vol.5 , pp. 613-624
- Dayan, P.¹

28
- 84892682926
- Actions, action sequences and habits: Evidence that goal-directed and habitual action control are hierarchically organized
- Dezfouli A, Balleine BW. 2013. Actions, action sequences and habits: evidence that goal-directed and habitual action control are hierarchically organized. PLOS Comput. Biol. 9:e1003364
- (2013) PLOS Comput. Biol , vol.9 , pp. e1003364
- Dezfouli, A.¹ Balleine, B.W.²

29
- 0043250430
- The role of learning in the operation of motivational systems
- Learning,Motivation and Emotion, ed.CRGallistel, New York: John Wiley &Sons. 3rd ed
- Dickinson A, Balleine BW. 2002. The role of learning in the operation of motivational systems. In Steven's Handbook of Experimental Psychology,Volume 3: Learning,Motivation and Emotion, ed.CRGallistel, pp. 497-534. New York: John Wiley &Sons. 3rd ed
- (2002) Steven's Handbook of Experimental Psychology , vol.3 , pp. 497-534
- Dickinson, A.¹ Balleine, B.W.²

30
- 84875468581
- Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia
- Diuk C, Tsai K, Wallis J, Botvinick M, Niv Y. 2013. Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia. J. Neurosci. 33:5797-805
- (2013) J. Neurosci , vol.33 , pp. 5797-5805
- Diuk, C.¹ Tsai, K.² Wallis, J.³ Botvinick, M.⁴ Niv, Y.⁵

31
- 84885802926
- Goals and habits in the brain
- Dolan RJ, Dayan P. 2013. Goals and habits in the brain. Neuron 80:312-25
- (2013) Neuron , vol.80 , pp. 312-325
- Dolan, R.J.¹ Dayan, P.²

32
- 84928701322
- Model-based choices involve prospective neural activity
- Doll BB, Duncan KD, Simon DA, Shohamy D, Daw ND. 2015. Model-based choices involve prospective neural activity. Nat. Neurosci. 18:767-72
- (2015) Nat. Neurosci , vol.18 , pp. 767-772
- Doll, B.B.¹ Duncan, K.D.² Simon, D.A.³ Shohamy, D.⁴ Daw, N.D.⁵

33
- 84921302012
- Oxford, UK: Oxford Univ. Press
- Eichenbaum H, Cohen NJ. 2004. From Conditioning to Conscious Recollection: Memory Systems of the Brain. Oxford, UK: Oxford Univ. Press
- (2004) From Conditioning to Conscious Recollection: Memory Systems of the Brain
- Eichenbaum, H.¹ Cohen, N.J.²

34
- 31844451013
- Reinforcement learning with Gaussian processes
- 22nd, Bonn, Ger., New York: Assoc. Comput. Mach
- Engel Y, Mannor S, Meir R. 2005. Reinforcement learning with Gaussian processes. Proc. Int. Conf. Mach. Learn., 22nd, Bonn, Ger., pp. 201-8. New York: Assoc. Comput. Mach
- (2005) Proc. Int. Conf. Mach. Learn , pp. 201-208
- Engel, Y.¹ Mannor, S.² Meir, R.³

35
- 56849099116
- Loss aversion, diminishing sensitivity, and the effect of experience on repeated decisions
- Erev I, Ert E, Yechiam E. 2008. Loss aversion, diminishing sensitivity, and the effect of experience on repeated decisions. J. Behav. Decis. Making 21:575-97
- (2008) J. Behav. Decis. Making , vol.21 , pp. 575-597
- Erev, I.¹ Ert, E.² Yechiam, E.³

36
- 27644454882
- Neural systems of reinforcement for drug addiction: From actions to habits to compulsion
- Everitt BJ, Robbins TW. 2005. Neural systems of reinforcement for drug addiction: from actions to habits to compulsion. Nat. Neurosci. 8:1481-89
- (2005) Nat. Neurosci , vol.8 , pp. 1481-1489
- Everitt, B.J.¹ Robbins, T.W.²

37
- 79952059195
- What constitutes an episode in episodic memory?
- Ezzyat Y, Davachi L. 2011. What constitutes an episode in episodic memory? Psychol. Sci. 22(2):243-52
- (2011) Psychol. Sci. , vol.22 , Issue.2 , pp. 243-252
- Ezzyat, Y.¹ Davachi, L.²

38
- 0033968832
- Amodel of hippocampally dependent navigation, using the temporal difference learning rule
- Foster DJ, Morris RGM,Dayan P. 2000. Amodel of hippocampally dependent navigation, using the temporal difference learning rule. Hippocampus 10:1-16
- (2000) Hippocampus , vol.10 , pp. 1-16
- Foster, D.J.¹ Morris, R.G.M.² Dayan, P.³

39
- 10344250993
- By carrot or by stick: Cognitive reinforcement learning in parkinsonism
- Frank MJ, Seeberger LC, O'Reilly RC. 2004. By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science 306:1940-43
- (2004) Science , vol.306 , pp. 1940-1943
- Frank, M.J.¹ Seeberger, L.C.² O'Reilly, R.C.³

40
- 40949160181
- Solving the credit assignment problem: Explicit and implicit learning of action sequences with probabilistic outcomes
- Fu W-T, Anderson JR. 2008. Solving the credit assignment problem: explicit and implicit learning of action sequences with probabilistic outcomes. Psychol. Res. 72:321-30
- (2008) Psychol. Res , vol.72 , pp. 321-330
- Fu, W.-T.¹ Anderson, J.R.²

41
- 0031602724
- Cognitive neuroscience of human memory
- Gabrieli JD. 1998. Cognitive neuroscience of human memory. Annu. Rev. Psychol. 49:87-115
- (1998) Annu. Rev. Psychol. , vol.49 , pp. 87-115
- Gabrieli, J.D.¹

42
- 4444288656
- Kernels and distances for structured data
- Gärtner T, Lloyd JW, Flach PA. 2004. Kernels and distances for structured data. Mach. Learn. 57:205-32
- (2004) Mach. Learn , vol.57 , pp. 205-232
- Gärtner, T.¹ Lloyd, J.W.² Flach, P.A.³

43
- 0001942829
- Neural networks and the bias/variance dilemma
- Geman S, Bienenstock E, Doursat R. 1992. Neural networks and the bias/variance dilemma. Neural Comput. 4:1-58
- (1992) Neural Comput , vol.4 , pp. 1-58
- Geman, S.¹ Bienenstock, E.² Doursat, R.³

44
- 74049117596
- Context, learning, and extinction
- Gershman SJ, Blei DM, Niv Y. 2010. Context, learning, and extinction. Psychol. Rev. 117:197-209
- (2010) Psychol. Rev. , vol.117 , pp. 197-209
- Gershman, S.J.¹ Blei, D.M.² Niv, Y.³

45
- 84893351295
- Retrospective revaluation in sequential decision making: A tale of two systems
- Gershman SJ, Markman AB, Otto AR. 2014. Retrospective revaluation in sequential decision making: a tale of two systems. J. Exp. Psychol. Gen. 143:182-94
- (2014) J. Exp. Psychol. Gen , vol.143 , pp. 182-194
- Gershman, S.J.¹ Markman, A.B.² Otto, A.R.³

46
- 84938863250
- Discovering latent causes in reinforcement learning
- Gershman SJ, Norman KA, Niv Y. 2015. Discovering latent causes in reinforcement learning. Curr. Opin. Behav. Sci. 5:43-50
- (2015) Curr. Opin. Behav. Sci. , vol.5 , pp. 43-50
- Gershman, S.J.¹ Norman, K.A.² Niv, Y.³

47
- 0003737904
- Cambridge, UK: Cambridge Univ. Press
- Gilboa I, Schmeidler D. 2001. A Theory of Case-Based Decisions. Cambridge, UK: Cambridge Univ. Press
- (2001) A Theory of Case-Based Decisions
- Gilboa, I.¹ Schmeidler, D.²

48
- 84961875552
- Characterizing a psychiatric symptom dimension related to deficits in goal-directed control
- Gillan CM, Kosinski M, Whelan R, Phelps EA, Daw ND. 2016. Characterizing a psychiatric symptom dimension related to deficits in goal-directed control. eLife 5:e11305
- (2016) ELife , vol.5 , pp. e11305
- Gillan, C.M.¹ Kosinski, M.² Whelan, R.³ Phelps, E.A.⁴ Daw, N.D.⁵

49
- 77953260848
- States versus rewards: Dissociable neural prediction error signals underlying model-based and model-free reinforcement learning
- Gläscher J, Daw N, Dayan P, O'Doherty JP. 2010. States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66:585-95
- (2010) Neuron , vol.66 , pp. 585-595
- Gläscher, J.¹ Daw, N.² Dayan, P.³ O'Doherty, J.P.⁴

50
- 58249121629
- Cambridge, MA: Academic
- Glimcher PW, Fehr E. 2013. Neuroeconomics: Decision-Making and the Brain. Cambridge, MA: Academic
- (2013) Neuroeconomics: Decision-Making and the Brain
- Glimcher, P.W.¹ Fehr, E.²

51
- 80755180871
- Instance-based learning: Integrating sampling and repeated decisions from experience
- Gonzalez C, Dutt V. 2011. Instance-based learning: integrating sampling and repeated decisions from experience. Psychol. Rev. 118:523-51
- (2011) Psychol. Rev. , vol.118 , pp. 523-551
- Gonzalez, C.¹ Dutt, V.²

52
- 0037971564
- Instance-based learning in dynamic decision making
- Gonzalez C, Lerch JF, Lebiere C. 2003. Instance-based learning in dynamic decision making. Cogn. Sci. 27:591-635
- (2003) Cogn. Sci. , vol.27 , pp. 591-635
- Gonzalez, C.¹ Lerch, J.F.² Lebiere, C.³

53
- 77955281093
- Probabilisticmodels of cognition: Exploring representations and inductive biases
- Griffiths TL,ChaterN,KempC, Perfors A, Tenenbaum JB. 2010. Probabilisticmodels of cognition: exploring representations and inductive biases. Trends Cogn. Sci. 14:357-64
- (2010) Trends Cogn. Sci. , vol.14 , pp. 357-364
- Griffiths, T.L.¹ Chater, N.² Kemp, C.³ Perfors, A.⁴ Tenenbaum, J.B.⁵

54
- 80055084825
- Grid cells, place cells, and geodesic generalization for spatial reinforcement learning
- Gustafson NJ, Daw ND. 2011. Grid cells, place cells, and geodesic generalization for spatial reinforcement learning. PLOS Comput. Biol. 7:e1002235
- (2011) PLOS Comput. Biol , vol.7 , pp. e1002235
- Gustafson, N.J.¹ Daw, N.D.²

55
- 45949091429
- Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors
- Hare TA, O'Doherty J, Camerer CF, Schultz W, Rangel A. 2008. Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors. J. Neurosci. 28:5623-30
- (2008) J. Neurosci , vol.28 , pp. 5623-5630
- Hare, T.A.¹ O'Doherty, J.² Camerer, C.F.³ Schultz, W.⁴ Rangel, A.⁵

56
- 84892388605
- Phasic dopamine release in the rat nucleus accumbens symmetrically encodes a reward prediction error term
- Hart AS,RutledgeRB, Glimcher PW, Phillips PE. 2014. Phasic dopamine release in the rat nucleus accumbens symmetrically encodes a reward prediction error term. J. Neurosci. 34:698-704
- (2014) J. Neurosci , vol.34 , pp. 698-704
- Hart, A.S.¹ Rutledge, R.B.² Glimcher, P.W.³ Phillips, P.E.⁴

57
- 66149096142
- The construction system of the brain
- Hassabis D, Maguire EA. 2009. The construction system of the brain. Philos. Trans. R. Soc. Lond. B Biol. Sci. 364(1521):1263-71
- (2009) Philos. Trans. R. Soc. Lond. B Biol. Sci. , vol.364 , Issue.1521 , pp. 1263-1271
- Hassabis, D.¹ Maguire, E.A.²

58
- 56849118500
- The description-experience gap in risky choice: The role of sample size and experienced probabilities
- Hau R, Pleskac TJ, Kiefer J, Hertwig R. 2008. The description-experience gap in risky choice: the role of sample size and experienced probabilities. J. Behav. Decis. Making 21:493-518
- (2008) J. Behav. Decis. Making , vol.21 , pp. 493-518
- Hau, R.¹ Pleskac, T.J.² Kiefer, J.³ Hertwig, R.⁴

59
- 70449671239
- The description-experience gap in risky choice
- Hertwig R, Erev I. 2009. The description-experience gap in risky choice. Trends Cogn. Sci. 13:517-23
- (2009) Trends Cogn. Sci. , vol.13 , pp. 517-523
- Hertwig, R.¹ Erev, I.²

60
- 0002861883
- A model of how the basal ganglia generate and use neural signals that predict reinforcement
- ed. JCHouk, JLDavis, DG Beiser, Cambridge, MA: MIT Press
- Houk JC, Adams JL, Barto AG. 1995. A model of how the basal ganglia generate and use neural signals that predict reinforcement. In Models of Information Processing in the Basal Ganglia, ed. JCHouk, JLDavis, DG Beiser, pp. 249-70. Cambridge, MA: MIT Press
- (1995) Models of Information Processing in the Basal Ganglia , pp. 249-270
- Houk, J.C.¹ Adams, J.L.² Barto, A.G.³

61
- 84924325916
- Interplay of approximate planning strategies
- Reinforcement Learning and Episodes 125
- Huys QJ, Lally N, Faulkner P, Eshel N, Seifritz E, et al. 2015. Interplay of approximate planning strategies. PNAS 112:3098-103 www.annualreviews.org Reinforcement Learning and Episodes 125
- (2015) PNAS , vol.112 , pp. 3098-3103
- Huys, Q.J.¹ Lally, N.² Faulkner, P.³ Eshel, N.⁴ Seifritz, E.⁵

62
- 69949111927
- Does cognitive science need kernels?
- Jäkel F, Scholkopf B, Wichmann FA. 2009. Does cognitive science need kernels? Trends Cogn. Sci. 13:381-88
- (2009) Trends Cogn. Sci. , vol.13 , pp. 381-388
- Jäkel, F.¹ Scholkopf, B.² Wichmann, F.A.³

63
- 36048937548
- Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point
- Johnson A, Redish AD. 2007. Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point. J. Neurosci. 27:12176-89
- (2007) J. Neurosci , vol.27 , pp. 12176-12189
- Johnson, A.¹ Redish, A.D.²

64
- 0032073263
- Planning and acting in partially observable stochastic domains
- Kaelbling LP, Littman ML, Cassandra AR. 1998. Planning and acting in partially observable stochastic domains. Artif. Intell. 101:99-134
- (1998) Artif. Intell. , vol.101 , pp. 99-134
- Kaelbling, L.P.¹ Littman, M.L.² Cassandra, A.R.³

65
- 0000050520
- Norm theory: Comparing reality to its alternatives
- Kahneman D, Miller DT. 1986. Norm theory: comparing reality to its alternatives. Psychol. Rev. 93:136-53
- (1986) Psychol. Rev. , vol.93 , pp. 136-153
- Kahneman, D.¹ Miller, D.T.²

66
- 85009479014
- Prospect theory: An analysis of decision under risk
- Kahneman D, Tversky A. 1979. Prospect theory: an analysis of decision under risk. Econometrica 47:263-91
- (1979) Econometrica , vol.47 , pp. 263-291
- Kahneman, D.¹ Tversky, A.²

67
- 26944457467
- Bias-variance" error bounds for temporal difference updates
- 13th, Stanford, CA, New York: Assoc. Comput. Mach
- Kearns MJ, Singh SP. 2000. "Bias-variance" error bounds for temporal difference updates. Proc. Annu. Conf. Comput. Learn. Theory, 13th, Stanford, CA, pp. 142-47. New York: Assoc. Comput. Mach
- (2000) Proc. Annu. Conf. Comput. Learn. Theory , pp. 142-147
- Kearns, M.J.¹ Singh, S.P.²

68
- 79958143780
- Speed/accuracy trade-off between the habitual and the goal-directed processes
- Keramati M, Dezfouli A, Piray P. 2011. Speed/accuracy trade-off between the habitual and the goal-directed processes. PLOS Comput. Biol. 7:e1002055
- (2011) PLOS Comput. Biol , vol.7 , pp. e1002055
- Keramati, M.¹ Dezfouli, A.² Piray, P.³

69
- 0029815726
- A neostriatal habit learning system in humans
- Knowlton BJ,Mangels JA, Squire LR. 1996. A neostriatal habit learning system in humans. Science 273:1399-402
- (1996) Science , vol.273 , pp. 1399-1402
- Knowlton, B.J.¹ Mangels, J.A.² Squire, L.R.³

70
- 0026477904
- ALCOVE: An exemplar-based connectionist model of category learning
- Kruschke JK. 1992. ALCOVE: an exemplar-based connectionist model of category learning. Psychol. Rev. 99:22-44
- (1992) Psychol. Rev. , vol.99 , pp. 22-44
- Kruschke, J.K.¹

71
- 84964695586
- Temporal structure in associative retrieval
- Kurth-Nelson Z, Barnes G, Sejdinovic D, Dolan R, Dayan P. 2015. Temporal structure in associative retrieval. eLife 4:e04919
- (2015) ELife , vol.4 , pp. e04919
- Kurth-Nelson, Z.¹ Barnes, G.² Sejdinovic, D.³ Dolan, R.⁴ Dayan, P.⁵

72
- 69349092175
- Hippocampus leads ventral striatum in replay of place-reward information
- Lansink CS, Goltstein PM, Lankelma JV,McNaughton BL, Pennartz CM. 2009. Hippocampus leads ventral striatum in replay of place-reward information. PLOS Biol. 7:e1000173
- (2009) PLOS Biol , vol.7 , pp. e1000173
- Lansink, C.S.¹ Goltstein, P.M.² Lankelma, J.V.³ McNaughton, B.L.⁴ Pennartz, C.M.⁵

73
- 33644782012
- Dynamic response-by-response models of matching behavior in rhesus monkeys
- Lau B, Glimcher PW. 2005. Dynamic response-by-response models of matching behavior in rhesus monkeys. J. Exp. Anal. Behav. 84:555-79
- (2005) J. Exp. Anal. Behav. , vol.84 , pp. 555-579
- Lau, B.¹ Glimcher, P.W.²

74
- 84893508813
- Neural computations underlying arbitration between model-based and model-free learning
- Lee SW, Shimojo S, O'Doherty JP. 2014. Neural computations underlying arbitration between model-based and model-free learning. Neuron 81:687-99
- (2014) Neuron , vol.81 , pp. 687-699
- Lee, S.W.¹ Shimojo, S.² O'Doherty, J.P.³

75
- 85162020429
- Hippocampal contributions to control: The third way
- Lengyel M, Dayan P. 2007. Hippocampal contributions to control: the third way. Adv. Neural Inf. Process. Syst., 20:889-96
- (2007) Adv. Neural Inf. Process. Syst. , vol.20 , pp. 889-896
- Lengyel, M.¹ Dayan, P.²

76
- 85015536997
- The high availability of extreme events serves resource-rational decisionmaking
- 36th, Quebec City, Can., Wheat Ridge, CO: Cogn. Sci. Soc
- Lieder F, HsuM,GriffithsTL. 2014. The high availability of extreme events serves resource-rational decisionmaking. Proc. Ann. Conf. Cogn. Sci. Soc., 36th, Quebec City, Can., pp. 2567-72. Wheat Ridge, CO: Cogn. Sci. Soc
- (2014) Proc. Ann. Conf. Cogn. Sci. Soc , pp. 2567-2572
- Lieder, F.¹ Hsu, M.² Griffiths, T.L.³

77
- 1942539715
- SUSTAIN: A network model of category learning
- Love BC, Medin DL, Gureckis TM. 2004. SUSTAIN: a network model of category learning. Psychol. Rev. 111:309-32
- (2004) Psychol. Rev. , vol.111 , pp. 309-332
- Love, B.C.¹ Medin, D.L.² Gureckis, T.M.³

78
- 84925731304
- Primingmemories of past wins induces risk seeking
- Ludvig EA, Madan CR, Spetch ML. 2015. Primingmemories of past wins induces risk seeking. J. Exp. Psychol. Gen. 144:24-29
- (2015) J. Exp. Psychol. Gen , vol.144 , pp. 24-29
- Ludvig, E.A.¹ Madan, C.R.² Spetch, M.L.³

79
- 57349130536
- Stimulus representation and the timing of reward-prediction errors in models of the dopamine system
- Ludvig EA, Sutton RS, Kehoe EJ. 2008. Stimulus representation and the timing of reward-prediction errors in models of the dopamine system. Neural Comput. 20:3034-54
- (2008) Neural Comput , vol.20 , pp. 3034-3054
- Ludvig, E.A.¹ Sutton, R.S.² Kehoe, E.J.³

80
- 0002674217
- Memory and attentional factors in consumer choice: Concepts and research methods
- Lynch JG Jr., Srull TK. 1982. Memory and attentional factors in consumer choice: concepts and research methods. J. Consum. Res. 9:18-37
- (1982) J. Consum. Res , vol.9 , pp. 18-37
- Lynch, J.G.¹ Srull, T.K.²

81
- 84901298185
- Remembering the best and worst of times: Memories for extreme outcomes bias risky decisions
- Madan CR, Ludvig EA, Spetch ML. 2014. Remembering the best and worst of times: memories for extreme outcomes bias risky decisions. Psychon. Bull. Rev. 21:629-36
- (2014) Psychon. Bull. Rev. , vol.21 , pp. 629-636
- Madan, C.R.¹ Ludvig, E.A.² Spetch, M.L.³

82
- 35748957806
- Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes
- Mahadevan S, Maggioni M. 2007. Proto-value functions: a Laplacian framework for learning representation and control in Markov decision processes. J. Mach. Learn. Res. 8:2169-231
- (2007) J. Mach. Learn. Res , vol.8 , pp. 2169-2231
- Mahadevan, S.¹ Maggioni, M.²

83
- 84924051598
- Human-level control through deep reinforcement learning
- Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, et al. 2015. Human-level control through deep reinforcement learning. Nature 518:529-33
- (2015) Nature , vol.518 , pp. 529-533
- Mnih, V.¹ Kavukcuoglu, K.² Silver, D.³ Rusu, A.A.⁴ Veness, J.⁵

84
- 0029981543
- A framework for mesencephalic dopamine systems based on predictive Hebbian learning
- Montague PR, Dayan P, Sejnowski TJ. 1996. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16:1936-47
- (1996) J. Neurosci , vol.16 , pp. 1936-1947
- Montague, P.R.¹ Dayan, P.² Sejnowski, T.J.³

85
- 84961275812
- Episodic memories predict adaptive value-based decision-making
- Murty VP, Feldman Hall O, Hunter LE, Phelps EA, Davachi L. 2016. Episodic memories predict adaptive value-based decision-making. J. Exp. Psychol. Gen. 145:548-58
- (2016) J. Exp. Psychol. Gen , vol.145 , pp. 548-558
- Murty, V.P.¹ Feldman Hall, O.² Hunter, L.E.³ Phelps, E.A.⁴ Davachi, L.⁵

86
- 0001464753
- Recall and consumer consideration sets: Influencing choice without altering brand evaluations
- Nedungadi P. 1990. Recall and consumer consideration sets: influencing choice without altering brand evaluations. J. Consum. Res. 17:263-76
- (1990) J. Consum. Res , vol.17 , pp. 263-276
- Nedungadi, P.¹

87
- 67349283062
- Reinforcement learning in the brain
- Niv Y. 2009. Reinforcement learning in the brain. J. Math. Psychol. 53:139-54
- (2009) J. Math. Psychol. , vol.53 , pp. 139-154
- Niv, Y.¹

88
- 84930260511
- Reinforcement learning in multidimensional environments relies on attention mechanisms
- Niv Y, Daniel R, Geana A, Gershman SJ, Leong YC, et al. 2015. Reinforcement learning in multidimensional environments relies on attention mechanisms. J. Neurosci. 35:8145-57
- (2015) J. Neurosci , vol.35 , pp. 8145-8157
- Niv, Y.¹ Daniel, R.² Geana, A.³ Gershman, S.J.⁴ Leong, Y.C.⁵

89
- 0022686961
- Attention, similarity, and the identification-categorization relationship
- Nosofsky RM. 1986. Attention, similarity, and the identification-categorization relationship. J. Exp. Psychol. Gen. 115:39-57
- (1986) J. Exp. Psychol. Gen , vol.115 , pp. 39-57
- Nosofsky, R.M.¹

90
- 0004098484
- Oxford, UK: Clarendon Press
- O'Keefe J, Nadel L. 1978. The Hippocampus as a Cognitive Map. Oxford, UK: Clarendon Press
- (1978) The Hippocampus As A Cognitive Map
- O'Keefe, J.¹ Nadel, L.²

91
- 33644927837
- Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia
- O'Reilly RC, Frank MJ. 2006. Making working memory work: a computational model of learning in the prefrontal cortex and basal ganglia. Neural Comput. 18:283-328
- (2006) Neural Comput , vol.18 , pp. 283-328
- O'Reilly, R.C.¹ Frank, M.J.²

92
- 0036832956
- Kernel-based reinforcement learning
- Ormoneit D, Seń S. 2002. Kernel-based reinforcement learning. Mach. Learn. 49:161-78
- (2002) Mach. Learn , vol.49 , pp. 161-178
- Ormoneit, D.¹ Seń, S.²

93
- 84877341847
- The curse of planning: Dissecting multiple reinforcement-learning systems by taxing the central executive
- Otto AR, Gershman SJ, Markman AB, Daw ND. 2013a. The curse of planning: dissecting multiple reinforcement-learning systems by taxing the central executive. Psychol. Sci. 24:751-61
- (2013) Psychol. Sci. , vol.24 , pp. 751-761
- Otto, A.R.¹ Gershman, S.J.² Markman, A.B.³ Daw, N.D.⁴

94
- 84891354506
- Working-memory capacity protectsmodel-based learning from stress
- Otto AR, Raio CM, Chiang A, Phelps EA, DawND. 2013b.Working-memory capacity protectsmodel-based learning from stress. PNAS 110:20941-46
- (2013) PNAS , vol.110 , pp. 20941-20946
- Otto, A.R.¹ Raio, C.M.² Chiang, A.³ Phelps, E.A.⁴ Daw, N.D.⁵

95
- 0029972847
- Inactivation of hippocampus or caudate nucleus with lidocaine differentially affects expression of place and response learning
- PackardMG, McGaugh JL. 1996. Inactivation of hippocampus or caudate nucleus with lidocaine differentially affects expression of place and response learning. Neurobiol. Learn. Mem. 65:65-72
- (1996) Neurobiol. Learn. Mem , vol.65 , pp. 65-72
- Packard, M.G.¹ McGaugh, J.L.²

96
- 84964311329
- Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target
- Parker NF, Cameron CM, Taliaferro JP, Lee J, Choi JY, et al. 2016. Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target. Nat. Neurosci. 19:845-54
- (2016) Nat. Neurosci , vol.19 , pp. 845-854
- Parker, N.F.¹ Cameron, C.M.² Taliaferro, J.P.³ Lee, J.⁴ Choi, J.Y.⁵

97
- 80053119064
- The hippocampal-striatal axis in learning, prediction and goal-directed behavior
- Pennartz CMA, Ito R, Verschure PFMJ, Battaglia FP, Robbins TW. 2011. The hippocampal-striatal axis in learning, prediction and goal-directed behavior. Trends Neurosci. 34:548-59
- (2011) Trends Neurosci , vol.34 , pp. 548-559
- Pennartz, C.M.A.¹ Ito, R.² Verschure, P.F.M.J.³ Battaglia, F.P.⁴ Robbins, T.W.⁵

98
- 33748302924
- Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans
- Pessiglione M, Seymour B, Flandin G, Dolan RJ, Frith CD. 2006. Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans. Nature 442:1042-45
- (2006) Nature , vol.442 , pp. 1042-1045
- Pessiglione, M.¹ Seymour, B.² Flandin, G.³ Dolan, R.J.⁴ Frith, C.D.⁵

99
- 84877578934
- Hippocampal place-cell sequences depict future paths to remembered goals
- Pfeiffer BE, Foster DJ. 2013. Hippocampal place-cell sequences depict future paths to remembered goals. Nature 497:74-79
- (2013) Nature , vol.497 , pp. 74-79
- Pfeiffer, B.E.¹ Foster, D.J.²

100
- 0035969560
- Interactive memory systems in the human brain
- Poldrack RA, Clark J, Pare-Blagoev E, Shohamy D, Moyano JC, et al. 2001. Interactive memory systems in the human brain. Nature 414:546-50
- (2001) Nature , vol.414 , pp. 546-550
- Poldrack, R.A.¹ Clark, J.² Pare-Blagoev, E.³ Shohamy, D.⁴ Moyano, J.C.⁵

101
- 79960241771
- Decision making under uncertainty: A neural model based on partially observable Markov decision processes
- Rao RP. 2010. Decision making under uncertainty: a neural model based on partially observable Markov decision processes. Front. Comput. Neurosci. 4:146
- (2010) Front. Comput. Neurosci , vol.4 , pp. 146
- Rao, R.P.¹

102
- 10344225664
- Addiction as a computational process gone awry
- Redish AD. 2004. Addiction as a computational process gone awry. Science 306:1944-47
- (2004) Science , vol.306 , pp. 1944-1947
- Redish, A.D.¹

103
- 0003602871
- Mahwah, NJ: Lawrence Erlbaum Assoc.
- Riesbeck CK, Schank RC. 1989. Inside Case-based Reasoning. Mahwah, NJ: Lawrence Erlbaum Assoc.
- (1989) Inside Case-based Reasoning
- Riesbeck, C.K.¹ Schank, R.C.²

104
- 80055099662
- The hippocampus is functionally connected to the striatum and orbitofrontal cortex during context dependent decision making
- Ross RS, Sherrill KR, Stern CE. 2011. The hippocampus is functionally connected to the striatum and orbitofrontal cortex during context dependent decision making. Brain Res. 1423:53-66
- (2011) Brain Res , vol.1423 , pp. 53-66
- Ross, R.S.¹ Sherrill, K.R.² Stern, C.E.³

105
- 84964211344
- Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework
- Sadacca BF, Jones JL, Schoenbaum G. 2016. Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework. eLife 5:e13665
- (2016) ELife , vol.5 , pp. e13665
- Sadacca, B.F.¹ Jones, J.L.² Schoenbaum, G.³

106
- 84869780901
- The future of memory: Remembering, imagining, and the brain
- Schacter DL, Addis DR, Hassabis D, Martin VC, Spreng RN, Szpunar KK. 2012. The future of memory: remembering, imagining, and the brain. Neuron 76:677-94
- (2012) Neuron , vol.76 , pp. 677-694
- Schacter, D.L.¹ Addis, D.R.² Hassabis, D.³ Martin, V.C.⁴ Spreng, R.N.⁵ Szpunar, K.K.⁶

107
- 0003408420
- Cambridge, MA: MIT Press
- Scholkopf B, Smola AJ. 2002. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Cambridge, MA: MIT Press
- (2002) Learning with Kernels: Support Vector Machines, Regularization, Optimization, and beyond
- Scholkopf, B.¹ Smola, A.J.²

108
- 70349967987
- Selective impairment of prediction error signaling in human dorsolateral but not ventral striatum in Parkinson's disease patients: Evidence from a model-based fMRI study
- SchonbergT,O'Doherty JP, JoelD, Inzelberg R, Segev Y,Daw ND. 2010. Selective impairment of prediction error signaling in human dorsolateral but not ventral striatum in Parkinson's disease patients: evidence from a model-based fMRI study. NeuroImage 49:772-81
- (2010) NeuroImage , vol.49 , pp. 772-781
- Schonberg, T.¹ O'Doherty, J.P.² Joel, D.³ Inzelberg, R.⁴ Segev, Y.⁵ Daw, N.D.⁶

109
- 0030896968
- A neural substrate of prediction and reward
- SchultzW, Dayan P, Montague PR. 1997. A neural substrate of prediction and reward. Science 275:1593-99
- (1997) Science , vol.275 , pp. 1593-1599
- Schultz, W.¹ Dayan, P.² Montague, P.R.³

110
- 84860163389
- Serotonin selectively modulates reward value in human decision-making
- Seymour B, Daw ND, Roiser JP, Dayan P, Dolan R. 2012. Serotonin selectively modulates reward value in human decision-making. J. Neurosci. 32:5833-42
- (2012) J. Neurosci , vol.32 , pp. 5833-5842
- Seymour, B.¹ Daw, N.D.² Roiser, J.P.³ Dayan, P.⁴ Dolan, R.⁵

111
- 84941026115
- Integrating memories to guide decisions
- Shohamy D, Daw ND. 2015. Integrating memories to guide decisions. Curr. Opin. Behav. Sci. 5:85-90
- (2015) Curr. Opin. Behav. Sci. , vol.5 , pp. 85-90
- Shohamy, D.¹ Daw, N.D.²

112
- 9644283111
- The role of dopamine in cognitive sequence learning: Evidence from Parkinson's disease
- Shohamy D, Myers CE, Grossman S, Sage J, Gluck MA. 2005. The role of dopamine in cognitive sequence learning: evidence from Parkinson's disease. Behav. Brain Res. 156:191-99
- (2005) Behav. Brain Res , vol.156 , pp. 191-199
- Shohamy, D.¹ Myers, C.E.² Grossman, S.³ Sage, J.⁴ Gluck, M.A.⁵

113
- 53849090288
- Integrating memories in the human brain: Hippocampal-midbrain encoding of overlapping events
- Shohamy D, Wagner AD. 2008. Integrating memories in the human brain: hippocampal-midbrain encoding of overlapping events. Neuron 60:378-89
- (2008) Neuron , vol.60 , pp. 378-389
- Shohamy, D.¹ Wagner, A.D.²

114
- 0000275661
- Choice in context: Tradeoff contrast and extremeness aversion
- Simonson I, Tversky A. 1992. Choice in context: tradeoff contrast and extremeness aversion. J. Mark. Res. 29:281-95
- (1992) J. Mark. Res , vol.29 , pp. 281-295
- Simonson, I.¹ Tversky, A.²

115
- 33645947016
- Mistake #37: The effect of previously encountered prices on current housing demand
- Simonsohn U, Loewenstein G. 2006. Mistake #37: the effect of previously encountered prices on current housing demand. Econ. J. 116:175-99
- (2006) Econ. J , vol.116 , pp. 175-199
- Simonsohn, U.¹ Loewenstein, G.²

116
- 0030012117
- Replay of neuronal firing sequences in rat hippocampus during sleep following spatial experience
- Skaggs WE, McNaughton BL. 1996. Replay of neuronal firing sequences in rat hippocampus during sleep following spatial experience. Science 271:1870-73
- (1996) Science , vol.271 , pp. 1870-1873
- Skaggs, W.E.¹ McNaughton, B.L.²

117
- 84941703217
- Evidence integration in model-based tree search
- Solway A, Botvinick MM. 2015. Evidence integration in model-based tree search. PNAS 112:11708-13
- (2015) PNAS , vol.112 , pp. 11708-11713
- Solway, A.¹ Botvinick, M.M.²

118
- 0026847039
- Memory and the hippocampus: A synthesis from findings with rats, monkeys, and humans
- Reinforcement Learning and Episodes 127
- Squire LR. 1992. Memory and the hippocampus: a synthesis from findings with rats, monkeys, and humans. Psychol. Rev. 99:195-231 www.annualreviews.org Reinforcement Learning and Episodes 127
- (1992) Psychol. Rev. , vol.99 , pp. 195-231
- Squire, L.R.¹

119
- 84937876815
- Design principles of the hippocampal cognitive map
- Stachenfeld KL, Botvinick M, Gershman SJ. 2014. Design principles of the hippocampal cognitive map. Adv. Neural Inf. Process. Sys. 27:2528-36
- (2014) Adv. Neural Inf. Process. Sys , vol.27 , pp. 2528-2536
- Stachenfeld, K.L.¹ Botvinick, M.² Gershman, S.J.³

120
- 84891681994
- A causal link between prediction errors, dopamine neurons and learning
- Steinberg EE, Keiflin R, Boivin JR,Witten IB, Deisseroth K, Janak PH. 2013. A causal link between prediction errors, dopamine neurons and learning. Nat. Neurosci. 16:966-73
- (2013) Nat. Neurosci , vol.16 , pp. 966-973
- Steinberg, E.E.¹ Keiflin, R.² Boivin, J.R.³ Witten, I.B.⁴ Deisseroth, K.⁵ Janak, P.H.⁶

121
- 27944495741
- Absolute identification by relative judgment
- StewartN, Brown GD, Chater N. 2005. Absolute identification by relative judgment. Psychol. Rev. 112:881-911
- (2005) Psychol. Rev. , vol.112 , pp. 881-911
- Stewart, N.¹ Brown, G.D.² Chater, N.³

122
- 33646581388
- Decision by sampling
- Stewart N, Chater N, Brown GD. 2006. Decision by sampling. Cogn. Psychol. 53:1-26
- (2006) Cogn. Psychol. , vol.53 , pp. 1-26
- Stewart, N.¹ Chater, N.² Brown, G.D.³

123
- 33847202724
- Learning to predict by the methods of temporal differences
- Sutton RS. 1988. Learning to predict by the methods of temporal differences. Mach. Learn. 3:9-44
- (1988) Mach. Learn , vol.3 , pp. 9-44
- Sutton, R.S.¹

124
- 0012929784
- Dyna, an integrated architecture for learning, planning, and reacting
- Sutton RS. 1991. Dyna, an integrated architecture for learning, planning, and reacting. ACM SIGART Bull. 2:160-63
- (1991) ACM SIGART Bull , vol.2 , pp. 160-163
- Sutton, R.S.¹

125
- 0004102479
- Cambridge, MA: MIT Press
- Sutton RS, Barto AG. 1998. Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

126
- 0034704229
- A global geometric framework for nonlinear dimensionality reduction
- Tenenbaum JB, De Silva V, Langford JC. 2000. A global geometric framework for nonlinear dimensionality reduction. Science 290:2319-23
- (2000) Science , vol.290 , pp. 2319-2323
- Tenenbaum, J.B.¹ De Silva, V.² Langford, J.C.³

127
- 77549088095
- Learning to use working memory in partially observable environments through dopaminergic reinforcement
- Todd MT, Niv Y, Cohen JD. 2008. Learning to use working memory in partially observable environments through dopaminergic reinforcement. Adv. Neural Inf. Process. Sys. 21:1689-96
- (2008) Adv. Neural Inf. Process. Sys , vol.21 , pp. 1689-1696
- Todd, M.T.¹ Niv, Y.² Cohen, J.D.³

128
- 58149442669
- Cognitive maps in rats and men
- Tolman EC. 1948. Cognitive maps in rats and men. Psychol. Rev. 55:189-208
- (1948) Psychol. Rev. , vol.55 , pp. 189-208
- Tolman, E.C.¹

129
- 0000838862
- Episodic and semantic memory 1
- ed. ETulving,WDonaldson, New York: Academic
- Tulving E. 1972. Episodic and semantic memory 1. In Organization and Memory, ed. ETulving,WDonaldson, pp. 381-402. New York: Academic
- (1972) Organization and Memory , pp. 381-402
- Tulving, E.¹

130
- 84941785218
- Ventromedial frontal cortex is critical for guiding attention to reward-predictive visual features in humans
- Vaidya AR, Fellows LK. 2015. Ventromedial frontal cortex is critical for guiding attention to reward-predictive visual features in humans. J. Neurosci. 35:12813-23
- (2015) J. Neurosci , vol.35 , pp. 12813-12823
- Vaidya, A.R.¹ Fellows, L.K.²

131
- 79951967897
- Theta phase precession in rat ventral striatum links place and reward information
- van der Meer MA, Redish AD. 2011. Theta phase precession in rat ventral striatum links place and reward information. J. Neurosci. 31:2843-54
- (2011) J. Neurosci , vol.31 , pp. 2843-2854
- Vander Meer, M.A.¹ Redish, A.D.²

132
- 33749242004
- Berlin: Springer Science &Business Media
- Wasserman L. 2006. All of Nonparametric Statistics. Berlin: Springer Science &Business Media
- (2006) All of Nonparametric Statistics
- Wasserman, L.¹

133
- 84908518146
- Episodic memory encoding interferes with reward learning and decreases striatal prediction errors
- Wimmer GE, Braun EK, Daw ND, Shohamy D. 2014. Episodic memory encoding interferes with reward learning and decreases striatal prediction errors. J. Neurosci. 34:14901-12
- (2014) J. Neurosci , vol.34 , pp. 14901-14912
- Wimmer, G.E.¹ Braun, E.K.² Daw, N.D.³ Shohamy, D.⁴

134
- 84867287309
- Preference by association: How memory mechanisms in the hippocampus bias decisions
- Wimmer GE, Shohamy D. 2012. Preference by association: how memory mechanisms in the hippocampus bias decisions. Science 338:270-73
- (2012) Science , vol.338 , pp. 270-273
- Wimmer, G.E.¹ Shohamy, D.²

135
- 39849087310
- Modeling the role of working memory and episodic memory in behavioral tasks
- Zilli EA, Hasselmo ME. 2008. Modeling the role of working memory and episodic memory in behavioral tasks. Hippocampus 18:193-209
- (2008) Hippocampus , vol.18 , pp. 193-209
- Zilli, E.A.¹ Hasselmo, M.E.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.