SCOPUS 정보 검색 플랫폼

Philosophical Transactions of the Royal Society B: Biological Sciences

Volumn 369, Issue 1655, 2014, Pages

The algorithmic anatomy of model-based evaluation

b UNIVERSITY COLLEGE LONDON (United Kingdom)

Author keywords

Model based reasoning; Model free reasoning; Monte Carlo tree search; Orbitofrontal cortex; Reinforcement learning; Striatum

Indexed keywords

ALGORITHM; ANATOMY; ANIMAL; COMPUTER SIMULATION; LEARNING; MONTE CARLO ANALYSIS; NUMERICAL MODEL; PREDICTION; TWENTIETH CENTURY;

ANIMAL; BIOLOGICAL MODEL; BRAIN; HUMAN; LEARNING; MARKOV CHAIN; PHYSIOLOGY;

ANIMALS; BRAIN; HUMANS; LEARNING; MARKOV CHAINS; MODELS, NEUROLOGICAL;

EID: 84907545889 PISSN: 09628436 EISSN: 14712970 Source Type: Journal
DOI: 10.1098/rstb.2013.0478 Document Type: Article

Times cited : (138)

References (129)

1
- 0033213819
- What are the computations of the cerebellum, the basal ganglia and the cerebral cortex?
- Doya K. 1999 What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Netw. 12, 961–974. (doi:10.1016/S0893-6080(99)00046-5)
- (1999) Neural Netw , vol.12 , pp. 961-974
- Doya, K.¹

2
- 0004102479
- Cambridge, MA: MIT Press
- Sutton R., Barto AG. 1998 Reinforcement learning: an introduction (adaptive computation and machine learning). Cambridge, MA: MIT Press.
- (1998) Reinforcement learning: An introduction (adaptive computation and machine learning)
- Sutton, R.¹ Barto, A.G.²

3
- 0002166755
- Actions and habits: Variations in associative representations during instrumental learning
- eds N Spear, R Miller, Hillsdale, NJ: Lawrence Erlbaum Associates
- Adams C, Dickinson A. 1981 Actions and habits: variations in associative representations during instrumental learning. In Information processing in animals: memory mechanisms (eds N Spear, R Miller), pp. 143–165. Hillsdale, NJ: Lawrence Erlbaum Associates.
- (1981) Information processing in animals: Memory mechanisms , pp. 143-165
- Adams, C.¹ Dickinson, A.²

4
- 28044450875
- Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
- Daw ND, Niv Y, Dayan P. 2005 Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711. (doi:10.1038/n1560)
- (2005) Nat. Neurosci , vol.8 , pp. 1704-1711
- Daw, N.D.¹ Niv, Y.² Dayan, P.³

5
- 0043250430
- The role of learning in motivation
- (ed. C Gallistel, New York, NY: Wiley
- Dickinson A, Balleine B. 2002 The role of learning in motivation. In Stevens’ handbook of experimental psychology, vol. 3 (ed. C Gallistel), pp. 497–533. New York, NY: Wiley.
- (2002) Stevens’ handbook of experimental psychology , vol.3 , pp. 497-533
- Dickinson, A.¹ Balleine, B.²

6
- 84855454549
- New York, NY: Farrar, Straus and Giroux
- Kahneman D. 2010 Thinking, fast and slow. New York, NY: Farrar, Straus and Giroux.
- (2010) Thinking, fast and slow
- Kahneman, D.¹

7
- 0242305777
- Individual differences in reasoning: Implications for the rationality debate?
- (eds T Gilovich, D Griffin, D Kahneman, Cambridge, UK: Cambridge University Press
- Stanovich K, West R. 2002 Individual differences in reasoning: implications for the rationality debate? In Heuristics and biases: the psychology of intuitive judgment (eds T Gilovich, D Griffin, D Kahneman), pp. 421–440. Cambridge, UK: Cambridge University Press.
- (2002) Heuristics and biases: The psychology of intuitive judgment , pp. 421-440
- Stanovich, K.¹ West, R.²

8
- 0001201756
- Some studies in machine learning using the game of checkers
- Samuel A. 1959 Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 3, 210–229. (doi:10.1147/rd.33.0210)
- (1959) IBM J. Res.Dev , vol.3 , pp. 210-229
- Samuel, A.¹

9
- 0020970738
- Neuronlike adaptive elements that can solve difficult learning control problems
- Barto A., Sutton R., Anderson CW. 1983 Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybernet. 834–846. (doi:10.1109/TSMC.1983.6313077)
- (1983) IEEE Trans. Syst. Man Cybernet , pp. 834-846
- Barto, A.¹ Sutton, R.² Anderson, C.W.³

10
- 84859341150
- Habits, action sequences and reinforcement learning
- Dezfouli A, Balleine BW. 2012 Habits, action sequences and reinforcement learning. Eur. J. Neurosci. 35, 1036–1051. (doi:10.1111/j. 1460-9568.2012.08050.x)
- (2012) Eur. J. Neurosci , vol.35 , pp. 1036-1051
- Dezfouli, A.¹ Balleine, B.W.²

11
- 84907480610
- Habits as action sequences: Hierarchical action control and changes in outcome value
- Dezfouli A, Lingawi NW, Balleine BW. 2014 Habits as action sequences: hierarchical action control and changes in outcome value. Phil. Trans. R. Soc. B 369, 20130482. (doi:10.1098/rstb.2013.0482)
- (2014) Phil. Trans. R. Soc. B , vol.369 , pp. 20130482
- Dezfouli, A.¹ Lingawi, N.W.² Balleine, B.W.³

12
- 84885802926
- Goals and habits in the brain
- Dolan RJ, Dayan P. 2013 Goals and habits in the brain. Neuron 80, 312–325. (doi:10.1016/j.neuron.2013.09.007)
- (2013) Neuron , vol.80 , pp. 312-325
- Dolan, R.J.¹ Dayan, P.²

13
- 1842853951
- Relations between Pavlovian – instrumental transfer and reinforcer devaluation
- Holland PC. 2004 Relations between Pavlovian – instrumental transfer and reinforcer devaluation. J. Exp. Psychol. Anim. Behav. Process. 30, 104–117. (doi:10.1037/0097-7403.30.2.104)
- (2004) J. Exp. Psychol. Anim. Behav. Process , vol.30 , pp. 104-117
- Holland, P.C.¹

14
- 79955709936
- Neural correlates of forward planning in a spatial decision task in humans
- Simon DA, Daw ND. 2011 Neural correlates of forward planning in a spatial decision task in humans. J. Neurosci. 31, 5526– 5539. (doi:10.1523/JNEUROSCI.4647-10.2011)
- (2011) J. Neurosci , vol.31 , pp. 5526-5539
- Simon, D.A.¹ Daw, N.D.²

15
- 0003998491
- New York, NY: Macmillan
- Thorndike E. 1911 Animal intelligence. New York, NY: Macmillan.
- (1911) Animal intelligence
- Thorndike, E.¹

16
- 0001461525
- There is more than one kind of learning
- Tolman E. 1949 There is more than one kind of learning. Psychol. Rev. 56, 144–155. (doi:10.1037/h0055304)
- (1949) Psychol. Rev , vol.56 , pp. 144-155
- Tolman, E.¹

17
- 84877341847
- The curse of planning: Dissecting multiple reinforcement-learning systems by taxing the central executive
- Otto AR, Gershman SJ, Markman AB, Daw ND. 2013 The curse of planning: dissecting multiple reinforcement-learning systems by taxing the central executive. Psychol. Sci. 24, 751–761. (doi:10.1177/0956797612463080)
- (2013) Psychol. Sci , vol.24 , pp. 751-761
- Otto, A.R.¹ Gershman, S.J.² Markman, A.B.³ Daw, N.D.⁴

18
- 84891354506
- Working-memory capacity protects model-based learning from stress
- Otto AR, Raio CM, Chiang A, Phelps EA, Daw ND. 2013 Working-memory capacity protects model-based learning from stress. Proc. Natl Acad. Sci. SA 110, 20 941– 20 946. (doi:10.1073/pnas.1312011110)
- (2013) Proc. Natl Acad. Sci , vol.110 , pp. 20941-20946
- Otto, A.R.¹ Raio, C.M.² Chiang, A.³ Phelps, E.A.⁴ Daw, N.D.⁵

19
- 0000541213
- Adaptive critics and the basal ganglia
- eds J Houk, J Davis, D Beiser, Cambridge, MA: MIT Press
- Barto A. 1995 Adaptive critics and the basal ganglia. In Models of information processing in the basal ganglia (eds J Houk, J Davis, D Beiser),pp. 215–232. Cambridge, MA: MIT Press.
- (1995) Models of information processing in the basal ganglia , pp. 215-232
- Barto, A.¹

20
- 0029981543
- A framework for mesencephalic dopamine systems based on predictive Hebbian learning
- Montague PR, Dayan P, Sejnowski TJ. 1996 A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947.
- (1996) J. Neurosci , vol.16 , pp. 1936-1947
- Montague, P.R.¹ Dayan, P.² Sejnowski, T.J.³

21
- 0030896968
- A neural substrate of prediction and reward
- Schultz W, Dayan P, Montague PR. 1997 A neural substrate of prediction and reward. Science 275, 1593– 1599. (doi:10.1126/science. 275.5306.1593)
- (1997) Science , vol.275 , pp. 1593-1599
- Schultz, W.¹ Dayan, P.² Montague, P.R.³

22
- 28444472936
- Neural bases of food-seeking: Affect, arousal and reward in corticostriatolimbic circuits
- Balleine BW. 2005 Neural bases of food-seeking: affect, arousal and reward in corticostriatolimbic circuits. Physiol. Behav. 86, 717–730. (doi:10.1016/j.physbeh.2005.08.061)
- (2005) Physiol. Behav , vol.86 , pp. 717-730
- Balleine, B.W.¹

23
- 0037382264
- Coordination of actions and habits in the medial prefrontal cortex of rats
- Killcross S, Coutureau E. 2003 Coordination of actions and habits in the medial prefrontal cortex of rats. Cereb. Cortex 13, 400– 408. (doi:10.1093/cercor/13.4.400)
- (2003) Cereb. Cortex , vol.13 , pp. 400-408
- Killcross, S.¹ Coutureau, E.²

24
- 84893752030
- Learning theory: A driving force in understanding orbitofrontal function
- McDannald MA, Jones JL, Takahashi YK, Schoenbaum G. 2013 Learning theory: a driving force in understanding orbitofrontal function. Neurobiol. Learn. Mem. 108C, 22–27.
- (2013) Neurobiol. Learn. Mem , vol.108C , pp. 22-27
- McDannald, M.A.¹ Jones, J.L.² Takahashi, Y.K.³ Schoenbaum, G.⁴

25
- 79951823576
- Ventral striatum and orbitofrontal cortex are both required for model-based, but not model-free, reinforcement learning
- McDannald MA, Lucantonio F, Burke KA, Niv Y, Schoenbaum G. 2011 Ventral striatum and orbitofrontal cortex are both required for model-based, but not model-free, reinforcement learning. J. Neurosci. 31, 2700–2705. (doi:10.1523/ JNEUROSCI.5499-10.2011)
- (2011) J. Neurosci , vol.31 , pp. 2700-2705
- McDannald, M.A.¹ Lucantonio, F.² Burke, K.A.³ Niv, Y.⁴ Schoenbaum, G.⁵

26
- 84859323549
- Model-based learning and the contribution of the orbitofrontal cortex to the model-free world
- McDannald MA, Takahashi YK, Lopatina N, Pietras BW, Jones JL, Schoenbaum G. 2012 Model-based learning and the contribution of the orbitofrontal cortex to the model-free world. Eur. J. Neurosci. 35, 991–996. (doi:10.1111/j.1460-9568.2011.07982.x)
- (2012) Eur. J. Neurosci , vol.35 , pp. 991-996
- McDannald, M.A.¹ Takahashi, Y.K.² Lopatina, N.³ Pietras, B.W.⁴ Jones, J.L.⁵ Schoenbaum, G.⁶

27
- 53949118376
- Reward-guided learning beyond dopamine in the nucleus accumbens: The integrative functions of cortico-basal ganglia networks
- Yin HH, Ostlund SB, Balleine BW. 2008 Reward-guided learning beyond dopamine in the nucleus accumbens: the integrative functions of cortico-basal ganglia networks. Eur. J. Neurosci. 28, 1437–1448. (doi:10.1111/j.1460-9568.2008.06422.x)
- (2008) Eur. J. Neurosci , vol.28 , pp. 1437-1448
- Yin, H.H.¹ Ostlund, S.B.² Balleine, B.W.³

28
- 77953260848
- States versus rewards: Dissociable neural prediction error signals underlying model-based and model-free reinforcement learning
- Glascher J, Daw N, Dayan P, O’Doherty JP. 2010 States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66, 585–595. (doi:10.1016/j.neuron.2010.04.016)
- (2010) Neuron , vol.66 , pp. 585-595
- Glascher, J.¹ Daw, N.² Dayan, P.³ O’doherty, J.P.⁴

29
- 66449119919
- A specific role for posterior dorsolateral striatum in human habit learning
- Tricomi E, Balleine B., O’Doherty JP. 2009 A specific role for posterior dorsolateral striatum in human habit learning. Eur. J. Neurosci. 29, 2225–2232. (doi:10.1111/j.1460-9568.2009.06796.x)
- (2009) Eur. J. Neurosci , vol.29 , pp. 2225-2232
- Tricomi, E.¹ Balleine, B.² O’doherty, J.P.³

30
- 34247147767
- Determining the neural substrates of goal-directed learning in the human brain
- Valentin VV, Dickinson A, O’Doherty JP. 2007 Determining the neural substrates of goal-directed learning in the human brain. J. Neurosci. 27, 4019– 4026. (doi:10.1523/JNEUROSCI.0564-07.2007)
- (2007) J. Neurosci , vol.27 , pp. 4019-4026
- Valentin, V.V.¹ Dickinson, A.² O’doherty, J.P.³

31
- 84860307045
- Mapping value based planning and extensively trained choice in the human brain
- Wunderlich K, Dayan P, Dolan RJ. 2012 Mapping value based planning and extensively trained choice in the human brain. Nat. Neurosci. 15, 786–791. (doi:10.1038/nn.3068)
- (2012) Nat. Neurosci , vol.15 , pp. 786-791
- Wunderlich, K.¹ Dayan, P.² Dolan, R.J.³

32
- 79958143780
- Speed/ accuracy trade-off between the habitual and the goal-directed processes
- Keramati M, Dezfouli A, Piray P. 2011 Speed/ accuracy trade-off between the habitual and the goal-directed processes. PLoS Comput. Biol. 7, e1002055. (doi:10.1371/journal.pcbi.1002055)
- (2011) PLoS Comput. Biol , vol.7
- Keramati, M.¹ Dezfouli, A.² Piray, P.³

33
- 84878783112
- The mixed instrumental controller: Using value of information to combine habitual choice and mental simulation
- Pezzulo G, Rigoli F, Chersi F. 2013 The mixed instrumental controller: using value of information to combine habitual choice and mental simulation. Front. Psychol. 4, 92. (doi:10.3389/fpsyg.2013.00092)
- (2013) Front. Psychol , vol.4 , pp. 92
- Pezzulo, G.¹ Rigoli, F.² Chersi, F.³

34
- 84859737036
- Goal-directed decision making as probabilistic inference: A computational framework and potential neural correlates
- Solway A, Botvinick MM. 2012 Goal-directed decision making as probabilistic inference: a computational framework and potential neural correlates. Psychol. Rev. 119, 120–154. (doi:10.1037/a0026435)
- (2012) Psychol. Rev , vol.119 , pp. 120-154
- Solway, A.¹ Botvinick, M.M.²

35
- 16244367260
- Cambridge, MA: MIT Press
- Baum EB. 2004 What is thought? Cambridge, MA: MIT Press.
- (2004) What is thought?
- Baum, E.B.¹

36
- 84878179610
- How to set the switches on this thing
- Dayan P. 2012 How to set the switches on this thing. Curr. Opin. Neurobiol. 22, 1068–1074. (doi:10.1016/j.conb.2012.05.011)
- (2012) Curr. Opin. Neurobiol , vol.22 , pp. 1068-1074
- Dayan, P.¹

37
- 0003711660
- Cambridge, MA: MIT Press
- Russell S., Wefald EH. 1991 Do the right thing: studies in limited rationality. Cambridge, MA: MIT Press.
- (1991) Do the right thing: Studies in limited rationality
- Russell, S.¹ Wefald, E.H.²

38
- 0004077471
- Cambridge, MA: MIT Press
- Simon HA. 1982 Models of bounded rationality. Cambridge, MA: MIT Press.
- (1982) Models of bounded rationality
- Simon, H.A.¹

39
- 79952746011
- Model-based influences on humans’ choices and striatal prediction errors
- Daw ND, Gershman SJ, Seymour B, Dayan P, Dolan RJ. 2011 Model-based influences on humans’ choices and striatal prediction errors. Neuron 69, 1204–1215. (doi:10.1016/j.neuron.2011.02.027)
- (2011) Neuron , vol.69 , pp. 1204-1215
- Daw, N.D.¹ Gershman, S.J.² Seymour, B.³ Dayan, P.⁴ Dolan, R.J.⁵

40
- 70449715719
- Instructional control of reinforcement learning: A behavioral and neurocomputational investigation
- Doll BB, Jacobs WJ, Sanfey AG, Frank MJ. 2009 Instructional control of reinforcement learning: a behavioral and neurocomputational investigation. Brain Res. 1299, 74–94. (doi:10.1016/j.brainres.2009.07.007)
- (2009) Brain Res , vol.1299 , pp. 74-94
- Doll, B.B.¹ Jacobs, W.J.² Sanfey, A.G.³ Frank, M.J.⁴

41
- 84872761547
- The ubiquity of model-based reinforcement learning
- Doll BB, Simon DA, Daw ND. 2012 The ubiquity of model-based reinforcement learning. Curr. Opin. Neurobiol. 22, 1075–1081. (doi:10.1016/j.conb.2012.08.003)
- (2012) Curr. Opin. Neurobiol , vol.22 , pp. 1075-1081
- Doll, B.B.¹ Simon, D.A.² Daw, N.D.³

42
- 84878779561
- Retrospective revaluation in sequential decision making: A tale of two systems
- Gershman SJ, Markman AB, Otto AR. 2012 Retrospective revaluation in sequential decision making: a tale of two systems. J. Exp. Psychol. Gen. 24, 751–761.
- (2012) J. Exp. Psychol. Gen , vol.24 , pp. 751-761
- Gershman, S.J.¹ Markman, A.B.² Otto, A.R.³

43
- 82255179147
- Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex
- Takahashi YK, Roesch MR, Wilson RC, Toreson K, O’Donnell P, Niv Y, Schoenbaum G. 2011 Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex. Nat. Neurosci. 14, 1590–1597. (doi:10.1038/nn.2957)
- (2011) Nat. Neurosci , vol.14 , pp. 1590-1597
- Takahashi, Y.K.¹ Roesch, M.R.² Wilson, R.C.³ Toreson, K.⁴ O’donnell, P.⁵ Niv, Y.⁶ Schoenbaum, G.⁷

44
- 0032073263
- Planning and acting in partially observable stochastic domains
- Kaelbling LP, Littman ML, Cassandra AR. 1998 Planning and acting in partially observable stochastic domains. Artif. Intell. 101, 99–134. (doi:10.1016/S0004-3702(98)00023-X)
- (1998) Artif. Intell , vol.101 , pp. 99-134
- Kaelbling, L.P.¹ Littman, M.L.² Cassandra, A.R.³

45
- 0036832951
- A sparse sampling algorithm for near-optimal planning in large Markov decision processes
- Kearns M, Mansour Y, Ng AY. 2002 A sparse sampling algorithm for near-optimal planning in large Markov decision processes. Mach. Learn. 49, 193–208. (doi:10.1023/A:1017932429737)
- (2002) Mach. Learn , vol.49 , pp. 193-208
- Kearns, M.¹ Mansour, Y.² Ng, A.Y.³

46
- 0003487482
- Belmont, MA: Athena Scientific
- Bertsekas DP, Tsitsiklis JN. 1996 Neuro-dynamic programming. Belmont, MA: Athena Scientific.
- (1996) Neuro-dynamic programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

47
- 33847202724
- Learning to predict by the methods of temporal differences
- Sutton R. 1988 Learning to predict by the methods of temporal differences. Mach. Learn. 3, 9–44.
- (1988) Mach. Learn , vol.3 , pp. 9-44
- Sutton, R.¹

48
- 33750293964
- Bandit based Monte Carlo planning
- (eds J Fu¨rnkranz, T Scheffer, M Spiliopoulou, Berlin, Germany: Springer
- Kocsis L, Szepesvari C. 2006 Bandit based Monte Carlo planning. In Machine learning: ECML 2006 (eds J Fu¨rnkranz, T Scheffer, M Spiliopoulou), pp. 282–293. Berlin, Germany: Springer.
- (2006) Machine learning: ECML 2006 , pp. 282-293
- Kocsis, L.¹ Szepesvari, C.²

49
- 56449110907
- Sample-based learning and search with permanent and transient memories
- New York, NY: Association for Computing Machinery
- Silver D, Sutton R., Muller M. 2008 Sample-based learning and search with permanent and transient memories. In Proc. 25th Int. Conf. on Machine Learning, pp. 968–975. New York, NY: Association for Computing Machinery.
- (2008) Proc. 25th Int. Conf. on Machine Learning , pp. 968-975
- Silver, D.¹ Sutton, R.² Muller, M.³

50
- 85132026293
- Integrated architectures for learning, planning, reacting based on approximating dynamic programming
- San Fransisco, CA: Morgan Kaufmann
- Sutton R. 1990 Integrated architectures for learning, planning, reacting based on approximating dynamic programming. Proc. Seventh Int. Conf. on Machine Learning, pp. 216–224. San Fransisco, CA: Morgan Kaufmann.
- (1990) Proc. Seventh Int. Conf. on Machine Learning , pp. 216-224
- Sutton, R.¹

51
- 0034275416
- Learning to play chess using temporal differences
- Baxter J, Tridgell A, Weaver L. 2000 Learning to play chess using temporal differences. Mach. Learn. 40, 243– 263. (doi:10.1023/A:1007634325138)
- (2000) Mach. Learn , vol.40 , pp. 243-263
- Baxter, J.¹ Tridgell, A.² Weaver, L.³

52
- 84858720579
- Bootstrapping from game tree search
- (eds Y Bengio, D Schuurmans, J Lafferty,, C Williams, A Culotta), Red Hook, NY: Curran Associates
- Veness J, Silver D, Uther WT, Blair A. 2009 Bootstrapping from game tree search. In NIPS, vol. 19 (eds Y Bengio, D Schuurmans, J Lafferty, C Williams, A Culotta), pp. 1937–1945. Red Hook, NY: Curran Associates.
- (2009) NIPS , vol.19 , pp. 1937-1945
- Veness, J.¹ Silver, D.² Uther, W.T.³ Blair, A.⁴

53
- 0001158047
- Improving generalization for temporal difference learning: The successor representation
- Dayan P. 1993 Improving generalization for temporal difference learning: the successor representation. Neural Comput. 5, 613–624. (doi:10.1162/neco.1993.5.4.613)
- (1993) Neural Comput , vol.5 , pp. 613-624
- Dayan, P.¹

54
- 84922015064
- TD models: Modeling the world at a mixture of time scales
- (eds A Prieditis, SJ Russell, San Mateo, CA: Morgan Kaufmann
- Sutton RS. 1995 TD models: modeling the world at a mixture of time scales. In ICML, vol. 12 (eds A Prieditis, SJ Russell), pp. 531–539. San Mateo, CA: Morgan Kaufmann.
- (1995) ICML , vol.12 , pp. 531-539
- Sutton, R.S.¹

55
- 84867135062
- Compositional planning using optimal option models
- (eds, J Langford, J Pineau), New York, NY: Omni Press
- Silver D, Ciosek K. 2012 Compositional planning using optimal option models. In Proc. 29th Int. Conf. on Machine Learning, ICML ‘12 (eds J Langford, J Pineau), pp. 1063–1070. New York, NY: Omni Press.
- (2012) Proc. 29th Int. Conf. on Machine Learning, ICML ‘12 , pp. 1063-1070
- Silver, D.¹ Ciosek, K.²

56
- 84907487070
- Model-based hierarchical reinforcement learning and human action control
- Botvinick M, Weinstein A. 2014 Model-based hierarchical reinforcement learning and human action control. Phil. Trans. R. Soc. B 369, 20130480. (doi:10.1098/rstb.2013.0480)
- (2014) Phil. Trans. R. Soc. B , vol.369 , pp. 20130480
- Botvinick, M.¹ Weinstein, A.²

57
- 0033170372
- Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
- Sutton R., Precup D, Singh S. 1999 Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112, 181–211. (doi:10.1016/S0004-3702(99) 00052-1)
- (1999) Artif. Intell , vol.112 , pp. 181-211
- Sutton, R.¹ Precup, D.² Singh, S.³

58
- 84898967780
- Policy search via density estimation
- eds SA Solla, TK Leen, K Müller, Cambridge, MA: MIT Press
- Ng AY, Parr R, Koller D. 1999 Policy search via density estimation. In NIPS (eds SA Solla, TK Leen, K Müller), pp. 1022–1028. Cambridge, MA: MIT Press.
- (1999) NIPS , pp. 1022-1028
- Ng, A.Y.¹ Parr, R.² Koller, D.³

59
- 34548784027
- Dual representations for dynamic programming and reinforcement learning
- Honolulu, HI, 1–5 April 2007. ADPRL 2007, IEEE
- Wang T, Bowling M, Schuurmans D. 2007 Dual representations for dynamic programming and reinforcement learning. In IEEE Int. Symp. on Approximate Dynamic Programming and Reinforcement Learning, Honolulu, HI, 1–5 April 2007. ADPRL 2007, pp. 44–51. IEEE. (doi:10.1109/ADPRL.2007.368168)
- (2007) IEEE Int. Symp. on Approximate Dynamic Programming and Reinforcement Learning , pp. 44-51
- Wang, T.¹ Bowling, M.² Schuurmans, D.³

60
- 0002692217
- Actions and habits: The development of behavioural autonomy
- Dickinson A. 1985 Actions and habits: the development of behavioural autonomy. Phil. Trans R. Soc. Lond. B 308, 67–78. (doi:10.1098/rstb.1985.0010)
- (1985) Phil. Trans R. Soc. Lond. B , vol.308 , pp. 67-78
- Dickinson, A.¹

61
- 0031619316
- Bayesian Q-learning
- Menlo Park, CA: American Association for Artificial Intelligence
- Dearden R, Friedman N, Russell S. 1998 Bayesian Q-learning. In Proc. Fifteenth Nat. Conf. on Artificial Intelligence, pp. 761–768. Menlo Park, CA: American Association for Artificial Intelligence.
- (1998) Proc. Fifteenth Nat. Conf. on Artificial Intelligence , pp. 761-768
- Dearden, R.¹ Friedman, N.² Russell, S.³

62
- 1942421151
- Bayes meets Bellman: The Gaussian process approach to temporal difference learning
- eds T Fawcett, N Mishra, Washington, DC: AAAI Press
- Engel Y, Mannor S, Meir R. 2003 Bayes meets Bellman: the Gaussian process approach to temporal difference learning. In ICML (eds T Fawcett, N Mishra), pp. 154–161. Washington, DC: AAAI Press.
- (2003) ICML , pp. 154-161
- Engel, Y.¹ Mannor, S.² Meir, R.³

63
- 14344261137
- Bias and variance in value function estimation
- New York, NY: Association for Computing Machinery
- Mannor S, Simester D, Sun P, Tsitsiklis JN. 2004 Bias and variance in value function estimation. In Proc. 21st Int. Conf. on Machine Learning, p. 72. New York, NY: Association for Computing Machinery.
- (2004) Proc. 21st Int. Conf. on Machine Learning , pp. 72
- Mannor, S.¹ Simester, D.² Sun, P.³ Tsitsiklis, J.N.⁴

64
- 0027684215
- Prioritized sweeping: Reinforcement learning with less data and less time
- Moore AW, Atkeson CG. 1993 Prioritized sweeping: reinforcement learning with less data and less time. Mach. Learn. 13, 103–130. (doi:10.1007/BF00993104)
- (1993) Mach. Learn , vol.13 , pp. 103-130
- Moore, A.W.¹ Atkeson, C.G.²

65
- 84946268134
- Variations in the sensitivity of instrumental responding to reinforcer devaluation
- Adams CD. 1982 Variations in the sensitivity of instrumental responding to reinforcer devaluation. Q. J. Exp. Psychol. 34, 77–98.
- (1982) Q. J. Exp. Psychol , vol.34 , pp. 77-98
- Adams, C.D.¹

66
- 0004049893
- PhD thesis, Cambridge University, Cambridge, UK
- Watkins CJCH. 1989 Learning from delayed rewards. PhD thesis, Cambridge University, Cambridge, UK.
- (1989) Learning from delayed rewards
- Watkins, C.H.¹

67
- 84868291906
- Technical Report no. UCB/EECS-2011-119. EECS Department, University of California, Berkeley, CA
- Hay N, Russell SJ. 2011 Metareasoning for Monte Carlo tree search. Technical Report no. UCB/EECS-2011-119. EECS Department, University of California, Berkeley, CA.
- (2011) Metareasoning for Monte Carlo tree search
- Hay, N.¹ Russell, S.J.²

68
- 85012688561
- Princeton, NJ: Princeton University Press
- Bellman RE. 1957 Dynamic programming. Princeton, NJ: Princeton University Press.
- (1957) Dynamic programming
- Bellman, R.E.¹

69
- 72049125602
- Human and rodent homologies in action control: Corticostriatal determinants of goal-directed and habitual action
- Balleine B., O’Doherty JP. 2010 Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology 35, 48–69. (doi:10.1038/npp.2009.131)
- (2010) Neuropsychopharmacology , vol.35 , pp. 48-69
- Balleine, B.¹ O’doherty, J.P.²

70
- 79960836887
- Parallel associative processing in the dorsal striatum: Segregation of stimulus–response and cognitive control subregions
- Devan BD, Hong NS, McDonald RJ. 2011 Parallel associative processing in the dorsal striatum: segregation of stimulus–response and cognitive control subregions. Neurobiol. Learn. Mem. 96, 95–120. (doi:10.1016/j.nlm.2011.06.002)
- (2011) Neurobiol. Learn. Mem , vol.96 , pp. 95-120
- Devan, B.D.¹ Hong, N.S.² McDonald, R.J.³

71
- 77953675717
- Differential dynamics of activity changes in dorsolateral and dorsomedial striatal loops during learning
- Thorn CA, Atallah H, Howe M, Graybiel AM. 2010 Differential dynamics of activity changes in dorsolateral and dorsomedial striatal loops during learning. Neuron 66, 781–795. (doi:10.1016/j.neuron.2010.04.036)
- (2010) Neuron , vol.66 , pp. 781-795
- Thorn, C.A.¹ Atallah, H.² Howe, M.³ Graybiel, A.M.⁴

72
- 0025321039
- Functional architecture of basal ganglia circuits: Neural substrates of parallel processing
- Alexander GE, Crutcher MD. 1990 Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends Neurosci. 13, 266–271. (doi:10.1016/0166-2236(90)90107-L)
- (1990) Trends Neurosci , vol.13 , pp. 266-271
- Alexander, G.E.¹ Crutcher, M.D.²

73
- 0022930826
- Parallel organization of functionally segregated circuits linking basal ganglia and cortex
- Alexander GE, DeLong MR, Strick PL. 1986 Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu. Rev. Neurosci. 9, 357–381. (doi:10.1146/annurev.ne.09.030186.002041)
- (1986) Annu. Rev. Neurosci , vol.9 , pp. 357-381
- Alexander, G.E.¹ Delong, M.R.² Strick, P.L.³

74
- 33744550336
- Anatomy of a decision: Striato-orbitofrontal interactions in reinforcement learning, decision making, reversal
- Frank MJ, Claus ED. 2006 Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, reversal. Psychol. Rev. 113, 300–326. (doi:10.1037/0033-295X.113.2.300)
- (2006) Psychol. Rev , vol.113 , pp. 300-326
- Frank, M.J.¹ Claus, E.D.²

75
- 0031801210
- Goal-directed instrumental action: Contingency and incentive learning and their cortical substrates
- Balleine B., Dickinson A. 1998 Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology 37, 407–419. (doi:10.1016/S0028-3908(98)00033-1)
- (1998) Neuropharmacology , vol.37 , pp. 407-419
- Balleine, B.¹ Dickinson, A.²

76
- 0141502087
- Lesions of mediodorsal thalamus and anterior thalamic nuclei produce dissociable effects on instrumental conditioning in rats
- Corbit LH, Muir JL, Balleine BW. 2003 Lesions of mediodorsal thalamus and anterior thalamic nuclei produce dissociable effects on instrumental conditioning in rats. Eur. J. Neurosci. 18, 1286–1294. (doi:10.1046/j. 1460-9568.2003.02833.x)
- (2003) Eur. J. Neurosci , vol.18 , pp. 1286-1294
- Corbit, L.H.¹ Muir, J.L.² Balleine, B.W.³

77
- 37549066620
- Lights, camembert, action! The role of human orbitofrontal cortex in encoding stimuli, rewards, choices
- O’Doherty JP. 2007 Lights, camembert, action! The role of human orbitofrontal cortex in encoding stimuli, rewards, choices. Ann. NY Acad. Sci. 1121, 254–272. (doi:10.1196/annals.1401.036)
- (2007) Ann. NY Acad. Sci , vol.1121 , pp. 254-272
- O’doherty, J.P.¹

78
- 84865459435
- Corticostriatal connectivity underlies individual differences in the balance between habitual and goal-directed action control
- de Wit S, Watson P, Harsay HA, Cohen MX, van de Vijver I, Ridderinkhof KR. 2012 Corticostriatal connectivity underlies individual differences in the balance between habitual and goal-directed action control. J. Neurosci. 32, 12 066–12 075. (doi:10.1523/JNEUROSCI.1088-12.2012)
- (2012) J. Neurosci , vol.32 , pp. 12066-12075
- De Wit, S.¹ Watson, P.² Harsay, H.A.³ Cohen, M.X.⁴ Van De Vijver, I.⁵ Ridderinkhof, K.R.⁶

79
- 79951839136
- Neural correlates of instrumental contingency learning: Differential effects of action–reward conjunction and disjunction
- Liljeholm M, Tricomi E, O’Doherty JP, Balleine BW. 2011 Neural correlates of instrumental contingency learning: differential effects of action–reward conjunction and disjunction. J. Neurosci. 31, 2474–2480. (doi:10.1523/JNEUROSCI.3354-10.2011)
- (2011) J. Neurosci , vol.31 , pp. 2474-2480
- Liljeholm, M.¹ Tricomi, E.² O’doherty, J.P.³ Balleine, B.W.⁴

80
- 84904384936
- Model-based and model-free Pavlovian reward learning: Revaluation, revision and revelation
- Dayan P, Berridge K. 2014 Model-based and model-free Pavlovian reward learning: revaluation, revision and revelation. Cogn. Affect. Behav. Neurosci. 14, 473–492. (doi:10.3758/s13415-014-0277-8)
- (2014) Cogn. Affect. Behav. Neurosci , vol.14 , pp. 473-492
- Dayan, P.¹ Berridge, K.²

81
- 84862244642
- Neural correlates of specific and general Pavlovian-to-instrumental transfer within human amygdalar subregions: A high-resolution fMRI study
- Prevost C, Liljeholm M, Tyszka JM, O’Doherty JP. 2012 Neural correlates of specific and general Pavlovian-to-instrumental transfer within human amygdalar subregions: a high-resolution fMRI study. J. Neurosci. 32, 8383–8390. (doi:10.1523/JNEUROSCI.6237-11.2012)
- (2012) J. Neurosci , vol.32 , pp. 8383-8390
- Prevost, C.¹ Liljeholm, M.² Tyszka, J.M.³ O’doherty, J.P.⁴

82
- 0000121367
- Associations between the discriminative stimulus and the reinforcer in instrumental learning
- Colwill RM, Rescorla RA. 1988 Associations between the discriminative stimulus and the reinforcer in instrumental learning. J. Exp. Psychol. Anim. Behav. Process. 14, 155–164. (doi:10.1037/0097-7403.14.2.155)
- (1988) J. Exp. Psychol. Anim. Behav. Process , vol.14 , pp. 155-164
- Colwill, R.M.¹ Rescorla, R.A.²

83
- 0000772114
- Discriminative conditioning. I. A discriminative property of conditioned anticipation
- Estes W. 1943 Discriminative conditioning. I. A discriminative property of conditioned anticipation. J. Exp. Psychol. 32, 150–155. (doi:10.1037/h0058316)
- (1943) J. Exp. Psychol , vol.32 , pp. 150-155
- Estes, W.¹

84
- 84893764491
- Dorsal and ventral streams: The distinct role of striatal subregions in the acquisition and performance of goal-directed actions
- Hart G, Leung BK, Balleine BW. 2013 Dorsal and ventral streams: the distinct role of striatal subregions in the acquisition and performance of goal-directed actions. Neurobiol. Learn. Mem. 108, 104–118. (doi:10.1016/j.nlm.2013.11.003)
- (2013) Neurobiol. Learn. Mem , vol.108 , pp. 104-118
- Hart, G.¹ Leung, B.K.² Balleine, B.W.³

85
- 0014085947
- Two-process learning theory: Relationships between Pavlovian conditioning and instrumental learning
- Rescorla RA, Solomon RL. 1967 Two-process learning theory: relationships between Pavlovian conditioning and instrumental learning. Psychol. Rev. 74, 151–182. (doi:10.1037/h0024475)
- (1967) Psychol. Rev , vol.74 , pp. 151-182
- Rescorla, R.A.¹ Solomon, R.L.²

86
- 84887030057
- The role of the amygdala–striatal pathway in the acquisition and performance of goal-directed instrumental actions
- Corbit LH, Leung BK, Balleine BW. 2013 The role of the amygdala–striatal pathway in the acquisition and performance of goal-directed instrumental actions. J. Neurosci. 33, 17 682– 17 690. (doi:10. 1523/JNEUROSCI.3271-13.2013)
- (2013) J. Neurosci , vol.33 , pp. 17682-17690
- Corbit, L.H.¹ Leung, B.K.² Balleine, B.W.³

87
- 84856002548
- Amygdala central nucleus interacts with dorsolateral striatum to regulate the acquisition of habits
- Lingawi NW, Balleine BW. 2012 Amygdala central nucleus interacts with dorsolateral striatum to regulate the acquisition of habits. J. Neurosci. 323, 1073–1081. (doi:10.1523/JNEUROSCI.4806-11.2012)
- (2012) J. Neurosci , vol.323 , pp. 1073-1081
- Lingawi, N.W.¹ Balleine, B.W.²

88
- 77955362035
- Neurocomputational models of motor and cognitive deficits in Parkinson’s disease
- Wiecki TV, Frank MJ. 2010 Neurocomputational models of motor and cognitive deficits in Parkinson’s disease. Prog. Brain Res. 183, 275–297. (doi:10.1016/S0079-6123(10)83014-6)
- (2010) Prog. Brain Res , vol.183 , pp. 275-297
- Wiecki, T.V.¹ Frank, M.J.²

89
- 33645458694
- Reverse replay of behavioural sequences in hippocampal place cells during the awake state
- Foster DJ, Wilson MA. 2006 Reverse replay of behavioural sequences in hippocampal place cells during the awake state. Nature 440, 680–683. (doi:10.1038/nature04587)
- (2006) Nature , vol.440 , pp. 680-683
- Foster, D.J.¹ Wilson, M.A.²

90
- 36348999880
- Hippocampal theta sequences
- Foster DJ, Wilson MA. 2007 Hippocampal theta sequences. Hippocampus 17, 1093–1099. (doi:10.1002/hipo.20345)
- (2007) Hippocampus , vol.17 , pp. 1093-1099
- Foster, D.J.¹ Wilson, M.A.²

91
- 40849087850
- Integrating hippocampus and striatum in decision-making
- Johnson A, van der Meer MAA, Redish AD. 2007 Integrating hippocampus and striatum in decision-making. Curr. Opin. Neurobiol. 17, 692–697. (doi:10.1016/j.conb.2008.01.003)
- (2007) Curr. Opin. Neurobiol , vol.17 , pp. 692-697
- Johnson, A.¹ Van Der Meer, M.A.A.² Redish, A.D.³

92
- 84877578934
- Hippocampal place-cell sequences depict future paths to remembered goals
- Pfeiffer BE, Foster DJ. 2013 Hippocampal place-cell sequences depict future paths to remembered goals. Nature 497, 74–79. (doi:10.1038/nature12112)
- (2013) Nature , vol.497 , pp. 74-79
- Pfeiffer, B.E.¹ Foster, D.J.²

93
- 29144438020
- Theta rhythms coordinate hippocampal–prefrontal interactions in a spatial memory task
- Jones MW, Wilson MA. 2005 Theta rhythms coordinate hippocampal–prefrontal interactions in a spatial memory task. PLoS Biol. 3, e402. (doi:10.1371/journal.pbio.0030402)
- (2005) PLoS Biol , vol.3
- Jones, M.W.¹ Wilson, M.A.²

94
- 69349092175
- Hippocampus leads ventral striatum in replay of place-reward information
- Lansink CS, Goltstein PM, Lankelma JV, McNaughton BL, Pennartz CM. 2009 Hippocampus leads ventral striatum in replay of place-reward information. PLoS Biol. 7, e1000173. (doi:10.1371/journal.pbio.1000173)
- (2009) PLoS Biol , vol.7
- Lansink, C.S.¹ Goltstein, P.M.² Lankelma, J.V.³ McNaughton, B.L.⁴ Pennartz, C.M.⁵

95
- 34347235785
- Cambridge, MA: MIT Press
- Doya K, Ishii S, Pouget A, Rao RP. (eds) 2007 Bayesian brain: probabilistic approaches to neural coding. Cambridge, MA: MIT Press.
- (2007) Bayesian brain: Probabilistic approaches to neural coding
- Doya, K.¹ Ishii, S.² Pouget, A.³ Rao, R.P.⁴

96
- 0033168618
- Prospective coding for objects in primate prefrontal cortex
- Rainer G, Rao SC, Miller EK. 1999 Prospective coding for objects in primate prefrontal cortex. J. Neurosci. 19, 5493–5505.
- (1999) J. Neurosci , vol.19 , pp. 5493-5505
- Rainer, G.¹ Rao, S.C.² Miller, E.K.³

97
- 0025948773
- Neural organization for the long-term memory of paired associates
- Sakai K, Miyashita Y. 1991 Neural organization for the long-term memory of paired associates. Nature 354,152–155. (doi:10.1038/354152a0)
- (1991) Nature , vol.354 , pp. 152-155
- Sakai, K.¹ Miyashita, Y.²

98
- 15244346900
- Lesion to the nigrostriatal dopamine system disrupts stimulus–response habit formation
- Faure A, Haberland U, Conde´ F, El Massioui N. 2005 Lesion to the nigrostriatal dopamine system disrupts stimulus–response habit formation. J. Neurosci. 25, 2771–2780. (doi:10.1523/JNEUROSCI.3894-04.2005)
- (2005) J. Neurosci , vol.25 , pp. 2771-2780
- Faure, A.¹ Haberland, U.² Conde´, F.³ El Massioui, N.⁴

99
- 84891681994
- A causal link between prediction errors, dopamine neurons and learning
- Steinberg EE, Keiflin R, Boivin JR, Witten IB, Deisseroth K, Janak PH. 2013 A causal link between prediction errors, dopamine neurons and learning. Nat. Neurosci. 16, 966–973. (doi:10.1038/nn.3413)
- (2013) Nat. Neurosci , vol.16 , pp. 966-973
- Steinberg, E.E.¹ Keiflin, R.² Boivin, J.R.³ Witten, I.B.⁴ Deisseroth, K.⁵ Janak, P.H.⁶

100
- 84155183278
- NMDA receptors in dopaminergic neurons are crucial for habit learning
- Wang LP, Li F, Wang D, Xie K, Wang D, Shen X, Tsien JZ. 2011 NMDA receptors in dopaminergic neurons are crucial for habit learning. Neuron 72, 1055–1066. (doi:10.1016/j.neuron.2011.10.019)
- (2011) Neuron , vol.72 , pp. 1055-1066
- Wang, L.P.¹ Li, F.² Wang, D.³ Xie, K.⁴ Wang, D.⁵ Shen, X.⁶ Tsien, J.Z.⁷

101
- 0033913868
- Dissociation of Pavlovian and instrumental incentive learning under dopamine antagonists
- Dickinson A, Smith J, Mirenowicz J. 2000 Dissociation of Pavlovian and instrumental incentive learning under dopamine antagonists. Behav. Neurosci. 114, 468–483. (doi:10.1037/0735-7044.114.3.468)
- (2000) Behav. Neurosci , vol.114 , pp. 468-483
- Dickinson, A.¹ Smith, J.² Mirenowicz, J.³

102
- 84874655309
- Instant transformation of learned repulsion into motivational ‘wanting’
- Robinson MJF, Berridge KC. 2013 Instant transformation of learned repulsion into motivational ‘wanting’. Curr. Biol. 23, 282–289. (doi:10.1016/j.cub.2013.01.016)
- (2013) Curr. Biol , vol.23 , pp. 282-289
- Robinson, M.J.F.¹ Berridge, K.C.²

103
- 84906263462
- The computational and neural basis of cognitive control: Charted territory and new frontiers
- Botvinick M., Cohen J. 2014 The computational and neural basis of cognitive control: charted territory and new frontiers. Cogn. Sci. (doi:10.1111/cogs.12126)
- (2014) Cogn. Sci
- Botvinick, M.¹ Cohen, J.²

104
- 0035382933
- Interactions between frontal cortex and basal ganglia in working memory: A computational model
- Frank MJ, Loughry B, O’Reilly RC. 2001 Interactions between frontal cortex and basal ganglia in working memory: a computational model. Cogn. Affect. Behav. Neurosci. 1, 137–160. (doi:10.3758/CABN.1.2.137)
- (2001) Cogn. Affect. Behav. Neurosci , vol.1 , pp. 137-160
- Frank, M.J.¹ Loughry, B.² O’Reilly, R.C.³

105
- 33645889031
- Banishing the homunculus: Making working memory work
- Hazy TE, Frank MJ, O’Reilly RC. 2006 Banishing the homunculus: making working memory work. Neuroscience 139, 105–118. (doi:10.1016/j.neuroscience.2005.04.067)
- (2006) Neuroscience , vol.139 , pp. 105-118
- Hazy, T.E.¹ Frank, M.J.² O’Reilly, R.C.³

106
- 0034928713
- An integrative theory of prefrontal cortex function
- Miller EK, Cohen JD. 2001 An integrative theory of prefrontal cortex function. Annu. Rev. Neurosci. 24, 167–202. (doi:10.1146/annurev.neuro.24.1.167)
- (2001) Annu. Rev. Neurosci , vol.24 , pp. 167-202
- Miller, E.K.¹ Cohen, J.D.²

107
- 0002538557
- A biologically based computational model of working memory
- (eds A Mikaye, P Shah, New York, NY: Cambridge University Press
- O’Reilly R, Braver T, Cohen J. 1999 A biologically based computational model of working memory. In Models of working memory: mechanisms of active maintenance and executive control (eds A Mikaye, P Shah), pp. 375–411. New York, NY: Cambridge University Press.
- (1999) Models of working memory: Mechanisms of active maintenance and executive control , pp. 375-411
- O’Reilly, R.¹ Braver, T.² Cohen, J.³

108
- 33644927837
- Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia
- O’Reilly RC, Frank MJ. 2006 Making working memory work: a computational model of learning in the prefrontal cortex and basal ganglia. Neural Comput. 18, 283–328. (doi:10.1162/0899766067 75093909)
- (2006) Neural Comput , vol.18 , pp. 283-328
- O’Reilly, R.C.¹ Frank, M.J.²

109
- 0036173816
- Prefrontal cortex and dynamic categorization tasks: Representational organization and neuromodulatory control
- O’Reilly RC, Noelle DC, Braver TS, Cohen J.. 2002 Prefrontal cortex and dynamic categorization tasks: representational organization and neuromodulatory control. Cereb. Cortex 12, 246–257. (doi:10.1093/ cercor/12.3.246)
- (2002) Cereb. Cortex , vol.12 , pp. 246-257
- O’Reilly, R.C.¹ Noelle, D.C.² Braver, T.S.³ Cohen, J.⁴

110
- 18844456810
- Prefrontal cortex and flexible cognitive control: Rules without symbols
- Rougier NP, Noelle DC, Braver TS, Cohen J., O’Reilly RC. 2005 Prefrontal cortex and flexible cognitive control: rules without symbols. Proc. Natl Acad. Sci. USA 102, 7338–7343. (doi:10.1073/pnas.0502455102)
- (2005) Proc. Natl Acad. Sci. USA , vol.102 , pp. 7338-7343
- Rougier, N.P.¹ Noelle, D.C.² Braver, T.S.³ Cohen, J.⁴ O’Reilly, R.C.⁵

111
- 70350440484
- Rational adaptation under task and processing constraints: Implications for testing theories of cognition and action
- Howes A, Lewis RL, Vera A. 2009 Rational adaptation under task and processing constraints: implications for testing theories of cognition and action. Psychol. Rev. 116, 717–751. (doi:10.1037/a0017187)
- (2009) Psychol. Rev , vol.116 , pp. 717-751
- Howes, A.¹ Lewis, R.L.² Vera, A.³

112
- 84899629222
- Computational rationality: Linking mechanism and behavior through utility maximization
- Lewis RL, Howes A, Singh S. 2014 Computational rationality: linking mechanism and behavior through utility maximization. Top. Cogn. Sci. 6, 279–311. (doi:10.1111/tops.12086)
- (2014) Top. Cogn. Sci , vol.6 , pp. 279-311
- Lewis, R.L.¹ Howes, A.² Singh, S.³

113
- 33847205014
- Planning for the future by western scrub-jays
- Raby CR, Alexis DM, Dickinson A, Clayton NS. 2007 Planning for the future by western scrub-jays. Nature 445, 919–921. (doi:10.1038/nature05575)
- (2007) Nature , vol.445 , pp. 919-921
- Raby, C.R.¹ Alexis, D.M.² Dickinson, A.³ Clayton, N.S.⁴

114
- 84960566726
- Anomalies in intertemporal choice: Evidence and an interpretation
- Loewenstein G, Prelec D. 1992 Anomalies in intertemporal choice: evidence and an interpretation. Q. J. Econ. 107, 573–597. (doi:10. 2307/2118482)
- (1992) Q. J. Econ , vol.107 , pp. 573-597
- Loewenstein, G.¹ Prelec, D.²

115
- 0004145775
- New York, NY: Harper
- Guthrie E. 1935 The psychology of learning. New York, NY: Harper.
- (1935) The psychology of learning
- Guthrie, E.¹

116
- 70350566799
- Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective
- Botvinick M., Niv Y, Barto AC. 2009 Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 113, 262–280. (doi:10.1016/j.cognition.2008.08.011)
- (2009) Cognition , vol.113 , pp. 262-280
- Botvinick, M.¹ Niv, Y.² Barto, A.C.³

117
- 79960637995
- A neural signature of hierarchical reinforcement learning
- Ribas-Fernandes JJF, Solway A, Diuk C, McGuire JT, Barto A., Niv Y, Botvinick MM. 2011 A neural signature of hierarchical reinforcement learning. Neuron 71, 370–379. (doi:10.1016/j.neuron.2011.05. 042)
- (2011) Neuron , vol.71 , pp. 370-379
- Ribas-Fernandes, J.J.F.¹ Solway, A.² Diuk, C.³ McGuire, J.T.⁴ Barto, A.⁵ Niv, Y.⁶ Botvinick, M.M.⁷

118
- 78649604962
- Evidence for model-based action planning in a sequential finger movement task
- Fermin A, Yoshida T, Ito M, Yoshimoto J, Doya K. 2010 Evidence for model-based action planning in a sequential finger movement task. J. Motiv. Behav. 42, 371–379. (doi:10.1080/00222895.2010.526467)
- (2010) J. Motiv. Behav , vol.42 , pp. 371-379
- Fermin, A.¹ Yoshida, T.² Ito, M.³ Yoshimoto, J.⁴ Doya, K.⁵

119
- 84889680697
- The intrinsic cost of cognitive control
- Kool W, Botvinick M. 2013 The intrinsic cost of cognitive control. Behav. Brain Sci. 36, 697–698. (doi:10.1017/S0140525X1300109X)
- (2013) Behav. Brain Sci , vol.36 , pp. 697-698
- Kool, W.¹ Botvinick, M.²

120
- 84880660982
- The expected value of control: An integrative theory of anterior cingulate cortex function
- Shenhav A, Botvinick M., Cohen JD. 2013 The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron 79, 217–240. (doi:10.1016/j.neuron.2013.07.007)
- (2013) Neuron , vol.79 , pp. 217-240
- Shenhav, A.¹ Botvinick, M.² Cohen, J.D.³

121
- 0000827179
- Boxes: An experiment in adaptive control
- Michie D, Chambers R. 1968 Boxes: an experiment in adaptive control. Mach. Intell. 2, 137–152.
- (1968) Mach. Intell , vol.2 , pp. 137-152
- Michie, D.¹ Chambers, R.²

122
- 84875027622
- Cambridge, MA: Harvard University, Psychological Laboratories
- Minsky M. 1952 A neural-analogue calculator based upon a probability model of reinforcement. Cambridge, MA: Harvard University, Psychological Laboratories.
- (1952) A neural-analogue calculator based upon a probability model of reinforcement
- Minsky, M.¹

123
- 0001778486
- The impact of chess research on cognitive science
- Charness N. 1992 The impact of chess research on cognitive science. Psychol. Res. 54, 4–9. (doi:10. 1007/BF01359217)
- (1992) Psychol. Res , vol.54 , pp. 4-9
- Charness, N.¹

124
- 0004217226
- The Hague, The Netherlands: Mouton
- de Groot AD. 1965 Thought and choice in chess. The Hague, The Netherlands: Mouton.
- (1965) Thought and choice in chess
- De Groot, A.D.¹

125
- 0001275820
- Chess-playing programs and the problem of complexity
- Newell A, Shaw JC, Simon HA. 1958 Chess-playing programs and the problem of complexity. IBM J. Res. Dev. 2, 320–335. (doi:10.1147/rd.24.0320)
- (1958) IBM J. Res. Dev , vol.2 , pp. 320-335
- Newell, A.¹ Shaw, J.C.² Simon, H.A.³

126
- 0003430412
- Englewood Cliffs, NJ: Prentice-Hall
- Newell A et al. 1972 Human problem solving. Englewood Cliffs, NJ: Prentice-Hall.
- (1972) Human problem solving
- Newell, A.¹

127
- 0002285834
- Problem solving and learning
- Anderson JR. 1993 Problem solving and learning. Am. Psychol. 48, 35–44. (doi:10.1037/0003-066X.48.1.35)
- (1993) Am. Psychol , vol.48 , pp. 35-44
- Anderson, J.R.¹

128
- 0000004705
- The functional equivalence of problem solving skills
- Simon HA. 1975 The functional equivalence of problem solving skills. Cogn. Psychol. 7, 268–288. (doi:10.1016/0010-0285 (75)90012-2)
- (1975) Cogn. Psychol , vol.7 , pp. 268-288
- Simon, H.A.¹

129
- 84859371025
- Bonsai trees in your head: How the Pavlovian system sculpts goal- directed choices by pruning decision trees
- Huys QJ, Eshel N, O’Nions E, Sheridan L, Dayan P, Roiser JP. 2012 Bonsai trees in your head: how the Pavlovian system sculpts goal- directed choices by pruning decision trees. PLoS Comput. Biol. 8, e1002410. (doi:10.1371/journal.pcbi.1002410)
- (2012) PLoS Comput. Biol , vol.8
- Huys, Q.J.¹ Eshel, N.² O’Nions, E.³ Sheridan, L.⁴ Dayan, P.⁵ Roiser, J.P.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.