메뉴 건너뛰기




Volumn 369, Issue 1655, 2014, Pages

The algorithmic anatomy of model-based evaluation

Author keywords

Model based reasoning; Model free reasoning; Monte Carlo tree search; Orbitofrontal cortex; Reinforcement learning; Striatum

Indexed keywords

ALGORITHM; ANATOMY; ANIMAL; COMPUTER SIMULATION; LEARNING; MONTE CARLO ANALYSIS; NUMERICAL MODEL; PREDICTION; TWENTIETH CENTURY;

EID: 84907545889     PISSN: 09628436     EISSN: 14712970     Source Type: Journal    
DOI: 10.1098/rstb.2013.0478     Document Type: Article
Times cited : (138)

References (129)
  • 1
    • 0033213819 scopus 로고    scopus 로고
    • What are the computations of the cerebellum, the basal ganglia and the cerebral cortex?
    • Doya K. 1999 What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Netw. 12, 961–974. (doi:10.1016/S0893-6080(99)00046-5)
    • (1999) Neural Netw , vol.12 , pp. 961-974
    • Doya, K.1
  • 3
    • 0002166755 scopus 로고
    • Actions and habits: Variations in associative representations during instrumental learning
    • eds N Spear, R Miller, Hillsdale, NJ: Lawrence Erlbaum Associates
    • Adams C, Dickinson A. 1981 Actions and habits: variations in associative representations during instrumental learning. In Information processing in animals: memory mechanisms (eds N Spear, R Miller), pp. 143–165. Hillsdale, NJ: Lawrence Erlbaum Associates.
    • (1981) Information processing in animals: Memory mechanisms , pp. 143-165
    • Adams, C.1    Dickinson, A.2
  • 4
    • 28044450875 scopus 로고    scopus 로고
    • Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
    • Daw ND, Niv Y, Dayan P. 2005 Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711. (doi:10.1038/n1560)
    • (2005) Nat. Neurosci , vol.8 , pp. 1704-1711
    • Daw, N.D.1    Niv, Y.2    Dayan, P.3
  • 5
    • 0043250430 scopus 로고    scopus 로고
    • The role of learning in motivation
    • (ed. C Gallistel, New York, NY: Wiley
    • Dickinson A, Balleine B. 2002 The role of learning in motivation. In Stevens’ handbook of experimental psychology, vol. 3 (ed. C Gallistel), pp. 497–533. New York, NY: Wiley.
    • (2002) Stevens’ handbook of experimental psychology , vol.3 , pp. 497-533
    • Dickinson, A.1    Balleine, B.2
  • 7
    • 0242305777 scopus 로고    scopus 로고
    • Individual differences in reasoning: Implications for the rationality debate?
    • (eds T Gilovich, D Griffin, D Kahneman, Cambridge, UK: Cambridge University Press
    • Stanovich K, West R. 2002 Individual differences in reasoning: implications for the rationality debate? In Heuristics and biases: the psychology of intuitive judgment (eds T Gilovich, D Griffin, D Kahneman), pp. 421–440. Cambridge, UK: Cambridge University Press.
    • (2002) Heuristics and biases: The psychology of intuitive judgment , pp. 421-440
    • Stanovich, K.1    West, R.2
  • 8
    • 0001201756 scopus 로고
    • Some studies in machine learning using the game of checkers
    • Samuel A. 1959 Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 3, 210–229. (doi:10.1147/rd.33.0210)
    • (1959) IBM J. Res.Dev , vol.3 , pp. 210-229
    • Samuel, A.1
  • 9
    • 0020970738 scopus 로고
    • Neuronlike adaptive elements that can solve difficult learning control problems
    • Barto A., Sutton R., Anderson CW. 1983 Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybernet. 834–846. (doi:10.1109/TSMC.1983.6313077)
    • (1983) IEEE Trans. Syst. Man Cybernet , pp. 834-846
    • Barto, A.1    Sutton, R.2    Anderson, C.W.3
  • 10
    • 84859341150 scopus 로고    scopus 로고
    • Habits, action sequences and reinforcement learning
    • Dezfouli A, Balleine BW. 2012 Habits, action sequences and reinforcement learning. Eur. J. Neurosci. 35, 1036–1051. (doi:10.1111/j. 1460-9568.2012.08050.x)
    • (2012) Eur. J. Neurosci , vol.35 , pp. 1036-1051
    • Dezfouli, A.1    Balleine, B.W.2
  • 11
    • 84907480610 scopus 로고    scopus 로고
    • Habits as action sequences: Hierarchical action control and changes in outcome value
    • Dezfouli A, Lingawi NW, Balleine BW. 2014 Habits as action sequences: hierarchical action control and changes in outcome value. Phil. Trans. R. Soc. B 369, 20130482. (doi:10.1098/rstb.2013.0482)
    • (2014) Phil. Trans. R. Soc. B , vol.369 , pp. 20130482
    • Dezfouli, A.1    Lingawi, N.W.2    Balleine, B.W.3
  • 12
    • 84885802926 scopus 로고    scopus 로고
    • Goals and habits in the brain
    • Dolan RJ, Dayan P. 2013 Goals and habits in the brain. Neuron 80, 312–325. (doi:10.1016/j.neuron.2013.09.007)
    • (2013) Neuron , vol.80 , pp. 312-325
    • Dolan, R.J.1    Dayan, P.2
  • 13
    • 1842853951 scopus 로고    scopus 로고
    • Relations between Pavlovian – instrumental transfer and reinforcer devaluation
    • Holland PC. 2004 Relations between Pavlovian – instrumental transfer and reinforcer devaluation. J. Exp. Psychol. Anim. Behav. Process. 30, 104–117. (doi:10.1037/0097-7403.30.2.104)
    • (2004) J. Exp. Psychol. Anim. Behav. Process , vol.30 , pp. 104-117
    • Holland, P.C.1
  • 14
    • 79955709936 scopus 로고    scopus 로고
    • Neural correlates of forward planning in a spatial decision task in humans
    • Simon DA, Daw ND. 2011 Neural correlates of forward planning in a spatial decision task in humans. J. Neurosci. 31, 5526– 5539. (doi:10.1523/JNEUROSCI.4647-10.2011)
    • (2011) J. Neurosci , vol.31 , pp. 5526-5539
    • Simon, D.A.1    Daw, N.D.2
  • 16
    • 0001461525 scopus 로고
    • There is more than one kind of learning
    • Tolman E. 1949 There is more than one kind of learning. Psychol. Rev. 56, 144–155. (doi:10.1037/h0055304)
    • (1949) Psychol. Rev , vol.56 , pp. 144-155
    • Tolman, E.1
  • 17
    • 84877341847 scopus 로고    scopus 로고
    • The curse of planning: Dissecting multiple reinforcement-learning systems by taxing the central executive
    • Otto AR, Gershman SJ, Markman AB, Daw ND. 2013 The curse of planning: dissecting multiple reinforcement-learning systems by taxing the central executive. Psychol. Sci. 24, 751–761. (doi:10.1177/0956797612463080)
    • (2013) Psychol. Sci , vol.24 , pp. 751-761
    • Otto, A.R.1    Gershman, S.J.2    Markman, A.B.3    Daw, N.D.4
  • 18
    • 84891354506 scopus 로고    scopus 로고
    • Working-memory capacity protects model-based learning from stress
    • Otto AR, Raio CM, Chiang A, Phelps EA, Daw ND. 2013 Working-memory capacity protects model-based learning from stress. Proc. Natl Acad. Sci. SA 110, 20 941– 20 946. (doi:10.1073/pnas.1312011110)
    • (2013) Proc. Natl Acad. Sci , vol.110 , pp. 20941-20946
    • Otto, A.R.1    Raio, C.M.2    Chiang, A.3    Phelps, E.A.4    Daw, N.D.5
  • 19
    • 0000541213 scopus 로고
    • Adaptive critics and the basal ganglia
    • eds J Houk, J Davis, D Beiser, Cambridge, MA: MIT Press
    • Barto A. 1995 Adaptive critics and the basal ganglia. In Models of information processing in the basal ganglia (eds J Houk, J Davis, D Beiser),pp. 215–232. Cambridge, MA: MIT Press.
    • (1995) Models of information processing in the basal ganglia , pp. 215-232
    • Barto, A.1
  • 20
    • 0029981543 scopus 로고    scopus 로고
    • A framework for mesencephalic dopamine systems based on predictive Hebbian learning
    • Montague PR, Dayan P, Sejnowski TJ. 1996 A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947.
    • (1996) J. Neurosci , vol.16 , pp. 1936-1947
    • Montague, P.R.1    Dayan, P.2    Sejnowski, T.J.3
  • 21
    • 0030896968 scopus 로고    scopus 로고
    • A neural substrate of prediction and reward
    • Schultz W, Dayan P, Montague PR. 1997 A neural substrate of prediction and reward. Science 275, 1593– 1599.  (doi:10.1126/science. 275.5306.1593)
    • (1997) Science , vol.275 , pp. 1593-1599
    • Schultz, W.1    Dayan, P.2    Montague, P.R.3
  • 22
    • 28444472936 scopus 로고    scopus 로고
    • Neural bases of food-seeking: Affect, arousal and reward in corticostriatolimbic circuits
    • Balleine BW. 2005 Neural bases of food-seeking: affect, arousal and reward in corticostriatolimbic circuits. Physiol. Behav. 86, 717–730. (doi:10.1016/j.physbeh.2005.08.061)
    • (2005) Physiol. Behav , vol.86 , pp. 717-730
    • Balleine, B.W.1
  • 23
    • 0037382264 scopus 로고    scopus 로고
    • Coordination of actions and habits in the medial prefrontal cortex of rats
    • Killcross S, Coutureau E. 2003 Coordination of actions and habits in the medial prefrontal cortex of rats. Cereb. Cortex 13, 400– 408. (doi:10.1093/cercor/13.4.400)
    • (2003) Cereb. Cortex , vol.13 , pp. 400-408
    • Killcross, S.1    Coutureau, E.2
  • 25
    • 79951823576 scopus 로고    scopus 로고
    • Ventral striatum and orbitofrontal cortex are both required for model-based, but not model-free, reinforcement learning
    • McDannald MA, Lucantonio F, Burke KA, Niv Y, Schoenbaum G. 2011 Ventral striatum and orbitofrontal cortex are both required for model-based, but not model-free, reinforcement learning. J. Neurosci. 31, 2700–2705. (doi:10.1523/ JNEUROSCI.5499-10.2011)
    • (2011) J. Neurosci , vol.31 , pp. 2700-2705
    • McDannald, M.A.1    Lucantonio, F.2    Burke, K.A.3    Niv, Y.4    Schoenbaum, G.5
  • 26
    • 84859323549 scopus 로고    scopus 로고
    • Model-based learning and the contribution of the orbitofrontal cortex to the model-free world
    • McDannald MA, Takahashi YK, Lopatina N, Pietras BW, Jones JL, Schoenbaum G. 2012 Model-based learning and the contribution of the orbitofrontal cortex to the model-free world. Eur. J. Neurosci. 35, 991–996. (doi:10.1111/j.1460-9568.2011.07982.x)
    • (2012) Eur. J. Neurosci , vol.35 , pp. 991-996
    • McDannald, M.A.1    Takahashi, Y.K.2    Lopatina, N.3    Pietras, B.W.4    Jones, J.L.5    Schoenbaum, G.6
  • 27
    • 53949118376 scopus 로고    scopus 로고
    • Reward-guided learning beyond dopamine in the nucleus accumbens: The integrative functions of cortico-basal ganglia networks
    • Yin HH, Ostlund SB, Balleine BW. 2008 Reward-guided learning beyond dopamine in the nucleus accumbens: the integrative functions of cortico-basal ganglia networks. Eur. J. Neurosci. 28, 1437–1448. (doi:10.1111/j.1460-9568.2008.06422.x)
    • (2008) Eur. J. Neurosci , vol.28 , pp. 1437-1448
    • Yin, H.H.1    Ostlund, S.B.2    Balleine, B.W.3
  • 28
    • 77953260848 scopus 로고    scopus 로고
    • States versus rewards: Dissociable neural prediction error signals underlying model-based and model-free reinforcement learning
    • Glascher J, Daw N, Dayan P, O’Doherty JP. 2010 States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66, 585–595. (doi:10.1016/j.neuron.2010.04.016)
    • (2010) Neuron , vol.66 , pp. 585-595
    • Glascher, J.1    Daw, N.2    Dayan, P.3    O’doherty, J.P.4
  • 29
    • 66449119919 scopus 로고    scopus 로고
    • A specific role for posterior dorsolateral striatum in human habit learning
    • Tricomi E, Balleine B., O’Doherty JP. 2009 A specific role for posterior dorsolateral striatum in human habit learning. Eur. J. Neurosci. 29, 2225–2232. (doi:10.1111/j.1460-9568.2009.06796.x)
    • (2009) Eur. J. Neurosci , vol.29 , pp. 2225-2232
    • Tricomi, E.1    Balleine, B.2    O’doherty, J.P.3
  • 30
    • 34247147767 scopus 로고    scopus 로고
    • Determining the neural substrates of goal-directed learning in the human brain
    • Valentin VV, Dickinson A, O’Doherty JP. 2007 Determining the neural substrates of goal-directed learning in the human brain. J. Neurosci. 27, 4019– 4026. (doi:10.1523/JNEUROSCI.0564-07.2007)
    • (2007) J. Neurosci , vol.27 , pp. 4019-4026
    • Valentin, V.V.1    Dickinson, A.2    O’doherty, J.P.3
  • 31
    • 84860307045 scopus 로고    scopus 로고
    • Mapping value based planning and extensively trained choice in the human brain
    • Wunderlich K, Dayan P, Dolan RJ. 2012 Mapping value based planning and extensively trained choice in the human brain. Nat. Neurosci. 15, 786–791. (doi:10.1038/nn.3068)
    • (2012) Nat. Neurosci , vol.15 , pp. 786-791
    • Wunderlich, K.1    Dayan, P.2    Dolan, R.J.3
  • 32
    • 79958143780 scopus 로고    scopus 로고
    • Speed/ accuracy trade-off between the habitual and the goal-directed processes
    • Keramati M, Dezfouli A, Piray P. 2011 Speed/ accuracy trade-off between the habitual and the goal-directed processes. PLoS Comput. Biol. 7, e1002055. (doi:10.1371/journal.pcbi.1002055)
    • (2011) PLoS Comput. Biol , vol.7
    • Keramati, M.1    Dezfouli, A.2    Piray, P.3
  • 33
    • 84878783112 scopus 로고    scopus 로고
    • The mixed instrumental controller: Using value of information to combine habitual choice and mental simulation
    • Pezzulo G, Rigoli F, Chersi F. 2013 The mixed instrumental controller: using value of information to combine habitual choice and mental simulation. Front. Psychol. 4, 92. (doi:10.3389/fpsyg.2013.00092)
    • (2013) Front. Psychol , vol.4 , pp. 92
    • Pezzulo, G.1    Rigoli, F.2    Chersi, F.3
  • 34
    • 84859737036 scopus 로고    scopus 로고
    • Goal-directed decision making as probabilistic inference: A computational framework and potential neural correlates
    • Solway A, Botvinick MM. 2012 Goal-directed decision making as probabilistic inference: a computational framework and potential neural correlates. Psychol. Rev. 119, 120–154. (doi:10.1037/a0026435)
    • (2012) Psychol. Rev , vol.119 , pp. 120-154
    • Solway, A.1    Botvinick, M.M.2
  • 36
    • 84878179610 scopus 로고    scopus 로고
    • How to set the switches on this thing
    • Dayan P. 2012 How to set the switches on this thing. Curr. Opin. Neurobiol. 22, 1068–1074. (doi:10.1016/j.conb.2012.05.011)
    • (2012) Curr. Opin. Neurobiol , vol.22 , pp. 1068-1074
    • Dayan, P.1
  • 39
    • 79952746011 scopus 로고    scopus 로고
    • Model-based influences on humans’ choices and striatal prediction errors
    • Daw ND, Gershman SJ, Seymour B, Dayan P, Dolan RJ. 2011 Model-based influences on humans’ choices and striatal prediction errors. Neuron 69, 1204–1215. (doi:10.1016/j.neuron.2011.02.027)
    • (2011) Neuron , vol.69 , pp. 1204-1215
    • Daw, N.D.1    Gershman, S.J.2    Seymour, B.3    Dayan, P.4    Dolan, R.J.5
  • 40
    • 70449715719 scopus 로고    scopus 로고
    • Instructional control of reinforcement learning: A behavioral and neurocomputational investigation
    • Doll BB, Jacobs WJ, Sanfey AG, Frank MJ. 2009 Instructional control of reinforcement learning: a behavioral and neurocomputational investigation. Brain Res. 1299, 74–94. (doi:10.1016/j.brainres.2009.07.007)
    • (2009) Brain Res , vol.1299 , pp. 74-94
    • Doll, B.B.1    Jacobs, W.J.2    Sanfey, A.G.3    Frank, M.J.4
  • 41
    • 84872761547 scopus 로고    scopus 로고
    • The ubiquity of model-based reinforcement learning
    • Doll BB, Simon DA, Daw ND. 2012 The ubiquity of model-based reinforcement learning. Curr. Opin. Neurobiol. 22, 1075–1081. (doi:10.1016/j.conb.2012.08.003)
    • (2012) Curr. Opin. Neurobiol , vol.22 , pp. 1075-1081
    • Doll, B.B.1    Simon, D.A.2    Daw, N.D.3
  • 42
    • 84878779561 scopus 로고    scopus 로고
    • Retrospective revaluation in sequential decision making: A tale of two systems
    • Gershman SJ, Markman AB, Otto AR. 2012 Retrospective revaluation in sequential decision making: a tale of two systems. J. Exp. Psychol. Gen. 24, 751–761.
    • (2012) J. Exp. Psychol. Gen , vol.24 , pp. 751-761
    • Gershman, S.J.1    Markman, A.B.2    Otto, A.R.3
  • 44
    • 0032073263 scopus 로고    scopus 로고
    • Planning and acting in partially observable stochastic domains
    • Kaelbling LP, Littman ML, Cassandra AR. 1998 Planning and acting in partially observable stochastic domains. Artif. Intell. 101, 99–134. (doi:10.1016/S0004-3702(98)00023-X)
    • (1998) Artif. Intell , vol.101 , pp. 99-134
    • Kaelbling, L.P.1    Littman, M.L.2    Cassandra, A.R.3
  • 45
    • 0036832951 scopus 로고    scopus 로고
    • A sparse sampling algorithm for near-optimal planning in large Markov decision processes
    • Kearns M, Mansour Y, Ng AY. 2002 A sparse sampling algorithm for near-optimal planning in large Markov decision processes. Mach. Learn. 49, 193–208. (doi:10.1023/A:1017932429737)
    • (2002) Mach. Learn , vol.49 , pp. 193-208
    • Kearns, M.1    Mansour, Y.2    Ng, A.Y.3
  • 47
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton R. 1988 Learning to predict by the methods of temporal differences. Mach. Learn. 3, 9–44.
    • (1988) Mach. Learn , vol.3 , pp. 9-44
    • Sutton, R.1
  • 48
    • 33750293964 scopus 로고    scopus 로고
    • Bandit based Monte Carlo planning
    • (eds J Fu¨rnkranz, T Scheffer, M Spiliopoulou, Berlin, Germany: Springer
    • Kocsis L, Szepesvari C. 2006 Bandit based Monte Carlo planning. In Machine learning: ECML 2006 (eds J Fu¨rnkranz, T Scheffer, M Spiliopoulou), pp. 282–293. Berlin, Germany: Springer.
    • (2006) Machine learning: ECML 2006 , pp. 282-293
    • Kocsis, L.1    Szepesvari, C.2
  • 49
    • 56449110907 scopus 로고    scopus 로고
    • Sample-based learning and search with permanent and transient memories
    • New York, NY: Association for Computing Machinery
    • Silver D, Sutton R., Muller M. 2008 Sample-based learning and search with permanent and transient memories. In Proc. 25th Int. Conf. on Machine Learning, pp. 968–975. New York, NY: Association for Computing Machinery.
    • (2008) Proc. 25th Int. Conf. on Machine Learning , pp. 968-975
    • Silver, D.1    Sutton, R.2    Muller, M.3
  • 50
    • 85132026293 scopus 로고
    • Integrated architectures for learning, planning, reacting based on approximating dynamic programming
    • San Fransisco, CA: Morgan Kaufmann
    • Sutton R. 1990 Integrated architectures for learning, planning, reacting based on approximating dynamic programming. Proc. Seventh Int. Conf. on Machine Learning, pp. 216–224. San Fransisco, CA: Morgan Kaufmann.
    • (1990) Proc. Seventh Int. Conf. on Machine Learning , pp. 216-224
    • Sutton, R.1
  • 51
    • 0034275416 scopus 로고    scopus 로고
    • Learning to play chess using temporal differences
    • Baxter J, Tridgell A, Weaver L. 2000 Learning to play chess using temporal differences. Mach. Learn. 40, 243– 263. (doi:10.1023/A:1007634325138)
    • (2000) Mach. Learn , vol.40 , pp. 243-263
    • Baxter, J.1    Tridgell, A.2    Weaver, L.3
  • 52
    • 84858720579 scopus 로고    scopus 로고
    • Bootstrapping from game tree search
    • (eds Y Bengio, D Schuurmans, J Lafferty,, C Williams, A Culotta), Red Hook, NY: Curran Associates
    • Veness J, Silver D, Uther WT, Blair A. 2009 Bootstrapping from game tree search. In NIPS, vol. 19 (eds Y Bengio, D Schuurmans, J Lafferty, C Williams, A Culotta), pp. 1937–1945. Red Hook, NY: Curran Associates.
    • (2009) NIPS , vol.19 , pp. 1937-1945
    • Veness, J.1    Silver, D.2    Uther, W.T.3    Blair, A.4
  • 53
    • 0001158047 scopus 로고
    • Improving generalization for temporal difference learning: The successor representation
    • Dayan P. 1993 Improving generalization for temporal difference learning: the successor representation. Neural Comput. 5, 613–624. (doi:10.1162/neco.1993.5.4.613)
    • (1993) Neural Comput , vol.5 , pp. 613-624
    • Dayan, P.1
  • 54
    • 84922015064 scopus 로고
    • TD models: Modeling the world at a mixture of time scales
    • (eds A Prieditis, SJ Russell, San Mateo, CA: Morgan Kaufmann
    • Sutton RS. 1995 TD models: modeling the world at a mixture of time scales. In ICML, vol. 12 (eds A Prieditis, SJ Russell), pp. 531–539. San Mateo, CA: Morgan Kaufmann.
    • (1995) ICML , vol.12 , pp. 531-539
    • Sutton, R.S.1
  • 55
    • 84867135062 scopus 로고    scopus 로고
    • Compositional planning using optimal option models
    • (eds, J Langford, J Pineau), New York, NY: Omni Press
    • Silver D, Ciosek K. 2012 Compositional planning using optimal option models. In Proc. 29th Int. Conf. on Machine Learning, ICML ‘12 (eds J Langford, J Pineau), pp. 1063–1070. New York, NY: Omni Press.
    • (2012) Proc. 29th Int. Conf. on Machine Learning, ICML ‘12 , pp. 1063-1070
    • Silver, D.1    Ciosek, K.2
  • 56
    • 84907487070 scopus 로고    scopus 로고
    • Model-based hierarchical reinforcement learning and human action control
    • Botvinick M, Weinstein A. 2014 Model-based hierarchical reinforcement learning and human action control. Phil. Trans. R. Soc. B 369, 20130480. (doi:10.1098/rstb.2013.0480)
    • (2014) Phil. Trans. R. Soc. B , vol.369 , pp. 20130480
    • Botvinick, M.1    Weinstein, A.2
  • 57
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • Sutton R., Precup D, Singh S. 1999 Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112, 181–211. (doi:10.1016/S0004-3702(99) 00052-1)
    • (1999) Artif. Intell , vol.112 , pp. 181-211
    • Sutton, R.1    Precup, D.2    Singh, S.3
  • 58
    • 84898967780 scopus 로고    scopus 로고
    • Policy search via density estimation
    • eds SA Solla, TK Leen, K Müller, Cambridge, MA: MIT Press
    • Ng AY, Parr R, Koller D. 1999 Policy search via density estimation. In NIPS (eds SA Solla, TK Leen, K Müller), pp. 1022–1028. Cambridge, MA: MIT Press.
    • (1999) NIPS , pp. 1022-1028
    • Ng, A.Y.1    Parr, R.2    Koller, D.3
  • 59
  • 60
    • 0002692217 scopus 로고
    • Actions and habits: The development of behavioural autonomy
    • Dickinson A. 1985 Actions and habits: the development of behavioural autonomy. Phil. Trans R. Soc. Lond. B 308, 67–78. (doi:10.1098/rstb.1985.0010)
    • (1985) Phil. Trans R. Soc. Lond. B , vol.308 , pp. 67-78
    • Dickinson, A.1
  • 62
    • 1942421151 scopus 로고    scopus 로고
    • Bayes meets Bellman: The Gaussian process approach to temporal difference learning
    • eds T Fawcett, N Mishra, Washington, DC: AAAI Press
    • Engel Y, Mannor S, Meir R. 2003 Bayes meets Bellman: the Gaussian process approach to temporal difference learning. In ICML (eds T Fawcett, N Mishra), pp. 154–161. Washington, DC: AAAI Press.
    • (2003) ICML , pp. 154-161
    • Engel, Y.1    Mannor, S.2    Meir, R.3
  • 64
    • 0027684215 scopus 로고
    • Prioritized sweeping: Reinforcement learning with less data and less time
    • Moore AW, Atkeson CG. 1993 Prioritized sweeping: reinforcement learning with less data and less time. Mach. Learn. 13, 103–130. (doi:10.1007/BF00993104)
    • (1993) Mach. Learn , vol.13 , pp. 103-130
    • Moore, A.W.1    Atkeson, C.G.2
  • 65
    • 84946268134 scopus 로고
    • Variations in the sensitivity of instrumental responding to reinforcer devaluation
    • Adams CD. 1982 Variations in the sensitivity of instrumental responding to reinforcer devaluation. Q. J. Exp. Psychol. 34, 77–98.
    • (1982) Q. J. Exp. Psychol , vol.34 , pp. 77-98
    • Adams, C.D.1
  • 67
    • 84868291906 scopus 로고    scopus 로고
    • Technical Report no. UCB/EECS-2011-119. EECS Department, University of California, Berkeley, CA
    • Hay N, Russell SJ. 2011 Metareasoning for Monte Carlo tree search. Technical Report no. UCB/EECS-2011-119. EECS Department, University of California, Berkeley, CA.
    • (2011) Metareasoning for Monte Carlo tree search
    • Hay, N.1    Russell, S.J.2
  • 68
    • 85012688561 scopus 로고
    • Princeton, NJ: Princeton University Press
    • Bellman RE. 1957 Dynamic programming. Princeton, NJ: Princeton University Press.
    • (1957) Dynamic programming
    • Bellman, R.E.1
  • 69
    • 72049125602 scopus 로고    scopus 로고
    • Human and rodent homologies in action control: Corticostriatal determinants of goal-directed and habitual action
    • Balleine B., O’Doherty JP. 2010 Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology 35, 48–69. (doi:10.1038/npp.2009.131)
    • (2010) Neuropsychopharmacology , vol.35 , pp. 48-69
    • Balleine, B.1    O’doherty, J.P.2
  • 70
    • 79960836887 scopus 로고    scopus 로고
    • Parallel associative processing in the dorsal striatum: Segregation of stimulus–response and cognitive control subregions
    • Devan BD, Hong NS, McDonald RJ. 2011 Parallel associative processing in the dorsal striatum: segregation of stimulus–response and cognitive control subregions. Neurobiol. Learn. Mem. 96, 95–120. (doi:10.1016/j.nlm.2011.06.002)
    • (2011) Neurobiol. Learn. Mem , vol.96 , pp. 95-120
    • Devan, B.D.1    Hong, N.S.2    McDonald, R.J.3
  • 71
    • 77953675717 scopus 로고    scopus 로고
    • Differential dynamics of activity changes in dorsolateral and dorsomedial striatal loops during learning
    • Thorn CA, Atallah H, Howe M, Graybiel AM. 2010 Differential dynamics of activity changes in dorsolateral and dorsomedial striatal loops during learning. Neuron 66, 781–795. (doi:10.1016/j.neuron.2010.04.036)
    • (2010) Neuron , vol.66 , pp. 781-795
    • Thorn, C.A.1    Atallah, H.2    Howe, M.3    Graybiel, A.M.4
  • 72
    • 0025321039 scopus 로고
    • Functional architecture of basal ganglia circuits: Neural substrates of parallel processing
    • Alexander GE, Crutcher MD. 1990 Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends Neurosci. 13, 266–271. (doi:10.1016/0166-2236(90)90107-L)
    • (1990) Trends Neurosci , vol.13 , pp. 266-271
    • Alexander, G.E.1    Crutcher, M.D.2
  • 73
    • 0022930826 scopus 로고
    • Parallel organization of functionally segregated circuits linking basal ganglia and cortex
    • Alexander GE, DeLong MR, Strick PL. 1986 Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu. Rev. Neurosci. 9, 357–381. (doi:10.1146/annurev.ne.09.030186.002041)
    • (1986) Annu. Rev. Neurosci , vol.9 , pp. 357-381
    • Alexander, G.E.1    Delong, M.R.2    Strick, P.L.3
  • 74
    • 33744550336 scopus 로고    scopus 로고
    • Anatomy of a decision: Striato-orbitofrontal interactions in reinforcement learning, decision making, reversal
    • Frank MJ, Claus ED. 2006 Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, reversal. Psychol. Rev. 113, 300–326. (doi:10.1037/0033-295X.113.2.300)
    • (2006) Psychol. Rev , vol.113 , pp. 300-326
    • Frank, M.J.1    Claus, E.D.2
  • 75
    • 0031801210 scopus 로고    scopus 로고
    • Goal-directed instrumental action: Contingency and incentive learning and their cortical substrates
    • Balleine B., Dickinson A. 1998 Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology 37, 407–419. (doi:10.1016/S0028-3908(98)00033-1)
    • (1998) Neuropharmacology , vol.37 , pp. 407-419
    • Balleine, B.1    Dickinson, A.2
  • 76
    • 0141502087 scopus 로고    scopus 로고
    • Lesions of mediodorsal thalamus and anterior thalamic nuclei produce dissociable effects on instrumental conditioning in rats
    • Corbit LH, Muir JL, Balleine BW. 2003 Lesions of mediodorsal thalamus and anterior thalamic nuclei produce dissociable effects on instrumental conditioning in rats. Eur. J. Neurosci. 18, 1286–1294. (doi:10.1046/j. 1460-9568.2003.02833.x)
    • (2003) Eur. J. Neurosci , vol.18 , pp. 1286-1294
    • Corbit, L.H.1    Muir, J.L.2    Balleine, B.W.3
  • 77
    • 37549066620 scopus 로고    scopus 로고
    • Lights, camembert, action! The role of human orbitofrontal cortex in encoding stimuli, rewards, choices
    • O’Doherty JP. 2007 Lights, camembert, action! The role of human orbitofrontal cortex in encoding stimuli, rewards, choices. Ann. NY Acad. Sci. 1121, 254–272. (doi:10.1196/annals.1401.036)
    • (2007) Ann. NY Acad. Sci , vol.1121 , pp. 254-272
    • O’doherty, J.P.1
  • 78
    • 84865459435 scopus 로고    scopus 로고
    • Corticostriatal connectivity underlies individual differences in the balance between habitual and goal-directed action control
    • de Wit S, Watson P, Harsay HA, Cohen MX, van de Vijver I, Ridderinkhof KR. 2012 Corticostriatal connectivity underlies individual differences in the balance between habitual and goal-directed action control. J. Neurosci. 32, 12 066–12 075. (doi:10.1523/JNEUROSCI.1088-12.2012)
    • (2012) J. Neurosci , vol.32 , pp. 12066-12075
    • De Wit, S.1    Watson, P.2    Harsay, H.A.3    Cohen, M.X.4    Van De Vijver, I.5    Ridderinkhof, K.R.6
  • 79
    • 79951839136 scopus 로고    scopus 로고
    • Neural correlates of instrumental contingency learning: Differential effects of action–reward conjunction and disjunction
    • Liljeholm M, Tricomi E, O’Doherty JP, Balleine BW. 2011 Neural correlates of instrumental contingency learning: differential effects of action–reward conjunction and disjunction. J. Neurosci. 31, 2474–2480. (doi:10.1523/JNEUROSCI.3354-10.2011)
    • (2011) J. Neurosci , vol.31 , pp. 2474-2480
    • Liljeholm, M.1    Tricomi, E.2    O’doherty, J.P.3    Balleine, B.W.4
  • 80
    • 84904384936 scopus 로고    scopus 로고
    • Model-based and model-free Pavlovian reward learning: Revaluation, revision and revelation
    • Dayan P, Berridge K. 2014 Model-based and model-free Pavlovian reward learning: revaluation, revision and revelation. Cogn. Affect. Behav. Neurosci. 14, 473–492. (doi:10.3758/s13415-014-0277-8)
    • (2014) Cogn. Affect. Behav. Neurosci , vol.14 , pp. 473-492
    • Dayan, P.1    Berridge, K.2
  • 81
    • 84862244642 scopus 로고    scopus 로고
    • Neural correlates of specific and general Pavlovian-to-instrumental transfer within human amygdalar subregions: A high-resolution fMRI study
    • Prevost C, Liljeholm M, Tyszka JM, O’Doherty JP. 2012 Neural correlates of specific and general Pavlovian-to-instrumental transfer within human amygdalar subregions: a high-resolution fMRI study. J. Neurosci. 32, 8383–8390. (doi:10.1523/JNEUROSCI.6237-11.2012)
    • (2012) J. Neurosci , vol.32 , pp. 8383-8390
    • Prevost, C.1    Liljeholm, M.2    Tyszka, J.M.3    O’doherty, J.P.4
  • 82
    • 0000121367 scopus 로고
    • Associations between the discriminative stimulus and the reinforcer in instrumental learning
    • Colwill RM, Rescorla RA. 1988 Associations between the discriminative stimulus and the reinforcer in instrumental learning. J. Exp. Psychol. Anim. Behav. Process. 14, 155–164. (doi:10.1037/0097-7403.14.2.155)
    • (1988) J. Exp. Psychol. Anim. Behav. Process , vol.14 , pp. 155-164
    • Colwill, R.M.1    Rescorla, R.A.2
  • 83
    • 0000772114 scopus 로고
    • Discriminative conditioning. I. A discriminative property of conditioned anticipation
    • Estes W. 1943 Discriminative conditioning. I. A discriminative property of conditioned anticipation. J. Exp. Psychol. 32, 150–155. (doi:10.1037/h0058316)
    • (1943) J. Exp. Psychol , vol.32 , pp. 150-155
    • Estes, W.1
  • 84
    • 84893764491 scopus 로고    scopus 로고
    • Dorsal and ventral streams: The distinct role of striatal subregions in the acquisition and performance of goal-directed actions
    • Hart G, Leung BK, Balleine BW. 2013 Dorsal and ventral streams: the distinct role of striatal subregions in the acquisition and performance of goal-directed actions. Neurobiol. Learn. Mem. 108, 104–118. (doi:10.1016/j.nlm.2013.11.003)
    • (2013) Neurobiol. Learn. Mem , vol.108 , pp. 104-118
    • Hart, G.1    Leung, B.K.2    Balleine, B.W.3
  • 85
    • 0014085947 scopus 로고
    • Two-process learning theory: Relationships between Pavlovian conditioning and instrumental learning
    • Rescorla RA, Solomon RL. 1967 Two-process learning theory: relationships between Pavlovian conditioning and instrumental learning. Psychol. Rev. 74, 151–182. (doi:10.1037/h0024475)
    • (1967) Psychol. Rev , vol.74 , pp. 151-182
    • Rescorla, R.A.1    Solomon, R.L.2
  • 86
    • 84887030057 scopus 로고    scopus 로고
    • The role of the amygdala–striatal pathway in the acquisition and performance of goal-directed instrumental actions
    • Corbit LH, Leung BK, Balleine BW. 2013 The role of the amygdala–striatal pathway in the acquisition and performance of goal-directed instrumental actions. J. Neurosci. 33, 17 682– 17 690. (doi:10. 1523/JNEUROSCI.3271-13.2013)
    • (2013) J. Neurosci , vol.33 , pp. 17682-17690
    • Corbit, L.H.1    Leung, B.K.2    Balleine, B.W.3
  • 87
    • 84856002548 scopus 로고    scopus 로고
    • Amygdala central nucleus interacts with dorsolateral striatum to regulate the acquisition of habits
    • Lingawi NW, Balleine BW. 2012 Amygdala central nucleus interacts with dorsolateral striatum to regulate the acquisition of habits. J. Neurosci. 323, 1073–1081. (doi:10.1523/JNEUROSCI.4806-11.2012)
    • (2012) J. Neurosci , vol.323 , pp. 1073-1081
    • Lingawi, N.W.1    Balleine, B.W.2
  • 88
    • 77955362035 scopus 로고    scopus 로고
    • Neurocomputational models of motor and cognitive deficits in Parkinson’s disease
    • Wiecki TV, Frank MJ. 2010 Neurocomputational models of motor and cognitive deficits in Parkinson’s disease. Prog. Brain Res. 183, 275–297. (doi:10.1016/S0079-6123(10)83014-6)
    • (2010) Prog. Brain Res , vol.183 , pp. 275-297
    • Wiecki, T.V.1    Frank, M.J.2
  • 89
    • 33645458694 scopus 로고    scopus 로고
    • Reverse replay of behavioural sequences in hippocampal place cells during the awake state
    • Foster DJ, Wilson MA. 2006 Reverse replay of behavioural sequences in hippocampal place cells during the awake state. Nature 440, 680–683. (doi:10.1038/nature04587)
    • (2006) Nature , vol.440 , pp. 680-683
    • Foster, D.J.1    Wilson, M.A.2
  • 90
    • 36348999880 scopus 로고    scopus 로고
    • Hippocampal theta sequences
    • Foster DJ, Wilson MA. 2007 Hippocampal theta sequences. Hippocampus 17, 1093–1099. (doi:10.1002/hipo.20345)
    • (2007) Hippocampus , vol.17 , pp. 1093-1099
    • Foster, D.J.1    Wilson, M.A.2
  • 91
    • 40849087850 scopus 로고    scopus 로고
    • Integrating hippocampus and striatum in decision-making
    • Johnson A, van der Meer MAA, Redish AD. 2007 Integrating hippocampus and striatum in decision-making. Curr. Opin. Neurobiol. 17, 692–697. (doi:10.1016/j.conb.2008.01.003)
    • (2007) Curr. Opin. Neurobiol , vol.17 , pp. 692-697
    • Johnson, A.1    Van Der Meer, M.A.A.2    Redish, A.D.3
  • 92
    • 84877578934 scopus 로고    scopus 로고
    • Hippocampal place-cell sequences depict future paths to remembered goals
    • Pfeiffer BE, Foster DJ. 2013 Hippocampal place-cell sequences depict future paths to remembered goals. Nature 497, 74–79. (doi:10.1038/nature12112)
    • (2013) Nature , vol.497 , pp. 74-79
    • Pfeiffer, B.E.1    Foster, D.J.2
  • 93
    • 29144438020 scopus 로고    scopus 로고
    • Theta rhythms coordinate hippocampal–prefrontal interactions in a spatial memory task
    • Jones MW, Wilson MA. 2005 Theta rhythms coordinate hippocampal–prefrontal interactions in a spatial memory task. PLoS Biol. 3, e402. (doi:10.1371/journal.pbio.0030402)
    • (2005) PLoS Biol , vol.3
    • Jones, M.W.1    Wilson, M.A.2
  • 96
    • 0033168618 scopus 로고    scopus 로고
    • Prospective coding for objects in primate prefrontal cortex
    • Rainer G, Rao SC, Miller EK. 1999 Prospective coding for objects in primate prefrontal cortex. J. Neurosci. 19, 5493–5505.
    • (1999) J. Neurosci , vol.19 , pp. 5493-5505
    • Rainer, G.1    Rao, S.C.2    Miller, E.K.3
  • 97
    • 0025948773 scopus 로고
    • Neural organization for the long-term memory of paired associates
    • Sakai K, Miyashita Y. 1991 Neural organization for the long-term memory of paired associates. Nature 354,152–155. (doi:10.1038/354152a0)
    • (1991) Nature , vol.354 , pp. 152-155
    • Sakai, K.1    Miyashita, Y.2
  • 98
    • 15244346900 scopus 로고    scopus 로고
    • Lesion to the nigrostriatal dopamine system disrupts stimulus–response habit formation
    • Faure A, Haberland U, Conde´ F, El Massioui N. 2005 Lesion to the nigrostriatal dopamine system disrupts stimulus–response habit formation. J. Neurosci. 25, 2771–2780. (doi:10.1523/JNEUROSCI.3894-04.2005)
    • (2005) J. Neurosci , vol.25 , pp. 2771-2780
    • Faure, A.1    Haberland, U.2    Conde´, F.3    El Massioui, N.4
  • 100
    • 84155183278 scopus 로고    scopus 로고
    • NMDA receptors in dopaminergic neurons are crucial for habit learning
    • Wang LP, Li F, Wang D, Xie K, Wang D, Shen X, Tsien JZ. 2011 NMDA receptors in dopaminergic neurons are crucial for habit learning. Neuron 72, 1055–1066. (doi:10.1016/j.neuron.2011.10.019)
    • (2011) Neuron , vol.72 , pp. 1055-1066
    • Wang, L.P.1    Li, F.2    Wang, D.3    Xie, K.4    Wang, D.5    Shen, X.6    Tsien, J.Z.7
  • 101
    • 0033913868 scopus 로고    scopus 로고
    • Dissociation of Pavlovian and instrumental incentive learning under dopamine antagonists
    • Dickinson A, Smith J, Mirenowicz J. 2000 Dissociation of Pavlovian and instrumental incentive learning under dopamine antagonists. Behav. Neurosci. 114, 468–483. (doi:10.1037/0735-7044.114.3.468)
    • (2000) Behav. Neurosci , vol.114 , pp. 468-483
    • Dickinson, A.1    Smith, J.2    Mirenowicz, J.3
  • 102
    • 84874655309 scopus 로고    scopus 로고
    • Instant transformation of learned repulsion into motivational ‘wanting’
    • Robinson MJF, Berridge KC. 2013 Instant transformation of learned repulsion into motivational ‘wanting’. Curr. Biol. 23, 282–289. (doi:10.1016/j.cub.2013.01.016)
    • (2013) Curr. Biol , vol.23 , pp. 282-289
    • Robinson, M.J.F.1    Berridge, K.C.2
  • 103
    • 84906263462 scopus 로고    scopus 로고
    • The computational and neural basis of cognitive control: Charted territory and new frontiers
    • Botvinick M., Cohen J. 2014 The computational and neural basis of cognitive control: charted territory and new frontiers. Cogn. Sci. (doi:10.1111/cogs.12126)
    • (2014) Cogn. Sci
    • Botvinick, M.1    Cohen, J.2
  • 104
    • 0035382933 scopus 로고    scopus 로고
    • Interactions between frontal cortex and basal ganglia in working memory: A computational model
    • Frank MJ, Loughry B, O’Reilly RC. 2001 Interactions between frontal cortex and basal ganglia in working memory: a computational model. Cogn. Affect. Behav. Neurosci. 1, 137–160. (doi:10.3758/CABN.1.2.137)
    • (2001) Cogn. Affect. Behav. Neurosci , vol.1 , pp. 137-160
    • Frank, M.J.1    Loughry, B.2    O’Reilly, R.C.3
  • 105
    • 33645889031 scopus 로고    scopus 로고
    • Banishing the homunculus: Making working memory work
    • Hazy TE, Frank MJ, O’Reilly RC. 2006 Banishing the homunculus: making working memory work. Neuroscience 139, 105–118. (doi:10.1016/j.neuroscience.2005.04.067)
    • (2006) Neuroscience , vol.139 , pp. 105-118
    • Hazy, T.E.1    Frank, M.J.2    O’Reilly, R.C.3
  • 106
    • 0034928713 scopus 로고    scopus 로고
    • An integrative theory of prefrontal cortex function
    • Miller EK, Cohen JD. 2001 An integrative theory of prefrontal cortex function. Annu. Rev. Neurosci. 24, 167–202. (doi:10.1146/annurev.neuro.24.1.167)
    • (2001) Annu. Rev. Neurosci , vol.24 , pp. 167-202
    • Miller, E.K.1    Cohen, J.D.2
  • 108
    • 33644927837 scopus 로고    scopus 로고
    • Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia
    • O’Reilly RC, Frank MJ. 2006 Making working memory work: a computational model of learning in the prefrontal cortex and basal ganglia. Neural Comput. 18, 283–328. (doi:10.1162/0899766067 75093909)
    • (2006) Neural Comput , vol.18 , pp. 283-328
    • O’Reilly, R.C.1    Frank, M.J.2
  • 109
    • 0036173816 scopus 로고    scopus 로고
    • Prefrontal cortex and dynamic categorization tasks: Representational organization and neuromodulatory control
    • O’Reilly RC, Noelle DC, Braver TS, Cohen J.. 2002 Prefrontal cortex and dynamic categorization tasks: representational organization and neuromodulatory control. Cereb. Cortex 12, 246–257. (doi:10.1093/ cercor/12.3.246)
    • (2002) Cereb. Cortex , vol.12 , pp. 246-257
    • O’Reilly, R.C.1    Noelle, D.C.2    Braver, T.S.3    Cohen, J.4
  • 111
    • 70350440484 scopus 로고    scopus 로고
    • Rational adaptation under task and processing constraints: Implications for testing theories of cognition and action
    • Howes A, Lewis RL, Vera A. 2009 Rational adaptation under task and processing constraints: implications for testing theories of cognition and action. Psychol. Rev. 116, 717–751. (doi:10.1037/a0017187)
    • (2009) Psychol. Rev , vol.116 , pp. 717-751
    • Howes, A.1    Lewis, R.L.2    Vera, A.3
  • 112
    • 84899629222 scopus 로고    scopus 로고
    • Computational rationality: Linking mechanism and behavior through utility maximization
    • Lewis RL, Howes A, Singh S. 2014 Computational rationality: linking mechanism and behavior through utility maximization. Top. Cogn. Sci. 6, 279–311. (doi:10.1111/tops.12086)
    • (2014) Top. Cogn. Sci , vol.6 , pp. 279-311
    • Lewis, R.L.1    Howes, A.2    Singh, S.3
  • 113
    • 33847205014 scopus 로고    scopus 로고
    • Planning for the future by western scrub-jays
    • Raby CR, Alexis DM, Dickinson A, Clayton NS. 2007 Planning for the future by western scrub-jays. Nature 445, 919–921. (doi:10.1038/nature05575)
    • (2007) Nature , vol.445 , pp. 919-921
    • Raby, C.R.1    Alexis, D.M.2    Dickinson, A.3    Clayton, N.S.4
  • 114
    • 84960566726 scopus 로고
    • Anomalies in intertemporal choice: Evidence and an interpretation
    • Loewenstein G, Prelec D. 1992 Anomalies in intertemporal choice: evidence and an interpretation. Q. J. Econ. 107, 573–597. (doi:10. 2307/2118482)
    • (1992) Q. J. Econ , vol.107 , pp. 573-597
    • Loewenstein, G.1    Prelec, D.2
  • 116
    • 70350566799 scopus 로고    scopus 로고
    • Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective
    • Botvinick M., Niv Y, Barto AC. 2009 Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 113, 262–280. (doi:10.1016/j.cognition.2008.08.011)
    • (2009) Cognition , vol.113 , pp. 262-280
    • Botvinick, M.1    Niv, Y.2    Barto, A.C.3
  • 118
    • 78649604962 scopus 로고    scopus 로고
    • Evidence for model-based action planning in a sequential finger movement task
    • Fermin A, Yoshida T, Ito M, Yoshimoto J, Doya K. 2010 Evidence for model-based action planning in a sequential finger movement task. J. Motiv. Behav. 42, 371–379. (doi:10.1080/00222895.2010.526467)
    • (2010) J. Motiv. Behav , vol.42 , pp. 371-379
    • Fermin, A.1    Yoshida, T.2    Ito, M.3    Yoshimoto, J.4    Doya, K.5
  • 119
    • 84889680697 scopus 로고    scopus 로고
    • The intrinsic cost of cognitive control
    • Kool W, Botvinick M. 2013 The intrinsic cost of cognitive control. Behav. Brain Sci. 36, 697–698. (doi:10.1017/S0140525X1300109X)
    • (2013) Behav. Brain Sci , vol.36 , pp. 697-698
    • Kool, W.1    Botvinick, M.2
  • 120
    • 84880660982 scopus 로고    scopus 로고
    • The expected value of control: An integrative theory of anterior cingulate cortex function
    • Shenhav A, Botvinick M., Cohen JD. 2013 The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron 79, 217–240. (doi:10.1016/j.neuron.2013.07.007)
    • (2013) Neuron , vol.79 , pp. 217-240
    • Shenhav, A.1    Botvinick, M.2    Cohen, J.D.3
  • 121
    • 0000827179 scopus 로고
    • Boxes: An experiment in adaptive control
    • Michie D, Chambers R. 1968 Boxes: an experiment in adaptive control. Mach. Intell. 2, 137–152.
    • (1968) Mach. Intell , vol.2 , pp. 137-152
    • Michie, D.1    Chambers, R.2
  • 123
    • 0001778486 scopus 로고
    • The impact of chess research on cognitive science
    • Charness N. 1992 The impact of chess research on cognitive science. Psychol. Res. 54, 4–9. (doi:10. 1007/BF01359217)
    • (1992) Psychol. Res , vol.54 , pp. 4-9
    • Charness, N.1
  • 125
    • 0001275820 scopus 로고
    • Chess-playing programs and the problem of complexity
    • Newell A, Shaw JC, Simon HA. 1958 Chess-playing programs and the problem of complexity. IBM J. Res. Dev. 2, 320–335. (doi:10.1147/rd.24.0320)
    • (1958) IBM J. Res. Dev , vol.2 , pp. 320-335
    • Newell, A.1    Shaw, J.C.2    Simon, H.A.3
  • 126
    • 0003430412 scopus 로고
    • Englewood Cliffs, NJ: Prentice-Hall
    • Newell A et al. 1972 Human problem solving. Englewood Cliffs, NJ: Prentice-Hall.
    • (1972) Human problem solving
    • Newell, A.1
  • 127
    • 0002285834 scopus 로고
    • Problem solving and learning
    • Anderson JR. 1993 Problem solving and learning. Am. Psychol. 48, 35–44. (doi:10.1037/0003-066X.48.1.35)
    • (1993) Am. Psychol , vol.48 , pp. 35-44
    • Anderson, J.R.1
  • 128
    • 0000004705 scopus 로고
    • The functional equivalence of problem solving skills
    • Simon HA. 1975 The functional equivalence of problem solving skills. Cogn. Psychol. 7, 268–288. (doi:10.1016/0010-0285 (75)90012-2)
    • (1975) Cogn. Psychol , vol.7 , pp. 268-288
    • Simon, H.A.1
  • 129
    • 84859371025 scopus 로고    scopus 로고
    • Bonsai trees in your head: How the Pavlovian system sculpts goal- directed choices by pruning decision trees
    • Huys QJ, Eshel N, O’Nions E, Sheridan L, Dayan P, Roiser JP. 2012 Bonsai trees in your head: how the Pavlovian system sculpts goal- directed choices by pruning decision trees. PLoS Comput. Biol. 8, e1002410. (doi:10.1371/journal.pcbi.1002410)
    • (2012) PLoS Comput. Biol , vol.8
    • Huys, Q.J.1    Eshel, N.2    O’Nions, E.3    Sheridan, L.4    Dayan, P.5    Roiser, J.P.6


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.