메뉴 건너뛰기




Volumn 112, Issue 37, 2015, Pages 11708-11713

Evidence integration in model-based tree search

Author keywords

Drift diffusion model; Reinforcement learning; Reward based decision making

Indexed keywords

ARTICLE; ARTIFICIAL INTELLIGENCE; CONTROLLED STUDY; DECISION MAKING; DECISION TREE; EVIDENCE INTEGRATION MODEL; HUMAN; PRIORITY JOURNAL; PROCESS MODEL; RESPONSE TIME; REWARD; SLOPE FACTOR; BAYES THEOREM; BIOLOGICAL MODEL; COMPUTER SIMULATION; LEARNING; REINFORCEMENT; REPRODUCIBILITY;

EID: 84941703217     PISSN: 00278424     EISSN: 10916490     Source Type: Journal    
DOI: 10.1073/pnas.1505483112     Document Type: Article
Times cited : (38)

References (57)
  • 1
    • 0027637720 scopus 로고
    • Decision field theory: A dynamic-cognitive approach to decision making in an uncertain environment
    • Busemeyer JR, Townsend JT (1993) Decision field theory: A dynamic-cognitive approach to decision making in an uncertain environment. Psychol Rev 100(3):432-459.
    • (1993) Psychol Rev , vol.100 , Issue.3 , pp. 432-459
    • Busemeyer, J.R.1    Townsend, J.T.2
  • 2
    • 78449253934 scopus 로고    scopus 로고
    • The drift diffusion model can account for the accuracy and reaction time of value-based choices under high and low time pressure
    • Milosavljevic M, Malmaud J, Huth A, Koch C, Rangel A (2010) The drift diffusion model can account for the accuracy and reaction time of value-based choices under high and low time pressure. Judgm Decis Mak 5(6):437-449.
    • (2010) Judgm Decis Mak , vol.5 , Issue.6 , pp. 437-449
    • Milosavljevic, M.1    Malmaud, J.2    Huth, A.3    Koch, C.4    Rangel, A.5
  • 3
    • 77957264374 scopus 로고    scopus 로고
    • Visual fixations and the computation and comparison of value in simple choice
    • Krajbich I, Armel C, Rangel A (2010) Visual fixations and the computation and comparison of value in simple choice. Nat Neurosci 13(10):1292-1298.
    • (2010) Nat Neurosci , vol.13 , Issue.10 , pp. 1292-1298
    • Krajbich, I.1    Armel, C.2    Rangel, A.3
  • 4
    • 80052015081 scopus 로고    scopus 로고
    • Multialternative drift-diffusion model predicts the relationship between visual fixations and choice in value-based decisions
    • Krajbich I, Rangel A (2011) Multialternative drift-diffusion model predicts the relationship between visual fixations and choice in value-based decisions. Proc Natl Acad Sci USA 108(33):13852-13857.
    • (2011) Proc Natl Acad Sci USA , vol.108 , Issue.33 , pp. 13852-13857
    • Krajbich, I.1    Rangel, A.2
  • 5
    • 84901460554 scopus 로고    scopus 로고
    • Stochastic choice: An optimizing neuroeconomic model
    • Woodford M (2014) Stochastic choice: An optimizing neuroeconomic model. Am Econ Rev 104(5):495-500.
    • (2014) Am Econ Rev , vol.104 , Issue.5 , pp. 495-500
    • Woodford, M.1
  • 6
    • 84877264809 scopus 로고    scopus 로고
    • Model-based reinforcement learning as cognitive search: Neuro-computational theories
    • eds Todd PM, Hills TT, Robbins TW (MIT Press, Cambridge, MA)
    • Daw ND (2012) Model-based reinforcement learning as cognitive search: neuro-computational theories. Cognitive Search: Evolution Algorithms and the Brain, eds Todd PM, Hills TT, Robbins TW (MIT Press, Cambridge, MA), pp 195-208.
    • (2012) Cognitive Search: Evolution Algorithms and the Brain , pp. 195-208
    • Daw, N.D.1
  • 7
    • 28044450875 scopus 로고    scopus 로고
    • Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
    • Daw ND, Niv Y, Dayan P (2005) Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci 8(12): 1704-1711.
    • (2005) Nat Neurosci , vol.8 , Issue.12 , pp. 1704-1711
    • Daw, N.D.1    Niv, Y.2    Dayan, P.3
  • 8
    • 84885802926 scopus 로고    scopus 로고
    • Goals and habits in the brain
    • Dolan RJ, Dayan P (2013) Goals and habits in the brain. Neuron 80(2):312-325.
    • (2013) Neuron , vol.80 , Issue.2 , pp. 312-325
    • Dolan, R.J.1    Dayan, P.2
  • 9
    • 84859371025 scopus 로고    scopus 로고
    • Bonsai trees in your head: How the pavlovian system sculpts goal-directed choices by pruning decision trees
    • Huys QJM, et al. (2012) Bonsai trees in your head: How the pavlovian system sculpts goal-directed choices by pruning decision trees. PLOS Comput Biol 8(3):e1002410.
    • (2012) PLOS Comput Biol , vol.8 , Issue.3 , pp. e1002410
    • Huys, Q.J.M.1
  • 10
    • 84924325916 scopus 로고    scopus 로고
    • Interplay of approximate planning strategies
    • Huys QJM, et al. (2015) Interplay of approximate planning strategies. Proc Natl Acad Sci USA 112(10):3098-3103.
    • (2015) Proc Natl Acad Sci USA , vol.112 , Issue.10 , pp. 3098-3103
    • Huys, Q.J.M.1
  • 12
    • 84859297479 scopus 로고    scopus 로고
    • Dissociating hippocampal and striatal contributions to sequential prediction learning
    • Bornstein AM, Daw ND (2012) Dissociating hippocampal and striatal contributions to sequential prediction learning. Eur J Neurosci 35(7):1011-1023.
    • (2012) Eur J Neurosci , vol.35 , Issue.7 , pp. 1011-1023
    • Bornstein, A.M.1    Daw, N.D.2
  • 13
    • 84892777219 scopus 로고    scopus 로고
    • Cortical and hippocampal correlates of deliberation during model-based decisions for rewards in humans
    • Bornstein AM, Daw ND (2013) Cortical and hippocampal correlates of deliberation during model-based decisions for rewards in humans. PLOS Comput Biol 9(12): e1003387.
    • (2013) PLOS Comput Biol , vol.9 , Issue.12 , pp. e1003387
    • Bornstein, A.M.1    Daw, N.D.2
  • 14
    • 77953260848 scopus 로고    scopus 로고
    • States versus rewards: Dissociable neural prediction error signals underlying model-based and model-free reinforcement learning
    • Gläscher J, Daw N, Dayan P, O'Doherty JP (2010) States versus rewards: Dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66(4):585-595.
    • (2010) Neuron , vol.66 , Issue.4 , pp. 585-595
    • Gläscher, J.1    Daw, N.2    Dayan, P.3    O'Doherty, J.P.4
  • 15
    • 79952746011 scopus 로고    scopus 로고
    • Model-based influences on humans' choices and striatal prediction errors
    • Daw ND, Gershman SJ, Seymour B, Dayan P, Dolan RJ (2011) Model-based influences on humans' choices and striatal prediction errors. Neuron 69(6):1204-1215.
    • (2011) Neuron , vol.69 , Issue.6 , pp. 1204-1215
    • Daw, N.D.1    Gershman, S.J.2    Seymour, B.3    Dayan, P.4    Dolan, R.J.5
  • 16
    • 79955709936 scopus 로고    scopus 로고
    • Neural correlates of forward planning in a spatial decision task in humans
    • Simon DA, Daw ND (2011) Neural correlates of forward planning in a spatial decision task in humans. J Neurosci 31(14):5526-5539.
    • (2011) J Neurosci , vol.31 , Issue.14 , pp. 5526-5539
    • Simon, D.A.1    Daw, N.D.2
  • 17
    • 84860307045 scopus 로고    scopus 로고
    • Mapping value based planning and extensively trained choice in the human brain
    • Wunderlich K, Dayan P, Dolan RJ (2012) Mapping value based planning and extensively trained choice in the human brain. Nat Neurosci 15(5):786-791.
    • (2012) Nat Neurosci , vol.15 , Issue.5 , pp. 786-791
    • Wunderlich, K.1    Dayan, P.2    Dolan, R.J.3
  • 18
    • 84877341847 scopus 로고    scopus 로고
    • The curse of planning: Dissecting multiple reinforcement-learning systems by taxing the central executive
    • Otto AR, Gershman SJ, Markman AB, Daw ND (2013) The curse of planning: Dissecting multiple reinforcement-learning systems by taxing the central executive. Psychol Sci 24(5):751-761.
    • (2013) Psychol Sci , vol.24 , Issue.5 , pp. 751-761
    • Otto, A.R.1    Gershman, S.J.2    Markman, A.B.3    Daw, N.D.4
  • 19
  • 20
    • 84888025934 scopus 로고    scopus 로고
    • Disruption of dorsolateral prefrontal cortex decreases model-based in favor of model-free control in humans
    • Smittenaar P, FitzGerald THB, Romei V, Wright ND, Dolan RJ (2013) Disruption of dorsolateral prefrontal cortex decreases model-based in favor of model-free control in humans. Neuron 80(4):914-919.
    • (2013) Neuron , vol.80 , Issue.4 , pp. 914-919
    • Smittenaar, P.1    FitzGerald, T.H.B.2    Romei, V.3    Wright, N.D.4    Dolan, R.J.5
  • 21
    • 84899826677 scopus 로고    scopus 로고
    • Transcranial direct current stimulation of right dorsolateral prefrontal cortex does not affect model-based or model-free reinforcement learning in humans
    • Smittenaar P, Prichard G, FitzGerald THB, Diedrichsen J, Dolan RJ (2014) Transcranial direct current stimulation of right dorsolateral prefrontal cortex does not affect model-based or model-free reinforcement learning in humans. PLoS One 9(1):e86850.
    • (2014) PLoS One , vol.9 , Issue.1 , pp. e86850
    • Smittenaar, P.1    Prichard, G.2    FitzGerald, T.H.B.3    Diedrichsen, J.4    Dolan, R.J.5
  • 22
    • 84864935116 scopus 로고    scopus 로고
    • Dopamine enhances model-based over model-free choice behavior
    • Wunderlich K, Smittenaar P, Dolan RJ (2012) Dopamine enhances model-based over model-free choice behavior. Neuron 75(3):418-424.
    • (2012) Neuron , vol.75 , Issue.3 , pp. 418-424
    • Wunderlich, K.1    Smittenaar, P.2    Dolan, R.J.3
  • 23
    • 84859737036 scopus 로고    scopus 로고
    • Goal-directed decision making as probabilistic inference: A computational framework and potential neural correlates
    • Solway A, Botvinick MM (2012) Goal-directed decision making as probabilistic inference: A computational framework and potential neural correlates. Psychol Rev 119(1):120-154.
    • (2012) Psychol Rev , vol.119 , Issue.1 , pp. 120-154
    • Solway, A.1    Botvinick, M.M.2
  • 24
    • 0001429177 scopus 로고    scopus 로고
    • Dynamic stochastic models for decision making under time constraints
    • Diederich A (1997) Dynamic stochastic models for decision making under time constraints. J Math Psychol 41(3):260-274.
    • (1997) J Math Psychol , vol.41 , Issue.3 , pp. 260-274
    • Diederich, A.1
  • 25
    • 85047684990 scopus 로고    scopus 로고
    • Multialternative decision field theory: A dynamic connectionist model of decision making
    • Roe RM, Busemeyer JR, Townsend JT (2001) Multialternative decision field theory: A dynamic connectionist model of decision making. Psychol Rev 108(2):370-392.
    • (2001) Psychol Rev , vol.108 , Issue.2 , pp. 370-392
    • Roe, R.M.1    Busemeyer, J.R.2    Townsend, J.T.3
  • 26
    • 85047685362 scopus 로고    scopus 로고
    • The time course of perceptual choice: The leaky, competing accumulator model
    • Usher M, McClelland JL (2001) The time course of perceptual choice: The leaky, competing accumulator model. Psychol Rev 108(3):550-592.
    • (2001) Psychol Rev , vol.108 , Issue.3 , pp. 550-592
    • Usher, M.1    McClelland, J.L.2
  • 27
    • 84881098150 scopus 로고    scopus 로고
    • Disentangling decision models: From independence to competition
    • Teodorescu AR, Usher M (2013) Disentangling decision models: From independence to competition. Psychol Rev 120(1):1-38.
    • (2013) Psychol Rev , vol.120 , Issue.1 , pp. 1-38
    • Teodorescu, A.R.1    Usher, M.2
  • 28
    • 70349772546 scopus 로고    scopus 로고
    • Confidence intervals from normalized data: A correction to Cous-ineau (2005)
    • Morey RD (2008) Confidence intervals from normalized data: A correction to Cous-ineau (2005). Tutor Quant Methods Psychol 4(2):61-64.
    • (2008) Tutor Quant Methods Psychol , vol.4 , Issue.2 , pp. 61-64
    • Morey, R.D.1
  • 29
    • 0000120766 scopus 로고
    • Estimating the dimension of a model
    • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461-464.
    • (1978) Ann Stat , vol.6 , Issue.2 , pp. 461-464
    • Schwarz, G.1
  • 30
    • 0346942368 scopus 로고    scopus 로고
    • Decision-theoretic planning: Structural assumptions and computational leverage
    • Boutilier C, Dean T, Hanks S (1999) Decision-theoretic planning: Structural assumptions and computational leverage. J Artif Intell Res 11(1):1-94.
    • (1999) J Artif Intell Res , vol.11 , Issue.1 , pp. 1-94
    • Boutilier, C.1    Dean, T.2    Hanks, S.3
  • 31
    • 33745774340 scopus 로고    scopus 로고
    • How fast to work: Response vigor, motivation and tonic dopamine
    • eds Weiss Y, Schölkopf B, Platt J (MIT Press, Cambridge, MA)
    • Niv Y, Daw N, Dayan P (2005) How fast to work: Response vigor, motivation and tonic dopamine. Advances in Neural Information Processing Systems, eds Weiss Y, Schölkopf B, Platt J (MIT Press, Cambridge, MA), Vol 18, pp 1019-1026.
    • (2005) Advances in Neural Information Processing Systems , vol.18 , pp. 1019-1026
    • Niv, Y.1    Daw, N.2    Dayan, P.3
  • 32
    • 33847675011 scopus 로고    scopus 로고
    • Tonic dopamine: Opportunity costs and the control of response vigor
    • Niv Y, Daw ND, Joel D, Dayan P (2007) Tonic dopamine: Opportunity costs and the control of response vigor. Psychopharmacology (Berl) 191(3):507-520.
    • (2007) Psychopharmacology (Berl) , vol.191 , Issue.3 , pp. 507-520
    • Niv, Y.1    Daw, N.D.2    Joel, D.3    Dayan, P.4
  • 33
    • 0842349509 scopus 로고    scopus 로고
    • Reward-predicting activity of dopamine and caudate neurons-a possible mechanism of motivational control of saccadic eye movement
    • Kawagoe R, Takikawa Y, Hikosaka O (2004) Reward-predicting activity of dopamine and caudate neurons-a possible mechanism of motivational control of saccadic eye movement. J Neurophysiol 91(2):1013-1024.
    • (2004) J Neurophysiol , vol.91 , Issue.2 , pp. 1013-1024
    • Kawagoe, R.1    Takikawa, Y.2    Hikosaka, O.3
  • 34
    • 0036138643 scopus 로고    scopus 로고
    • Modulation of sac-cadic eye movements by predicted reward outcome
    • Takikawa Y, Kawagoe R, Itoh H, Nakahara H, Hikosaka O (2002) Modulation of sac-cadic eye movements by predicted reward outcome. Exp Brain Res 142(2):284-291.
    • (2002) Exp Brain Res , vol.142 , Issue.2 , pp. 284-291
    • Takikawa, Y.1    Kawagoe, R.2    Itoh, H.3    Nakahara, H.4    Hikosaka, O.5
  • 36
    • 58149404021 scopus 로고
    • A theory of memory retrieval
    • Ratcliff R (1978) A theory of memory retrieval. Psychol Rev 85(2):59-108.
    • (1978) Psychol Rev , vol.85 , Issue.2 , pp. 59-108
    • Ratcliff, R.1
  • 37
    • 60749089421 scopus 로고    scopus 로고
    • A context maintenance and retrieval model of organizational processes in free recall
    • Polyn SM, Norman KA, Kahana MJ (2009) A context maintenance and retrieval model of organizational processes in free recall. Psychol Rev 116(1):129-156.
    • (2009) Psychol Rev , vol.116 , Issue.1 , pp. 129-156
    • Polyn, S.M.1    Norman, K.A.2    Kahana, M.J.3
  • 38
    • 56149116457 scopus 로고    scopus 로고
    • A context-based theory of recency and contiguity in free recall
    • Sederberg PB, Howard MW, Kahana MJ (2008) A context-based theory of recency and contiguity in free recall. Psychol Rev 115(4):893-912.
    • (2008) Psychol Rev , vol.115 , Issue.4 , pp. 893-912
    • Sederberg, P.B.1    Howard, M.W.2    Kahana, M.J.3
  • 39
    • 41849150773 scopus 로고    scopus 로고
    • The diffusion decision model: Theory and data for two-choice decision tasks
    • Ratcliff R, McKoon G (2008) The diffusion decision model: Theory and data for two-choice decision tasks. Neural Comput 20(4):873-922.
    • (2008) Neural Comput , vol.20 , Issue.4 , pp. 873-922
    • Ratcliff, R.1    McKoon, G.2
  • 40
    • 36549038073 scopus 로고    scopus 로고
    • A diffusion model account of criterion shifts in the lexical decision task
    • Wagenmakers E-J, Ratcliff R, Gomez P, McKoon G (2008) A diffusion model account of criterion shifts in the lexical decision task. J Mem Lang 58(1):140-159.
    • (2008) J Mem Lang , vol.58 , Issue.1 , pp. 140-159
    • Wagenmakers, E.-J.1    Ratcliff, R.2    Gomez, P.3    McKoon, G.4
  • 41
    • 32544439341 scopus 로고    scopus 로고
    • A recurrent network mechanism of time integration in perceptual decisions
    • Wong KF, Wang XJ (2006) A recurrent network mechanism of time integration in perceptual decisions. J Neurosci 26(4):1314-1328.
    • (2006) J Neurosci , vol.26 , Issue.4 , pp. 1314-1328
    • Wong, K.F.1    Wang, X.J.2
  • 42
    • 38049037928 scopus 로고    scopus 로고
    • Efficient selectivity and backup operators in Monte-carlo tree search
    • eds van den Herik HJ, Ciancarini P, Donkers HHLM (Springer, New York)
    • Coulom R (2007) Efficient selectivity and backup operators in Monte-carlo tree search. Computers and Games, eds van den Herik HJ, Ciancarini P, Donkers HHLM (Springer, New York), pp 72-83.
    • (2007) Computers and Games , pp. 72-83
    • Coulom, R.1
  • 43
    • 79956202655 scopus 로고    scopus 로고
    • Monte-carlo tree search and rapid action value estimation in computer Go
    • Gelly S, Silver D (2011) Monte-carlo tree search and rapid action value estimation in computer Go. Artif Intell 175(11):1856-1875.
    • (2011) Artif Intell , vol.175 , Issue.11 , pp. 1856-1875
    • Gelly, S.1    Silver, D.2
  • 45
    • 0036832951 scopus 로고    scopus 로고
    • A sparse sampling algorithm for near-optimal planning in large Markov decision processes
    • Kearns M, Mansour Y, Ng AY (2002) A sparse sampling algorithm for near-optimal planning in large Markov decision processes. Mach Learn 49(2-3):193-208.
    • (2002) Mach Learn , vol.49 , Issue.2-3 , pp. 193-208
    • Kearns, M.1    Mansour, Y.2    Ng, A.Y.3
  • 46
    • 0141988716 scopus 로고    scopus 로고
    • Recent advances in hierarchical reinforcement learning
    • Barto AG, Mahadevan S (2003) Recent advances in hierarchical reinforcement learning. Discrete Event Dyn Syst 13(4):341-379.
    • (2003) Discrete Event Dyn Syst , vol.13 , Issue.4 , pp. 341-379
    • Barto, A.G.1    Mahadevan, S.2
  • 47
    • 70350566799 scopus 로고    scopus 로고
    • Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective
    • Botvinick MM, Niv Y, Barto AC (2009) Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective. Cognition 113(3):262-280.
    • (2009) Cognition , vol.113 , Issue.3 , pp. 262-280
    • Botvinick, M.M.1    Niv, Y.2    Barto, A.C.3
  • 48
    • 84875468581 scopus 로고    scopus 로고
    • Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia
    • Diuk C, Tsai K, Wallis J, Botvinick M, Niv Y (2013) Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia. J Neurosci 33(13): 5797-5805.
    • (2013) J Neurosci , vol.33 , Issue.13 , pp. 5797-5805
    • Diuk, C.1    Tsai, K.2    Wallis, J.3    Botvinick, M.4    Niv, Y.5
  • 49
    • 79960637995 scopus 로고    scopus 로고
    • A neural signature of hierarchical reinforcement learning
    • Ribas-Fernandes JJF, et al. (2011) A neural signature of hierarchical reinforcement learning. Neuron 71(2):370-379.
    • (2011) Neuron , vol.71 , Issue.2 , pp. 370-379
    • Ribas-Fernandes, J.J.F.1
  • 50
    • 84923228672 scopus 로고    scopus 로고
    • Optimal behavioral hierarchy
    • Solway A, et al. (2014) Optimal behavioral hierarchy. PLOS Comput Biol 10(8): e1003779.
    • (2014) PLOS Comput Biol , vol.10 , Issue.8 , pp. e1003779
    • Solway, A.1
  • 52
    • 84885066831 scopus 로고    scopus 로고
    • Simultaneous modeling of visual saliency and value computation improves predictions of economic choice
    • Towal RB, Mormann M, Koch C (2013) Simultaneous modeling of visual saliency and value computation improves predictions of economic choice. Proc Natl Acad Sci USA 110(40):E3858-E3867.
    • (2013) Proc Natl Acad Sci USA , vol.110 , Issue.40 , pp. E3858-E3867
    • Towal, R.B.1    Mormann, M.2    Koch, C.3
  • 53
    • 84867287309 scopus 로고    scopus 로고
    • Preference by association: How memory mechanisms in the hippocampus bias decisions
    • Wimmer GE, Shohamy D (2012) Preference by association: How memory mechanisms in the hippocampus bias decisions. Science 338(6104):270-273.
    • (2012) Science , vol.338 , Issue.6104 , pp. 270-273
    • Wimmer, G.E.1    Shohamy, D.2
  • 54
    • 0030612822 scopus 로고    scopus 로고
    • The psychophysics toolbox
    • Brainard DH (1997) The psychophysics toolbox. Spat Vis 10(4):433-436.
    • (1997) Spat Vis , vol.10 , Issue.4 , pp. 433-436
    • Brainard, D.H.1
  • 56
    • 79961229328 scopus 로고    scopus 로고
    • DEoptim: An R package for global optimization by differential evolution
    • Mullen KM, Ardia D, Gil DL, Windover D, Cline J (2009) DEoptim: An R package for global optimization by differential evolution. J Stat Softw 40(6):1-26.
    • (2009) J Stat Softw , vol.40 , Issue.6 , pp. 1-26
    • Mullen, K.M.1    Ardia, D.2    Gil, D.L.3    Windover, D.4    Cline, J.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.