메뉴 건너뛰기




Volumn 143, Issue 6, 2014, Pages 2074-2081

Humans use directed and random exploration to solve the explore-exploit dilemma

Author keywords

Decision making; Decision noise; Explore exploit; Information bonus; Reinforcement learning

Indexed keywords

ADOLESCENT; DECISION MAKING; EXPLORATORY BEHAVIOR; FEMALE; HUMAN; MALE; NEUROPSYCHOLOGICAL TEST; PHYSIOLOGY; REWARD; YOUNG ADULT;

EID: 84925600345     PISSN: 00963445     EISSN: None     Source Type: Journal    
DOI: 10.1037/a0038199     Document Type: Article
Times cited : (355)

References (35)
  • 1
    • 23244432007 scopus 로고    scopus 로고
    • An integrative theory of locus coeruleus-norepinephrine function: Adaptive gain and optimal performance
    • Aston-Jones, G., & Cohen, J. D. (2005). An integrative theory of locus coeruleus-norepinephrine function: Adaptive gain and optimal performance. Annual Review of Neuroscience, 28, 403-450. http://dx.doi.org/ 10.1146/annurev.neuro.28.061604.135709
    • (2005) Annual Review of Neuroscience , vol.28 , pp. 403-450
    • Aston-Jones, G.1    Cohen, J.D.2
  • 2
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47, 235-256. http://dx .doi.org/10.1023/A:1013689704352
    • (2002) Machine Learning , vol.47 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 3
    • 0031287072 scopus 로고    scopus 로고
    • An experimental analysis of the bandit problem
    • Banks, J., Olson, M., & Porter, D. (1997). An experimental analysis of the bandit problem. Economic Theory, 10, 55-77. http://dx.doi.org/10.1007/ s001990050146
    • (1997) Economic Theory , vol.10 , pp. 55-77
    • Banks, J.1    Olson, M.2    Porter, D.3
  • 4
    • 84859638124 scopus 로고    scopus 로고
    • Not noisy, just wrong: The role of suboptimal inference in behavioral variability
    • Beck, J. M., Ma, W. J., Pitkow, X., Latham, P. E., & Pouget, A. (2012). Not noisy, just wrong: The role of suboptimal inference in behavioral variability. Neuron, 74, 30-39. http://dx.doi.org/10.1016/j.neuron.2012 .03.016
    • (2012) Neuron , vol.74 , pp. 30-39
    • Beck, J.M.1    Ma, W.J.2    Pitkow, X.3    Latham, P.E.4    Pouget, A.5
  • 5
    • 84980139560 scopus 로고
    • Ambiguity seeking in multi-attribute decisions: Effects of optimism and message framing
    • Bier, V. M., & Connell, B. L. (1994). Ambiguity seeking in multi-attribute decisions: Effects of optimism and message framing. Journal of Behavioral Decision Making, 7, 169 -182. http://dx.doi.org/10.1002/bdm .3960070303
    • (1994) Journal of Behavioral Decision Making , vol.7 , pp. 169-182
    • Bier, V.M.1    Connell, B.L.2
  • 6
    • 0001699291 scopus 로고
    • Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimates of parameters
    • D. S. Touretzky (Ed.) Cambridge, MA: MIT Press.
    • Bridle, J. S. (1990). Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimates of parameters. In D. S. Touretzky (Ed.), Advances in neural information processing systems (Vol. 2, pp. 211-217). Cambridge, MA: MIT Press.
    • (1990) Advances in neural information processing systems , vol.2 , pp. 211-217
    • Bridle, J.S.1
  • 7
    • 84874045238 scopus 로고    scopus 로고
    • Regret analysis of stochastic and nonstochastic multi-armed bandit problems
    • Bubeck, S., & Cesa-Bianchi, N. (2012). Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends in Machine Learning, 5, 1-130.
    • (2012) Foundations and Trends in Machine Learning , vol.5 , pp. 1-130
    • Bubeck, S.1    Cesa-Bianchi, N.2
  • 8
    • 34249838202 scopus 로고
    • Recent developments in modeling preferences: Uncertainty and ambiguity
    • Camerer, C., & Weber, M. (1992). Recent developments in modeling preferences: Uncertainty and ambiguity. Journal of Risk and Uncertainty, 5, 325-370. http://dx.doi.org/10.1007/BF00122575
    • (1992) Journal of Risk and Uncertainty , vol.5 , pp. 325-370
    • Camerer, C.1    Weber, M.2
  • 9
    • 33745223257 scopus 로고    scopus 로고
    • Cortical substrates for exploratory decisions in humans
    • Daw, N. D., O'Doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441, 876-879. http://dx.doi.org/10.1038/nature04766
    • (2006) Nature , vol.441 , pp. 876-879
    • Daw, N.D.1    O'Doherty, J.P.2    Dayan, P.3    Seymour, B.4    Dolan, R.J.5
  • 10
    • 85153964297 scopus 로고
    • A novel reinforcement model of birdsong vocalization learning
    • G. Tesauro D. S. Touretzky & T. K. Leen (Eds.) Cambridge, MA: MIT Press.
    • Doya, K., & Sejnowski, T. J. (1995). A novel reinforcement model of birdsong vocalization learning. In G. Tesauro, D. S. Touretzky, & T. K. Leen (Eds.), Advances in neural information processing systems (Vol. 7, pp. 101-108). Cambridge, MA: MIT Press.
    • (1995) Advances in neural information processing systems , vol.7 , pp. 101-108
    • Doya, K.1    Sejnowski, T.J.2
  • 11
    • 84957363402 scopus 로고
    • Risk, ambiguity and the savage axioms
    • Ellsberg, D. (1961). Risk, ambiguity and the savage axioms. The Quarterly Journal of Economics, 75, 643. http://dx.doi.org/10.2307/1884324
    • (1961) The Quarterly Journal of Economics , vol.75 , pp. 643
    • Ellsberg, D.1
  • 13
    • 68149138772 scopus 로고    scopus 로고
    • Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation
    • Frank, M. J., Doll, B. B., Oas-Terpstra, J., & Moreno, F. (2009). Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. Nature Neuroscience, 12, 1062-1068. http://dx .doi.org/10.1038/nn.2342
    • (2009) Nature Neuroscience , vol.12 , pp. 1062-1068
    • Frank, M.J.1    Doll, B.B.2    Oas-Terpstra, J.3    Moreno, F.4
  • 15
    • 0002955623 scopus 로고
    • A dynamic allocation index for the sequential design of experiments
    • J. Gans (Ed.) Amsterdam the Netherlands: North-Holland
    • Gittins, J., & Jones, D. (1974). A dynamic allocation index for the sequential design of experiments. In J. Gans (Ed.), Progress in statistics (pp. 241-266). Amsterdam, the Netherlands: North-Holland.
    • (1974) Progress in statistics , pp. 241-266
    • Gittins, J.1    Jones, D.2
  • 16
    • 4043119771 scopus 로고    scopus 로고
    • Decisions from experience and the effect of rare events in risky choice
    • Hertwig, R., Barron, G., Weber, E. U., & Erev, I. (2004). Decisions from experience and the effect of rare events in risky choice. Psychological Science, 15, 534 -539. http://dx.doi.org/10.1111/j.0956-7976.2004 .00715.x
    • (2004) Psychological Science , vol.15 , pp. 534-539
    • Hertwig, R.1    Barron, G.2    Weber, E.U.3    Erev, I.4
  • 17
    • 70449671239 scopus 로고    scopus 로고
    • The description-experience gap in risky choice
    • Hertwig, R., & Erev, I. (2009). The description-experience gap in risky choice. Trends in Cognitive Sciences, 13, 517-523. http://dx.doi.org/ 10.1016/j.tics.2009.09.004
    • (2009) Trends in Cognitive Sciences , vol.13 , pp. 517-523
    • Hertwig, R.1    Erev, I.2
  • 19
    • 0001367835 scopus 로고
    • Modeling ambiguity in decisions under uncertainty
    • Kahn, B. E., & Sarin, R. K. (1988). Modeling ambiguity in decisions under uncertainty. The Journal of Consumer Research, 15, 265-272. http://dx .doi.org/10.1086/209163
    • (1988) The Journal of Consumer Research , vol.15 , pp. 265-272
    • Kahn, B.E.1    Sarin, R.K.2
  • 20
    • 14144252095 scopus 로고    scopus 로고
    • Contributions of an avian basal ganglia-forebrain circuit to real-time modulation of song
    • Kao, M. H., Doupe, A. J., & Brainard, M. S. (2005). Contributions of an avian basal ganglia-forebrain circuit to real-time modulation of song. Nature, 433, 638-643. http://dx.doi.org/10.1038/nature03127
    • (2005) Nature , vol.433 , pp. 638-643
    • Kao, M.H.1    Doupe, A.J.2    Brainard, M.S.3
  • 21
    • 79952189388 scopus 로고    scopus 로고
    • Psychological models of human and optimal performance on bandit problems
    • Lee, M. D., Zhang, S., Munro, M. N., & Steyvers, M. (2011). Psychological models of human and optimal performance on bandit problems. Cognitive Systems Research, 12, 164-174. http://dx.doi.org/10.1016/j .cogsys.2010.07.007
    • (2011) Cognitive Systems Research , vol.12 , pp. 164-174
    • Lee, M.D.1    Zhang, S.2    Munro, M.N.3    Steyvers, M.4
  • 23
    • 34250317199 scopus 로고    scopus 로고
    • An exploration- exploitation model based on norepinepherine and dopamine activity
    • Y. Weiss B. Schölkopf & J Platt (Eds.), Cambridge, MA: MIT Press.
    • McClure, S. M., Gilzenrat, M. S., & Cohen, J. D. (2006). An exploration- exploitation model based on norepinepherine and dopamine activity. In Y. Weiss, B. Schölkopf, & J. Platt (Eds.), Advances in neural information processing systems (Vol. 18, pp. 867-874). Cambridge, MA: MIT Press.
    • (2006) Advances in neural information processing systems , vol.18 , pp. 867-874
    • McClure, S.M.1    Gilzenrat, M.S.2    Cohen, J.D.3
  • 24
    • 0008803714 scopus 로고
    • Choice under ambiguity: Intuitive solutions to the armed-bandit problem
    • Meyer, R., & Shi, Y. (1995). Choice under ambiguity: Intuitive solutions to the armed-bandit problem. Management Science, 41, 817-834. http:// dx.doi.org/10.1287/mnsc.41.5.817
    • (1995) Management Science , vol.41 , pp. 817-834
    • Meyer, R.1    Shi, Y.2
  • 25
    • 21844471233 scopus 로고    scopus 로고
    • Vocal experimentation in the juvenile songbird requires a basal ganglia circuit
    • Ölveczky, B. P., Andalman, A. S., & Fee, M. S. (2005). Vocal experimentation in the juvenile songbird requires a basal ganglia circuit. PLoS Biology, 3, e153. http://dx.doi.org/10.1371/journal.pbio.0030153
    • (2005) PLoS Biology , vol.3
    • Ölveczky, B.P.1    Andalman, A.S.2    Fee, M.S.3
  • 26
    • 25144524028 scopus 로고    scopus 로고
    • A sensory source for motor variation
    • Osborne, L. C., Lisberger, S. G., & Bialek, W. (2005). A sensory source for motor variation. Nature, 437, 412- 416. http://dx.doi.org/10.1038/ nature03961
    • (2005) Nature , vol.437 , pp. 412-416
    • Osborne, L.C.1    Lisberger, S.G.2    Bialek, W.3
  • 27
    • 79551573880 scopus 로고    scopus 로고
    • Risk, unexpected uncertainty, and estimation uncertainty: Bayesian learning in unstable settings
    • Payzan-LeNestour, E., & Bossaerts, P. (2011). Risk, unexpected uncertainty, and estimation uncertainty: Bayesian learning in unstable settings. PLoS Computational Biology, 7, e1001048. http://dx.doi.org/ 10.1371/journal.pcbi.1001048
    • (2011) PLoS Computational Biology , vol.7
    • Payzan-LeNestour, E.1    Bossaerts, P.2
  • 28
    • 84870898601 scopus 로고    scopus 로고
    • Do not bet on the unknown versus try to find out more: Estimation uncertainty and "unexpected uncertainty" both modulate exploration
    • Payzan-LeNestour, E., & Bossaerts, P. (2012). Do not bet on the unknown versus try to find out more: Estimation uncertainty and "unexpected uncertainty" both modulate exploration. Frontiers in Neuroscience, 6, 150. http://dx.doi.org/10.3389/fnins.2012.00150
    • (2012) Frontiers in Neuroscience , vol.6 , pp. 150
    • Payzan-LeNestour, E.1    Bossaerts, P.2
  • 29
    • 67349268975 scopus 로고    scopus 로고
    • A Bayesian analysis of human decision-making on bandit problems
    • Steyvers, M., Lee, M., & Wagenmakers, E. (2009). A Bayesian analysis of human decision-making on bandit problems. Journal of Mathematical Psychology, 53, 168-179. http://dx.doi.org/10.1016/j.jmp.2008.11.002
    • (2009) Journal of Mathematical Psychology , vol.53 , pp. 168-179
    • Steyvers, M.1    Lee, M.2    Wagenmakers, E.3
  • 31
    • 0001395850 scopus 로고
    • On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
    • Thompson, W. (1933). On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25, 285-294. http://dx.doi.org/10.1093/biomet/25.3-4.285
    • (1933) Biometrika , vol.25 , pp. 285-294
    • Thompson, W.1
  • 32
    • 37549058045 scopus 로고    scopus 로고
    • Performance variability enables adaptive plasticity of 'crystallized' adult birdsong
    • Tumer, E. C., & Brainard, M. S. (2007). Performance variability enables adaptive plasticity of 'crystallized' adult birdsong. Nature, 450, 1240- 1244. http://dx.doi.org/10.1038/nature06390
    • (2007) Nature , vol.450 , pp. 1240-1244
    • Tumer, E.C.1    Brainard, M.S.2
  • 33
    • 0004049893 scopus 로고
    • (Unpublished doctoral dissertation) Cambridge University, Cambridge, England
    • Watkins, C. J. C. H. (1989). Learning from delayed rewards (Unpublished doctoral dissertation). Cambridge University, Cambridge, England.
    • (1989) Learning from delayed rewards
    • Watkins, C.J.C.H.1
  • 34
    • 84893805643 scopus 로고    scopus 로고
    • Temporal structure of motor variability is dynamically regulated and predicts motor learning ability
    • Wu, H. G., Miyamoto, Y. R., Gonzalez Castro, L. N., ölveczky, B. P., & Smith, M. A. (2014). Temporal structure of motor variability is dynamically regulated and predicts motor learning ability. Nature Neuroscience, 17, 312-321. http://dx.doi.org/10.1038/nn.3616
    • (2014) Nature Neuroscience , vol.17 , pp. 312-321
    • Wu, H.G.1    Miyamoto, Y.R.2    Gonzalez Castro, L.N.3    Ölveczky, B.P.4    Smith, M.A.5
  • 35
    • 84898954453 scopus 로고    scopus 로고
    • Forgetful Bayes and myopic planning: Human learning and decision making in a bandit setting
    • Zhang, S., & Yu, A. J. (2013). Forgetful Bayes and myopic planning: Human learning and decision making in a bandit setting. Advances in Neural Information Processing Systems, 26, 2607-2615.
    • (2013) Advances in Neural Information Processing Systems , vol.26 , pp. 2607-2615
    • Zhang, S.1    Yu, A.J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.