메뉴 건너뛰기




Volumn 6, Issue 12, 2010, Pages

Structure learning in human sequential decision-making

Author keywords

[No Author keywords available]

Indexed keywords

BEHAVIORAL RESEARCH; REINFORCEMENT LEARNING;

EID: 78651226963     PISSN: 1553734X     EISSN: 15537358     Source Type: Journal    
DOI: 10.1371/journal.pcbi.1001003     Document Type: Article
Times cited : (39)

References (45)
  • 1
    • 0004870746 scopus 로고
    • A problem in the sequential design of experiments
    • Bellman RE (1956) A problem in the sequential design of experiments. Sankhyā 16: 221-229.
    • (1956) Sankhyā , vol.16 , pp. 221-229
    • Bellman, R.E.1
  • 3
    • 0001043843 scopus 로고
    • Restless bandits: Activity allocation in a changing world
    • Whittle P (1988) Restless bandits: activity allocation in a changing world. J Appl Probab 25: 287-298.
    • (1988) J Appl Probab , vol.25 , pp. 287-298
    • Whittle, P.1
  • 5
    • 84864921697 scopus 로고    scopus 로고
    • Modeling human performance in restless bandits with particle filters
    • Available
    • Yi MS, Steyvers M, Lee M (2009) Modeling human performance in restless bandits with particle filters. The Journal of Problem Solving 2: Available: http://docs.lib.purdue.edu/jps/vol2/iss2/5/.
    • (2009) The Journal of Problem Solving 2
    • Yi, M.S.1    Steyvers, M.2    Lee, M.3
  • 6
    • 84858789760 scopus 로고    scopus 로고
    • Sequential effects: Superstition or rational behavior?
    • Cambridge, MA: MIT Press
    • Yu AJ, Cohen JD (2009) Sequential effects: Superstition or rational behavior? In: Advances in Neural Information Processing Systems, 21. Cambridge, MA: MIT Press. pp 1873-1880.
    • (2009) Advances in Neural Information Processing Systems, 21 , pp. 1873-1880
    • Yu, A.J.1    Cohen, J.D.2
  • 7
    • 57049112212 scopus 로고    scopus 로고
    • When does reward maximization lead to matching law?
    • Sakai Y, Fukai T (2008) When does reward maximization lead to matching law? PLoS One 3: e3795.
    • (2008) PLoS One , vol.3
    • Sakai, Y.1    Fukai, T.2
  • 8
    • 37749023538 scopus 로고    scopus 로고
    • The actor-critic learning is behind the matching law: Matching vs. optimal behaviors
    • Sakai Y, Fukai T (2008) The actor-critic learning is behind the matching law: Matching vs. optimal behaviors. Neural Comput 20: 227-251.
    • (2008) Neural Comput , vol.20 , pp. 227-251
    • Sakai, Y.1    Fukai, T.2
  • 9
    • 0032073263 scopus 로고    scopus 로고
    • Planning and acting in partially observable stochastic domains
    • Kaelbling L, Littman M, Cassandra A (1998) Planning and acting in partially observable stochastic domains. Artif Intell 101: 99-134.
    • (1998) Artif Intell , vol.101 , pp. 99-134
    • Kaelbling, L.1    Littman, M.2    Cassandra, A.3
  • 14
    • 34249761849 scopus 로고
    • Learning bayesian networks: The combination of knowledge and statistical data
    • Heckerman D, Geiger D, Chickering DM (1995) Learning bayesian networks: The combination of knowledge and statistical data. Mach Learn 20: 197-243.
    • (1995) Mach Learn , vol.20 , pp. 197-243
    • Heckerman, D.1    Geiger, D.2    Chickering, D.M.3
  • 16
    • 33746260413 scopus 로고    scopus 로고
    • Theory-based bayesian models of inductive learning and reasoning
    • Tenenbaum JB, Griffiths TL, Kemp C (2006) Theory-based bayesian models of inductive learning and reasoning. Trends Cogn Sci 10: 309-318.
    • (2006) Trends Cogn Sci , vol.10 , pp. 309-318
    • Tenenbaum, J.B.1    Griffiths, T.L.2    Kemp, C.3
  • 17
    • 0003787146 scopus 로고
    • Princeton: Princeton University Press
    • Bellman RE (1957) Dynamic programming. Princeton: Princeton University Press.
    • (1957) Dynamic Programming
    • Bellman, R.E.1
  • 18
    • 0002955623 scopus 로고
    • A dynamic allocation index for the sequential design of experiments
    • In: Gani J, Sarkadi K, Vincze I, eds., Amsterdam: North-Holland Pub. Co
    • Gittins JC, Jones DM (1974) A dynamic allocation index for the sequential design of experiments. In: Gani J, Sarkadi K, Vincze I, eds. Progress in statistics. Amsterdam: North-Holland Pub. Co. pp 241-266.
    • (1974) Progress in Statistics , pp. 241-266
    • Gittins, J.C.1    Jones, D.M.2
  • 19
    • 34249833101 scopus 로고
    • Technical note: Q-learning
    • Watkins C, Dayan P (1992) Technical note: Q-learning. Mach Learn 8: 279-292.
    • (1992) Mach Learn , vol.8 , pp. 279-292
    • Watkins, C.1    Dayan, P.2
  • 21
    • 0030896968 scopus 로고    scopus 로고
    • A neural substrate of prediction and reward
    • Schultz W, Dayan P, Montague P (1997) A neural substrate of prediction and reward. Science 275: 1593-1599.
    • (1997) Science , vol.275 , pp. 1593-1599
    • Schultz, W.1    Dayan, P.2    Montague, P.3
  • 22
    • 0031867046 scopus 로고    scopus 로고
    • Predictive reward signal of dopamine neurons
    • Schultz W (1998) Predictive reward signal of dopamine neurons. J Neurophysiol 80: 1-27.
    • (1998) J Neurophysiol , vol.80 , pp. 1-27
    • Schultz, W.1
  • 24
    • 0008803714 scopus 로고
    • Sequential choice under ambiguity: Intuitive solutions to the armed-bandit problem
    • Meyer RJ, Shi Y (1995) Sequential choice under ambiguity: Intuitive solutions to the armed-bandit problem. Manage Sci 41: 817-834.
    • (1995) Manage Sci , vol.41 , pp. 817-834
    • Meyer, R.J.1    Shi, Y.2
  • 25
    • 0031287072 scopus 로고    scopus 로고
    • An experimental analysis of the bandit problem
    • Banks J, Olson M, Porter D (1997) An experimental analysis of the bandit problem. Econ Theory 10: 55-77.
    • (1997) Econ Theory , vol.10 , pp. 55-77
    • Banks, J.1    Olson, M.2    Porter, D.3
  • 27
    • 61549113484 scopus 로고    scopus 로고
    • Simple models of discrete choice and their performance in bandit experiments
    • Gans N, Knox G, Croson R (2007) Simple models of discrete choice and their performance in bandit experiments. Manuf Serv Oper Manag 9: 383-408.
    • (2007) Manuf Serv Oper Manag , vol.9 , pp. 383-408
    • Gans, N.1    Knox, G.2    Croson, R.3
  • 28
    • 0010186317 scopus 로고
    • Reward probability, amount, and information as determiners of sequential two-alternative decisions
    • Edwards W (1956) Reward probability, amount, and information as determiners of sequential two-alternative decisions. J Exp Psychol 52: 177-88.
    • (1956) J Exp Psychol , vol.52 , pp. 177-188
    • Edwards, W.1
  • 29
    • 0001515225 scopus 로고
    • Probability learning in 1000 trials
    • Edwards W (1961) Probability learning in 1000 trials. J Exp Psychol 62: 385-394.
    • (1961) J Exp Psychol , vol.62 , pp. 385-394
    • Edwards, W.1
  • 30
    • 0342748193 scopus 로고
    • Supplementary report: The utility of correctly predicting infrequent events
    • Brackbill Y, Bravos A (1962) Supplementary report: The utility of correctly predicting infrequent events. J Exp Psychol 64: 648-649.
    • (1962) J Exp Psychol , vol.64 , pp. 648-649
    • Brackbill, Y.1    Bravos, A.2
  • 32
    • 77952541839 scopus 로고    scopus 로고
    • Learning latent structure: Carving nature at its joints
    • Gershman SJ, Niv Y (2010) Learning latent structure: carving nature at its joints. Curr Opin Neurobiol 20: 251-256.
    • (2010) Curr Opin Neurobiol , vol.20 , pp. 251-256
    • Gershman, S.J.1    Niv, Y.2
  • 35
    • 77951576301 scopus 로고    scopus 로고
    • Bayesian modeling of human sequential decisionmaking on the multi-armed bandit problem
    • In: Sloutsky V, Love B, McRae K, eds., AustinTX: Cognitive Science Society
    • Acuna D, Schrater P (2008) Bayesian modeling of human sequential decisionmaking on the multi-armed bandit problem. In: Sloutsky V, Love B, McRae K, eds. 30th Annual Conference of the Cognitive Science Society. AustinTX: Cognitive Science Society. pp 2065-2070.
    • (2008) 30th Annual Conference of the Cognitive Science Society , pp. 2065-2070
    • Acuna, D.1    Schrater, P.2
  • 36
    • 33745910265 scopus 로고    scopus 로고
    • A hierarchical Bayesian model of human decision-making on an optimal stopping problem
    • Lee MD (2006) A hierarchical Bayesian model of human decision-making on an optimal stopping problem. Cogn Sci 30: 1-26.
    • (2006) Cogn Sci , vol.30 , pp. 1-26
    • Lee, M.D.1
  • 38
    • 84864032307 scopus 로고    scopus 로고
    • Prediction and change detection
    • Steyvers M, Brown S (2006) Prediction and change detection. In: NIPS 2006. pp 1281-1288.
    • (2006) NIPS 2006 , pp. 1281-1288
    • Steyvers, M.1    Brown, S.2
  • 40
    • 0038829878 scopus 로고    scopus 로고
    • Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria
    • Erev I, Roth AE (1998) Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria. Am Econ Rev 88: 848-881.
    • (1998) Am Econ Rev , vol.88 , pp. 848-881
    • Erev, I.1    Roth, A.E.2
  • 41
    • 33646230819 scopus 로고    scopus 로고
    • Dopamine, prediction error and associative learning: A model-based account
    • Smith A, Li M, Becker S, Kapur S (2006) Dopamine, prediction error and associative learning: A model-based account. Network 17: 61-84.
    • (2006) Network , vol.17 , pp. 61-84
    • Smith, A.1    Li, M.2    Becker, S.3    Kapur, S.4
  • 42
    • 40849087850 scopus 로고    scopus 로고
    • Integrating hippocampus and striatum in decision-making
    • Johnson A, van der Meer M, Redish A (2007) Integrating hippocampus and striatum in decision-making. Curr Opin Neurobiol 17: 692-697.
    • (2007) Curr Opin Neurobiol , vol.17 , pp. 692-697
    • Johnson, A.1    van der Meer, M.2    Redish, A.3
  • 43
    • 67349268975 scopus 로고    scopus 로고
    • A bayesian analysis of human decision-making on bandit problems
    • Steyvers M, Lee MD, Wagenmakers E (2009) A bayesian analysis of human decision-making on bandit problems. J Math Psychol 53: 168-179.
    • (2009) J Math Psychol , vol.53 , pp. 168-179
    • Steyvers, M.1    Lee, M.D.2    Wagenmakers, E.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.