메뉴 건너뛰기




Volumn 113, Issue 3, 2009, Pages 293-313

Short-term gains, long-term pains: How cues about state aid learning in dynamic environments

Author keywords

Decision making; Dynamic control task; Learning; Q learning; Reinforcement learning; Self control; State; Temporal difference; Temporal discounting

Indexed keywords

ARTICLE; ASSOCIATION; DECISION MAKING; ENVIRONMENT; HUMAN; LEARNING; MOTIVATION; NORMAL HUMAN; PRIORITY JOURNAL; REINFORCEMENT; REWARD; TASK PERFORMANCE;

EID: 70350572378     PISSN: 00100277     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.cognition.2009.03.013     Document Type: Article
Times cited : (77)

References (50)
  • 1
    • 0016355478 scopus 로고
    • A new look at the statistical model identification
    • Akaike H. A new look at the statistical model identification. IEEE Transaction on Automatic Control 19 6 (1974) 716-723
    • (1974) IEEE Transaction on Automatic Control , vol.19 , Issue.6 , pp. 716-723
    • Akaike, H.1
  • 3
    • 0034859944 scopus 로고    scopus 로고
    • Autonomous helicopter control using reinforcement learning policy search methods
    • IEEE
    • Bagnell J., and Schneider J. Autonomous helicopter control using reinforcement learning policy search methods. International conference on robotics and automation (2001), IEEE 1615-1620
    • (2001) International conference on robotics and automation , pp. 1615-1620
    • Bagnell, J.1    Schneider, J.2
  • 4
    • 0036241221 scopus 로고    scopus 로고
    • Decision-making and addition (part I): Impaired activation of somatic states in substance dependent individuals when pondering decisions with negative future consequences
    • Bechara A., and Damasio H. Decision-making and addition (part I): Impaired activation of somatic states in substance dependent individuals when pondering decisions with negative future consequences. Neuropsychologia 40 10 (2002) 1675-1689
    • (2002) Neuropsychologia , vol.40 , Issue.10 , pp. 1675-1689
    • Bechara, A.1    Damasio, H.2
  • 5
    • 0035148287 scopus 로고    scopus 로고
    • Decision-making deficits, linked to a dysfunctional ventromedial prefrontal cortex, revealed in alcohol and stimulant abusers
    • Bechara A., Dolan S., Denburg N., Hindes A., Anderson S., and Nathan P. Decision-making deficits, linked to a dysfunctional ventromedial prefrontal cortex, revealed in alcohol and stimulant abusers. Neuropsychologia 39 (2001) 376-389
    • (2001) Neuropsychologia , vol.39 , pp. 376-389
    • Bechara, A.1    Dolan, S.2    Denburg, N.3    Hindes, A.4    Anderson, S.5    Nathan, P.6
  • 6
    • 85004846313 scopus 로고
    • Interactive tasks and the implicit-explicit distinction
    • Berry D.C., and Broadbent D.E. Interactive tasks and the implicit-explicit distinction. British Journal of Psychology 79 (1988) 251-272
    • (1988) British Journal of Psychology , vol.79 , pp. 251-272
    • Berry, D.C.1    Broadbent, D.E.2
  • 7
    • 34248999741 scopus 로고    scopus 로고
    • Short-term memory traces for action bias in human reinforcement learning
    • Bogacz R., McClure S., Li J., Cohen J., and Montague P. Short-term memory traces for action bias in human reinforcement learning. Brain Research 1153 (2007) 111-121
    • (2007) Brain Research , vol.1153 , pp. 111-121
    • Bogacz, R.1    McClure, S.2    Li, J.3    Cohen, J.4    Montague, P.5
  • 8
    • 0036720714 scopus 로고    scopus 로고
    • A contribution of cognitive decision models to clinical assessment: Decomposing performance on the bechara gambling task
    • Busemeyer J., and Stout J. A contribution of cognitive decision models to clinical assessment: Decomposing performance on the bechara gambling task. Psychological Assessment 14 3 (2002) 253-262
    • (2002) Psychological Assessment , vol.14 , Issue.3 , pp. 253-262
    • Busemeyer, J.1    Stout, J.2
  • 9
    • 0023981451 scopus 로고
    • The art of adaptive pattern recognition by a self-organizing neural network
    • Carpenter G.A., and Grossberg S. The art of adaptive pattern recognition by a self-organizing neural network. Computer 21 3 (1988) 77-88
    • (1988) Computer , vol.21 , Issue.3 , pp. 77-88
    • Carpenter, G.A.1    Grossberg, S.2
  • 10
    • 0002192119 scopus 로고
    • Input generalization in delayed reinforcement learning: An algorithm and performance comparisons
    • Chapman, D., & Kaelbling, L. (1991). Input generalization in delayed reinforcement learning: An algorithm and performance comparisons. In Proceedings of IJCAI.
    • (1991) Proceedings of IJCAI
    • Chapman, D.1    Kaelbling, L.2
  • 11
    • 33750942379 scopus 로고    scopus 로고
    • Near-optimal human adaptive control across different noise environments
    • Chhabra M., and Jacobs R. Near-optimal human adaptive control across different noise environments. The Journal of Neuroscience 26 42 (2006) 10883-10887
    • (2006) The Journal of Neuroscience , vol.26 , Issue.42 , pp. 10883-10887
    • Chhabra, M.1    Jacobs, R.2
  • 12
    • 28044450875 scopus 로고    scopus 로고
    • Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
    • Daw N., Niv Y., and Dayan P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience 8 12 (2005) 1704-1711
    • (2005) Nature Neuroscience , vol.8 , Issue.12 , pp. 1704-1711
    • Daw, N.1    Niv, Y.2    Dayan, P.3
  • 13
    • 33745223257 scopus 로고    scopus 로고
    • Cortical substrates for exploratory decision in humans
    • Daw N., O'Doherty J., Seymour B., Dayan P., and Dolan R. Cortical substrates for exploratory decision in humans. Nature 441 (2006) 876-879
    • (2006) Nature , vol.441 , pp. 876-879
    • Daw, N.1    O'Doherty, J.2    Seymour, B.3    Dayan, P.4    Dolan, R.5
  • 14
    • 0036835734 scopus 로고    scopus 로고
    • Long-term reward prediction in td models of the dopamine system
    • Daw N., and Touretzky D. Long-term reward prediction in td models of the dopamine system. Neural Computation 14 (2002) 603-616
    • (2002) Neural Computation , vol.14 , pp. 603-616
    • Daw, N.1    Touretzky, D.2
  • 16
    • 0031788392 scopus 로고    scopus 로고
    • A computational role for dopamine delivery in human decision making
    • Egelman D., Person C., and Montague P. A computational role for dopamine delivery in human decision making. Journal of Cognitive Neuroscience 10 (1998) 623-630
    • (1998) Journal of Cognitive Neuroscience , vol.10 , pp. 623-630
    • Egelman, D.1    Person, C.2    Montague, P.3
  • 17
    • 33745108748 scopus 로고    scopus 로고
    • From recurrent choice to skill learning: A reinforcement-learning model
    • Fu W., and Anderson J. From recurrent choice to skill learning: A reinforcement-learning model. Journal of Experimental Psychology: General 135 2 (2006) 184-206
    • (2006) Journal of Experimental Psychology: General , vol.135 , Issue.2 , pp. 184-206
    • Fu, W.1    Anderson, J.2
  • 18
    • 0034083642 scopus 로고    scopus 로고
    • Drug abusers show impaired performance in a laboratory test of decision making
    • Grant S., Controreggi C., and London E. Drug abusers show impaired performance in a laboratory test of decision making. Neuropsychologia 38 (2000) 1180-1187
    • (2000) Neuropsychologia , vol.38 , pp. 1180-1187
    • Grant, S.1    Controreggi, C.2    London, E.3
  • 19
    • 0000209274 scopus 로고
    • Experiments on stable suboptimality in individual behavior
    • Herrnstein R. Experiments on stable suboptimality in individual behavior. The American Economic Review 81 2 (1991) 360-364
    • (1991) The American Economic Review , vol.81 , Issue.2 , pp. 360-364
    • Herrnstein, R.1
  • 23
    • 1942539715 scopus 로고    scopus 로고
    • Sustain: A network model of category learning
    • Love B., Medin D., and Gureckis T. Sustain: A network model of category learning. Psychological Review 111 2 (2004) 309-332
    • (2004) Psychological Review , vol.111 , Issue.2 , pp. 309-332
    • Love, B.1    Medin, D.2    Gureckis, T.3
  • 27
    • 0037057753 scopus 로고    scopus 로고
    • Neural economics and the biological substrates of valuation
    • Montague P., and Berns G. Neural economics and the biological substrates of valuation. Neuron 36 (2002) 265-284
    • (2002) Neuron , vol.36 , pp. 265-284
    • Montague, P.1    Berns, G.2
  • 28
    • 0028972278 scopus 로고
    • Bee foraging in uncertain environments using predictive hebbian learning
    • Montague P., Dayan P., Person C., and Sejnowski T. Bee foraging in uncertain environments using predictive hebbian learning. Nature 377 6551 (1995) 725-728
    • (1995) Nature , vol.377 , Issue.6551 , pp. 725-728
    • Montague, P.1    Dayan, P.2    Person, C.3    Sejnowski, T.4
  • 29
    • 0029981543 scopus 로고    scopus 로고
    • A framework for mesencephalic dopamine system based on predictive hebbian learning
    • Montague P., Dayan P., and Sejnowski T. A framework for mesencephalic dopamine system based on predictive hebbian learning. Journal of Neuroscience 16 5 (1996) 1936-1947
    • (1996) Journal of Neuroscience , vol.16 , Issue.5 , pp. 1936-1947
    • Montague, P.1    Dayan, P.2    Sejnowski, T.3
  • 30
    • 0000238336 scopus 로고
    • A simple method for function minimization
    • Nelder J., and Mead R. A simple method for function minimization. Computer Journal 7 (1965) 308-313
    • (1965) Computer Journal , vol.7 , pp. 308-313
    • Nelder, J.1    Mead, R.2
  • 31
    • 67349169259 scopus 로고    scopus 로고
    • Melioration dominates maximization: Stable suboptimal performance despite global feedback
    • Sun R., and Miyake N. (Eds), Lawrence Erlbaum Associates, Hillsdale, NJ
    • Neth H., Sims C., and Gray W. Melioration dominates maximization: Stable suboptimal performance despite global feedback. In: Sun R., and Miyake N. (Eds). Proceedings of the 28th annual meeting of the cognitive science society (2006), Lawrence Erlbaum Associates, Hillsdale, NJ
    • (2006) Proceedings of the 28th annual meeting of the cognitive science society
    • Neth, H.1    Sims, C.2    Gray, W.3
  • 32
    • 34548837994 scopus 로고    scopus 로고
    • Reconciling reinforcement learning models with behavioral extinction and renewal: Implications for addition, relapse, and problem gamling
    • Redish A., Jensen S., Johnson A., and Kurth-Nelson Z. Reconciling reinforcement learning models with behavioral extinction and renewal: Implications for addition, relapse, and problem gamling. Psychological Review 114 3 (2007) 784-805
    • (2007) Psychological Review , vol.114 , Issue.3 , pp. 784-805
    • Redish, A.1    Jensen, S.2    Johnson, A.3    Kurth-Nelson, Z.4
  • 33
    • 0002109138 scopus 로고
    • A theory of pavolvian conditioning: Variations in the effectiveness of reinforcement and non-reinforcement
    • Black A., and Prokasy W. (Eds), Appleton-Century-Crofts, New York
    • Rescorla R., and Wagner A. A theory of pavolvian conditioning: Variations in the effectiveness of reinforcement and non-reinforcement. In: Black A., and Prokasy W. (Eds). Classical conditioning II: Current research and theory (1972), Appleton-Century-Crofts, New York 64-99
    • (1972) Classical conditioning II: Current research and theory , pp. 64-99
    • Rescorla, R.1    Wagner, A.2
  • 34
    • 1442280641 scopus 로고    scopus 로고
    • Learning by pigeons playing against tit-for-tat in an operant prisoner's dilemma
    • Sanabria F., Baker F., and Rachlin H. Learning by pigeons playing against tit-for-tat in an operant prisoner's dilemma. Learning and Behavior 31 4 (2003) 318-331
    • (2003) Learning and Behavior , vol.31 , Issue.4 , pp. 318-331
    • Sanabria, F.1    Baker, F.2    Rachlin, H.3
  • 35
    • 0030896968 scopus 로고    scopus 로고
    • A neural substrate of prediction and reward
    • Schultz W., Dayan P., and Montague P.R. A neural substrate of prediction and reward. Science 275 (1997) 1593-1598
    • (1997) Science , vol.275 , pp. 1593-1598
    • Schultz, W.1    Dayan, P.2    Montague, P.R.3
  • 36
    • 0000120766 scopus 로고
    • Estimating the dimension of a model
    • Schwartz G. Estimating the dimension of a model. The Annals of Statistics 5 (1978) 461-464
    • (1978) The Annals of Statistics , vol.5 , pp. 461-464
    • Schwartz, G.1
  • 37
    • 84948875029 scopus 로고
    • Insight without awareness: On the interaction of verbalization, instruction, and practice in a simulated process control task
    • Stanley W., Mathew R., Russ R., and Kotler-Cope S. Insight without awareness: On the interaction of verbalization, instruction, and practice in a simulated process control task. Quarterly Journal of Experimental Psychology 41A 3 (1989) 553-577
    • (1989) Quarterly Journal of Experimental Psychology , vol.41 A , Issue.3 , pp. 553-577
    • Stanley, W.1    Mathew, R.2    Russ, R.3    Kotler-Cope, S.4
  • 38
    • 12344306223 scopus 로고    scopus 로고
    • The interaction of the explicit and the implicit in skill learning: A dual-process approach
    • Sun R., Slusarz P., and Terry C. The interaction of the explicit and the implicit in skill learning: A dual-process approach. Psychological Review 112 1 (2005) 159-192
    • (2005) Psychological Review , vol.112 , Issue.1 , pp. 159-192
    • Sun, R.1    Slusarz, P.2    Terry, C.3
  • 39
    • 0035961179 scopus 로고    scopus 로고
    • Modeling functions of striatal dopamine modulation in learning and planning
    • Suri R., Bargas J., and Arbib M. Modeling functions of striatal dopamine modulation in learning and planning. Neuroscience 103 (2001) 65-85
    • (2001) Neuroscience , vol.103 , pp. 65-85
    • Suri, R.1    Bargas, J.2    Arbib, M.3
  • 40
    • 85156221438 scopus 로고    scopus 로고
    • Generalization in reinforcement learning: Successful examples using sparse coarse coding
    • Touretzky D., Mozer M., and Hasselmo M. (Eds), MIT Press, Cambridge, MA
    • Sutton R. Generalization in reinforcement learning: Successful examples using sparse coarse coding. In: Touretzky D., Mozer M., and Hasselmo M. (Eds). Advances in neural information processing systems: Proceedings of the 1995 conference (1996), MIT Press, Cambridge, MA 1038-1044
    • (1996) Advances in neural information processing systems: Proceedings of the 1995 conference , pp. 1038-1044
    • Sutton, R.1
  • 42
    • 0000985504 scopus 로고
    • Td-gammon, a self-teaching backgammon program, achieves master-level play
    • Tesauro G. Td-gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation 6 2 (1994) 215-219
    • (1994) Neural Computation , vol.6 , Issue.2 , pp. 215-219
    • Tesauro, G.1
  • 45
    • 0002330234 scopus 로고
    • Inhibition in pavlovian conditioning: Application of a theory
    • Boake R., and Halliday M. (Eds), Academic Press, London
    • Wagner A., and Rescorla R. Inhibition in pavlovian conditioning: Application of a theory. In: Boake R., and Halliday M. (Eds). Inhibition and learning (1972), Academic Press, London 301-336
    • (1972) Inhibition and learning , pp. 301-336
    • Wagner, A.1    Rescorla, R.2
  • 46
    • 0004049893 scopus 로고
    • Unpublished doctoral dissertation. Cambridge, England: Cambridge University
    • Watkins, C. (1989). Learning from delayed rewards. Unpublished doctoral dissertation. Cambridge, England: Cambridge University.
    • (1989) Learning from delayed rewards
    • Watkins, C.1
  • 47
    • 0002557085 scopus 로고
    • Learning to perceive and act by trial and error
    • Whitehead S., and Ballard D. Learning to perceive and act by trial and error. Machine Learning 7 1 (1991) 45-83
    • (1991) Machine Learning , vol.7 , Issue.1 , pp. 45-83
    • Whitehead, S.1    Ballard, D.2
  • 50
    • 25644450322 scopus 로고    scopus 로고
    • Comparison of basic assumptions embedded in learning models for experience based decision-making
    • Yechiam E., and Busemeyer J. Comparison of basic assumptions embedded in learning models for experience based decision-making. Psychonomic Bulletin and Review 12 3 (2005) 387-402
    • (2005) Psychonomic Bulletin and Review , vol.12 , Issue.3 , pp. 387-402
    • Yechiam, E.1    Busemeyer, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.