메뉴 건너뛰기




Volumn 61, Issue 2, 2007, Pages 259-276

Transient and asymptotic dynamics of reinforcement learning in games

Author keywords

Bush and Mosteller; Distance diminishing; Learning in games; Reinforcement learning; Slow learning; Stochastic approximation

Indexed keywords


EID: 35048853187     PISSN: 08998256     EISSN: 10902473     Source Type: Journal    
DOI: 10.1016/j.geb.2007.01.005     Document Type: Article
Times cited : (44)

References (46)
  • 1
    • 0001254030 scopus 로고
    • Designing economic agents that act like human agents: A behavioral approach to bounded rationality
    • Arthur W.B. Designing economic agents that act like human agents: A behavioral approach to bounded rationality. Amer. Econ. Rev. 81 2 (1991) 353-359
    • (1991) Amer. Econ. Rev. , vol.81 , Issue.2 , pp. 353-359
    • Arthur, W.B.1
  • 2
    • 0036003050 scopus 로고    scopus 로고
    • Stochastic evolution with slow learning
    • Beggs A. Stochastic evolution with slow learning. Econ. Theory 19 (2002) 379-405
    • (2002) Econ. Theory , vol.19 , pp. 379-405
    • Beggs, A.1
  • 3
    • 16244410118 scopus 로고    scopus 로고
    • On the convergence of reinforcement learning
    • Beggs A.W. On the convergence of reinforcement learning. J. Econ. Theory 122 (2005) 1-36
    • (2005) J. Econ. Theory , vol.122 , pp. 1-36
    • Beggs, A.W.1
  • 4
    • 0038364234 scopus 로고    scopus 로고
    • Aspiration-based reinforcement learning in repeated interaction games: An overview
    • Bendor J., Mookherjee D., and Ray D. Aspiration-based reinforcement learning in repeated interaction games: An overview. Int. Game Theory Rev. 3 2-3 (2001) 159-174
    • (2001) Int. Game Theory Rev. , vol.3 , Issue.2-3 , pp. 159-174
    • Bendor, J.1    Mookherjee, D.2    Ray, D.3
  • 5
    • 0038660317 scopus 로고    scopus 로고
    • Reinforcement learning in repeated interaction games
    • Article 3
    • Bendor J., Mookherjee D., and Ray D. Reinforcement learning in repeated interaction games. Adv. Theor. Econ. 1 1 (2001) Article 3
    • (2001) Adv. Theor. Econ. , vol.1 , Issue.1
    • Bendor, J.1    Mookherjee, D.2    Ray, D.3
  • 7
    • 21344490097 scopus 로고
    • An economist's perspective on the evolution of norms
    • Binmore K., and Samuelson L. An economist's perspective on the evolution of norms. J. Institutional Theoret. Econ. 150 (1993) 45-63
    • (1993) J. Institutional Theoret. Econ. , vol.150 , pp. 45-63
    • Binmore, K.1    Samuelson, L.2
  • 9
    • 0031281590 scopus 로고    scopus 로고
    • Learning through reinforcement and replicator dynamics
    • Börgers T., and Sarin R. Learning through reinforcement and replicator dynamics. J. Econ. Theory 77 (1997) 1-14
    • (1997) J. Econ. Theory , vol.77 , pp. 1-14
    • Börgers, T.1    Sarin, R.2
  • 10
    • 0042571479 scopus 로고    scopus 로고
    • Naive reinforcement learning with endogenous aspirations
    • Börgers T., and Sarin R. Naive reinforcement learning with endogenous aspirations. Int. Econ. Rev. 41 (2000) 921-950
    • (2000) Int. Econ. Rev. , vol.41 , pp. 921-950
    • Börgers, T.1    Sarin, R.2
  • 11
    • 38249007930 scopus 로고
    • Laws of large numbers for dynamical systems with randomly matched individuals
    • Boylan R.T. Laws of large numbers for dynamical systems with randomly matched individuals. J. Econ. Theory 57 (1992) 473-504
    • (1992) J. Econ. Theory , vol.57 , pp. 473-504
    • Boylan, R.T.1
  • 12
    • 0010970766 scopus 로고
    • Continuous approximation of dynamical systems with randomly matched individuals
    • Boylan R.T. Continuous approximation of dynamical systems with randomly matched individuals. J. Econ. Theory 66 (1995) 615-625
    • (1995) J. Econ. Theory , vol.66 , pp. 615-625
    • Boylan, R.T.1
  • 15
    • 0000017298 scopus 로고    scopus 로고
    • Learning and incentive-compatible mechanisms for public goods provision: An experimental study
    • Chen Y., and Tang F. Learning and incentive-compatible mechanisms for public goods provision: An experimental study. J. Polit. Economy 106 (1998) 633-662
    • (1998) J. Polit. Economy , vol.106 , pp. 633-662
    • Chen, Y.1    Tang, F.2
  • 16
    • 0000742255 scopus 로고
    • A stochastic learning model of economic behavior
    • Cross J.G. A stochastic learning model of economic behavior. Quart. J. Econ. 87 (1973) 239-266
    • (1973) Quart. J. Econ. , vol.87 , pp. 239-266
    • Cross, J.G.1
  • 17
    • 66049149458 scopus 로고    scopus 로고
    • Agent-based models and human subject experiments
    • Tesfatsion L., and Judd K.L. (Eds), Elsevier, North-Holland (Chapter 19)
    • Duffy J. Agent-based models and human subject experiments. In: Tesfatsion L., and Judd K.L. (Eds). Handbook of Computational Economics II: Agent-Based Computational Economics (2006), Elsevier, North-Holland 949-1011 (Chapter 19)
    • (2006) Handbook of Computational Economics II: Agent-Based Computational Economics , pp. 949-1011
    • Duffy, J.1
  • 18
    • 0038829878 scopus 로고    scopus 로고
    • Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria
    • Erev I., and Roth A.E. Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria. Amer. Econ. Rev. 88 4 (1998) 848-881
    • (1998) Amer. Econ. Rev. , vol.88 , Issue.4 , pp. 848-881
    • Erev, I.1    Roth, A.E.2
  • 19
    • 0038630244 scopus 로고    scopus 로고
    • Simple reinforcement learning models and reciprocation in the Prisoner's Dilemma game
    • Gigerenzer G., and Selten R. (Eds), MIT Press, Cambridge, MA (Chapter 12)
    • Erev I., and Roth A.E. Simple reinforcement learning models and reciprocation in the Prisoner's Dilemma game. In: Gigerenzer G., and Selten R. (Eds). Bounded Rationality: The Adaptive Toolbox (2001), MIT Press, Cambridge, MA 216-231 (Chapter 12)
    • (2001) Bounded Rationality: The Adaptive Toolbox , pp. 216-231
    • Erev, I.1    Roth, A.E.2
  • 20
    • 0033130546 scopus 로고    scopus 로고
    • The effect of adding a constant to all payoffs: Experimental investigation, and implications for reinforcement learning models
    • Erev I., Bereby-Meyer Y., and Roth A.E. The effect of adding a constant to all payoffs: Experimental investigation, and implications for reinforcement learning models. J. Econ. Behav. Organ. 39 1 (1999) 111-128
    • (1999) J. Econ. Behav. Organ. , vol.39 , Issue.1 , pp. 111-128
    • Erev, I.1    Bereby-Meyer, Y.2    Roth, A.E.3
  • 21
    • 0036390926 scopus 로고    scopus 로고
    • Stochastic collusion and the power law of learning
    • Flache A., and Macy M.W. Stochastic collusion and the power law of learning. J. Conflict Resolution 46 5 (2002) 629-653
    • (2002) J. Conflict Resolution , vol.46 , Issue.5 , pp. 629-653
    • Flache, A.1    Macy, M.W.2
  • 22
    • 0036434064 scopus 로고    scopus 로고
    • Two competing models of how people learn in games
    • Hopkins E. Two competing models of how people learn in games. Econometrica 70 (2002) 2141-2166
    • (2002) Econometrica , vol.70 , pp. 2141-2166
    • Hopkins, E.1
  • 23
    • 26844467703 scopus 로고    scopus 로고
    • Attainability of boundary points under reinforcement learning
    • Hopkins E., and Posch M. Attainability of boundary points under reinforcement learning. Games Econ. Behav. 53 1 (2005) 110-125
    • (2005) Games Econ. Behav. , vol.53 , Issue.1 , pp. 110-125
    • Hopkins, E.1    Posch, M.2
  • 25
    • 35048902091 scopus 로고    scopus 로고
    • Ianni, A., 2001. Reinforcement learning and the power law of practice. Mimeo. University of Southampton
  • 28
    • 33644506135 scopus 로고    scopus 로고
    • A reinforcement learning process in extensive form games
    • Laslier J., and Walliser B. A reinforcement learning process in extensive form games. Int. J. Game Theory 33 (2005) 219-227
    • (2005) Int. J. Game Theory , vol.33 , pp. 219-227
    • Laslier, J.1    Walliser, B.2
  • 30
    • 0037076357 scopus 로고    scopus 로고
    • Learning dynamics in social dilemmas
    • Macy M.W., and Flache A. Learning dynamics in social dilemmas. Proc. Natl. Acad. Sci. USA 99 3 (2002) 7229-7236
    • (2002) Proc. Natl. Acad. Sci. USA , vol.99 , Issue.3 , pp. 7229-7236
    • Macy, M.W.1    Flache, A.2
  • 32
    • 0009391820 scopus 로고
    • Adaptive approaches to stochastic programming
    • McAllister P.H. Adaptive approaches to stochastic programming. Ann. Oper. Res. 30 (1991) 45-62
    • (1991) Ann. Oper. Res. , vol.30 , pp. 45-62
    • McAllister, P.H.1
  • 33
    • 0002053554 scopus 로고
    • Learning behavior in an experimental matching pennies game
    • Mookherjee D., and Sopher B. Learning behavior in an experimental matching pennies game. Games Econ. Behav. 7 (1994) 62-91
    • (1994) Games Econ. Behav. , vol.7 , pp. 62-91
    • Mookherjee, D.1    Sopher, B.2
  • 34
    • 0002159270 scopus 로고    scopus 로고
    • Learning and decision costs in experimental constant sum games
    • Mookherjee D., and Sopher B. Learning and decision costs in experimental constant sum games. Games Econ. Behav. 19 (1997) 97-132
    • (1997) Games Econ. Behav. , vol.19 , pp. 97-132
    • Mookherjee, D.1    Sopher, B.2
  • 35
    • 0002710392 scopus 로고
    • Some convergence theorems for stochastic learning models with distance diminishing operators
    • Norman M.F. Some convergence theorems for stochastic learning models with distance diminishing operators. J. Math. Psychol. 5 (1968) 61-101
    • (1968) J. Math. Psychol. , vol.5 , pp. 61-101
    • Norman, M.F.1
  • 37
    • 0033481949 scopus 로고    scopus 로고
    • Convergence of aspirations and (partial) cooperation in the Prisoner's Dilemma
    • Palomino F., and Vega-Redondo F. Convergence of aspirations and (partial) cooperation in the Prisoner's Dilemma. Int. J. Game Theory 28 4 (1999) 465-488
    • (1999) Int. J. Game Theory , vol.28 , Issue.4 , pp. 465-488
    • Palomino, F.1    Vega-Redondo, F.2
  • 39
    • 0031287487 scopus 로고    scopus 로고
    • Cycling in a stochastic learning algorithm for normal form games
    • Posch M. Cycling in a stochastic learning algorithm for normal form games. J. Evolutionary Econ. 7 (1997) 193-207
    • (1997) J. Evolutionary Econ. , vol.7 , pp. 193-207
    • Posch, M.1
  • 40
    • 58149324992 scopus 로고
    • Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term
    • Roth A.E., and Erev I. Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term. Games Econ. Behav. 8 (1995) 164-212
    • (1995) Games Econ. Behav. , vol.8 , pp. 164-212
    • Roth, A.E.1    Erev, I.2
  • 41
    • 0001703679 scopus 로고    scopus 로고
    • Optimal properties of stimulus-response learning models
    • Rustichini A. Optimal properties of stimulus-response learning models. Games Econ. Behav. 29 (1999) 244-273
    • (1999) Games Econ. Behav. , vol.29 , pp. 244-273
    • Rustichini, A.1
  • 42
    • 0028423534 scopus 로고
    • Decentralized learning of Nash equilibria in multi-person stochastic games with incomplete information
    • Sastry P.S., Phansalkar V.V., and Thathachar M.A.L. Decentralized learning of Nash equilibria in multi-person stochastic games with incomplete information. IEEE Trans. Syst. Man Cybernet. 24 5 (1994) 769-777
    • (1994) IEEE Trans. Syst. Man Cybernet. , vol.24 , Issue.5 , pp. 769-777
    • Sastry, P.S.1    Phansalkar, V.V.2    Thathachar, M.A.L.3
  • 43
    • 33747856809 scopus 로고
    • Re-examination of the perfectness concept for equilibrium points in extensive games
    • Selten R. Re-examination of the perfectness concept for equilibrium points in extensive games. Int. J. Game Theory 4 (1975) 25-55
    • (1975) Int. J. Game Theory , vol.4 , pp. 25-55
    • Selten, R.1
  • 44
    • 0002621983 scopus 로고
    • Animal Intelligence: An Experimental Study of the Associative Processes in Animals
    • MacMillan, New York
    • Thorndike E.L. Animal Intelligence: An Experimental Study of the Associative Processes in Animals. Psychological Review, Monograph Supplements vol. 8 (1898), MacMillan, New York
    • (1898) Psychological Review, Monograph Supplements , vol.8
    • Thorndike, E.L.1
  • 46
    • 0030249446 scopus 로고    scopus 로고
    • A new paradigm for operant conditioning of drosophila melanogaster
    • Wustmann G., Rein K., Wolf R., and Heisenberg M. A new paradigm for operant conditioning of drosophila melanogaster. J. Comp. Physiol. (A) 179 (1996) 429-436
    • (1996) J. Comp. Physiol. (A) , vol.179 , pp. 429-436
    • Wustmann, G.1    Rein, K.2    Wolf, R.3    Heisenberg, M.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.