메뉴 건너뛰기




Volumn 13, Issue 4, 2003, Pages 1231-1251

Convergent multiple-timescales reinforcement learning algorithms in normal form games

Author keywords

Best response dynamics; Reinforcement learning; Repeated normal form games; Stochastic approximation

Indexed keywords


EID: 0346913265     PISSN: 10505164     EISSN: None     Source Type: Journal    
DOI: 10.1214/aoap/1069786497     Document Type: Article
Times cited : (86)

References (20)
  • 2
    • 0002277539 scopus 로고    scopus 로고
    • Mixed equilibria and dynamical systems arising from fictitious play in perturbed games
    • BENAÏM, M. and HIRSCH, M. W. (1999). Mixed equilibria and dynamical systems arising from fictitious play in perturbed games, Games Econom. Behav. 29 36-72.
    • (1999) Games Econom. Behav. , vol.29 , pp. 36-72
    • Benaïm, M.1    Hirsch, M.W.2
  • 4
    • 0031076413 scopus 로고    scopus 로고
    • Stochastic approximation with two timescales
    • BORKAR, V. S. (1997). Stochastic approximation with two timescales. Systems Control Lett. 29 291-294.
    • (1997) Systems Control Lett. , vol.29 , pp. 291-294
    • Borkar, V.S.1
  • 9
    • 0003161771 scopus 로고
    • Games with randomly disturbed payoffs: A new rationale for mixed-strategy equilibrium points
    • HARSANYI, J. (1973). Games with randomly disturbed payoffs: A new rationale for mixed-strategy equilibrium points. Internat. J. Game Theory 2 1-23.
    • (1973) Internat. J. Game Theory , vol.2 , pp. 1-23
    • Harsanyi, J.1
  • 11
    • 0000978264 scopus 로고    scopus 로고
    • A note on best response dynamics
    • HOPKINS, E. (1999). A note on best response dynamics. Games Econom. Behav. 29 138-150.
    • (1999) Games Econom. Behav. , vol.29 , pp. 138-150
    • Hopkins, E.1
  • 12
    • 0002316532 scopus 로고
    • Geometric singular perturbation theory
    • Springer, Berlin
    • JONES, C. K. R. T. (1995). Geometric singular perturbation theory. Dynamical Systems. Lecture Notes in Math. 1609 44-118. Springer, Berlin.
    • (1995) Dynamical Systems. Lecture Notes in Math. , vol.1609 , pp. 44-118
    • Jones, C.K.R.T.1
  • 13
    • 0000415605 scopus 로고
    • Three problems in learning mixed strategy equilibria
    • JORDAN, J. S. (1993). Three problems in learning mixed strategy equilibria. Games Econom. Behav. 5 368-386.
    • (1993) Games Econom. Behav. , vol.5 , pp. 368-386
    • Jordan, J.S.1
  • 14
    • 0343893613 scopus 로고    scopus 로고
    • Actor-critic-type learning algorithms for Markov decision process
    • KONDA, V. R. and BORKAR, V. S. (2000). Actor-critic-type learning algorithms for Markov decision process. SIAM J. Control Opt. 38 94-123.
    • (2000) SIAM J. Control Opt. , vol.38 , pp. 94-123
    • Konda, V.R.1    Borkar, V.S.2
  • 17
    • 0001730497 scopus 로고
    • Non-cooperative games
    • NASH, J. (1951). Non-cooperative games. Ann. Math. 54 286-295.
    • (1951) Ann. Math. , vol.54 , pp. 286-295
    • Nash, J.1
  • 18
    • 0001000786 scopus 로고
    • Nonconvergence to unstable points in urn models and stochastic approximations
    • PEMANTLE, R. (1990). Nonconvergence to unstable points in urn models and stochastic approximations. Ann. Probab. 18 698-712.
    • (1990) Ann. Probab. , vol.18 , pp. 698-712
    • Pemantle, R.1
  • 19
    • 0002623794 scopus 로고
    • Some topics in two person games
    • (M. Dresher, L. S. Shapley and A. W. Tucker, eds.). Princeton Univ. Press
    • SHAPLEY, L. S. (1964). Some topics in two person games. In Advances in Game Theory (M. Dresher, L. S. Shapley and A. W. Tucker, eds.) 1-28. Princeton Univ. Press.
    • (1964) Advances in Game Theory , pp. 1-28
    • Shapley, L.S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.