SCOPUS 정보 검색 플랫폼

SIAM Journal on Control and Optimization

Volumn 44, Issue 2, 2006, Pages 495-514

Individual Q-learning in normal form games

a UNIVERSITY OF NEW SOUTH WALES (Australia)

b UNIVERSITY OF BRISTOL (United Kingdom)

Author keywords

Multi agent learning; Normal form games; Player dependent learning rates; Reinforcement learning; Stochastic approximation

Indexed keywords

ALGORITHMS; GAME THEORY; ITERATIVE METHODS; PROBLEM SOLVING;

MULTI-AGENT LEARNING; NORMAL FORM GAMES; PLAYER-DEPENDENT LEARNING RATES; REINFORCEMENT LEARNING; STOCHASTIC APPROXIMATION;

LEARNING SYSTEMS;

EID: 33645029191 PISSN: 03630129 EISSN: None Source Type: Journal
DOI: 10.1137/S0363012903437976 Document Type: Article

Times cited : (137)

References (38)

1
- 0029513526
- Gambling in a rigged casino: The adversarial multi-armed bandit problem
- IEEE Computer Society Press, Los Alamitos, CA
- P. AUER, N. CESA-BIANCHI, Y. FREUND, AND R. E. SCHAPIRE (1995), Gambling in a rigged casino: The adversarial multi-armed bandit problem, in 36th Annual Symposium on Foundations of Computer Science, IEEE Computer Society Press, Los Alamitos, CA, pp. 322-331.
- (1995) 36th Annual Symposium on Foundations of Computer Science , pp. 322-331
- Auer, P.¹ Cesa-Bianchi, N.² Freund, Y.³ Schapire, R.E.⁴

2
- 0002430114
- Subjectivity and correlation in randomized strategies
- R. J. AUMANN (1974), Subjectivity and correlation in randomized strategies, J. Math. Econom., 1, pp. 67-96.
- (1974) J. Math. Econom. , vol.1 , pp. 67-96
- Aumann, R.J.¹

3
- 0038623721
- On pseudo-games
- A. BAÑDS (1968), On pseudo-games, Ann. Math. Statist., 39, pp. 1932-1945.
- (1968) Ann. Math. Statist. , vol.39 , pp. 1932-1945
- Bañds, A.¹

4
- 0001793657
- Dynamics of stochastic approximation algorithms
- Le Séminaire de Probabilités, Springer-Verlag, Berlin
- M. BENAÏM (1999), Dynamics of stochastic approximation algorithms, in Le Séminaire de Probabilités, Lecture Notes in Math. 1709, Springer-Verlag, Berlin, pp. 1-68.
- (1999) Lecture Notes in Math. , vol.1709 , pp. 1-68
- Benaïm, M.¹

5
- 0002277539
- Mixed equilibria and dynamical systems arising from fictitious play in perturbed games
- M. BENAÏM AND M. W. HIRSCH (1999), Mixed equilibria and dynamical systems arising from fictitious play in perturbed games, Games Econom. Behav., 29, pp. 36-72.
- (1999) Games Econom. Behav. , vol.29 , pp. 36-72
- Benaïm, M.¹ Hirsch, M.W.²

6
- 12444269117
- Fictitious play inlxn games
- U. BERGER (2005), Fictitious play inlxn games, J. Econom. Theory, 120, pp. 139-154.
- (2005) J. Econom. Theory , vol.120 , pp. 139-154
- Berger, U.¹

7
- 0031281590
- Learning through reinforcement and replicator dynamics
- T. BÖRGERS AND R. SARIN (1997), Learning through reinforcement and replicator dynamics, J. Econom. Theory, 77, pp. 1-14.
- (1997) J. Econom. Theory , vol.77 , pp. 1-14
- Börgers, T.¹ Sarin, R.²

8
- 33645016930
- V. S. BORKAR (2001), Reinforcement learning in Markovian evolutionary games, available at http://www.tcs.tifr.res.in/~borkar/game.ps.
- (2001) Reinforcement Learning in Markovian Evolutionary Games
- Borkar, V.S.¹

9
- 0036531878
- Multiagent learning using a variable learning rate
- M. BOWLING AND M. VELOSO (2002), Multiagent learning using a variable learning rate, Artificial Intelligence, 136, pp. 215-250.
- (2002) Artificial Intelligence , vol.136 , pp. 215-250
- Bowling, M.¹ Veloso, M.²

10
- 0000719863
- Packet routing in dynamically changing networks: A reinforcement learning approach
- J. D. Cowan, G. Tesauro, and J. Alspector, eds., Morgan Kaufmann, San Francisco
- J. A. BOYAN AND M. L. LITTMAN (1994), Packet routing in dynamically changing networks: A reinforcement learning approach, in Advances in Neural Information Processing Systems, Vol. 6, J. D. Cowan, G. Tesauro, and J. Alspector, eds., Morgan Kaufmann, San Francisco, pp. 671-678.
- (1994) Advances in Neural Information Processing Systems , vol.6 , pp. 671-678
- Boyan, J.A.¹ Littman, M.L.²

11
- 0002672918
- Iterative solution of games by fictitious play
- T. C. Koopmans, ed., John Wiley &; Sons, New York
- G. W. BROWN (1951), Iterative solution of games by fictitious play, in Activity Analysis of Production and Allocation, T. C. Koopmans, ed., John Wiley &; Sons, New York, pp. 374-376.
- (1951) Activity Analysis of Production and Allocation , pp. 374-376
- Brown, G.W.¹

12
- 0011595015
- Stochastic approximation procedures with randomly varying truncations
- H.-F. CHEN AND Y.-M. ZHU (1986), Stochastic approximation procedures with randomly varying truncations, Sci. China Ser. A, 29, pp. 914-926.
- (1986) Sci. China Ser. A , vol.29 , pp. 914-926
- Chen, H.-F.¹ Zhu, Y.-M.²

13
- 0007336816
- Ph.D. thesis, University of California, Berkeley
- S. COWAN (1992), Dynamical Systems Arising from Game Theory, Ph.D. thesis, University of California, Berkeley.
- (1992) Dynamical Systems Arising from Game Theory
- Cowan, S.¹

14
- 0032208335
- Elevator group control using multiple reinforcement learning agents
- R. H. CRITES AND A. G. BARTO (1998), Elevator group control using multiple reinforcement learning agents, Machine Learning, 33, pp. 235-262.
- (1998) Machine Learning , vol.33 , pp. 235-262
- Crites, R.H.¹ Barto, A.G.²

15
- 0141838158
- Learning, hypothesis testing, and Nash equilibrium
- D. P. FOSTER AND H. P. YOUNG (2003), Learning, hypothesis testing, and Nash equilibrium, Games Econom. Behav., 45, pp. 73-96.
- (2003) Games Econom. Behav. , vol.45 , pp. 73-96
- Foster, D.P.¹ Young, H.P.²

16
- 0004247096
- MIT Press, Cambridge, MA
- D. FUDENBERG AND D. K. LEVINE (1998), The Theory of Learning in Games, MIT Press, Cambridge, MA.
- (1998) The Theory of Learning in Games
- Fudenberg, D.¹ Levine, D.K.²

17
- 0004260007
- MIT Press, Cambridge, MA
- D. FUDENBERG AND J. TIROLE (1991), Game Theory, MIT Press, Cambridge, MA.
- (1991) Game Theory
- Fudenberg, D.¹ Tirole, J.²

18
- 22944471739
- Physiological utility theory and the neuroeconomics of choice
- to appear
- P. W. GLIMCHER, M. C. DORRIS, AND H. M. BAYER (2005), Physiological utility theory and the neuroeconomics of choice, Games Econom. Behav., to appear.
- (2005) Games Econom. Behav.
- Glimcher, P.W.¹ Dorris, M.C.² Bayer, H.M.³

19
- 0242570473
- A short proof of Harsanyi's purification theorem
- S. GOVINDAN, P. J. RENY, AND A. J. ROBSON (2003), A short proof of Harsanyi's purification theorem, Games Econom. Behav., 45, pp. 369-374.
- (2003) Games Econom. Behav. , vol.45 , pp. 369-374
- Govindan, S.¹ Reny, P.J.² Robson, A.J.³

20
- 0001976283
- Approximation to Bayes risk in repeated play
- Contributions to the Theory of Games, M. Drescher, A. W. Tucker, and P. Wolfe, eds., Princeton University Press, Princeton, NJ
- J. HANNAN (1957), Approximation to Bayes risk in repeated play, in Contributions to the Theory of Games, Vol. 3, Ann. of Math. Stud. 39, M. Drescher, A. W. Tucker, and P. Wolfe, eds., Princeton University Press, Princeton, NJ, pp. 97-139.
- (1957) Ann. of Math. Stud. , vol.3-39 , pp. 97-139
- Hannan, J.¹

21
- 0003161771
- Games with randomly disturbed payoffs: A new rationale for mixed-strategy equilibrium points
- J. C. HARSANYI (1973), Games with randomly disturbed payoffs: A new rationale for mixed-strategy equilibrium points, Internat. J. Game Theory, 2, pp. 1-23.
- (1973) Internat. J. Game Theory , vol.2 , pp. 1-23
- Harsanyi, J.C.¹

22
- 0000908510
- A simple adaptive procedure leading to correlated equilibrium
- S. HART AND A. MAS-COLELL (2000), A simple adaptive procedure leading to correlated equilibrium, Econometrica, 68, pp. 1127-1150.
- (2000) Econometrica , vol.68 , pp. 1127-1150
- Hart, S.¹ Mas-Colell, A.²

23
- 0242684983
- A reinforcement procedure leading to correlated equilibrium
- W. N. G. Debreu and W. Trockel, eds., Springer-Verlag, Berlin
- S. HART AND A. MAS-COLELL (2001), A reinforcement procedure leading to correlated equilibrium, in Economic Essays: A Festschrift for Werner Hildenbrand, W. N. G. Debreu and W. Trockel, eds., Springer-Verlag, Berlin, pp. 181-200.
- (2001) Economic Essays: A Festschrift for Werner Hildenbrand , pp. 181-200
- Hart, S.¹ Mas-Colell, A.²

24
- 2942744741
- Uncoupled dynamics cannot lead to Nash equilibrium
- S. HART AND A. MAS-COLELL (2003), Uncoupled dynamics cannot lead to Nash equilibrium, Amer. Econom. Rev., 93, pp. 1830-1836.
- (2003) Amer. Econom. Rev. , vol.93 , pp. 1830-1836
- Hart, S.¹ Mas-Colell, A.²

25
- 20344390000
- Learning in perturbed asymmetric games
- J. HOFBAUER AND E. HOPKINS (2005), Learning in perturbed asymmetric games, Games Econom. Behav., 52, pp. 133-152.
- (2005) Games Econom. Behav. , vol.52 , pp. 133-152
- Hofbauer, J.¹ Hopkins, E.²

26
- 0000415605
- Three problems in learning mixed strategy equilibria
- J. S. JORDAN (1993), Three problems in learning mixed strategy equilibria, Games Econom. Behav., 5, pp. 368-386.
- (1993) Games Econom. Behav. , vol.5 , pp. 368-386
- Jordan, J.S.¹

27
- 0141503453
- Multi-agent influence diagrams for representing and solving games
- D. KOLLER AND B. MILCH (2003), Multi-agent influence diagrams for representing and solving games, Games Econom. Behav., 45, pp. 181-221.
- (2003) Games Econom. Behav. , vol.45 , pp. 181-221
- Koller, D.¹ Milch, B.²

28
- 0345532155
- Stochastic approximation algorithms and applications
- Springer-Verlag, New York
- H. J. KUSHNER AND G. G. YIN (1997), Stochastic Approximation Algorithms and Applications, Appl. Math. 35, Springer-Verlag, New York.
- (1997) Appl. Math. , vol.35
- Kushner, H.J.¹ Yin, G.G.²

29
- 0346913265
- Convergent multiple-timescales reinforcement learning algorithms in normal form games
- D. S. LESLIE AND E. J. COLLINS (2003), Convergent multiple-timescales reinforcement learning algorithms in normal form games, Ann. Appl. Probab., 13, pp. 1231-1251.
- (2003) Ann. Appl. Probab. , vol.13 , pp. 1231-1251
- Leslie, D.S.¹ Collins, E.J.²

30
- 0242635258
- An efficient, exact algorithm for solving tree-structured graphical games
- T. G. Dietterich, S. Becker, and Z. Ghahramani, eds., MIT Press, Cambridge, MA
- M. L. LITTMAN, M. KEARNS, AND S. SINGH (2001), An efficient, exact algorithm for solving tree-structured graphical games, in Advances in Neural Information Processing Systems, Vol. 14, T. G. Dietterich, S. Becker, and Z. Ghahramani, eds., MIT Press, Cambridge, MA.
- (2001) Advances in Neural Information Processing Systems , vol.14
- Littman, M.L.¹ Kearns, M.² Singh, S.³

31
- 0004018184
- Cambridge University Press, Cambridge, UK
- J. MAYNARD SMITH (1982), Evolution and the Theory of Games, Cambridge University Press, Cambridge, UK.
- (1982) Evolution and the Theory of Games
- Smith, J.M.¹

32
- 0038675791
- On repeated games with incomplete information played by non-Bayesian players
- N. MEGIDDO (1980), On repeated games with incomplete information played by non-Bayesian players, Internat. J. Game Theory, 9, pp. 157-167.
- (1980) Internat. J. Game Theory , vol.9 , pp. 157-167
- Megiddo, N.¹

33
- 0002021736
- Equilibrium points in n-person games
- J. NASH (1950), Equilibrium points in n-person games, Proc. Natl. Acad. Sci. USA, 36, pp. 48-49.
- (1950) Proc. Natl. Acad. Sci. USA , vol.36 , pp. 48-49
- Nash, J.¹

34
- 0001000786
- Nonconvergence to unstable points in urn models and stochastic approximations
- R. PEMANTLE (1990), Nonconvergence to unstable points in urn models and stochastic approximations, Ann. Probab., 18, pp. 698-712.
- (1990) Ann. Probab. , vol.18 , pp. 698-712
- Pemantle, R.¹

35
- 58149324992
- Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term
- A. E. ROTH AND I. EREV (1995), Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term, Games Econom. Behav., 8, pp. 164-212.
- (1995) Games Econom. Behav. , vol.8 , pp. 164-212
- Roth, A.E.¹ Erev, I.²

36
- 0002623794
- Some topics in two person games
- M. Drescher, L. S. Shapley, and A. W. Tucker, eds., Princeton University Press, Princeton, NJ
- L. S. SHAPLEY (1964), Some topics in two person games, in Advances in Game Theory, M. Drescher, L. S. Shapley, and A. W. Tucker, eds., Princeton University Press, Princeton, NJ, pp. 1-28.
- (1964) Advances in Game Theory , pp. 1-28
- Shapley, L.S.¹

37
- 84898972974
- Reinforcement learning for dynamic channel allocation in cellular telephone systems
- M. C. Mozer, M. I. Jordan, and T. Petsche, eds., MIT Press, Cambridge, MA
- S. SINGH AND D. BERTSEKAS (1997), Reinforcement learning for dynamic channel allocation in cellular telephone systems, in Advances in Neural Information Processing Systems, Vol. 9, M. C. Mozer, M. I. Jordan, and T. Petsche, eds., MIT Press, Cambridge, MA, p. 974.
- (1997) Advances in Neural Information Processing Systems , vol.9 , pp. 974
- Singh, S.¹ Bertsekas, D.²

38
- 0004102479
- MIT Press, Cambridge, MA
- R. S. SUTTON AND A. G. BARTO (1998), Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.