메뉴 건너뛰기




Volumn 171, Issue 7, 2007, Pages 365-377

If multi-agent learning is the answer, what is the question?

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE; GAME THEORY; LEARNING SYSTEMS;

EID: 34147161536     PISSN: 00043702     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.artint.2006.02.006     Document Type: Article
Times cited : (333)

References (55)
  • 1
    • 0000428680 scopus 로고
    • Rationality of self and others in an economic system
    • Arrow K. Rationality of self and others in an economic system. Journal of Business 59 4 (1986)
    • (1986) Journal of Business , vol.59 , Issue.4
    • Arrow, K.1
  • 2
    • 34249056307 scopus 로고    scopus 로고
    • B. Banerjee, J. Peng, Efficient no-regret multiagent learning, in: AAAI, 2005
  • 4
    • 84880840280 scopus 로고    scopus 로고
    • D. Billings, N. Burch, A. Davidson, R. Holte, J. Schaeffer, T. Schauenberg, D. Szafron, Approximating game-theoretic optimal strategies for full-scale poker, in: The Eighteenth International Joint Conference on Artificial Intelligence, 2003
  • 6
    • 84899027977 scopus 로고    scopus 로고
    • Convergence and no-regret in multiagent learning
    • MIT Press, Cambridge, MA
    • Bowling M. Convergence and no-regret in multiagent learning. Advances in Neural Information Processing Systems vol. 17 (2005), MIT Press, Cambridge, MA
    • (2005) Advances in Neural Information Processing Systems , vol.17
    • Bowling, M.1
  • 7
    • 84880865940 scopus 로고    scopus 로고
    • M. Bowling, M. Veloso, Rational and convergent learning in stochastic games, in: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, 2001
  • 8
    • 0041965975 scopus 로고    scopus 로고
    • R-max, a general polynomial time algorithm for near-optimal reinforcement learning
    • Brafman R., and Tennenholtz M. R-max, a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research 3 (2002) 213-231
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 213-231
    • Brafman, R.1    Tennenholtz, M.2
  • 10
    • 0002672918 scopus 로고
    • Iterative solution of games by fictitious play
    • John Wiley and Sons, New York
    • Brown G. Iterative solution of games by fictitious play. Activity Analysis of Production and Allocation (1951), John Wiley and Sons, New York
    • (1951) Activity Analysis of Production and Allocation
    • Brown, G.1
  • 11
    • 0036268277 scopus 로고    scopus 로고
    • Sophisticated EWA learning and strategic teaching in repeated games
    • Camerer C., Ho T., and Chong J. Sophisticated EWA learning and strategic teaching in repeated games. Journal of Economic Theory 104 (2002) 137-188
    • (2002) Journal of Economic Theory , vol.104 , pp. 137-188
    • Camerer, C.1    Ho, T.2    Chong, J.3
  • 12
    • 4544279432 scopus 로고    scopus 로고
    • Y.-H. Chang, T. Ho, L.P. Kaelbling, Mobilized ad-hoc networks: A reinforcement learning approach, in: 1st International Conference on Autonomic Computing (ICAC 2004), 2004, pp. 240-247
  • 14
    • 0031630561 scopus 로고    scopus 로고
    • C. Claus, C. Boutilier, The dynamics of reinforcement learning in cooperative multiagent systems, in: Proceedings of the Fifteenth National Conference on Artificial Intelligence, 1998, pp. 746-752
  • 15
    • 0038829878 scopus 로고    scopus 로고
    • Predicting how people play games: reinforcement leaning in experimental games with unique, mixed strategy equilibria
    • Erev I., and Roth A.E. Predicting how people play games: reinforcement leaning in experimental games with unique, mixed strategy equilibria. The American Economic Review 88 4 (1998) 848-881
    • (1998) The American Economic Review , vol.88 , Issue.4 , pp. 848-881
    • Erev, I.1    Roth, A.E.2
  • 16
  • 21
    • 1942517280 scopus 로고    scopus 로고
    • A. Greenwald, K. Hall, Correlated Q-learning, in: Proceedings of the Twentieth International Conference on Machine Learning, 2003, pp. 242-249
  • 22
    • 34249052651 scopus 로고    scopus 로고
    • C. Guestrin, D. Koller, R. Parr, Multiagent planning with factored mdps, in: Advances in Neural Information Processing Systems (NIPS-14), 2001
  • 23
  • 24
    • 0000908510 scopus 로고    scopus 로고
    • A simple adaptive procedure leading to correlated equilibrium
    • Hart S., and Mas-Colell A. A simple adaptive procedure leading to correlated equilibrium. Econometrica 68 (2000) 1127-1150
    • (2000) Econometrica , vol.68 , pp. 1127-1150
    • Hart, S.1    Mas-Colell, A.2
  • 25
    • 4644369748 scopus 로고    scopus 로고
    • Nash Q-learning for general-sum stochastic games
    • Hu J., and Wellman M. Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research 4 (2003) 1039-1069
    • (2003) Journal of Machine Learning Research , vol.4 , pp. 1039-1069
    • Hu, J.1    Wellman, M.2
  • 26
    • 34249074572 scopus 로고    scopus 로고
    • J. Hu, P. Wellman, Multiagent reinforcement learning: Theoretical framework and an algorithm, in: Proceedings of the Fifteenth International Conference on Machine Learning, 1998, pp. 242-250
  • 27
    • 34249071716 scopus 로고    scopus 로고
    • A. Jafari, A. Greenwald, D. Gondek, G. Ercal, On no-regret learning, fictitious play, and Nash equilibrium, in: Proceedings of the Eighteenth International Conference on Machine Learning, 2001
  • 28
    • 1142305713 scopus 로고    scopus 로고
    • Learning to play games in extensive form by valuation
    • Jehiel P., and Samet D. Learning to play games in extensive form by valuation. NAJ Economics 3 (2001)
    • (2001) NAJ Economics , vol.3
    • Jehiel, P.1    Samet, D.2
  • 30
    • 0000221289 scopus 로고
    • Rational learning leads to Nash equilibrium
    • Kalai E., and Lehrer E. Rational learning leads to Nash equilibrium. Econometrica 61 5 (1993) 1019-1045
    • (1993) Econometrica , vol.61 , Issue.5 , pp. 1019-1045
    • Kalai, E.1    Lehrer, E.2
  • 31
    • 4544251885 scopus 로고    scopus 로고
    • S. Kapetanakis, D. Kudenko, Reinforcement learning of coordination in heterogeneous cooperative multi-agent systems, in: Proceedings of the Third Autonomous Agents and Multi-Agent Systems Conference, 2004
  • 32
    • 34249043965 scopus 로고    scopus 로고
    • M. Kearns, S. Singh, Near-optimal reinforcement learning in polynomial time, in: Proceedings of the Fifteenth International Conference on Machine Learning, 1998, pp. 260-268
  • 33
    • 0031192989 scopus 로고    scopus 로고
    • Representations and solutions for game-theoretic problems
    • Koller D., and Pfeffer A. Representations and solutions for game-theoretic problems. Artificial Intelligence 94 1 (1997) 167-215
    • (1997) Artificial Intelligence , vol.94 , Issue.1 , pp. 167-215
    • Koller, D.1    Pfeffer, A.2
  • 35
    • 84880839504 scopus 로고    scopus 로고
    • K. Leyton-Brown, M. Tennenholtz, Local-effect games, in: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, 2003, pp. 772-780
  • 36
    • 34249095889 scopus 로고    scopus 로고
    • M.L. Littman, Markov games as a framework for multi-agent reinforcement learning, in: Proceedings of the 11th International Conference on Machine Learning, 1994, pp. 157-163
  • 37
    • 34249110426 scopus 로고    scopus 로고
    • M.L. Littman, Friend-or-foe Q-learning in general-sum games, in: Proceedings of the Eighteenth International Conference on Machine Learning, 2001
  • 38
    • 34249011566 scopus 로고    scopus 로고
    • M.L. Littman, C. Szepesvari, A generalized reinforcement-learning model: Convergence and applications, in: Proceedings of the 13th International Conference on Machine Learning, 1996, pp. 310-318
  • 39
    • 0038386340 scopus 로고    scopus 로고
    • The empirical Bayes envelope and regret minimization in competitive Markov decision processes
    • Mannor S., and Shimkin N. The empirical Bayes envelope and regret minimization in competitive Markov decision processes. Mathematics of Operations Research 28 2 (2003) 327-345
    • (2003) Mathematics of Operations Research , vol.28 , Issue.2 , pp. 327-345
    • Mannor, S.1    Shimkin, N.2
  • 41
    • 0003646168 scopus 로고
    • On the convergence of learning processes in a 2 × 2 non-zero-person game
    • Miyasawa K. On the convergence of learning processes in a 2 × 2 non-zero-person game. Research Memo 33 (1961)
    • (1961) Research Memo , vol.33
    • Miyasawa, K.1
  • 42
    • 0002714588 scopus 로고
    • Evolutionary selection dynamics in games: Convergence and limit properties
    • Nachbar J. Evolutionary selection dynamics in games: Convergence and limit properties. International Journal of Game Theory 19 (1990) 59-89
    • (1990) International Journal of Game Theory , vol.19 , pp. 59-89
    • Nachbar, J.1
  • 43
    • 4544335718 scopus 로고    scopus 로고
    • E. Nudelman, J. Wortman, K. Leyton-Brown, Y. Shoham, Run the GAMUT: A comprehensive approach to evaluating game-theoretic algorithms, in: AAMAS, 2004
  • 44
    • 33745609272 scopus 로고    scopus 로고
    • R. Powers, Y. Shoham, Learning against opponents with bounded memory, in: Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, 2005
  • 45
    • 84898936075 scopus 로고    scopus 로고
    • New criteria and a new algorithm for learning in multi-agent systems
    • MIT Press, Cambridge, MA
    • Powers R., and Shoham Y. New criteria and a new algorithm for learning in multi-agent systems. Advances in Neural Information Processing Systems vol. 17 (2005), MIT Press, Cambridge, MA
    • (2005) Advances in Neural Information Processing Systems , vol.17
    • Powers, R.1    Shoham, Y.2
  • 46
    • 0001402950 scopus 로고
    • An iterative method of solving a game
    • Robinson J. An iterative method of solving a game. Annals of Mathematics 54 (1951) 298-301
    • (1951) Annals of Mathematics , vol.54 , pp. 298-301
    • Robinson, J.1
  • 48
    • 0028555752 scopus 로고    scopus 로고
    • S. Sen, M. Sekaran, J. Hale, Learning to coordinate without sharing information, in: Proceedings of the Twelfth National Conference on Artificial Intelligence, Seattle, WA, 1994, pp. 426-431
  • 51
    • 34247179640 scopus 로고    scopus 로고
    • T. Vu, R. Powers, Y. Shoham, Learning against multiple opponents, in: Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multi Agent Systems, 2006
  • 52
    • 34249105930 scopus 로고    scopus 로고
    • X. Wang, T. Sandholm, Reinforcement learning to play an optimal Nash equilibrium in team Markov games, in: Advances in Neural Information Processing Systems, vol. 15, 2002
  • 53
    • 34249833101 scopus 로고
    • Technical note: Q-learning
    • Watkins C., and Dayan P. Technical note: Q-learning. Machine Learning 8 3/4 (1992) 279-292
    • (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 279-292
    • Watkins, C.1    Dayan, P.2
  • 55
    • 1942484421 scopus 로고    scopus 로고
    • M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, in: ICML, 2003


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.