-
1
-
-
0029513526
-
Gambling in a rigged casino: The adversarial multi-arm bandit problem
-
Milwaukee, WI: IEEE Computer Society Press
-
Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (1995). Gambling in a rigged casino: The adversarial multi-arm bandit problem. In Proceedings of the thirtysixth annual symposium on foundations of computer science (pp. 322-331). Milwaukee, WI: IEEE Computer Society Press.
-
(1995)
Proceedings of the Thirtysixth Annual Symposium on Foundations of Computer Science
, pp. 322-331
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
3
-
-
84899027977
-
Convergence and no-regret in multiagent learning
-
Bowling, M. (2005). Convergence and no-regret in multiagent learning. In Proceedings of NIPS 2004/5.
-
(2005)
Proceedings of NIPS 2004/5
-
-
Bowling, M.1
-
5
-
-
0036531878
-
Multiagent learning using a variable learning rate
-
Bowling M., Veloso M. (2002). Multiagent learning using a variable learning rate. Artificial Intelligence 136: 215-250
-
(2002)
Artificial Intelligence
, vol.136
, pp. 215-250
-
-
Bowling, M.1
Veloso, M.2
-
6
-
-
0041965975
-
R-max - A general polynomial time algorithm for near-optimal reinforcement learning
-
Brafman R.I., Tennenholtz M. (2002). R-max - A general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research 3: 213-231
-
(2002)
Journal of Machine Learning Research
, vol.3
, pp. 213-231
-
-
Brafman, R.I.1
Tennenholtz, M.2
-
8
-
-
1942421183
-
AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents
-
Conitzer, V., & Sandholm, T. (2003). AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. In Proceedings of the twentieth international conference on machine learning.
-
(2003)
Proceedings of the Twentieth International Conference on Machine Learning
-
-
Conitzer, V.1
Sandholm, T.2
-
11
-
-
0002267135
-
Adaptive game playing using multiplicative weights
-
Freund Y., Schapire R.E. (1999). Adaptive game playing using multiplicative weights. Games and Economic Behavior 29: 79-103
-
(1999)
Games and Economic Behavior
, vol.29
, pp. 79-103
-
-
Freund, Y.1
Schapire, R.E.2
-
15
-
-
2942744741
-
Uncoupled dynamics do not lead to nash equilibrium
-
3
-
Hart S., Mas-Colell A. (2003) Uncoupled dynamics do not lead to nash equilibrium. American Economic Review 93(3): 1830-1836
-
(2003)
American Economic Review
, vol.93
, pp. 1830-1836
-
-
Hart, S.1
Mas-Colell, A.2
-
17
-
-
9444236608
-
On no-regret learning, fictitious play, and nash equilibrium
-
Jafari, A., Greenwald, A., Gondek, D., & Ercal, G. (2001). On no-regret learning, fictitious play, and nash equilibrium. In Proceedings of the eighteenth international conference on machine learning, pp. 226-223.
-
(2001)
Proceedings of the Eighteenth International Conference on Machine Learning
, pp. 226-223
-
-
Jafari, A.1
Greenwald, A.2
Gondek, D.3
Ercal, G.4
-
19
-
-
85149834820
-
Markov games as a framework for multi-agent reinforcement learning
-
San Mateo, CA: Morgan Kaufmann
-
Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the eleventh international conference on machine learning, (pp. 157-163). San Mateo, CA: Morgan Kaufmann.
-
(1994)
Proceedings of the Eleventh International Conference on Machine Learning
, pp. 157-163
-
-
Littman, M.L.1
-
22
-
-
0001730497
-
Non-cooperative games
-
Nash J.F. (1951). Non-cooperative games. Annals of Mathematics 54: 286-295
-
(1951)
Annals of Mathematics
, vol.54
, pp. 286-295
-
-
Nash, J.F.1
-
23
-
-
0027336968
-
A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner's dilemma game
-
Nowak M., Sigmund K. (1993). A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner's dilemma game. Nature 364: 56-58
-
(1993)
Nature
, vol.364
, pp. 56-58
-
-
Nowak, M.1
Sigmund, K.2
-
24
-
-
0004260006
-
-
Academic Press UK
-
Owen G. (1995). Game Theory. UK, Academic Press
-
(1995)
Game Theory
-
-
Owen, G.1
-
25
-
-
9444277661
-
Win-stay, lose-shift. A general learning rule for repeated normal form games
-
Stanford, CA, June 30-July 2, 1997
-
Posch, M., & Brannath, W. (1997). Win-stay, lose-shift. A general learning rule for repeated normal form games. In Proceedings of the third international conference on computing in economics and finance, Stanford, CA, June 30-July 2, 1997.
-
(1997)
Proceedings of the Third International Conference on Computing in Economics and Finance
-
-
Posch, M.1
Brannath, W.2
-
26
-
-
84898936075
-
New criteria and a new algorithm for learning in multi-agent systems
-
Powers, R., & Shoham, Y. (2005). New criteria and a new algorithm for learning in multi-agent systems. In Proceedings of NIPS 2004/5.
-
(2005)
Proceedings of NIPS 2004/5
-
-
Powers, R.1
Shoham, Y.2
-
27
-
-
84949966897
-
On multiagent Q-learning in a semi-competitive domain
-
G. Weiß & S. Sen, (Eds.) Springer-Verlag
-
Sandholm T., Crites R. (1996). On multiagent Q-learning in a semi-competitive domain. In G. Weiß & S. Sen, (Eds.) Adaptation and learning in multi-agent systems. pp. 191-205, Springer-Verlag.
-
(1996)
Adaptation and Learning in Multi-agent Systems
, pp. 191-205
-
-
Sandholm, T.1
Crites, R.2
-
28
-
-
0028555752
-
Learning to coordinate without sharing information
-
Menlo Park, CA: AAAI Press/MIT Press. (Also published in READINGS in AGENTS, Michael Huhns, N, and Munindar Singh (Editors), p. 509-514, Morgan Kaufmann Publishers Inc., San Francisco, CA, 1998.)
-
Sen, S., Sekaran, M., & Hale, J. (1994). Learning to coordinate without sharing information. In National conference on artificial intelligence, p. 426-431, Menlo Park, CA: AAAI Press/MIT Press. (Also published in READINGS in AGENTS, Michael Huhns, N, and Munindar Singh (Editors), p. 509-514, Morgan Kaufmann Publishers Inc., San Francisco, CA, 1998.).
-
(1994)
National Conference on Artificial Intelligence
, pp. 426-431
-
-
Sen, S.1
Sekaran, M.2
Hale, J.3
-
30
-
-
0001644761
-
Nash convergence of gradient dynamics in general-sum games
-
Singh, S., Kearns, M., & Mansour, Y. (2000). Nash convergence of gradient dynamics in general-sum games. In Proceedings of the sixteenth conference on uncertainty in artificial intelligence, pp. 541-548.
-
(2000)
Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence
, pp. 541-548
-
-
Singh, S.1
Kearns, M.2
Mansour, Y.3
-
33
-
-
84898941549
-
Extending Q-learning to general adaptive multi-agent systems
-
S. Thrun, L. Saul, & B. Schölkopf, (Eds) Cambridge, MA: MIT Press
-
Tesauro, G. (2004). Extending Q-learning to general adaptive multi-agent systems. In S. Thrun, L. Saul, & B. Schölkopf, (Eds), Advances in neural information processing systems Vol. 16. Cambridge, MA: MIT Press.
-
(2004)
Advances in Neural Information Processing Systems
, vol.16
-
-
Tesauro, G.1
-
34
-
-
67649405225
-
Reinforcement learning to play an optimal nash equilibrium in team markov games
-
NIPS
-
Wang, X., & Sandholm, T. (2002). Reinforcement learning to play an optimal nash equilibrium in team markov games. In Advances in neural information processing systems 15, NIPS.
-
(2002)
Advances in Neural Information Processing Systems
, vol.15
-
-
Wang, X.1
Sandholm, T.2
-
35
-
-
4544231144
-
Best-response multiagent learning in non-stationary environments
-
New York, NY: ACM
-
Weinberg, M., & Rosenschein, J. S. (2004). Best-Response multiagent learning in non-stationary environments. In Proceedings of the third international joint conference on autonomous agents and multiagent systems (AAMAS), (vol. 2, pp. 506-513, New York, NY: ACM.
-
(2004)
Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS)
, vol.2
, pp. 506-513
-
-
Weinberg, M.1
Rosenschein, J.S.2
|