-
1
-
-
0029513526
-
Gambling in a rigged casino: The adversarial multi-arm bandit problem
-
Milwaukee, WI: IEEE Computer Society Press
-
Auer, P.; Cesa-Bianchi, N.; Freund, Y.; and Schapire, R. E. 1995. Gambling in a rigged casino: The adversarial multi-arm bandit problem. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 322 - 331. Milwaukee, WI: IEEE Computer Society Press.
-
(1995)
Proceedings of the 36th Annual Symposium on Foundations of Computer Science
, pp. 322-331
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
5
-
-
0041965975
-
R-max - A general polynomial time algorithm for near-optimal reinforcement learning
-
Brafman, R. I., and Tennenholtz, M. 2002b. R-max - A general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research 3:213-231.
-
(2002)
Journal of Machine Learning Research
, vol.3
, pp. 213-231
-
-
Brafman, R.I.1
Tennenholtz, M.2
-
7
-
-
1942421183
-
AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents
-
Conitzer, V., and Sandholm, T. 2003a. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. In Proceedings of the 20th International Conference on Machine Learning.
-
(2003)
Proceedings of the 20th International Conference on Machine Learning
-
-
Conitzer, V.1
Sandholm, T.2
-
9
-
-
0002267135
-
Adaptive game playing using multiplicative weights
-
Freund, Y., and Schapire, R. E. 1999. Adaptive game playing using multiplicative weights. Games and Economic Behavior 29:79 - 103.
-
(1999)
Games and Economic Behavior
, vol.29
, pp. 79-103
-
-
Freund, Y.1
Schapire, R.E.2
-
14
-
-
0000929496
-
Multiagent reinforcement learning: Theoretical framework and an algorithm
-
San Francisco, CA: Morgan Kaufmann
-
Hu, J., and Wellman, M. P. 1998. Multiagent reinforcement learning: Theoretical framework and an algorithm. In Proc. of the 15th Int. Conf. on Machine Learning (ML'98), 242-250. San Francisco, CA: Morgan Kaufmann.
-
(1998)
Proc. of the 15th Int. Conf. on Machine Learning (ML'98)
, pp. 242-250
-
-
Hu, J.1
Wellman, M.P.2
-
16
-
-
9444236608
-
On no-regret learning, fictitious play, and nash equilibrium
-
Jafari, A.; Greenwald, A.; Gondek, D.; and Ercal, G. 2001. On no-regret learning, fictitious play, and nash equilibrium. In Proceedings of the Eighteenth International Conference on Machine Learning, 226 - 223.
-
(2001)
Proceedings of the Eighteenth International Conference on Machine Learning
, pp. 226-1223
-
-
Jafari, A.1
Greenwald, A.2
Gondek, D.3
Ercal, G.4
-
19
-
-
85149834820
-
Markov games as a framework for multi-agent reinforcement learning
-
San Mateo, CA: Morgan Kaufmann
-
Littman, M. L. 1994. Markov games as a framework for multi-agent reinforcement learning. In Proc. of the 11th Int. Conf. on Machine Learning, 157-163. San Mateo, CA: Morgan Kaufmann.
-
(1994)
Proc. of the 11th Int. Conf. on Machine Learning
, pp. 157-163
-
-
Littman, M.L.1
-
21
-
-
0001730497
-
Non-cooperative games
-
Nash, J.F. 1951. Non-cooperative games. Annals of Mathematics 54:286-295.
-
(1951)
Annals of Mathematics
, vol.54
, pp. 286-295
-
-
Nash, J.F.1
-
22
-
-
0027336968
-
A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner's dilemma game
-
Nowak, M., and Sigmund, K. 1993. A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner's dilemma game. Nature 364:56 - 58.
-
(1993)
Nature
, vol.364
, pp. 56-58
-
-
Nowak, M.1
Sigmund, K.2
-
24
-
-
84949966897
-
On multiagent Q-learning in a semi-competitive domain
-
Weiß, G., and Sen, S., eds. Springer-Verlag
-
Sandholm, T., and Crites, R. 1996. On multiagent Q-learning in a semi-competitive domain. In Weiß, G., and Sen, S., eds. Adaptation and Learning in Multi-Agent Systems. Springer-Verlag. 191-205.
-
(1996)
Adaptation and Learning in Multi-agent Systems
, pp. 191-205
-
-
Sandholm, T.1
Crites, R.2
-
25
-
-
0028555752
-
Learning to coordinate without sharing information
-
Menlo Park, CA: AAAI Press/MIT Press
-
Sen, S.; Sekaran, M.; and Hale, J. 1994. Learning to coordinate without sharing information. In National Conference on Artificial Intelligence, 426-431. Menlo Park, CA: AAAI Press/MIT Press.
-
(1994)
National Conference on Artificial Intelligence
, pp. 426-431
-
-
Sen, S.1
Sekaran, M.2
Hale, J.3
|