-
1
-
-
33748599852
-
-
T. Abbott, D. Kane, P. Valiant, On the complexity of two-player win-lose games, in: Symposium on Foundations of Computer Science, 2005
-
-
-
-
3
-
-
9444299000
-
-
B. Banerjee, J. Peng, Performance bounded reinforcement learning in strategic interactions, in: National Conf. on Artificial Intelligence, 2004
-
-
-
-
4
-
-
84880876200
-
-
B. Banerjee, S. Sen, J. Peng, Fast concurrent reinforcement learners, in: Internat. Joint Conf. on Artificial Intelligence, 2001
-
-
-
-
5
-
-
33750693387
-
-
M. Benisch, G. Davis, T. Sandholm, Algorithms for rationalizability and CURB sets, in: National Conf. on Artificial Intelligence, 2006
-
-
-
-
6
-
-
33748692398
-
-
A. Blum, E. Even-Dar, K. Ligett, Routing without regret: On convergence to Nash equilibria of regret-minimizing algorithms in routing games, in: ACM Symposium on Principles of Distributed Computing, 2006
-
-
-
-
7
-
-
84899027977
-
-
M. Bowling, Convergence and no-regret in multiagent learning, in: Conf. on Neural Information Processing Systems, 2005
-
-
-
-
8
-
-
0036531878
-
Multiagent learning using a variable learning rate
-
Bowling M., and Veloso M. Multiagent learning using a variable learning rate. Artificial Intelligence 136 (2002) 215-250
-
(2002)
Artificial Intelligence
, vol.136
, pp. 215-250
-
-
Bowling, M.1
Veloso, M.2
-
9
-
-
0034247018
-
A near-optimal polynomial time algorithm for learning in certain classes of stochastic games
-
Brafman R., and Tennenholtz M. A near-optimal polynomial time algorithm for learning in certain classes of stochastic games. Artificial Intelligence 121 (2000) 31-47
-
(2000)
Artificial Intelligence
, vol.121
, pp. 31-47
-
-
Brafman, R.1
Tennenholtz, M.2
-
11
-
-
0041965975
-
R-max-a general polynomial time algorithm for near-optimal reinforcement learning
-
Brafman R., and Tennenholtz M. R-max-a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research 3 (2003) 213-231
-
(2003)
Journal of Machine Learning Research
, vol.3
, pp. 213-231
-
-
Brafman, R.1
Tennenholtz, M.2
-
12
-
-
4544271516
-
Efficient learning equilibrium
-
Earlier version in NIPS-02
-
Brafman R., and Tennenholtz M. Efficient learning equilibrium. Artificial Intelligence 159 (2004) 27-47 Earlier version in NIPS-02
-
(2004)
Artificial Intelligence
, vol.159
, pp. 27-47
-
-
Brafman, R.1
Tennenholtz, M.2
-
13
-
-
29344456477
-
-
R. Brafman, M. Tennenholtz, Optimal efficient learning equilibrium: Imperfect monitoring in symmetric games, in: National Conf. on Artificial Intelligence, 2005
-
-
-
-
14
-
-
4544279432
-
-
Y.-H. Chang, T. Ho, L. Kaelbling, Mobilized ad-hoc networks: A reinforcement learning approach, in: Internat. Conf. on Autonomic Computing, 2004
-
-
-
-
15
-
-
34248999852
-
-
X. Chen, X. Deng, Settling the complexity of 2-player Nash equilibrium, in: Electronic Colloquium on Computational Complexity, Report No. 150, 2005
-
-
-
-
16
-
-
0031630561
-
-
C. Claus, C. Boutilier, The dynamics of reinforcement learning in cooperative multiagent systems, in: National Conf. on Artificial Intelligence, 1998
-
-
-
-
17
-
-
1942452777
-
-
V. Conitzer, T. Sandholm, BL-WoLF: A framework for loss-bounded learnability in zero-sum games, in: Internat. Conf. on Machine Learning, 2003
-
-
-
-
18
-
-
84880852207
-
-
V. Conitzer, T. Sandholm, Complexity results about Nash equilibria, in: Internat. Joint Conf. on Artificial Intelligence, 2003
-
-
-
-
19
-
-
14344252185
-
-
V. Conitzer, T. Sandholm, Communication complexity as a lower bound for learning in games, in: Internat. Conf. on Machine Learning, 2004
-
-
-
-
20
-
-
30044441719
-
-
V. Conitzer, T. Sandholm, Complexity of (iterated) dominance, in: ACM Conf. on Electronic Commerce, 2005
-
-
-
-
21
-
-
29344453190
-
-
V. Conitzer, T. Sandholm, A generalized strategy eliminability criterion and computational methods for applying it, in: National Conf. on Artificial Intelligence, 2005
-
-
-
-
22
-
-
34147159616
-
AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents
-
Special issue on Learning and Computational Game Theory Short version in ICML-03
-
Conitzer V., and Sandholm T. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. Special issue on Learning and Computational Game Theory. Machine Learning 67 (2007) 23-43 Short version in ICML-03
-
(2007)
Machine Learning
, vol.67
, pp. 23-43
-
-
Conitzer, V.1
Sandholm, T.2
-
23
-
-
33748712836
-
-
V. Conitzer, T. Sandholm, Computing the optimal strategy to commit to, in: ACM Conf. on Electronic Commerce, 2006
-
-
-
-
24
-
-
20744454447
-
-
A. Flaxman, A. Kalai, B. McMahan, Online convex optimization in the bandit setting: Gradient descent without a gradient, in: ACM-SIAM Symposium on Discrete Algorithms, 2005
-
-
-
-
26
-
-
0002267135
-
Adaptive game playing using multiplicative weights
-
Freund Y., and Schapire R. Adaptive game playing using multiplicative weights. Games and Economic Behavior 29 (1999) 79-103
-
(1999)
Games and Economic Behavior
, vol.29
, pp. 79-103
-
-
Freund, Y.1
Schapire, R.2
-
30
-
-
45249127547
-
Nash and correlated equilibria: Some complexity considerations
-
Gilboa I., and Zemel E. Nash and correlated equilibria: Some complexity considerations. Games and Economic Behavior 1 (1989) 80-93
-
(1989)
Games and Economic Behavior
, vol.1
, pp. 80-93
-
-
Gilboa, I.1
Zemel, E.2
-
31
-
-
34249084095
-
-
A. Gilpin, S. Hoda, J. Peña, T. Sandholm, Gradient-based algorithms for finding Nash equilibria in extensive form games, Mimeo, 2007
-
-
-
-
32
-
-
33748800293
-
-
A. Gilpin, T. Sandholm, Finding equilibria in large sequential games of imperfect information, in: ACM Conf. on Electronic Commerce, 2006
-
-
-
-
33
-
-
1942517280
-
-
A. Greenwald, K. Hall, Correlated Q-learning, in: Internat. Conf. on Machine Learning, 2003
-
-
-
-
34
-
-
34249025537
-
-
A. Greenwald, A. Jafari, A general class of no-regret learning algorithms and game-theoretic equilibria, in: Conf. on Learning Theory, 2003
-
-
-
-
35
-
-
34248994001
-
-
S. Hart, Y. Mansour, The communication complexity of uncoupled Nash equilibrium procedures, 2006, Draft
-
-
-
-
36
-
-
0000908510
-
A simple adaptive procedure leading to correlated equilibrium
-
Hart S., and Mas-Colell A. A simple adaptive procedure leading to correlated equilibrium. Econometrica 68 (2000) 1127-1150
-
(2000)
Econometrica
, vol.68
, pp. 1127-1150
-
-
Hart, S.1
Mas-Colell, A.2
-
37
-
-
2942744741
-
Uncoupled dynamics do not lead to Nash equilibrium
-
Hart S., and Mas-Colell A. Uncoupled dynamics do not lead to Nash equilibrium. American Economic Review 93 (2003) 1830-1836
-
(2003)
American Economic Review
, vol.93
, pp. 1830-1836
-
-
Hart, S.1
Mas-Colell, A.2
-
39
-
-
34249001758
-
-
A. Jafari, A. Greenwald, D. Gondek, G. Ercal, On no-regret learning, fictitious play, and Nash equilibrium, in: Internat. Conf. on Machine Learning, 2001
-
-
-
-
41
-
-
0000221289
-
Rational learning leads to Nash equilibrium
-
Kalai E., and Lehrer E. Rational learning leads to Nash equilibrium. Econometrica 61 5 (1993) 1019-1045
-
(1993)
Econometrica
, vol.61
, Issue.5
, pp. 1019-1045
-
-
Kalai, E.1
Lehrer, E.2
-
42
-
-
34249099432
-
-
M. Kearns, M. Littman, S. Singh, Graphical models for game theory, in: Conf. on Uncertainty in Artificial Intelligence, 2001
-
-
-
-
46
-
-
34249059413
-
-
M. Littman, Markov games as a framework for multi-agent reinforcement learning, in: Internat. Conf. on Machine Learning, 1994
-
-
-
-
47
-
-
34248995976
-
-
M. Littman, Friend or foe Q-learning in general-sum Markov games, in: Internat. Conf. on Machine Learning, 2001
-
-
-
-
48
-
-
0001547175
-
Value-function reinforcement learning in Markov games
-
Littman M. Value-function reinforcement learning in Markov games. Journal of Cognitive Systems Research 2 (2001) 55-66
-
(2001)
Journal of Cognitive Systems Research
, vol.2
, pp. 55-66
-
-
Littman, M.1
-
49
-
-
34249080445
-
-
M. Littman, C. Szepesvári, A generalized reinforcement-learning model: Convergence and applications, in: Internat. Conf. on Machine Learning, 1996
-
-
-
-
50
-
-
0038386340
-
The empirical Bayes envelope and regret minimization in competitive Markov decision processes
-
Mannor S., and Shimkin N. The empirical Bayes envelope and regret minimization in competitive Markov decision processes. Mathematics of Operations Research 28 2 (2003) 327-345
-
(2003)
Mathematics of Operations Research
, vol.28
, Issue.2
, pp. 327-345
-
-
Mannor, S.1
Shimkin, N.2
-
51
-
-
32844468744
-
-
P. McCracken, M. Bowling, Safe strategies for agent modelling in games, in: AAAI Fall Symposium on Artificial Multi-agent Learning, 2004
-
-
-
-
52
-
-
34249106519
-
-
B. McMahan, A. Blum, Online geometric optimization in the bandit setting against an adaptive adversary, in: Conf. on Learning Theory, 2004
-
-
-
-
53
-
-
0000927072
-
Prediction, optimization, and learning in games
-
Nachbar J. Prediction, optimization, and learning in games. Econometrica 65 (1997) 275-309
-
(1997)
Econometrica
, vol.65
, pp. 275-309
-
-
Nachbar, J.1
-
54
-
-
23044525979
-
Bayesian learning in repeated games of incomplete information
-
Nachbar J. Bayesian learning in repeated games of incomplete information. Social Choice and Welfare 18 (2001) 303-326
-
(2001)
Social Choice and Welfare
, vol.18
, pp. 303-326
-
-
Nachbar, J.1
-
56
-
-
20744453823
-
-
C. Papadimitriou, T. Roughgarden, Computing equilibria in multi-player games, in: Symposium on Discrete Algorithms, 2005
-
-
-
-
57
-
-
0036923099
-
-
K. Pivazyan, Y. Shoham, Polynomial-time reinforcement learning of near-optimal policies, in: National Conf. on Artificial Intelligence, 2002
-
-
-
-
58
-
-
9444249830
-
-
R. Porter, E. Nudelman, Y. Shoham, Simple search methods for finding a Nash equilibrium, in: National Conf. on Artificial Intelligence, 2004
-
-
-
-
59
-
-
33745609272
-
-
R. Powers, Y. Shoham, Learning against opponents with bounded memory, in: Internat. Joint Conf. on Artificial Intelligence, 2005
-
-
-
-
60
-
-
84898936075
-
-
R. Powers, Y. Shoham, New criteria and a new algorithm for learning in multi-agent systems, in: Conf. on Neural Information Processing Systems, 2005
-
-
-
-
61
-
-
0030050933
-
Multiagent reinforcement learning in the iterated prisoner's dilemma
-
special issue on the Prisoner's Dilemma. Early version in IJCAI-95 Workshop on Adaptation and Learning in Multiagent Systems
-
Sandholm T., and Crites R. Multiagent reinforcement learning in the iterated prisoner's dilemma. Biosystems 37 (1996) 147-166 special issue on the Prisoner's Dilemma. Early version in IJCAI-95 Workshop on Adaptation and Learning in Multiagent Systems
-
(1996)
Biosystems
, vol.37
, pp. 147-166
-
-
Sandholm, T.1
Crites, R.2
-
62
-
-
29344453416
-
-
T. Sandholm, A. Gilpin, V. Conitzer, Mixed-integer programming methods for finding Nash equilibria, in: National Conf. on Artificial Intelligence, 2005
-
-
-
-
63
-
-
34249018059
-
-
T. Sandholm, M.V. Nagendra Prasad, Learning pursuit strategies, Project for CmpSci 698 Machine Learning, Computer Science Department, University of Massachusetts at Amherst, Spring, 1993
-
-
-
-
64
-
-
84880710441
-
-
J. Schaeffer, Y. Björnsson, N. Burch, A. Kishimoto, M. Müller, R. Lake, P. Lu, S. Sutphen, Solving checkers, in: Internat. Joint Conf. on Artificial Intelligence, 2005
-
-
-
-
65
-
-
34249095225
-
-
S. Singh, M. Kearns, Y. Mansour, Nash convergence of gradient dynamics in general-sum games, in: Conf. on Uncertainty in Artificial Intelligence, 2000
-
-
-
-
66
-
-
85131710903
-
-
F. Southey, M. Bowling, B. Larson, C. Piccione, N. Burch, D. Billings, C. Rayner, Bayes' bluff: Opponent modelling in poker, in: Conf. on Uncertainty in Artificial Intelligence, 2005
-
-
-
-
68
-
-
34249080936
-
-
M. Tan, Multi-agent reinforcement learning: Independent vs. cooperative agents, in: Internat. Conf. on Machine Learning, 1993
-
-
-
-
69
-
-
0029276036
-
Temporal difference learning and TD-gammon
-
Tesauro G. Temporal difference learning and TD-gammon. Communications of the ACM 38 3 (1995)
-
(1995)
Communications of the ACM
, vol.38
, Issue.3
-
-
Tesauro, G.1
-
71
-
-
0036930301
-
-
D. Vickrey, D. Koller, Multi-agent algorithms for solving graphical games, in: National Conf. on Artificial Intelligence, 2002
-
-
-
-
72
-
-
34249049304
-
-
X. Wang, T. Sandholm, Reinforcement learning to play an optimal Nash equilibrium in team Markov games, in: Conf. on Neural Information Processing Systems, 2002
-
-
-
-
73
-
-
34249014146
-
-
X. Wang, T. Sandholm, Learning near-Pareto-optimal conventions in polynomial time, in: Conf. on Neural Information Processing Systems, 2003
-
-
-
-
74
-
-
1942484421
-
-
M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, in: Internat. Conf. on Machine Learning, 2003
-
-
-
|