SCOPUS 정보 검색 플랫폼

Journal of Artificial Intelligence Research

Volumn 22, Issue , 2004, Pages 353-384

Existence of multiagent equilibria with limited agents

(2) Bowling, Michael a Veloso, Manuela b

a UNIVERSITY OF ALBERTA (Canada)

b Carnegie Mellon University (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE; GAME THEORY; INTELLIGENT AGENTS; LEARNING ALGORITHMS; LEARNING SYSTEMS; SOFTWARE AGENTS;

AGENT BEHAVIOR; MULTIAGENT EQUILIBRIA; MULTIAGENT LEARNING ALGORITHM;

MULTI AGENT SYSTEMS;

EID: 27344450680 PISSN: 10769757 EISSN: 10769757 Source Type: Journal
DOI: 10.1613/jair.1332 Document Type: Article

Times cited : (28)

References (52)

1
- 0003272616
- Reinforcement learning in POMDP's via direct gradient ascent
- Stanford University. Morgan Kaufman
- Baxter, J., & Bartlett, P. L. (2000). Reinforcement learning in POMDP's via direct gradient ascent. In Proceedings of the Seventeenth International Conference on Machine Learning, pp. 41-48, Stanford University. Morgan Kaufman.
- (2000) Proceedings of the Seventeenth International Conference on Machine Learning , pp. 41-48
- Baxter, J.¹ Bartlett, P.L.²

2
- 84880690842
- Bounding the suboptimality of reusing subproblems
- Stockholm, Sweden. Morgan Kaufman
- Bowling, M., & Veloso, M. (1999). Bounding the suboptimality of reusing subproblems. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, pp. 1340-1345, Stockholm, Sweden. Morgan Kaufman.
- (1999) Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence , pp. 1340-1345
- Bowling, M.¹ Veloso, M.²

3
- 27344440129
- An earlier version appeared in the
- An earlier version appeared in the Proceedings of the NIPS Workshop on Abstraction in Reinforcement Learning, 1998.
- (1998) Proceedings of the NIPS Workshop on Abstraction in Reinforcement Learning

4
- 0036531878
- Multiagent learning using a variable learning rate
- Bowling, M., & Veloso, M. (2002). Multiagent learning using a variable learning rate. Artificial Intelligence, 136, 215-250.
- (2002) Artificial Intelligence , vol.136 , pp. 215-250
- Bowling, M.¹ Veloso, M.²

5
- 26444574161
- Ph.D. thesis, Computer Science Department, Technion
- Carmel, D. (1997). Model-based Learning of Interaction Strategies in Multi-agent Systems. Ph.D. thesis, Computer Science Department, Technion.
- (1997) Model-based Learning of Interaction Strategies in Multi-agent Systems
- Carmel, D.¹

6
- 0030365402
- Learning models of intelligent agents
- Menlo Park, CA. AAAI Press
- Carmel, D., & Markovitch, S. (1996). Learning models of intelligent agents. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, Menlo Park, CA. AAAI Press.
- (1996) Proceedings of the Thirteenth National Conference on Artificial Intelligence
- Carmel, D.¹ Markovitch, S.²

7
- 0031630561
- The dynamics of reinforcement learning in cooperative multiagent systems
- Menlo Park, CA. AAAI Press
- Claus, C., & Boutilier, C. (1998). The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the Fifteenth National Conference on Artificial Intelligence, pp. 746-752, Menlo Park, CA. AAAI Press.
- (1998) Proceedings of the Fifteenth National Conference on Artificial Intelligence , pp. 746-752
- Claus, C.¹ Boutilier, C.²

8
- 0003989209
- Springer Verlag, New York
- Filar, J., & Vrieze, K. (1997). Competitive Markov Decision Processes. Springer Verlag, New York.
- (1997) Competitive Markov Decision Processes
- Filar, J.¹ Vrieze, K.²

9
- 84972535636
- Equilibrium in a stochastic n-person game
- Fink, A. M. (1964). Equilibrium in a stochastic n-person game. Journal of Science in Hiroshima University, Series A-I, 28, 89-93.
- (1964) Journal of Science in Hiroshima University, Series A-I , vol.28 , pp. 89-93
- Fink, A.M.¹

10
- 18944395036
- Brooks/Cole Publishing Company, Pacific Grove, CA
- Gaughan, E. D. (1993). Introduction to Analysis, 4th Edition. Brooks/Cole Publishing Company, Pacific Grove, CA.
- (1993) Introduction to Analysis, 4th Edition
- Gaughan, E.D.¹

11
- 38249006045
- Bounded versus unbounded rationality: The tyranny of the weak
- Gilboa, I., & Samet, D. (1989). Bounded versus unbounded rationality: The tyranny of the weak. Games and Economic Behavior, 213-221.
- (1989) Games and Economic Behavior , pp. 213-221
- Gilboa, I.¹ Samet, D.²

12
- 0003860985
- Princeton University Press
- Gintis, H. (2000). Game Theory Evolving. Princeton University Press.
- (2000) Game Theory Evolving
- Gintis, H.¹

13
- 4143084696
- Correlated Q-learning
- Greenwald, A., & Hall, K. (2002). Correlated Q-learning. In Proceedings of the AAAI Spring Symposium Workshop on Collaborative Learning Agents.
- (2002) Proceedings of the AAAI Spring Symposium Workshop on Collaborative Learning Agents
- Greenwald, A.¹ Hall, K.²

14
- 0006419533
- Hierarchical solution of Markov decision processes using macro-actions
- Hauskrecht, M., Meuleau, N., Kaelbling, L. P., Dean, T., & Boutilier, C. (1998). Hierarchical solution of Markov decision processes using macro-actions. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-98).
- (1998) Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-98)
- Hauskrecht, M.¹ Meuleau, N.² Kaelbling, L.P.³ Dean, T.⁴ Boutilier, C.⁵

15
- 0000929496
- Multiagent reinforcement learning: Theoretical framework and an algorithm
- San Francisco. Morgan Kaufman
- Hu, J., & Wellman, M. P. (1998). Multiagent reinforcement learning: Theoretical framework and an algorithm. In Proceedings of the Fifteenth International Conference on Machine Learning, pp. 242-250, San Francisco. Morgan Kaufman.
- (1998) Proceedings of the Fifteenth International Conference on Machine Learning , pp. 242-250
- Hu, J.¹ Wellman, M.P.²

16
- 85153938292
- Reinforcement learning algorithm for partially observable Markov decision problems
- MIT Press
- Jaakkola, T., Singh, S. P., & Jordan, M. I. (1994). Reinforcement learning algorithm for partially observable Markov decision problems. In Advances in Neural Information Processing Systems 6. MIT Press.
- (1994) Advances in Neural Information Processing Systems 6
- Jaakkola, T.¹ Singh, S.P.² Jordan, M.I.³

17
- 0004089723
- Ph.D. thesis, Computer Science Department, Carnegie Mellon University
- Jansen, P. J. (1992). Using Knowledge About the Opponent in Game-tree Search. Ph.D. thesis, Computer Science Department, Carnegie Mellon University.
- (1992) Using Knowledge about the Opponent in Game-tree Search
- Jansen, P.J.¹

18
- 0000619048
- Extensive games and the problem of information
- Kuhn, H. W., & Tucker, A. W. (Eds.), Princeton University Press. Reprinted in (Kuhn, 1997)
- Kuhn, H. W. (1953). Extensive games and the problem of information. In Kuhn, H. W., & Tucker, A. W. (Eds.), Contributions to the Theory of Games II, pp. 193-216. Princeton University Press. Reprinted in (Kuhn, 1997).
- (1953) Contributions to the Theory of Games II , pp. 193-216
- Kuhn, H.W.¹

19
- 0004132273
- Princeton University Press
- Kuhn, H. W. (Ed.). (1997). Classics in Game Theory. Princeton University Press.
- (1997) Classics in Game Theory
- Kuhn, H.W.¹

20
- 0035501436
- Bargaining with limited computation: Deliberation equilibrium
- Larson, K., & Sandholm, T. (2001). Bargaining with limited computation: Deliberation equilibrium. Artificial Intelligence, 132(2), 183-217.
- (2001) Artificial Intelligence , vol.132 , Issue.2 , pp. 183-217
- Larson, K.¹ Sandholm, T.²

21
- 0242466944
- Friend-or-foe Q-learning in general-sum games
- Morgan Kaufman
- Littman, M. (2001). Friend-or-foe Q-learning in general-sum games. In Proceedings of the Eighteenth International Conference on Machine Learning, pp. 322-328. Morgan Kaufman.
- (2001) Proceedings of the Eighteenth International Conference on Machine Learning , pp. 322-328
- Littman, M.¹

22
- 22344437403
- Leading best-response strategies in repeated games
- Littman, M., & Stone, P. (2001). Leading best-response strategies in repeated games. In Seventeenth Annual International Joint Conference on Artificial Intelligence Workshop on Economic Agents, Models, and Mechanisms.
- (2001) Seventeenth Annual International Joint Conference on Artificial Intelligence Workshop on Economic Agents, Models, and Mechanisms
- Littman, M.¹ Stone, P.²

23
- 85149834820
- Markov games as a framework for multi-agent reinforcement learning
- Morgan Kaufman
- Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the Eleventh International Conference on Machine Learning, pp. 157-163. Morgan Kaufman.
- (1994) Proceedings of the Eleventh International Conference on Machine Learning , pp. 157-163
- Littman, M.L.¹

24
- 0001961616
- A generalized reinforcement-learning model: Convergence and applications
- Bari, Italy. Morgan Kaufmann
- Littman, M. L., & Szepesvári, G. (1996). A generalized reinforcement-learning model: Convergence and applications. In Proceedings of the 13th International Conference on Machine Learning, pp. 310-318, Bari, Italy. Morgan Kaufmann.
- (1996) Proceedings of the 13th International Conference on Machine Learning , pp. 310-318
- Littman, M.L.¹ Szepesvári, G.²

25
- 0029752592
- Average reward reinforcement learning: Foundations, algorithms, and empirical results
- Mahadevan, S. (1996). Average reward reinforcement learning: Foundations, algorithms, and empirical results. Machine Learning, 22, 159-196.
- (1996) Machine Learning , vol.22 , pp. 159-196
- Mahadevan, S.¹

26
- 84957895797
- Reward functions for accelerated learning
- San Francisco. Morgan Kaufman
- Mataric, M. J. (1994). Reward functions for accelerated learning. In Proceedings of the Eleventh International Conference on Machine Learning, San Francisco. Morgan Kaufman.
- (1994) Proceedings of the Eleventh International Conference on Machine Learning
- Mataric, M.J.¹

27
- 0013465187
- Automatic discovery of subgoals in reinforcement learning using diverse density
- Morgan Kaufman
- McGovern, A., & Barto, A. G. (2001). Automatic discovery of subgoals in reinforcement learning using diverse density. In Proceedings of the Eighteenth International Conference on Machine Learning, pp. 361-368. Morgan Kaufman.
- (2001) Proceedings of the Eighteenth International Conference on Machine Learning , pp. 361-368
- McGovern, A.¹ Barto, A.G.²

28
- 0002310119
- Stochastic games
- Mertens, J. F., & Neyman, A. (1981). Stochastic games. International Journal of Game Theory, 10, 53-56.
- (1981) International Journal of Game Theory , vol.10 , pp. 53-56
- Mertens, J.F.¹ Neyman, A.²

29
- 0027684215
- Prioritized sweeping: Reinforcement learning with less data and less time
- Moore, A. W., & Atkeson, C. G. (1993). Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning, 13, 103-130.
- (1993) Machine Learning , vol.13 , pp. 103-130
- Moore, A.W.¹ Atkeson, C.G.²

30
- 0002021736
- Equilibrium points in n-person games
- Reprinted in (Kuhn, 1997)
- Nash, Jr., J. F. (1950). Equilibrium points in n-person games. PNAS, 36, 48-49. Reprinted in (Kuhn, 1997).
- (1950) PNAS , vol.36 , pp. 48-49
- Nash Jr., J.F.¹

31
- 84898967780
- Policy search via density estimation
- MIT Press
- Ng, A. Y., Parr, R., & Koller, D. (1999). Policy search via density estimation. In Advances in Neural Information Processing Systems 12, pp. 1022-1028. MIT Press.
- (1999) Advances in Neural Information Processing Systems , vol.12 , pp. 1022-1028
- Ng, A.Y.¹ Parr, R.² Koller, D.³

32
- 0032021222
- Soccer server: A tool for research on multi-agent systems
- Noda, I., Matsubara, H., Hiraki, K., & Frank, I. (1998). Soccer server: a tool for research on multi-agent systems. Applied Artificial Intelligence, 12, 233-250.
- (1998) Applied Artificial Intelligence , vol.12 , pp. 233-250
- Noda, I.¹ Matsubara, H.² Hiraki, K.³ Frank, I.⁴

33
- 0003427725
- The MIT Press
- Osborne, M. J., & Rubinstein, A. (1994). A Course in Game Theory. The MIT Press.
- (1994) A Course in Game Theory
- Osborne, M.J.¹ Rubinstein, A.²

34
- 0345108843
- Games with procedurally rational players
- McMaster University, Department of Economics
- Osborne, M. J., & Rubinstein, A. (1997). Games with procedurally rational players. Working papers 9702, McMaster University, Department of Economics.
- (1997) Working Papers , vol.9702
- Osborne, M.J.¹ Rubinstein, A.²

35
- 16244421869
- Springer-Verlag
- Riley, P., & Veloso, M. (2000). On Behavior Classification in Adversarial Environments, pp. 371-380. Springer-Verlag.
- (2000) On Behavior Classification in Adversarial Environments , pp. 371-380
- Riley, P.¹ Veloso, M.²

36
- 84943319533
- Planning for distributed execution through use of probabilistic opponent models
- Toulouse, France
- Riley, P., & Veloso, M. (2002). Planning for distributed execution through use of probabilistic opponent models. In Proceedings of the Sixth International Conference on AI Planning and Scheduling, Toulouse, France.
- (2002) Proceedings of the Sixth International Conference on AI Planning and Scheduling
- Riley, P.¹ Veloso, M.²

37
- 0001402950
- An iterative method of solving a game
- Reprinted in (Kuhn, 1997)
- Robinson, J. (1951). An iterative method of solving a game. Annals of Mathematics, 54, 296-301. Reprinted in (Kuhn, 1997).
- (1951) Annals of Mathematics , vol.54 , pp. 296-301
- Robinson, J.¹

38
- 0018922522
- Existence and uniqueness of equilibrium points for concave n-person games
- Rosen, J. B. (1965). Existence and uniqueness of equilibrium points for concave n-person games. Econometrica, 33, 520-534.
- (1965) Econometrica , vol.33 , pp. 520-534
- Rosen, J.B.¹

39
- 0003687484
- The MIT Press
- Rubinstein, A. (1998). Modeling Bounded Rationality. The MIT Press.
- (1998) Modeling Bounded Rationality
- Rubinstein, A.¹

40
- 0028555752
- Learning to coordinate without sharing information
- Sen, S., Sekaran, M., & Hale, J. (1994). Learning to coordinate without sharing information. In Proceedings of the 13th National Conference on Artificial Intelligence.
- (1994) Proceedings of the 13th National Conference on Artificial Intelligence
- Sen, S.¹ Sekaran, M.² Hale, J.³

41
- 0000392613
- Stochastic games
- Reprinted in (Kuhn, 1997)
- Shapley, L. S. (1953). Stochastic games. PNAS, 39, 1095-1100. Reprinted in (Kuhn, 1997).
- (1953) PNAS , vol.39 , pp. 1095-1100
- Shapley, L.S.¹

42
- 0002298346
- From substantive to procedural rationality
- Latis, S. J. (Ed.), Cambridge University Press, New York
- Simon, H. A. (1976). From substantive to procedural rationality. In Latis, S. J. (Ed.), Methods and Appraisals in Economics, pp. 129-148. Cambridge University Press, New York.
- (1976) Methods and Appraisals in Economics , pp. 129-148
- Simon, H.A.¹

43
- 0004077471
- MIT Press, Cambridge, MA
- Simon, H. A. (1982). Models of Bounded Rationality. MIT Press, Cambridge, MA.
- (1982) Models of Bounded Rationality
- Simon, H.A.¹

44
- 0033901602
- Convergence results for single-step on-policy reinforcement-learning algorithms
- Singh, S., Jaakkola, T., Littman, M. L., & Szepesvári, C. (2000). Convergence results for single-step on-policy reinforcement-learning algorithms. Machine Learning.
- (2000) Machine Learning
- Singh, S.¹ Jaakkola, T.² Littman, M.L.³ Szepesvári, C.⁴

45
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- MIT Press
- Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems 12, pp. 1057-1063. MIT Press.
- (2000) Advances in Neural Information Processing Systems , vol.12 , pp. 1057-1063
- Sutton, R.S.¹ McAllester, D.² Singh, S.³ Mansour, Y.⁴

46
- 0002260073
- Intra-option learning about temporally abstract actions
- San Francisco. Morgan Kaufman
- Sutton, R. S., Precup, D., & Singh, S. (1998). Intra-option learning about temporally abstract actions. In Proceedings of the Fifteenth International Conference on Machine Learning, pp. 556-564, San Francisco. Morgan Kaufman.
- (1998) Proceedings of the Fifteenth International Conference on Machine Learning , pp. 556-564
- Sutton, R.S.¹ Precup, D.² Singh, S.³

47
- 85152198941
- Multi-agent reinforcement learning: Independent vs. cooperative agents
- Amherst, MA
- Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the Tenth International Conference on Machine Learning, pp. 330-337, Amherst, MA.
- (1993) Proceedings of the Tenth International Conference on Machine Learning , pp. 330-337
- Tan, M.¹

48
- 0004196515
- Adversarial reinforcement learning
- Carnegie Mellon University. Unpublished
- Uther, W., & Veloso, M. (1997). Adversarial reinforcement learning. Tech. rep., Carnegie Mellon University. Unpublished.
- (1997) Tech. Rep.
- Uther, W.¹ Veloso, M.²

49
- 23144434851
- Ph.D. thesis, Computer Science Department, Carnegie Mellon University, Pittsburgh, PA. Available as technical report CMU-CS-02-169
- Uther, W. T. B. (2002). Tree Based Hierarchical Reinforcement Learning. Ph.D. thesis, Computer Science Department, Carnegie Mellon University, Pittsburgh, PA. Available as technical report CMU-CS-02-169.
- (2002) Tree Based Hierarchical Reinforcement Learning
- Uther, W.T.B.¹

50
- 0003892527
- No. 33 in CWI Tracts. Centrum voor Wiskunde en Informatica
- Vrieze, O. J. (1987). Stochastic Games with Finite State and Action Spaces. No. 33 in CWI Tracts. Centrum voor Wiskunde en Informatica.
- (1987) Stochastic Games with Finite State and Action Spaces
- Vrieze, O.J.¹

51
- 0004049893
- Ph.D. thesis, King's College, Cambridge, UK
- Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. Ph.D. thesis, King's College, Cambridge, UK.
- (1989) Learning from Delayed Rewards
- Watkins, C.J.C.H.¹

52
- 0012252296
- Tight performance bounds on greedy policies based on imperfect value functions
- College of Computer Science, Northeastern University
- Williams, R. J., & Baird, L. C. (1993). Tight performance bounds on greedy policies based on imperfect value functions. Technical report, College of Computer Science, Northeastern University.
- (1993) Technical Report
- Williams, R.J.¹ Baird, L.C.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.