SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Advances in Neural Information Processing Systems

Volumn , Issue , 2003, Pages

Reinforcement learning to play an optimal nash equilibrium in team Markov games

(2) Wang, Xiaofeng a Sandholm, Tuomas b

a CARNEGIE MELLON UNIVERSITY (United States)

b CARNEGIE MELLON UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; GAME THEORY; MULTI AGENT SYSTEMS; REINFORCEMENT LEARNING; TELECOMMUNICATION NETWORKS;

ADAPTIVE LEARNING; CONVERGENCE CONDITIONS; LEARNING TO PLAY; MARKOV GAMES; MULTI-AGENT LEARNING; NASH EQUILIBRIA; OPTIMAL COORDINATION;

OPTIMIZATION;

EID: 27744448185 PISSN: 10495258 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (99)

References (18)

1
- 0002500351
- Planning, learning and coordination in multi-agent decision processes
- C. Boutilier. Planning, learning and coordination in multi-agent decision processes. In TARK, 1996.
- (1996) TARK
- Boutilier, C.¹

2
- 0010247544
- The dynamics of reinforcement learning in cooperative multi-agent systems
- C. Claus and C. Boutilier. The dynamics of reinforcement learning in cooperative multi-agent systems. In AAAI, 1998.
- (1998) AAAI
- Claus, C.¹ Boutilier, C.²

3
- 0004247096
- MIT Press
- D. Fudenberg and D.K. Levine. The theory of learning in games. MIT Press, 1998.
- (1998) The Theory of Learning in Games
- Fudenberg, D.¹ Levine, D.K.²

4
- 0003499462
- John Wiley and Sons, Inc
- D.L. Isaacson and R.W. Madsen. Markov chain: theory and applications. John Wiley and Sons, Inc, 1976.
- (1976) Markov Chain: Theory and Applications
- Isaacson, D.L.¹ Madsen, R.W.²

5
- 0001473356
- Learning to coordinate actions in multi-agent systems
- G. Weiβ. Learning to coordinate actions in multi-agent systems. In IJCAI, 1993.
- (1993) IJCAI
- Weiß, G.¹

6
- 0000929496
- Multiagent reinforcement learning: Theoretical framework and an algorithm
- J. Hu and W.P Wellman. Multiagent reinforcement learning: theoretical framework and an algorithm. InICML, 1998.
- (1998) InICML
- Hu, J.¹ Wellman, W.P.²

7
- 0002730095
- Learning, mutation, and long run equilibria in games
- M. Kandori, G.J. Mailath, and R. Rob. Learning, mutation, and long run equilibria in games. Econometrica, 61(1):29-56, 1993.
- (1993) Econometrica , vol.61 , Issue.1 , pp. 29-56
- Kandori, M.¹ Mailath, G.J.² Rob, R.³

8
- 0242466944
- Friend-or-foe Q-learning in general sum game
- M. Littman. Friend-or-Foe Q-learning in general sum game. In ICML, 2001.
- (2001) ICML
- Littman, M.¹

9
- 0001547175
- Value-function reinforcement learning in Markov games
- M.L. Littman. Value-function reinforcement learning in markov games. J. of Cognitive System Research, 2:55-66, 2000.
- (2000) J. of Cognitive System Research , vol.2 , pp. 55-66
- Littman, M.L.¹

10
- 85102627959
- John Wiley
- M.L. Purterman. Markov decision processes-discrete stochastic dynamic programming. John Wiley, 1994.
- (1994) Markov Decision Processes-discrete Stochastic Dynamic Programming
- Purterman, M.L.¹

11
- 85152198941
- Multiagent reinforcement learning: Independent vs. Cooperative agents
- M. Tan. Multiagent reinforcement learning: independent vs. cooperative agents. In ICML, 1993.
- (1993) ICML
- Tan, M.¹

12
- 0003644124
- MIT Press
- R.A. Howard. Dynamic programming and Markov processes. MIT Press, 1960.
- (1960) Dynamic Programming and Markov Processes
- Howard, R.A.¹

13
- 0001181267
- Spieltheoretische behandlung eines oligopolmodells mit nachfrageträgheit
- R. Selten. Spieltheoretische behandlung eines oligopolmodells mit nachfrageträgheit. Zeitschrift für die gesamte Staatswissenschaft, 12:301-324, 1965.
- (1965) Zeitschrift für die Gesamte Staatswissenschaft , vol.12 , pp. 301-324
- Selten, R.¹

14
- 0033901602
- Convergence results for single-step on-policy reinforcement learning algorithms
- S. Singh, T. Jaakkola, M.L. Littman, and C. Szepesvari. Convergence results for single-step on-policy reinforcement learning algorithms. Machine Learning, 2000.
- (2000) Machine Learning
- Singh, S.¹ Jaakkola, T.² Littman, M.L.³ Szepesvari, C.⁴

15
- 0002626229
- Learning to coordinate without sharing information
- S. Sen, M. Sekaran, and J. Hale. Learning to coordinate without sharing information. InAAAI, 1994.
- (1994) InAAAI
- Sen, S.¹ Sekaran, M.² Hale, J.³

16
- 84876153304
- Optimality and equilibrium in stochastic games
- F Thusijsman. Optimality and equilibrium in stochastic games. Centrum voor Wiskunde en Informatica, 1992.
- (1992) Centrum Voor Wiskunde en Informatica
- Thusijsman, F.¹

17
- 0030050933
- Learning in the iterated prisoner's dilemma
- T. Sandholm and R. Crites. Learning in the iterated prisoner's dilemma. Biosystems, 37:147-166, 1995.
- (1995) Biosystems , vol.37 , pp. 147-166
- Sandholm, T.¹ Crites, R.²

18
- 0001944917
- The evolution of conventions
- H. Young. The evolution of conventions. Econometrica, 61(1):57-84, 1993.
- (1993) Econometrica , vol.61 , Issue.1 , pp. 57-84
- Young, H.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.