메뉴 건너뛰기




Volumn , Issue , 2002, Pages 1571-1578

Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTATION THEORY; GAME THEORY; MULTI AGENT SYSTEMS;

EID: 67649405225     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (149)

References (18)
  • 1
    • 0002500351 scopus 로고    scopus 로고
    • Planning, learning and coordination in multi-agent decision processes
    • C.Boutilier. Planning, learning and coordination in multi-agent decision processes. In TARK, 1996.
    • (1996) TARK
    • Boutilier, C.1
  • 2
    • 0010247544 scopus 로고    scopus 로고
    • The dynamics of reinforcement learning in cooperative multi-agent systems
    • C.Claus and C.Boutilier. The dynamics of reinforcement learning in cooperative multi-agent systems. In AAAI, 1998.
    • (1998) AAAI
    • Claus, C.1    Boutilier, C.2
  • 5
    • 0001473356 scopus 로고
    • Learning to coordinate actions in multi-agent systems
    • G.Wei. Learning to coordinate actions in multi-agent systems. In IJCAI, 1993.
    • (1993) IJCAI
    • Wei, G.1
  • 6
    • 0000929496 scopus 로고    scopus 로고
    • Multiagent reinforcement learning: theoretical framework and an algorithm
    • J.Hu and W.P.Wellman. Multiagent reinforcement learning: theoretical framework and an algorithm. In ICML, 1998.
    • (1998) ICML
    • Hu, J.1    Wellman, W.P.2
  • 7
    • 0002730095 scopus 로고
    • Learning, mutation, and long run equilibria in games
    • M.Kandori, G.J.Mailath, and R.Rob. Learning, mutation, and long run equilibria in games. Econometrica, 61(1):29-56, 1993.
    • (1993) Econometrica , vol.61 , Issue.1 , pp. 29-56
    • Kandori, M.1    Mailath, G.J.2    Rob, R.3
  • 8
    • 0242466944 scopus 로고    scopus 로고
    • Friend-or-Foe Q-learning in general sum game
    • M.Littman. Friend-or-Foe Q-learning in general sum game. In ICML, 2001.
    • (2001) ICML
    • Littman, M.1
  • 9
    • 0001547175 scopus 로고    scopus 로고
    • Value-function reinforcement learning in markov games
    • M.L.Littman. Value-function reinforcement learning in markov games. J. of Cognitive System Research, 2:55-66, 2000.
    • (2000) J. of Cognitive System Research , vol.2 , pp. 55-66
    • Littman, M.L.1
  • 11
    • 85152198941 scopus 로고
    • Multi-agent reinforcement learning: independent vs. cooperative agents
    • M.Tan. Multi-agent reinforcement learning: independent vs. cooperative agents. In ICML, 1993.
    • (1993) ICML
    • Tan, M.1
  • 13
    • 0001181267 scopus 로고
    • Spieltheoretische behandlung eines oligopolmodells mit nachfrageträgheit
    • R. Selten. Spieltheoretische behandlung eines oligopolmodells mit nachfrageträgheit. Zeitschrift für die gesamte Staatswissenschaft, 12:301-324, 1965.
    • (1965) Zeitschrift für die gesamte Staatswissenschaft , vol.12 , pp. 301-324
    • Selten, R.1
  • 14
    • 0033901602 scopus 로고    scopus 로고
    • Convergence results for single-step on-policy reinforcement learning algorithms
    • S. Singh, T.Jaakkola, M.L.Littman, and C.Szepesvari. Convergence results for single-step on-policy reinforcement learning algorithms. Machine Learning, 2000.
    • (2000) Machine Learning
    • Singh, S.1    Jaakkola, T.2    Littman, M.L.3    Szepesvari, C.4
  • 15
    • 0002626229 scopus 로고
    • Learning to coordinate without sharing information
    • S.Sen, M.Sekaran, and J. Hale. Learning to coordinate without sharing information. In AAAI, 1994.
    • (1994) AAAI
    • Sen, S.1    Sekaran, M.2    Hale, J.3
  • 17
    • 0030050933 scopus 로고
    • Learning in the iterated prisoner's dilemma
    • T.Sandholm and R.Crites. Learning in the iterated prisoner's dilemma. Biosystems, 37:147-166, 1995.
    • (1995) Biosystems , vol.37 , pp. 147-166
    • Sandholm, T.1    Crites, R.2
  • 18
    • 0001944917 scopus 로고
    • The evolution of conventions
    • H. Young. The evolution of conventions. Econometrica, 61(1):57-84, 1993.
    • (1993) Econometrica , vol.61 , Issue.1 , pp. 57-84
    • Young, H.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.