메뉴 건너뛰기




Volumn , Issue , 2003, Pages

Reinforcement learning to play an optimal nash equilibrium in team Markov games

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; GAME THEORY; MULTI AGENT SYSTEMS; REINFORCEMENT LEARNING; TELECOMMUNICATION NETWORKS;

EID: 27744448185     PISSN: 10495258     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (99)

References (18)
  • 1
    • 0002500351 scopus 로고    scopus 로고
    • Planning, learning and coordination in multi-agent decision processes
    • C. Boutilier. Planning, learning and coordination in multi-agent decision processes. In TARK, 1996.
    • (1996) TARK
    • Boutilier, C.1
  • 2
    • 0010247544 scopus 로고    scopus 로고
    • The dynamics of reinforcement learning in cooperative multi-agent systems
    • C. Claus and C. Boutilier. The dynamics of reinforcement learning in cooperative multi-agent systems. In AAAI, 1998.
    • (1998) AAAI
    • Claus, C.1    Boutilier, C.2
  • 5
    • 0001473356 scopus 로고
    • Learning to coordinate actions in multi-agent systems
    • G. Weiβ. Learning to coordinate actions in multi-agent systems. In IJCAI, 1993.
    • (1993) IJCAI
    • Weiß, G.1
  • 6
    • 0000929496 scopus 로고    scopus 로고
    • Multiagent reinforcement learning: Theoretical framework and an algorithm
    • J. Hu and W.P Wellman. Multiagent reinforcement learning: theoretical framework and an algorithm. InICML, 1998.
    • (1998) InICML
    • Hu, J.1    Wellman, W.P.2
  • 7
    • 0002730095 scopus 로고
    • Learning, mutation, and long run equilibria in games
    • M. Kandori, G.J. Mailath, and R. Rob. Learning, mutation, and long run equilibria in games. Econometrica, 61(1):29-56, 1993.
    • (1993) Econometrica , vol.61 , Issue.1 , pp. 29-56
    • Kandori, M.1    Mailath, G.J.2    Rob, R.3
  • 8
    • 0242466944 scopus 로고    scopus 로고
    • Friend-or-foe Q-learning in general sum game
    • M. Littman. Friend-or-Foe Q-learning in general sum game. In ICML, 2001.
    • (2001) ICML
    • Littman, M.1
  • 9
    • 0001547175 scopus 로고    scopus 로고
    • Value-function reinforcement learning in Markov games
    • M.L. Littman. Value-function reinforcement learning in markov games. J. of Cognitive System Research, 2:55-66, 2000.
    • (2000) J. of Cognitive System Research , vol.2 , pp. 55-66
    • Littman, M.L.1
  • 11
    • 85152198941 scopus 로고
    • Multiagent reinforcement learning: Independent vs. Cooperative agents
    • M. Tan. Multiagent reinforcement learning: independent vs. cooperative agents. In ICML, 1993.
    • (1993) ICML
    • Tan, M.1
  • 13
    • 0001181267 scopus 로고
    • Spieltheoretische behandlung eines oligopolmodells mit nachfrageträgheit
    • R. Selten. Spieltheoretische behandlung eines oligopolmodells mit nachfrageträgheit. Zeitschrift für die gesamte Staatswissenschaft, 12:301-324, 1965.
    • (1965) Zeitschrift für die Gesamte Staatswissenschaft , vol.12 , pp. 301-324
    • Selten, R.1
  • 14
    • 0033901602 scopus 로고    scopus 로고
    • Convergence results for single-step on-policy reinforcement learning algorithms
    • S. Singh, T. Jaakkola, M.L. Littman, and C. Szepesvari. Convergence results for single-step on-policy reinforcement learning algorithms. Machine Learning, 2000.
    • (2000) Machine Learning
    • Singh, S.1    Jaakkola, T.2    Littman, M.L.3    Szepesvari, C.4
  • 15
    • 0002626229 scopus 로고
    • Learning to coordinate without sharing information
    • S. Sen, M. Sekaran, and J. Hale. Learning to coordinate without sharing information. InAAAI, 1994.
    • (1994) InAAAI
    • Sen, S.1    Sekaran, M.2    Hale, J.3
  • 17
    • 0030050933 scopus 로고
    • Learning in the iterated prisoner's dilemma
    • T. Sandholm and R. Crites. Learning in the iterated prisoner's dilemma. Biosystems, 37:147-166, 1995.
    • (1995) Biosystems , vol.37 , pp. 147-166
    • Sandholm, T.1    Crites, R.2
  • 18
    • 0001944917 scopus 로고
    • The evolution of conventions
    • H. Young. The evolution of conventions. Econometrica, 61(1):57-84, 1993.
    • (1993) Econometrica , vol.61 , Issue.1 , pp. 57-84
    • Young, H.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.