메뉴 건너뛰기




Volumn 5244 LNAI, Issue , 2008, Pages 13-24

Optimistic-pessimistic Q-learning algorithm for multi-agent systems

Author keywords

Algorithmic game theory; Multi agent reinforcement learning

Indexed keywords

EDUCATION; ELECTRIC SHIP EQUIPMENT; GAME THEORY; LEARNING SYSTEMS; MULTI AGENT SYSTEMS; REINFORCEMENT; REINFORCEMENT LEARNING; WELL TESTING;

EID: 56549086866     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-540-87805-6_3     Document Type: Conference Paper
Times cited : (3)

References (10)
  • 1
    • 85149834820 scopus 로고
    • Markov games as a framework for multi-agent reinforcement learning
    • Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: ICML, pp. 157-163 (1994)
    • (1994) ICML , pp. 157-163
    • Littman, M.L.1
  • 2
    • 56549129765 scopus 로고    scopus 로고
    • Littman, M.L.: Priend-or-foe q-learning in general-sum games. In: Brodley, C.E., Danyluk, A.P. (eds.) ICML, pp. 322-328. Morgan Kaufmann, San Francisco (2001)
    • Littman, M.L.: Priend-or-foe q-learning in general-sum games. In: Brodley, C.E., Danyluk, A.P. (eds.) ICML, pp. 322-328. Morgan Kaufmann, San Francisco (2001)
  • 3
    • 0000929496 scopus 로고    scopus 로고
    • Multiagent reinforcement learning: Theoretical framework and an algorithm
    • Morgan Kaufmann, San Francisco
    • Hu, J., Wellman, M.P.: Multiagent reinforcement learning: theoretical framework and an algorithm. In: Proc. 15th International Conf. on Machine Learning, pp. 242-250. Morgan Kaufmann, San Francisco (1998)
    • (1998) Proc. 15th International Conf. on Machine Learning , pp. 242-250
    • Hu, J.1    Wellman, M.P.2
  • 5
    • 0036531878 scopus 로고    scopus 로고
    • Multiagent learning using a variable learning rate
    • Bowling, M.H., Veloso, M.M.: Multiagent learning using a variable learning rate. Artificial Intelligence 136(2), 215-250 (2002)
    • (2002) Artificial Intelligence , vol.136 , Issue.2 , pp. 215-250
    • Bowling, M.H.1    Veloso, M.M.2
  • 6
    • 0003629453 scopus 로고    scopus 로고
    • Generalized markov decision processes: Dynamic-programming and reinforcement-learnig algorithms
    • Technical report, Providence, RI, USA
    • Szepesvári, C., Littman, M.L.: Generalized markov decision processes: Dynamic-programming and reinforcement-learnig algorithms. Technical report, Providence, RI, USA (1996)
    • (1996)
    • Szepesvári, C.1    Littman, M.L.2
  • 8
    • 84871550805 scopus 로고
    • Hurwiczs optimality criterion for decision making under ignorance
    • Technical Report 6, Stanford University
    • Arrow, K.: Hurwiczs optimality criterion for decision making under ignorance. Technical Report 6, Stanford University (1953)
    • (1953)
    • Arrow, K.1
  • 9
    • 4544335718 scopus 로고    scopus 로고
    • Run the gamut: A comprehensive approach to evaluating game-theoretic algorithms. In: AAMAS
    • Los Alamitos
    • Nudelman, E., Wortman, J., Shoham, Y., Leyton-Brown, K.: Run the gamut: A comprehensive approach to evaluating game-theoretic algorithms. In: AAMAS 2004, pp. 880-887. IEEE Computer Society. Los Alamitos (2004)
    • (2004) 880-887. IEEE Computer Society , pp. 2004
    • Nudelman, E.1    Wortman, J.2    Shoham, Y.3    Leyton-Brown, K.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.