메뉴 건너뛰기




Volumn 15, Issue 1, 2007, Pages 91-108

Reaching pareto-optimality in prisoner's dilemma using conditional joint action learning

Author keywords

Game theory; Multiagent learning; Prisoner's dilemma

Indexed keywords


EID: 34249676853     PISSN: 13872532     EISSN: 15737454     Source Type: Journal    
DOI: 10.1007/s10458-007-0020-8     Document Type: Article
Times cited : (55)

References (31)
  • 1
    • 0036531878 scopus 로고    scopus 로고
    • Multiagent learning using a variable learning rate
    • Bowling, M. H., & Veloso, M. M. (2002). Multiagent learning using a variable learning rate. Artificial Intelligence, 136(2), 215-250.
    • (2002) Artificial Intelligence , vol.136 , Issue.2 , pp. 215-250
    • Bowling, M.H.1    Veloso, M.M.2
  • 3
    • 0004251138 scopus 로고
    • Cambridge, UK: Cambridge University Press
    • Brams, S. J. (1994) Theory of moves. Cambridge, UK: Cambridge University Press.
    • (1994) Theory of moves
    • Brams, S.J.1
  • 6
    • 1942421183 scopus 로고    scopus 로고
    • Awesome: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents
    • Conitzer, V., &Sandholm, T. (2003). Awesome: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. In ICML, (pp. 83-90).
    • (2003) ICML , pp. 83-90
    • Conitzer, V.1    Sandholm, T.2
  • 8
    • 33749245241 scopus 로고    scopus 로고
    • How to combine expert (and novice) advice when actions impact the environment?
    • de Parias, D. P., & Megiddo, N. (2003). How to combine expert (and novice) advice when actions impact the environment? In NIPS.
    • (2003) NIPS
    • de Parias, D.P.1    Megiddo, N.2
  • 10
    • 1942517280 scopus 로고    scopus 로고
    • Correlated q-learning
    • Greenwald, A. R., & Hall, K. (2003). Correlated q-learning. In ICML, pp. 242-249.
    • (2003) ICML , pp. 242-249
    • Greenwald, A.R.1    Hall, K.2
  • 11
    • 9444256279 scopus 로고    scopus 로고
    • A general class of no-regret learning algorithms and game-theoretic equilibria
    • Greenwald, A. R., & Jafari, A. (2003). A general class of no-regret learning algorithms and game-theoretic equilibria. In COLT, pp. 2-12.
    • (2003) COLT , pp. 2-12
    • Greenwald, A.R.1    Jafari, A.2
  • 12
    • 4644369748 scopus 로고    scopus 로고
    • Nash q-learning for general-sum stochastic games
    • Hu J., & Wellman, M. P. (2003). Nash q-learning for general-sum stochastic games. Journal of Machine Learning Research, 4, 1039-1069.
    • (2003) Journal of Machine Learning Research , vol.4 , pp. 1039-1069
    • Hu, J.1    Wellman, M.P.2
  • 13
    • 0038344583 scopus 로고    scopus 로고
    • Geometric algorithms for online optimization
    • Technical Report MIT-LCS-TR-861, MIT Laboratory for Computer Science
    • Kalai, A., & Vempala, S. (2002). Geometric algorithms for online optimization. Technical Report MIT-LCS-TR-861, MIT Laboratory for Computer Science.
    • (2002)
    • Kalai, A.1    Vempala, S.2
  • 16
    • 85149834820 scopus 로고
    • Markov games as a framework for multi-agent reinforcement learning
    • San Mateo, CA: Morgan Kaufmann
    • Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the eleventh international conference on machine learning, (pp. 157-163). San Mateo, CA: Morgan Kaufmann.
    • (1994) Proceedings of the eleventh international conference on machine learning , pp. 157-163
    • Littman, M.L.1
  • 19
    • 9544234477 scopus 로고    scopus 로고
    • A polynomial-time nash equilibrium algorithm for repeated games
    • Littman, M. L., & Stone, P. (2005). A polynomial-time nash equilibrium algorithm for repeated games. Decision Support System, 39, 55-66.
    • (2005) Decision Support System , vol.39 , pp. 55-66
    • Littman, M.L.1    Stone, P.2
  • 21
    • 26444601262 scopus 로고    scopus 로고
    • Cooperative multi-agent learning: The state of the art
    • Panait, L., & Luke, S. (2005). Cooperative multi-agent learning: The state of the art. Autonomous Agents and Multi-Agent Systems, 11(3), 387-434.
    • (2005) Autonomous Agents and Multi-Agent Systems , vol.11 , Issue.3 , pp. 387-434
    • Panait, L.1    Luke, S.2
  • 22
    • 0030050933 scopus 로고
    • Multiagent reinforcement learning and iterated prisoner's dilemma
    • Sandholm, T. W., & Crites, R. H. ( 1995). Multiagent reinforcement learning and iterated prisoner's dilemma. Biosystems Journal, 37, 147-166.
    • (1995) Biosystems Journal , vol.37 , pp. 147-166
    • Sandholm, T.W.1    Crites, R.H.2
  • 23
    • 0542374968 scopus 로고
    • Learning with friends and foes
    • Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers
    • Sekaran, M., & Sen, S. (1994). Learning with friends and foes. In Sixteenth annual conference of the cognitive science society, (pp. 800-805). Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers.
    • (1994) Sixteenth annual conference of the cognitive science society , pp. 800-805
    • Sekaran, M.1    Sen, S.2
  • 25
    • 0013327463 scopus 로고    scopus 로고
    • A general class of adaptive strategies
    • Mas-Colell, A., & Hart, S. (2001). A general class of adaptive strategies. Journal of Economic Theory, 98(1), 26-54.
    • (2001) Journal of Economic Theory , vol.98 , Issue.1 , pp. 26-54
    • Mas-Colell, A.1    Hart, S.2
  • 26
    • 34249699690 scopus 로고    scopus 로고
    • Nash convergence of gradient dynamics in general-sum games
    • Singh, S. P., Kearns, M. J., & Mansour, Y. (2000) Nash convergence of gradient dynamics in general-sum games. In UAI, pp. 541-548.
    • (2000) UAI , pp. 541-548
    • Singh, S.P.1    Kearns, M.J.2    Mansour, Y.3
  • 28
    • 28544446213 scopus 로고    scopus 로고
    • Evolutionary game theory and multi-agent reinforcement learning
    • Tuyls, K., & Nowé, A. (2006). Evolutionary game theory and multi-agent reinforcement learning. The Knowledge Engineering Review, 20(1), 63-90.
    • (2006) The Knowledge Engineering Review , vol.20 , Issue.1 , pp. 63-90
    • Tuyls, K.1    Nowé, A.2
  • 29
    • 84943265381 scopus 로고    scopus 로고
    • Verbeeck, K., Nowé, A., Lenaerts, T., & Parentm, J. (2002). Learning to reach the pareto optimal nash equilibrium as a team. In LNAI 2557: Proceedings of the fifteenth Australian joint conference on artificial intelligence, pp. 407-418). Springer-Verlag.
    • Verbeeck, K., Nowé, A., Lenaerts, T., & Parentm, J. (2002). Learning to reach the pareto optimal nash equilibrium as a team. In LNAI 2557: Proceedings of the fifteenth Australian joint conference on artificial intelligence, Vol. (pp. 407-418). Springer-Verlag.
  • 30
    • 0346502047 scopus 로고    scopus 로고
    • Predicting the expected behavior of agents that learn about agents: The CLRI framework
    • Vidal, J. M., & Durfee, E. H. (2003). Predicting the expected behavior of agents that learn about agents: the CLRI framework. Autonomous Agents and Multi-Agent Systems, 6(1), 77-107.
    • (2003) Autonomous Agents and Multi-Agent Systems , vol.6 , Issue.1 , pp. 77-107
    • Vidal, J.M.1    Durfee, E.H.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.