메뉴 건너뛰기




Volumn 27, Issue 1, 2012, Pages 1-31

Independent reinforcement learners in cooperative Markov games: A survey regarding coordination problems

Author keywords

[No Author keywords available]

Indexed keywords

COORDINATION PROBLEMS; DISTRIBUTED Q-LEARNING; FREQUENCY MAXIMA; HILL CLIMBING; INDEPENDENT AGENTS; MARKOV GAMES; MATRIX GAME; MULTI AGENT SYSTEM (MAS); MULTI STATE; MULTI-AGENT APPLICATIONS; NON-STATIONARITIES; PURSUIT DOMAIN; Q-LEARNING; Q-VALUES; STOCHASTICITY;

EID: 84857861863     PISSN: 02698889     EISSN: 14698005     Source Type: Journal    
DOI: 10.1017/S0269888912000057     Document Type: Review
Times cited : (500)

References (62)
  • 1
    • 70350699723 scopus 로고    scopus 로고
    • A multiagent reinforcement learning algorithm with non-linear dynamics
    • Abdallah, S. & Lesser, V. 2008. A multiagent reinforcement learning algorithm with non-linear dynamics. Journal of Artificial Intelligence Research 33, 521-549.
    • (2008) Journal of Artificial Intelligence Research , vol.33 , pp. 521-549
    • Abdallah, S.1    Lesser, V.2
  • 3
    • 58149280068 scopus 로고    scopus 로고
    • Multi-agent reinforcement learning in common interest and fixed sum stochastic games: An experimental study
    • Bab, A. & Brafman, R. I. 2008. Multi-agent reinforcement learning in common interest and fixed sum stochastic games: An experimental study. Journal of Machine Learning Research 9, 2635-2675
    • (2008) Journal of Machine Learning Research , vol.9 , pp. 2635-2675
    • Bab, A.1    Brafman, R.I.2
  • 4
    • 0028745178 scopus 로고
    • Communication in reactive multiagent robotic systems
    • Balch, T. & Arkin, R. C. 1994. Communication in reactive multiagent robotic systems. Autonomous Robots 1(1), 27-52.
    • (1994) Autonomous Robots , vol.1 , Issue.1 , pp. 27-52
    • Balch, T.1    Arkin, R.C.2
  • 8
    • 0002500351 scopus 로고    scopus 로고
    • Planning, learning and coordination in multiagent decision processes
    • Morgan Kaufmann Publishers Inc.
    • Boutilier, C. 1996. Planning, learning and coordination in multiagent decision processes. In Theoretical Aspects of Rationality and Knowledge, Morgan Kaufmann Publishers Inc., 195-201.
    • (1996) Theoretical Aspects of Rationality and Knowledge , pp. 195-201
    • Boutilier, C.1
  • 9
    • 84880690163 scopus 로고    scopus 로고
    • Sequential optimality and coordination in multiagent systems
    • Morgan Publishers Inc.
    • Boutilier, C. 1999. Sequential optimality and coordination in multiagent systems. In IJCAI, Morgan Publishers Inc., 478-485.
    • (1999) IJCAI , pp. 478-485
    • Boutilier, C.1
  • 10
    • 84899027977 scopus 로고    scopus 로고
    • Convergence and no-regret in multiagent learning
    • Saul, L. K., Weiss, Y. & Bottou, L. (eds). MIT Press
    • Bowling, M. 2005. Convergence and no-regret in multiagent learning. In Advances in Neural Information Processing Systems, Saul, L. K., Weiss, Y. & Bottou, L. (eds). MIT Press, 209-216.
    • (2005) Advances in Neural Information Processing Systems , pp. 209-216
    • Bowling, M.1
  • 12
    • 0036531878 scopus 로고    scopus 로고
    • Multiagent learning using a variable learning rate
    • Bowling, M. & Veloso, M. 2002. Multiagent learning using a variable learning rate. Artificial Intelligence 136, 215-250.
    • (2002) Artificial Intelligence , vol.136 , pp. 215-250
    • Bowling, M.1    Veloso, M.2
  • 16
    • 77649261098 scopus 로고    scopus 로고
    • Baselines for joint-action reinforcement learning of coordination in cooperative multi-agent systems
    • Springer, Lecture Notes in Computer Science
    • Carpenter, M. & Kudenko, D. 2005. Baselines for joint-action reinforcement learning of coordination in cooperative multi-agent systems. In Adaptive Agents and Multi-Agent Systems II: Adaptation and Multi-Agent Learning, Lecture Notes in Computer Science, 3394, 55-72. Springer.
    • (2005) Adaptive Agents and Multi-Agent Systems II: Adaptation and Multi-Agent Learning , vol.3394 , pp. 55-72
    • Carpenter, M.1    Kudenko, D.2
  • 17
    • 0031630561 scopus 로고    scopus 로고
    • The dynamics of reinforcement learning in cooperative multiagent systems
    • American Association for Artificial Intelligence
    • Claus, C. & Boutilier, C. 1998. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the 15th National Conference on Artificial Intelligence, 746-752, American Association for Artificial Intelligence.
    • (1998) Proceedings of the 15th National Conference on Artificial Intelligence , pp. 746-752
    • Claus, C.1    Boutilier, C.2
  • 18
    • 33750270145 scopus 로고    scopus 로고
    • Building autonomic systems using collaborative reinforcement learning
    • Dowling, J., Cunningham, R., Curran, E. & Cahill, V. 2006. Building autonomic systems using collaborative reinforcement learning. Knowledge Engineering Review 21(3), 231-238.
    • (2006) Knowledge Engineering Review , vol.21 , Issue.3 , pp. 231-238
    • Dowling, J.1    Cunningham, R.2    Curran, E.3    Cahill, V.4
  • 20
    • 33751020264 scopus 로고    scopus 로고
    • Multi-agent case-based reasoning for cooperative reinforcement learners
    • Springer
    • Gabel, T. & Riedmiller, M. 2006. Multi-agent case-based reasoning for cooperative reinforcement learners. In Proceedings of the ECCBR, 32-46. Springer.
    • (2006) Proceedings of the ECCBR , pp. 32-46
    • Gabel, T.1    Riedmiller, M.2
  • 22
  • 24
    • 0036932299 scopus 로고    scopus 로고
    • Reinforcement learning of coordination in cooperative multi-agent systems
    • Dechter, R., Kearns, M. & Sutton, R. (eds.). Edmonton, Alberta, Canada
    • Kapetanakis, S. & Kudenko, D. 2002. Reinforcement learning of coordination in cooperative multi-agent systems. In Proceedings of the 9th NCAI, Dechter, R., Kearns, M. & Sutton, R. (eds.). Edmonton, Alberta, Canada.
    • (2002) Proceedings of the 9th NCAI
    • Kapetanakis, S.1    Kudenko, D.2
  • 29
    • 4544226982 scopus 로고    scopus 로고
    • Reinforcement learning for stochastic cooperative multi-agent systems
    • Lauer, M. & Riedmiller, M. 2004. Reinforcement learning for stochastic cooperative multi-agent systems. Autonomous Agents and Multi-Agent Systems 03, 1516-1517.
    • (2004) Autonomous Agents and Multi-Agent Systems , vol.3 , pp. 1516-1517
    • Lauer, M.1    Riedmiller, M.2
  • 31
    • 0001547175 scopus 로고    scopus 로고
    • Value-function reinforcement learning in Markov games
    • Littman, M. 2001. Value-function reinforcement learning in Markov games. Journal of Cognitive Systems Research 2, 55-66.
    • (2001) Journal of Cognitive Systems Research , vol.2 , pp. 55-66
    • Littman, M.1
  • 37
    • 77955654600 scopus 로고    scopus 로고
    • Designing decentralized controllers for distributed-air-jet MEMS-based micromanipulators by reinforcement learning
    • Matignon, L., Laurent, G. J., Le Fort-Piat, N. & Chapuis, Y. A. 2010. Designing decentralized controllers for distributed-air-jet MEMS-based micromanipulators by reinforcement learning. Journal of Intelligent and Robotic Systems 59(2), 145-166.
    • (2010) Journal of Intelligent and Robotic Systems , vol.59 , Issue.2 , pp. 145-166
    • Matignon, L.1    Laurent, G.J.2    Le Fort-Piat, N.3    Chapuis, Y.A.4
  • 38
    • 70349595296 scopus 로고    scopus 로고
    • Learning to cooperate in multi-agent systems by combining q-learning and evolutionary strategy
    • McGlohon, M. & Sen, S. 2005. Learning to cooperate in multi-agent systems by combining q-learning and evolutionary strategy. International Journal on Lateral Computing 1(2), 58-64.
    • (2005) International Journal on Lateral Computing , vol.1 , Issue.2 , pp. 58-64
    • McGlohon, M.1    Sen, S.2
  • 43
    • 41549123971 scopus 로고    scopus 로고
    • Theoretical advantages of lenient learners: An evolutionary game theoretic perspective
    • Panait, L., Tuyls, K. & Luke, S. 2008. Theoretical advantages of lenient learners: An evolutionary game theoretic perspective. Journal of Machine Learning Research 9, 423-457.
    • (2008) Journal of Machine Learning Research , vol.9 , pp. 423-457
    • Panait, L.1    Tuyls, K.2    Luke, S.3
  • 48
    • 0033901602 scopus 로고    scopus 로고
    • Convergence results for single-step onpolicy reinforcement-learning algorithms
    • Singh, S. P., Jaakkola, T., Littman, M. L. & Szepesvari, C. 2000. Convergence results for single-step onpolicy reinforcement-learning algorithms. Machine Learning 38(3), 287-308.
    • (2000) Machine Learning , vol.38 , Issue.3 , pp. 287-308
    • Singh, S.P.1    Jaakkola, T.2    Littman, M.L.3    Szepesvari, C.4
  • 49
    • 0034205975 scopus 로고    scopus 로고
    • Multiagent systems: A survey from a machine learning perspective
    • Stone, P. & Veloso, M. M. 2000. Multiagent systems: A survey from a machine learning perspective. Autonomous Robots 8(3), 345-383.
    • (2000) Autonomous Robots , vol.8 , Issue.3 , pp. 345-383
    • Stone, P.1    Veloso, M.M.2
  • 51
  • 54
    • 28544446213 scopus 로고    scopus 로고
    • Evolutionary game theory and multi-agent reinforcement learning
    • Tuyls, K. & Nowé , A. 2005. Evolutionary game theory and multi-agent reinforcement learning. Knowledge Engineering Review 20(1), 63-90.
    • (2005) Knowledge Engineering Review , vol.20 , Issue.1 , pp. 63-90
    • Tuyls, K.1    Nowé, A.2
  • 55
    • 34247642270 scopus 로고    scopus 로고
    • Exploring selfish reinforcement learning in repeated games with stochastic rewards
    • Verbeeck, K., Nowé , A., Parent, J. & Tuyls, K. 2007. Exploring selfish reinforcement learning in repeated games with stochastic rewards. Autonomous Agents and Multi-Agent Systems 14(3), 239-269.
    • (2007) Autonomous Agents and Multi-Agent Systems , vol.14 , Issue.3 , pp. 239-269
    • Verbeeck, K.1    Nowé, A.2    Parent, J.3    Tuyls, K.4
  • 56
    • 34250651573 scopus 로고    scopus 로고
    • Multi-robot box-pushing: Single-agent q-learning vs. team q-learning
    • Wang, Y. & de Silva, C. W. 2006. Multi-robot box-pushing: single-agent q-learning vs. team q-learning. In Proceedings opf the IROS, 3694-3699.
    • (2006) Proceedings opf the IROS , pp. 3694-3699
    • Wang, Y.1    De Silva, C.W.2
  • 58
    • 34249833101 scopus 로고
    • Technical note: Q-learning
    • Watkins, C. & Dayan, P. 1992. Technical note: Q-learning. Machine Learning 8, 279-292.
    • (1992) Machine Learning , vol.8 , pp. 279-292
    • Watkins, C.1    Dayan, P.2
  • 60
    • 0001309161 scopus 로고    scopus 로고
    • Optimal payoff functions for members of collectives
    • Wolpert, D. H. & Tumer, K. 2001. Optimal payoff functions for members of collectives. Advances in Complex Systems 04(02), 265-279.
    • (2001) Advances in Complex Systems , vol.4 , Issue.2 , pp. 265-279
    • Wolpert, D.H.1    Tumer, K.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.