메뉴 건너뛰기




Volumn 7, Issue , 2006, Pages 1789-1828

Collaborative multiagent reinforcement learning by payoff propagation

Author keywords

Collaborative multiagent system; Coordination graph; Q learning belief propagation; Reinforcement learning

Indexed keywords

ALGORITHMS; APPROXIMATION THEORY; DECISION MAKING; GRAPH THEORY; MULTI AGENT SYSTEMS;

EID: 33748543203     PISSN: 15337928     EISSN: 15337928     Source Type: Journal    
DOI: None     Document Type: Article
Times cited : (316)

References (57)
  • 8
    • 0000719863 scopus 로고
    • Packet routing in dynamically changing networks: A reinforcement learning approach
    • Jack D. Cowan, Gerald Tesauro, and Joshua Alspector, editors. Morgan Kaufmann Publishers, Inc.
    • J. A. Boyan and M. L. Littman. Packet routing in dynamically changing networks: A reinforcement learning approach. In Jack D. Cowan, Gerald Tesauro, and Joshua Alspector, editors, Advances in Neural Information Processing Systems (NIPS) 6, pages 671-678. Morgan Kaufmann Publishers, Inc., 1994.
    • (1994) Advances in Neural Information Processing Systems (NIPS) , vol.6 , pp. 671-678
    • Boyan, J.A.1    Littman, M.L.2
  • 16
    • 0035395660 scopus 로고    scopus 로고
    • Scaling up agent coordination strategies
    • July
    • E. H. Durfee. Scaling up agent coordination strategies. IEEE Computer, 34(7):39-46, July 2001.
    • (2001) IEEE Computer , vol.34 , Issue.7 , pp. 39-46
    • Durfee, E.H.1
  • 17
    • 31144432283 scopus 로고    scopus 로고
    • Cooperative information sharing to improve distributed learning in multi-agent systems
    • P. S. Dutta, N. R. Jennings, and L. Moreau. Cooperative information sharing to improve distributed learning in multi-agent systems. Journal of Artificial Intelligence Research, 24:407-463, 2005.
    • (2005) Journal of Artificial Intelligence Research , vol.24 , pp. 407-463
    • Dutta, P.S.1    Jennings, N.R.2    Moreau, L.3
  • 18
    • 27344449757 scopus 로고    scopus 로고
    • Decentralized control of cooperative systems: Categorization and complexity analysis
    • November
    • C. Goldman and S. Zilberstein. Decentralized control of cooperative systems: Categorization and complexity analysis. Journal of Artificial Intelligence Research, 22:143-174, November 2004.
    • (2004) Journal of Artificial Intelligence Research , vol.22 , pp. 143-174
    • Goldman, C.1    Zilberstein, S.2
  • 26
    • 14344250637 scopus 로고    scopus 로고
    • Sparse cooperative Q-learning
    • RUSS Greiner and Dale Schuurmans, editors, Banff, Canada, July. ACM
    • J. R. Kok and N. Vlassis. Sparse cooperative Q-learning. In RUSS Greiner and Dale Schuurmans, editors, Proceedings of the International Conference on Machine Learning, pages 481-488, Banff, Canada, July 2004. ACM.
    • (2004) Proceedings of the International Conference on Machine Learning , pp. 481-488
    • Kok, J.R.1    Vlassis, N.2
  • 27
    • 33748562008 scopus 로고    scopus 로고
    • Using the max-plus algorithm for multiagent decision making in coordination graphs
    • Osaka, Japan, July
    • J. R. Kok and N. Vlassis. Using the max-plus algorithm for multiagent decision making in coordination graphs. In RoboCup-2005: Robot Soccer World Cup IX, Osaka, Japan, July 2005.
    • (2005) RoboCup-2005: Robot Soccer World Cup IX
    • Kok, J.R.1    Vlassis, N.2
  • 28
    • 12244304892 scopus 로고    scopus 로고
    • Non-communicative multi-robot coordination in dynamic environments
    • February
    • J. R. Kok, M. T. J. Spaan, and N. Vlassis. Non-communicative multi-robot coordination in dynamic environments. Robotics and Autonomous Systems, 50(2-3):99-114, February 2005.
    • (2005) Robotics and Autonomous Systems , vol.50 , Issue.2-3 , pp. 99-114
    • Kok, J.R.1    Spaan, M.T.J.2    Vlassis, N.3
  • 33
    • 33746360402 scopus 로고    scopus 로고
    • Distributed optimization in adaptive networks
    • S. Thrun, L. Saul, and B. Schölkopf, editors. MIT Press, Cambridge, MA
    • C. C. Moallemi and B. Van Roy. Distributed optimization in adaptive networks. In S. Thrun, L. Saul, and B. Schölkopf, editors, Advances in Neural Information Processing Systems (NIPS) 16. MIT Press, Cambridge, MA, 2004.
    • (2004) Advances in Neural Information Processing Systems (NIPS) , vol.16
    • Moallemi, C.C.1    Van Roy, B.2
  • 34
    • 10044277219 scopus 로고    scopus 로고
    • ADOPT: Asynchronous distributed constraint optimization with quality guarantees
    • P. Jay Modi, W-M. Shen, M. Tambe, and M. Yokoo. ADOPT: Asynchronous distributed constraint optimization with quality guarantees. Artificial Intelligence, 161(1-2): 149-180, 2005.
    • (2005) Artificial Intelligence , vol.161 , Issue.1-2 , pp. 149-180
    • Modi, P.J.1    Shen, W.-M.2    Tambe, M.3    Yokoo, M.4
  • 37
    • 0036573011 scopus 로고    scopus 로고
    • Distributed algorithms for multi-robot observation of multiple moving targets
    • L. E. Parker. Distributed algorithms for multi-robot observation of multiple moving targets. Autonomous Robots, 12(3):231-255, 2002.
    • (2002) Autonomous Robots , vol.12 , Issue.3 , pp. 231-255
    • Parker, L.E.1
  • 41
    • 1142292938 scopus 로고    scopus 로고
    • The communicative multiagent team decision problem: Analyzing teamwork theories and models
    • D. V. Pynadath and M. Tambe. The communicative multiagent team decision problem: Analyzing teamwork theories and models. Journal of Artificial Intelligence Research, 16:389-423, 2002.
    • (2002) Journal of Artificial Intelligence Research , vol.16 , pp. 389-423
    • Pynadath, D.V.1    Tambe, M.2
  • 45
    • 27544506565 scopus 로고    scopus 로고
    • Reinforcement learning for RoboCup-soccer keepaway
    • P. Stone, R. S. Sutton, and G. Kuhlmann. Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior, 13(3): 165-188, 2005.
    • (2005) Adaptive Behavior , vol.13 , Issue.3 , pp. 165-188
    • Stone, P.1    Sutton, R.S.2    Kuhlmann, G.3
  • 47
    • 0032096675 scopus 로고    scopus 로고
    • Multiagent systems
    • K. Sycara. Multiagent systems. AI Magazine, 19(2):79-92, 1998.
    • (1998) AI Magazine , vol.19 , Issue.2 , pp. 79-92
    • Sycara, K.1
  • 48
    • 85152198941 scopus 로고
    • Multi-agent reinforcement learning: Independent vs. cooperative agents
    • Amherst, MA
    • M. Tan. Multi-agent reinforcement learning: Independent vs. cooperative agents. In International Conference on Machine Learning (ICML), Amherst, MA, 1993.
    • (1993) International Conference on Machine Learning (ICML)
    • Tan, M.1
  • 49
    • 0029276036 scopus 로고
    • Temporal difference learning and TD-Gammon
    • March
    • G. Tesauro. Temporal difference learning and TD-Gammon. Communications offne ACM, 38(3), March 1995.
    • (1995) Communications Offne ACM , vol.38 , Issue.3
    • Tesauro, G.1
  • 52
    • 3943084089 scopus 로고    scopus 로고
    • Tree consistency and bounds on the performance of the max-product algorithm and its generalizations
    • April
    • M. J. Wainwright, T. S. Jaakkola, and A. S. Willsky. Tree consistency and bounds on the performance of the max-product algorithm and its generalizations. Statistics and Computing, 14: 143-166, April 2004.
    • (2004) Statistics and Computing , vol.14 , pp. 143-166
    • Wainwright, M.J.1    Jaakkola, T.S.2    Willsky, A.S.3
  • 53
    • 34249833101 scopus 로고
    • Technical note: Q-learning
    • C. Watkins and P. Dayan. Technical note: Q-learning. Machine Learning, 8(3-4):279-292, 1992.
    • (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 279-292
    • Watkins, C.1    Dayan, P.2
  • 55
    • 0141695638 scopus 로고    scopus 로고
    • Understanding belief propagation and its generalizations
    • chapter 8. Morgan Kaufmann Publishers Inc., January
    • J. S. Yedidia, W. T. Freeman, and Y. Weiss. Understanding belief propagation and its generalizations. In Exploring Artificial Intelligence in the New Millennium, chapter 8, pages 239-269. Morgan Kaufmann Publishers Inc., January 2003.
    • (2003) Exploring Artificial Intelligence in the New Millennium , pp. 239-269
    • Yedidia, J.S.1    Freeman, W.T.2    Weiss, Y.3
  • 56
    • 1142305807 scopus 로고
    • Distributed constraint optimization as a formal model of partially adversarial cooperation
    • University of Michigan, Ann Arbor, MI 48109
    • M. Yokoo and E. H. Durfee. Distributed constraint optimization as a formal model of partially adversarial cooperation. Technical Report CSE-TR-101-91, University of Michigan, Ann Arbor, MI 48109, 1991.
    • (1991) Technical Report , vol.CSE-TR-101-91
    • Yokoo, M.1    Durfee, E.H.2
  • 57
    • 0000049635 scopus 로고    scopus 로고
    • Exploiting causal independence in Bayesian network inference
    • N. Lianwen Zhang and D. Poole. Exploiting causal independence in Bayesian network inference. Journal of Artificial Intelligence Research, 5:301-328, 1996.
    • (1996) Journal of Artificial Intelligence Research , vol.5 , pp. 301-328
    • Zhang, N.L.1    Poole, D.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.