메뉴 건너뛰기




Volumn 4, Issue 6, 2004, Pages 1039-1069

Nash Q-learning for general-sum stochastic games

Author keywords

Multiagent Learning; Q learning; Reinforcement Learning

Indexed keywords

FUNCTIONS; GAME THEORY; MARKOV PROCESSES; MULTI AGENT SYSTEMS; PROBABILITY; ROBOTICS;

EID: 4644369748     PISSN: 15324435     EISSN: None     Source Type: Journal    
DOI: 10.1162/1532443041827880     Document Type: Article
Times cited : (1072)

References (54)
  • 5
    • 0003091684 scopus 로고    scopus 로고
    • Convergence problems of general-sum multiagent reinforcement learning
    • Stanford
    • Michael Bowling. Convergence problems of general-sum multiagent reinforcement learning. In Seventeenth International Conference on Machine Learning, pages 89-94, Stanford, 2000.
    • (2000) Seventeenth International Conference on Machine Learning , pp. 89-94
    • Bowling, M.1
  • 6
    • 0036531878 scopus 로고    scopus 로고
    • Multiagent learning using a variable learning rate
    • Michael Bowling and Manuela Veloso. Multiagent learning using a variable learning rate. Artificial Intelligence, 136:215-250, 2002.
    • (2002) Artificial Intelligence , vol.136 , pp. 215-250
    • Bowling, M.1    Veloso, M.2
  • 7
    • 0034247018 scopus 로고    scopus 로고
    • A near-optimal polynomial time algorithm for learning in certain classes of stochastic games
    • Ronen I. Brafman and Moshe Tennenholtz. A near-optimal polynomial time algorithm for learning in certain classes of stochastic games. Artificial Intelligence, 121(1-2):31-47, 2000.
    • (2000) Artificial Intelligence , vol.121 , Issue.1-2 , pp. 31-47
    • Brafman, R.I.1    Tennenholtz, M.2
  • 10
    • 0031630561 scopus 로고    scopus 로고
    • The dynamics of reinforcement learning in cooperative multiagent systems
    • Madison, WI
    • Caroline Claus and Craig Boutilier. The dynamics of reinforcement learning in cooperative multiagent systems. In Fifteenth National Conference on Artificial Intelligence, pages 746-752, Madison, WI, 1998.
    • (1998) Fifteenth National Conference on Artificial Intelligence , pp. 746-752
    • Claus, C.1    Boutilier, C.2
  • 20
    • 0000929496 scopus 로고    scopus 로고
    • Multiagent reinforcement learning: Theoretical framework and an algorithm
    • Madison, WI
    • Junling Hu and Michael P. Wellman. Multiagent reinforcement learning: Theoretical framework and an algorithm. In Fifteenth International Conference on Machine Learning, pages 242-250, Madison, WI, 1998.
    • (1998) Fifteenth International Conference on Machine Learning , pp. 242-250
    • Hu, J.1    Wellman, M.P.2
  • 21
    • 4644301002 scopus 로고    scopus 로고
    • Experimental results on Q-learning for general-sum stochastic games
    • Stanford
    • Junling Hu and Michael P. Wellman. Experimental results on Q-learning for general-sum stochastic games. In Seventeenth International Conference on Machine Learning, pages 407-414, Stanford, 2000.
    • (2000) Seventeenth International Conference on Machine Learning , pp. 407-414
    • Hu, J.1    Wellman, M.P.2
  • 27
    • 85149834820 scopus 로고
    • Markov games as a framework for multi-agent reinforcement learning
    • New Brunswick
    • Michael L. Littman. Markov games as a framework for multi-agent reinforcement learning. In Eleventh International Conference on Machine Learning, pages 157-163. New Brunswick, 1994.
    • (1994) Eleventh International Conference on Machine Learning , pp. 157-163
    • Littman, M.L.1
  • 29
    • 0001547175 scopus 로고    scopus 로고
    • Value-function reinforcement learning in Markov games
    • Michael L. Littman. Value-function reinforcement learning in Markov games. Cognitive Systems Research, 2:55-66, 2001b.
    • (2001) Cognitive Systems Research , vol.2 , pp. 55-66
    • Littman, M.L.1
  • 32
    • 0032679082 scopus 로고    scopus 로고
    • Exploration of multi-state environments: Local measures and back-propagation of uncertainty
    • Nicolas Meuleau and Paul Bourgine. Exploration of multi-state environments: Local measures and back-propagation of uncertainty. Machine Learning, 35(2): 117-154, 1999.
    • (1999) Machine Learning , vol.35 , Issue.2 , pp. 117-154
    • Meuleau, N.1    Bourgine, P.2
  • 34
    • 0001730497 scopus 로고
    • Non-cooperative games
    • John F. Nash. Non-cooperative games. Annals of Mathematics, 54:286-295, 1951.
    • (1951) Annals of Mathematics , vol.54 , pp. 286-295
    • Nash, J.F.1
  • 37
    • 0000955979 scopus 로고    scopus 로고
    • Incremental multi-step Q-learning
    • Jing Peng and Ronald Williams. Incremental multi-step Q-learning. Machine Learning, 22:283-290, 1996.
    • (1996) Machine Learning , vol.22 , pp. 283-290
    • Peng, J.1    Williams, R.2
  • 40
    • 4544279348 scopus 로고    scopus 로고
    • Multi-agent reinforcement learning: A critical survey
    • Stanford University
    • Yoav Shoham, Rob Powers, and Trond Grenager. Multi-agent reinforcement learning: A critical survey. Technical report, Stanford University, 2003.
    • (2003) Technical Report
    • Shoham, Y.1    Powers, R.2    Grenager, T.3
  • 41
    • 84898972974 scopus 로고    scopus 로고
    • Reinforcement learning for dynamic channel allocation in cellular telephone systems
    • MIT Press
    • Satinder Singh and Dimitri Bertsekas. Reinforcement learning for dynamic channel allocation in cellular telephone systems. In Advances in Neural Information Processing Systems, volume 9, pages 974-980. MIT Press, 1996.
    • (1996) Advances in Neural Information Processing Systems , vol.9 , pp. 974-980
    • Singh, S.1    Bertsekas, D.2
  • 42
    • 0033901602 scopus 로고    scopus 로고
    • Convergence results for single-step on-policy reinforcement learning algorithms
    • Satinder Singh, Tommi Jaakkola, Michael L. Littman, and Csaba Szepesvári. Convergence results for single-step on-policy reinforcement learning algorithms. Machine Learning, 38(3):287-308, 2000.
    • (2000) Machine Learning , vol.38 , Issue.3 , pp. 287-308
    • Singh, S.1    Jaakkola, T.2    Littman, M.L.3    Szepesvári, C.4
  • 43
  • 44
    • 0013528313 scopus 로고    scopus 로고
    • Scaling reinforcement learning toward RoboCup soccer
    • Morgan Kaufmann, San Francisco, CA
    • Peter Stone and Richard S. Sutton. Scaling reinforcement learning toward RoboCup soccer. In Proc. 18th International Conf. on Machine Learning, pages 537-544. Morgan Kaufmann, San Francisco, CA, 2001.
    • (2001) Proc. 18th International Conf. on Machine Learning , pp. 537-544
    • Stone, P.1    Sutton, R.S.2
  • 47
    • 0033570798 scopus 로고    scopus 로고
    • A unified analysis of value-function-based reinforcement-learning algorithms
    • Csaba Szepesvári and Michael L. Littman. A unified analysis of value-function-based reinforcement-learning algorithms. Neural Computation, 11:8:2017-2059, 1999.
    • (1999) Neural Computation , vol.11 , Issue.8 , pp. 2017-2059
    • Szepesvári, C.1    Littman, M.L.2
  • 48
    • 85152198941 scopus 로고
    • Multi-agent reinforcement learning: Independent vs. cooperative agents
    • Amherst, MA
    • Ming Tan. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Tenth International Conference on Machine Learning, pages 330-337, Amherst, MA, 1993.
    • (1993) Tenth International Conference on Machine Learning , pp. 330-337
    • Tan, M.1
  • 50
    • 14344279109 scopus 로고    scopus 로고
    • An application of reinforcement learning to dialogue strategy selection in a spoken dialogue system for email
    • Marilyn A. Walker. An application of reinforcement learning to dialogue strategy selection in a spoken dialogue system for email. Journal of Artificial Intelligence Research, 12:387-416, 2000.
    • (2000) Journal of Artificial Intelligence Research , vol.12 , pp. 387-416
    • Walker, M.A.1
  • 53
    • 0032207451 scopus 로고    scopus 로고
    • Conjectural equilibrium in multiagent learning
    • Michael P. Wellman and Junling Hu. Conjectural equilibrium in multiagent learning. Machine Learning, 33:179-200, 1998.
    • (1998) Machine Learning , vol.33 , pp. 179-200
    • Wellman, M.P.1    Hu, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.