메뉴 건너뛰기




Volumn , Issue , 1994, Pages 157-163

Markov games as a framework for multi-agent reinforcement learning

Author keywords

[No Author keywords available]

Indexed keywords

GAME THEORY; LEARNING ALGORITHMS; MARKOV PROCESSES; MULTI AGENT SYSTEMS;

EID: 85149834820     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1016/B978-1-55860-335-6.50027-1     Document Type: Conference Paper
Times cited : (2248)

References (18)
  • 1
    • 0002201501 scopus 로고
    • [Barto et al, 1989] Technical Report 89-95, Department of Computer and Information Science, University of Massachusetts, Amherst, Massachusetts. Also published in Learning and Computational Neuroscience: Foundations ofAdaptive Networks, Michael Gabriel and John Moore, editors. The MIT Press, Cambridge, Massachusetts, 1991
    • [Barto et al, 1989] Barto, A. G.; Sutton, R. S.; and Watkins, C. J. C. H. 1989. Learning and sequential decision making. Technical Report 89-95, Department of Computer and Information Science, University of Massachusetts, Amherst, Massachusetts. Also published in Learning and Computational Neuroscience: Foundations ofAdaptive Networks, Michael Gabriel and John Moore, editors. The MIT Press, Cambridge, Massachusetts, 1991.
    • (1989) Learning and sequential decision making
    • Barto, A. G.1    Sutton, R. S.2    Watkins, C. J. C. H.3
  • 3
    • 0011530731 scopus 로고
    • [Boyan, 1992] Master's thesis, Department of Engineering and Computer Laboratory, University of Cambridge, Cambridge, England
    • [Boyan, 1992] Boyan, Justin A. 1992. Modular neural networks for learning context-dependent game strategies. Master's thesis, Department of Engineering and Computer Laboratory, University of Cambridge, Cambridge, England.
    • (1992) Modular neural networks for learning context-dependent game strategies
    • Boyan, Justin A.1
  • 7
    • 0001201756 scopus 로고
    • Some studies in machine learning using the game of checkers
    • [Samuel, 1959] Reprinted in E. A. Feigenbaum and J. Feldman, editors, Computers and Thought, McGraw-Hill, New York 1963
    • [Samuel, 1959] Samuel, A. L. 1959. Some studies in machine learning using the game of checkers. IBM Journal of Research and Development 3:211-229. Reprinted in E. A. Feigenbaum and J. Feldman, editors, Computers and Thought, McGraw-Hill, New York 1963.
    • (1959) IBM Journal of Research and Development , vol.3 , pp. 211-229
    • Samuel, A. L.1
  • 8
    • 0000433333 scopus 로고
    • Using the td(lambda) algorithm to learn an evaluation function for the game of go
    • [Schraudolph et al, 1994] San Mateo, CA. Morgan Kaufman. To appear
    • [Schraudolph et al, 1994] Schraudolph, Nicol N.; Dayan, Peter; and Sejnowski, Terrence J. 1994. Using the td(lambda) algorithm to learn an evaluation function for the game of go. In Advances in Neural Information Processing Systems 6, San Mateo, CA. Morgan Kaufman. To appear.
    • (1994) Advances in Neural Information Processing Systems , vol.6
    • Schraudolph, Nicol N.1    Dayan, Peter2    Sejnowski, Terrence J.3
  • 9
    • 85152626183 scopus 로고
    • A reinforcement learning method for maximizing undiscounted rewards
    • [Schwartz, 1993] Amherst, Massachusetts. Morgan Kaufmann
    • [Schwartz, 1993] Schwartz, Anton 1993. A reinforcement learning method for maximizing undiscounted rewards. In Proceedings of the Tenth International Conference on Machine Learning, Amherst, Massachusetts. Morgan Kaufmann. 298-305.
    • (1993) Proceedings of the Tenth International Conference on Machine Learning , pp. 298-305
    • Schwartz, Anton1
  • 12
    • 85152198941 scopus 로고
    • Multi-agent reinforcement learning: independent vs. cooperative agents
    • [Tan, 1993] Amherst, Massachusetts. Morgan Kaufmann
    • [Tan, 1993] Tan, M. 1993. Multi-agent reinforcement learning: independent vs. cooperative agents. In Proceedings of the Tenth International Conference on Machine Learning, Amherst, Massachusetts. Morgan Kaufmann.
    • (1993) Proceedings of the Tenth International Conference on Machine Learning
    • Tan, M.1
  • 13
    • 2542485629 scopus 로고
    • Practical issues in temporal difference
    • [Tesauro, 1992] Moody, J. E.; Lippman, D. S.; and Hanson, S. J., editors 1992, San Mateo, CA. Morgan Kaufman
    • [Tesauro, 1992] Tesauro, G. J. 1992. Practical issues in temporal difference. In Moody, J. E.; Lippman, D. S.; and Hanson, S. J., editors 1992, Advances in Neural Information Processing Systems 4, San Mateo, CA. Morgan Kaufman. 259-266.
    • (1992) Advances in Neural Information Processing Systems , vol.4 , pp. 259-266
    • Tesauro, G. J.1
  • 14
    • 0141824325 scopus 로고
    • Stochastic dynamic programming
    • 1981], Morgan Kaufmann, Amsterdam
    • [Van Der Wal, 1981] Van Der Wal, J. 1981. Stochastic dynamic programming. In Mathematical Centre Tracts 139. Morgan Kaufmann, Amsterdam.
    • (1981) Mathematical Centre Tracts , vol.139
    • Van, Der Wal1    Van Der Wal, J.2
  • 15
    • 84884079276 scopus 로고
    • [von Neumann and Morgenstern, 1947] Princeton University Press, Princeton, New Jersey
    • [von Neumann and Morgenstern, 1947] von Neumann, J. and Morgenstern, O. 1947. Theory of Games and Economic Behavior. Princeton University Press, Princeton, New Jersey.
    • (1947) Theory of Games and Economic Behavior
    • von Neumann, J.1    Morgenstern, O.2
  • 16
    • 34249833101 scopus 로고
    • Q-learning
    • [Watkins and Dayan, 1992]
    • [Watkins and Dayan, 1992] Watkins, C. J. C. H. and Dayan, P. 1992. Q-learning. Machine Learning 8(3):279-292.
    • (1992) Machine Learning , vol.8 , Issue.3 , pp. 279-292
    • Watkins, C. J. C. H.1    Dayan, P.2
  • 18
    • 0001875923 scopus 로고
    • An adaptive communication protocol for cooperating mobile robots
    • [Yanco and Stein, 1993] Meyer, Jean-Arcady; Roitblat, H. L.; and Wilson, Stewart W., editors 1993, MIT Press/Bradford Books. 4 7 8 8 5
    • [Yanco and Stein, 1993] Yanco, Holly and Stein, Lynn Andrea 1993. An adaptive communication protocol for cooperating mobile robots. In Meyer, Jean-Arcady; Roitblat, H. L.; and Wilson, Stewart W., editors 1993, From Animals to Animats: Proceedings of the Second International Conference on the Simultion ofAdaptive Behavior. MIT Press/Bradford Books. 4 7 8 ^ 8 5 .
    • (1993) From Animals to Animats: Proceedings of the Second International Conference on the Simultion ofAdaptive Behavior
    • Yanco, Holly1    Stein, Lynn Andrea2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.