SCOPUS 정보 검색 플랫폼

Proceedings of the 11th International Conference on Machine Learning, ICML 1994

Volumn , Issue , 1994, Pages 157-163

Markov games as a framework for multi-agent reinforcement learning

(1) Littman, Michael L a

a BROWN UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

GAME THEORY; LEARNING ALGORITHMS; MARKOV PROCESSES; MULTI AGENT SYSTEMS;

ADAPTIVE AGENTS; MARKOV DECISION PROCESSES; MARKOV GAMES; MULTI-AGENT REINFORCEMENT LEARNING; OPTIMAL POLICIES; PROBABILISTICS; PROCESS FORMALIZATIONS; REINFORCEMENT LEARNINGS; TRANSITION FUNCTIONS; TWO AGENTS;

REINFORCEMENT LEARNING;

EID: 85149834820 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1016/B978-1-55860-335-6.50027-1 Document Type: Conference Paper

Times cited : (2248)

References (18)

1
- 0002201501
- [Barto et al, 1989] Technical Report 89-95, Department of Computer and Information Science, University of Massachusetts, Amherst, Massachusetts. Also published in Learning and Computational Neuroscience: Foundations ofAdaptive Networks, Michael Gabriel and John Moore, editors. The MIT Press, Cambridge, Massachusetts, 1991
- [Barto et al, 1989] Barto, A. G.; Sutton, R. S.; and Watkins, C. J. C. H. 1989. Learning and sequential decision making. Technical Report 89-95, Department of Computer and Information Science, University of Massachusetts, Amherst, Massachusetts. Also published in Learning and Computational Neuroscience: Foundations ofAdaptive Networks, Michael Gabriel and John Moore, editors. The MIT Press, Cambridge, Massachusetts, 1991.
- (1989) Learning and sequential decision making
- Barto, A. G.¹ Sutton, R. S.² Watkins, C. J. C. H.³

2
- 0003565779
- [Bertsekas, 1987] Prentice-Hall
- [Bertsekas, 1987] Bertsekas, D. P. 1987. Dynamic Programming: Deterministic and Stochastic Models. Prentice-Hall.
- (1987) Dynamic Programming: Deterministic and Stochastic Models
- Bertsekas, D. P.¹

3
- 0011530731
- [Boyan, 1992] Master's thesis, Department of Engineering and Computer Laboratory, University of Cambridge, Cambridge, England
- [Boyan, 1992] Boyan, Justin A. 1992. Modular neural networks for learning context-dependent game strategies. Master's thesis, Department of Engineering and Computer Laboratory, University of Cambridge, Cambridge, England.
- (1992) Modular neural networks for learning context-dependent game strategies
- Boyan, Justin A.¹

4
- 85152548661
- Consideration of risk in reinforcement learning
- [Heger, 1994] To appear
- [Heger, 1994] Heger, Matthias 1994. Consideration of risk in reinforcement learning. In Proceedings of the Machine Learning Conference. To appear.
- (1994) Proceedings of the Machine Learning Conference
- Heger, Matthias¹

5
- 0003644124
- [Howard, I960] The MIT Press, Cambridge, Massachusetts
- [Howard, I960] Howard, Ronald A. 1960. Dynamic Programming and Markov Processes. The MIT Press, Cambridge, Massachusetts.
- (1960) Dynamic Programming and Markov Processes
- Howard, Ronald A.¹

6
- 0004260006
- [Owen, 1982] Academic Press, Orlando, Florida
- [Owen, 1982] Owen, Guillermo 1982. Game Theory: Second edition. Academic Press, Orlando, Florida.
- (1982) Game Theory: Second edition
- Owen, Guillermo¹

7
- 0001201756
- Some studies in machine learning using the game of checkers
- [Samuel, 1959] Reprinted in E. A. Feigenbaum and J. Feldman, editors, Computers and Thought, McGraw-Hill, New York 1963
- [Samuel, 1959] Samuel, A. L. 1959. Some studies in machine learning using the game of checkers. IBM Journal of Research and Development 3:211-229. Reprinted in E. A. Feigenbaum and J. Feldman, editors, Computers and Thought, McGraw-Hill, New York 1963.
- (1959) IBM Journal of Research and Development , vol.3 , pp. 211-229
- Samuel, A. L.¹

8
- 0000433333
- Using the td(lambda) algorithm to learn an evaluation function for the game of go
- [Schraudolph et al, 1994] San Mateo, CA. Morgan Kaufman. To appear
- [Schraudolph et al, 1994] Schraudolph, Nicol N.; Dayan, Peter; and Sejnowski, Terrence J. 1994. Using the td(lambda) algorithm to learn an evaluation function for the game of go. In Advances in Neural Information Processing Systems 6, San Mateo, CA. Morgan Kaufman. To appear.
- (1994) Advances in Neural Information Processing Systems , vol.6
- Schraudolph, Nicol N.¹ Dayan, Peter² Sejnowski, Terrence J.³

9
- 85152626183
- A reinforcement learning method for maximizing undiscounted rewards
- [Schwartz, 1993] Amherst, Massachusetts. Morgan Kaufmann
- [Schwartz, 1993] Schwartz, Anton 1993. A reinforcement learning method for maximizing undiscounted rewards. In Proceedings of the Tenth International Conference on Machine Learning, Amherst, Massachusetts. Morgan Kaufmann. 298-305.
- (1993) Proceedings of the Tenth International Conference on Machine Learning , pp. 298-305
- Schwartz, Anton¹

10
- 85152558379
- Model-free reinforcement learning for non-markovian decision problems
- [Singh et al, 1994] 1 To appear
- [Singh et al, 1994] Singh, Satinder Pal; Jaakkola, Tommi; and Jordan, Michael 1.1994. Model-free reinforcement learning for non-markovian decision problems. In Proceedings of the Machine Learning Conference. To appear.
- (1994) Proceedings of the Machine Learning Conference
- Singh, Satinder Pal¹ Jaakkola, Tommi² Jordan, Michael³

11
- 0003449348
- [Strang, 1980] Academic Press, Orlando, Florida
- [Strang, 1980] Strang, Gilbert 1980. Linear Algebra and its applications: second edition. Academic Press, Orlando, Florida.
- (1980) Linear Algebra and its applications: second edition
- Strang, Gilbert¹

12
- 85152198941
- Multi-agent reinforcement learning: independent vs. cooperative agents
- [Tan, 1993] Amherst, Massachusetts. Morgan Kaufmann
- [Tan, 1993] Tan, M. 1993. Multi-agent reinforcement learning: independent vs. cooperative agents. In Proceedings of the Tenth International Conference on Machine Learning, Amherst, Massachusetts. Morgan Kaufmann.
- (1993) Proceedings of the Tenth International Conference on Machine Learning
- Tan, M.¹

13
- 2542485629
- Practical issues in temporal difference
- [Tesauro, 1992] Moody, J. E.; Lippman, D. S.; and Hanson, S. J., editors 1992, San Mateo, CA. Morgan Kaufman
- [Tesauro, 1992] Tesauro, G. J. 1992. Practical issues in temporal difference. In Moody, J. E.; Lippman, D. S.; and Hanson, S. J., editors 1992, Advances in Neural Information Processing Systems 4, San Mateo, CA. Morgan Kaufman. 259-266.
- (1992) Advances in Neural Information Processing Systems , vol.4 , pp. 259-266
- Tesauro, G. J.¹

14
- 0141824325
- Stochastic dynamic programming
- 1981], Morgan Kaufmann, Amsterdam
- [Van Der Wal, 1981] Van Der Wal, J. 1981. Stochastic dynamic programming. In Mathematical Centre Tracts 139. Morgan Kaufmann, Amsterdam.
- (1981) Mathematical Centre Tracts , vol.139
- Van, Der Wal¹ Van Der Wal, J.²

15
- 84884079276
- [von Neumann and Morgenstern, 1947] Princeton University Press, Princeton, New Jersey
- [von Neumann and Morgenstern, 1947] von Neumann, J. and Morgenstern, O. 1947. Theory of Games and Economic Behavior. Princeton University Press, Princeton, New Jersey.
- (1947) Theory of Games and Economic Behavior
- von Neumann, J.¹ Morgenstern, O.²

16
- 34249833101
- Q-learning
- [Watkins and Dayan, 1992]
- [Watkins and Dayan, 1992] Watkins, C. J. C. H. and Dayan, P. 1992. Q-learning. Machine Learning 8(3):279-292.
- (1992) Machine Learning , vol.8 , Issue.3 , pp. 279-292
- Watkins, C. J. C. H.¹ Dayan, P.²

17
- 0004049893
- [Watkins, 1989] Ph.D. Dissertation, Cambridge University
- [Watkins, 1989] Watkins, C. J.C.H. 1989. Learning with Delayed Rewards. Ph.D. Dissertation, Cambridge University.
- (1989) Learning with Delayed Rewards
- Watkins, C. J.C.H.¹

18
- 0001875923
- An adaptive communication protocol for cooperating mobile robots
- [Yanco and Stein, 1993] Meyer, Jean-Arcady; Roitblat, H. L.; and Wilson, Stewart W., editors 1993, MIT Press/Bradford Books. 4 7 8 8 5
- [Yanco and Stein, 1993] Yanco, Holly and Stein, Lynn Andrea 1993. An adaptive communication protocol for cooperating mobile robots. In Meyer, Jean-Arcady; Roitblat, H. L.; and Wilson, Stewart W., editors 1993, From Animals to Animats: Proceedings of the Second International Conference on the Simultion ofAdaptive Behavior. MIT Press/Bradford Books. 4 7 8 ^ 8 5 .
- (1993) From Animals to Animats: Proceedings of the Second International Conference on the Simultion ofAdaptive Behavior
- Yanco, Holly¹ Stein, Lynn Andrea²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.