SCOPUS 정보 검색 플랫폼

Machine Learning

Volumn 12, Issue 4, 1998, Pages 201-233

Colearning in Differential Games

(1) Sheppard, John W a

a ARINC (United States)

Author keywords

Differential games; Markov games; Multiagent learning; Pursuit games; Q learning; Reinforcement learning

Indexed keywords

ALGORITHMS; ARTIFICIAL INTELLIGENCE; GAME THEORY; PROBLEM SOLVING; SET THEORY; TREES (MATHEMATICS);

DIFFERENTIAL GAMES; MULTIAGENT LEARNING; REINFORCEMENT LEARNING;

LEARNING SYSTEMS;

EID: 0032207552 PISSN: 08856125 EISSN: None Source Type: Journal
DOI: 10.1023/a:1007566607659 Document Type: Article

Times cited : (14)

References (52)

1
- 0000217085
- Tolerating noise, irrelevant, and novel attributes in instance-based learning algorithms
- Aha, D. (1992). Tolerating noise, irrelevant, and novel attributes in instance-based learning algorithms. International Journal of Man-Machine Studies, 16, 267-287.
- (1992) International Journal of Man-Machine Studies , vol.16 , pp. 267-287
- Aha, D.¹

2
- 0042353224
- (Technical Report CS-94-121), Department of Computer Science, Colorado State University
- Anderson, C. & Crawford-Hines, S. (1994). Multigrid Q-learning. (Technical Report CS-94-121), Department of Computer Science, Colorado State University.
- (1994) Multigrid Q-learning
- Anderson, C.¹ Crawford-Hines, S.²

3
- 0002700781
- Learning to act using real-time dynamic programming
- Barto, A., Bradtke, S., & Singh, S. (1993). Learning to act using real-time dynamic programming. Artificial Intelligence.
- (1993) Artificial Intelligence
- Barto, A.¹ Bradtke, S.² Singh, S.³

4
- 0002201501
- Learning and sequential decision making
- Gabriel & Moore (Eds.), Cambridge, MA: MIT Press
- Barto, A., Sutton, R., & Watkins, C. (1990). Learning and sequential decision making. In Gabriel & Moore (Eds.), Learning and computational neuroscience (pp. 539-602). Cambridge, MA: MIT Press.
- (1990) Learning and Computational Neuroscience , pp. 539-602
- Barto, A.¹ Sutton, R.² Watkins, C.³

5
- 0004071782
- London: Academic Press
- Basar, T. & Olsder, G. (1982). Dynamic noncooperativeg game theory. London: Academic Press.
- (1982) Dynamic Noncooperativeg Game Theory
- Basar, T.¹ Olsder, G.²

6
- 0018999905
- Multidimensional divide and conquer
- Bentley, J. (1980). Multidimensional divide and conquer. Communications of the ACM, 23(4), 214-229.
- (1980) Communications of the ACM , vol.23 , Issue.4 , pp. 214-229
- Bentley, J.¹

7
- 0003565779
- Prentice-Hall, Inc.
- Bertsekas, D. (1987). Dynamic programming: Deterministic and stochastic models. Prentice-Hall, Inc.
- (1987) Dynamic Programming: Deterministic and Stochastic Models
- Bertsekas, D.¹

8
- 0004254013
- Ph.D. thesis, Department of Computer Science, University of California at Los Angeles, Los Angeles, CA
- Collins, R. (1992). Studies in artificial evolution. Ph.D. thesis, Department of Computer Science, University of California at Los Angeles, Los Angeles, CA.
- (1992) Studies in Artificial Evolution
- Collins, R.¹

9
- 0000430514
- The convergence of TD(λ) for general λ
- Dayan, P. (1992). The convergence of TD(λ) for general λ. Machine Learning, 8, 341-362.
- (1992) Machine Learning , vol.8 , pp. 341-362
- Dayan, P.¹

10
- 0002646549
- Multiresolution instance-based learning
- Deng, K. & Moore, A. (1995). Multiresolution instance-based learning. Proceedings of the 1995 International Joint Conference on Artificial Intelligence.
- (1995) Proceedings of the 1995 International Joint Conference on Artificial Intelligence
- Deng, K.¹ Moore, A.²

11
- 0030082551
- The ant system: Optimization by a colony of cooperating agents
- Dorigo, M., Maniezzo, V., & Colorni, A. (1996). The ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics, 26(1), 1-13.
- (1996) IEEE Transactions on Systems, Man, and Cybernetics , vol.26 , Issue.1 , pp. 1-13
- Dorigo, M.¹ Maniezzo, V.² Colorni, A.³

12
- 0003801740
- Unpublished manuscript, August
- Erev, I. & Roth, A. (1995). On the need for low rationality, cognitive game theory: Reinforcement learning in experimental games with unique mixed strategy equilibria. Unpublished manuscript, August.
- (1995) On the Need for Low Rationality, Cognitive Game Theory: Reinforcement Learning in Experimental Games with Unique Mixed Strategy Equilibria
- Erev, I.¹ Roth, A.²

13
- 0000146518
- Credit assignment in rule discovery systems based on genetic algorithms
- Grefenstette, J. (1988). Credit assignment in rule discovery systems based on genetic algorithms. Machine Learning, 3, 225-245.
- (1988) Machine Learning , vol.3 , pp. 225-245
- Grefenstette, J.¹

14
- 0000440954
- Lamarkian learning in multi-agent environments
- Morgan Kaufman
- Grefenstette, J. (1991). Lamarkian learning in multi-agent environments. Proceedings of the Fourth International Conference on Genetic Algorithms (pp. 303-310). Morgan Kaufman.
- (1991) Proceedings of the Fourth International Conference on Genetic Algorithms , pp. 303-310
- Grefenstette, J.¹

15
- 2542500846
- Methods for competitive and cooperative coevolution
- AAAI Press
- Grefenstette, J. & Daley, R. (1995). Methods for competitive and cooperative coevolution. Adaptation, Coevolution, and Learning in Multiagent Systems (ICMAS '95), (pp. 276-282), AAAI Press.
- (1995) Adaptation, Coevolution, and Learning in Multiagent Systems (ICMAS '95) , pp. 276-282
- Grefenstette, J.¹ Daley, R.²

16
- 0000488536
- Learning sequential decision rules using simulation models and competition
- Grefenstette, J., Ramsey, C., & Schultz, A. (1990). Learning sequential decision rules using simulation models and competition. Machine Learning, 5, 355-381.
- (1990) Machine Learning , vol.5 , pp. 355-381
- Grefenstette, J.¹ Ramsey, C.² Schultz, A.³

17
- 0008815437
- Residual advantage learning applied to a differential game
- Harmon, M. & Baird, L. (1995). Residual advantage learning applied to a differential game. Neural Information Processing Systems, 7.
- (1995) Neural Information Processing Systems , vol.7
- Harmon, M.¹ Baird, L.²

18
- 21844483885
- Reinforcement learning applied to a differential game
- Harmon, M., Baird, L., & Klopf, A. (1995). Reinforcement learning applied to a differential game. Adaptive Behavior.
- (1995) Adaptive Behavior
- Harmon, M.¹ Baird, L.² Klopf, A.³

19
- 0012193615
- Ph.D. thesis, Department of Computer Science, The Johns Hopkins University, Baltimore, MD
- Heath, D. (1992). A geometric framework for machine learning. Ph.D. thesis, Department of Computer Science, The Johns Hopkins University, Baltimore, MD.
- (1992) A Geometric Framework for Machine Learning
- Heath, D.¹

20
- 27144523979
- Evolutionary games and computer simulations
- Huberman, B. & Glance, N. (1995). Evolutionary games and computer simulations. Proceedings of the National Academy of Sciences.
- (1995) Proceedings of the National Academy of Sciences
- Huberman, B.¹ Glance, N.²

21
- 0004251759
- New York, NY: Robert E. Krieger
- Isaacs, R. (1975). Differential Games. New York, NY: Robert E. Krieger.
- (1975) Differential Games
- Isaacs, R.¹

22
- 0009262279
- New York, NY: Springer-Verlag
- Lewin, J. (1994). Differential Games. New York, NY: Springer-Verlag.
- (1994) Differential Games
- Lewin, J.¹

23
- 85149834820
- Markov games as a framework for multi-agent reinforcement learning
- New Brunswick, NJ: Morgan Kaufmann
- Littman, M. (1994). Markov games as a framework for multi-agent reinforcement learning. Proceedings of the Eleventh International Machine Learning Conference (pp. 157-163). New Brunswick, NJ: Morgan Kaufmann.
- (1994) Proceedings of the Eleventh International Machine Learning Conference , pp. 157-163
- Littman, M.¹

24
- 0003861655
- Ph.D. thesis, Department of Computer Science, Brown University
- Liftman, M. (1996). Algorithms for sequential decision making. Ph.D. thesis, Department of Computer Science, Brown University.
- (1996) Algorithms for Sequential Decision Making
- Liftman, M.¹

25
- 0003442587
- Ph.D. thesis, Computer Laboratory, Cambridge University
- Moore, A. (1990). Efficient memory-based learning for robot ccntrol. Ph.D. thesis, Computer Laboratory, Cambridge University.
- (1990) Efficient Memory-based Learning for Robot Ccntrol
- Moore, A.¹

26
- 0029514510
- The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces
- Moore, A. & Atkeson, C. (1995). The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces. Machine Learning.
- (1995) Machine Learning
- Moore, A.¹ Atkeson, C.²

27
- 0004194203
- Ph.D. thesis, Department of Computer Science, The Johns Hopkins University, Baltimore, MD
- Murthy, S. (1995). On growing better decision trees from data. Ph.D. thesis, Department of Computer Science, The Johns Hopkins University, Baltimore, MD.
- (1995) On Growing Better Decision Trees from Data
- Murthy, S.¹

28
- 0002699888
- A coevolutionary approach to learning sequential decision rules
- Potter M., De Jong, K., & Grefenstette, J. (1995). A coevolutionary approach to learning sequential decision rules. Proceedings of the International Conference on Genetic Algorithms (pp. 366-372).
- (1995) Proceedings of the International Conference on Genetic Algorithms , pp. 366-372
- Potter, M.¹ De Jong, K.² Grefenstette, J.³

29
- 85003405838
- Pursuit-evasion of two aircraft in a horizontal plane
- Rajan, N., Prasad, U., & Rao, N. (1980). Pursuit-evasion of two aircraft in a horizontal plane. Journal of Guidance and Control, 3(3), 261-267.
- (1980) Journal of Guidance and Control , vol.3 , Issue.3 , pp. 261-267
- Rajan, N.¹ Prasad, U.² Rao, N.³

30
- 0002025849
- Deals among rational agents
- Rosenschein, J. & Genesereth, M. (1985). Deals among rational agents. Proceedings of the 1985 International Joint Conference on Artificial Intelligence (pp. 91-99).
- (1985) Proceedings of the 1985 International Joint Conference on Artificial Intelligence , pp. 91-99
- Rosenschein, J.¹ Genesereth, M.²

31
- 58149324992
- Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term
- Roth, A. & Erev, I. (1995). Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term. Games and Economic Behavior, 8, 164-212.
- (1995) Games and Economic Behavior , vol.8 , pp. 164-212
- Roth, A.¹ Erev, I.²

32
- 84994684889
- Distance metrics for instance-based learning
- Salzberg, S. (1991). Distance metrics for instance-based learning. Methodologies for Intelligence Systems: 6th International Symposium (pp. 399-408).
- (1991) Methodologies for Intelligence Systems: 6th International Symposium , pp. 399-408
- Salzberg, S.¹

33
- 0001201756
- Some studies in machine learning using the game of checkers
- Samuel, A. (1959). Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 3(3), 211-229.
- (1959) IBM Journal of Research and Development , vol.3 , Issue.3 , pp. 211-229
- Samuel, A.¹

34
- 0030050933
- Multiagent reinforcement learning in the iterated prisoner's dilemma
- Sandholm, T. & Crites, R. (1995). Multiagent reinforcement learning in the iterated prisoner's dilemma. Biosystems, 37, 147-166.
- (1995) Biosystems , vol.37 , pp. 147-166
- Sandholm, T.¹ Crites, R.²

35
- 0007918330
- A general method for incremental self-improvement and multi-agent learning in unrestricted environments
- Schmidhuber, J. (1996). A general method for incremental self-improvement and multi-agent learning in unrestricted environments. Evolutionary Computation: Theory and Applications.
- (1996) Evolutionary Computation: Theory and Applications
- Schmidhuber, J.¹

36
- 2542496166
- Ph.D. thesis, Department of Computer Science, The Johns Hopkins University, Baltimore, MD
- Sheppard, J. (1996). Multi-agent reinforcement learning in Markov games. Ph.D. thesis, Department of Computer Science, The Johns Hopkins University, Baltimore, MD.
- (1996) Multi-agent Reinforcement Learning in Markov Games
- Sheppard, J.¹

37
- 0031071704
- A teaching method for memory-based control
- Sheppard, J. & Salzberg, S. (1997). A teaching method for memory-based control. Artificial Intelligence Review, 11, 343-370.
- (1997) Artificial Intelligence Review , vol.11 , pp. 343-370
- Sheppard, J.¹ Salzberg, S.²

38
- 0020199330
- A self-learning automaton with variable resolution for high precision assembly by industrial robots
- Simons, J., van Brussel, H., DeSchutter, J., & Verhaert, J. (1982). A self-learning automaton with variable resolution for high precision assembly by industrial robots. IEEE Transactions on Automatic Control, 27(5), 1109-1113.
- (1982) IEEE Transactions on Automatic Control , vol.27 , Issue.5 , pp. 1109-1113
- Simons, J.¹ Van Brussel, H.² Deschutter, J.³ Verhaert, J.⁴

39
- 2342589900
- (Technical Report TCGA Report No. 94002), University of Alabama, Tuscaloosa, Alabama
- Smith, R. & Gray, B. (1993). Co-adaptive genetic algorithms: An example in Othello strategy. (Technical Report TCGA Report No. 94002), University of Alabama, Tuscaloosa, Alabama.
- (1993) Co-adaptive Genetic Algorithms: An Example in Othello Strategy
- Smith, R.¹ Gray, B.²

40
- 2542440859
- Iterated prisoner's dilemma with choice and refusal of partners
- Sante Fe Institute
- Stanley, E., Ashlock, D., & Tesfatsion, L. (1993). Iterated prisoner's dilemma with choice and refusal of partners. Proceedings of Alife III. Sante Fe Institute.
- (1993) Proceedings of Alife III
- Stanley, E.¹ Ashlock, D.² Tesfatsion, L.³

41
- 0042049192
- (Technical Report COINS TR 93-27), Amherts, MA: University of Massachusetts
- Suguwara, T. & Lesser, V. (1993). On-line learning of coordination plans. (Technical Report COINS TR 93-27), Amherts, MA: University of Massachusetts.
- (1993) On-line Learning of Coordination Plans
- Suguwara, T.¹ Lesser, V.²

42
- 33847202724
- Learning to predict by methods of temporal differences
- Sutton, R. (1988). Learning to predict by methods of temporal differences. Machine Learning, 3, 9-44.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.¹

43
- 85156255116
- Beating a defender in robotic soccer: Memory-based learning of a continuous function
- Stone, P. & Veloso, M. (1995). Beating a defender in robotic soccer: Memory-based learning of a continuous function. Proceedings of Neural Information Processing Systems.
- (1995) Proceedings of Neural Information Processing Systems
- Stone, P.¹ Veloso, M.²

44
- 0006191664
- Multiagent systems: A survey from a machine learning perspective
- submitted
- Stone, P. & Veloso, M. (1996a). Multiagent systems: A survey from a machine learning perspective. IEEE Transactions on Knowledge and Data Engineering, submitted.
- (1996) IEEE Transactions on Knowledge and Data Engineering
- Stone, P.¹ Veloso, M.²

45
- 0009401114
- Towards collaborative and adversarial learning: A case study in robotic soccer
- Stone, P. & Veloso, M. (1996b). Towards collaborative and adversarial learning: A case study in robotic soccer. AAAI Spring Symposium on Adaptation, Co-Evolution, and Learning in Multiagent Systems.
- (1996) AAAI Spring Symposium on Adaptation, Co-Evolution, and Learning in Multiagent Systems
- Stone, P.¹ Veloso, M.²

46
- 0003248149
- Teamwork in real-world, dynamic environments
- AAAI Press
- Tambe, M. (1996a). Teamwork in real-world, dynamic environments. International Conference on Multiagent Systems, AAAI Press.
- (1996) International Conference on Multiagent Systems
- Tambe, M.¹

47
- 0030358984
- Tracking dynamic team activity
- AAAI Press
- Tambe, M. (1996b). Tracking dynamic team activity. Proceedings of the 13th National Conference on Artificial Intelligence. AAAI Press.
- (1996) Proceedings of the 13th National Conference on Artificial Intelligence
- Tambe, M.¹

48
- 85152198941
- Multi-agent reinforcement learning: Independent vs. cooperative agents
- San Mateo, CA: Morgan Kaufmann
- Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents. Machine Learning: Proceedings of the Tenth International Conference, San Mateo, CA: Morgan Kaufmann.
- (1993) Machine Learning: Proceedings of the Tenth International Conference
- Tan, M.¹

49
- 0001046225
- Practical issues in temporal difference learning
- Tesauro, G. (1992). Practical issues in temporal difference learning. Machine Learning, 8, 257-277.
- (1992) Machine Learning , vol.8 , pp. 257-277
- Tesauro, G.¹

50
- 0029276036
- Temporal difference learning and TD-gammon
- Tesauro, G. (1995). Temporal difference learning and TD-gammon. Communications of the ACM (pp. 58-67).
- (1995) Communications of the ACM , pp. 58-67
- Tesauro, G.¹

51
- 0004049895
- Ph.D. thesis, Department of Computer Science, Cambridge University, Cambridge, England
- Watkins, C. (1989). Learning with delayed rewards. Ph.D. thesis, Department of Computer Science, Cambridge University, Cambridge, England.
- (1989) Learning with Delayed Rewards
- Watkins, C.¹

52
- 34249833101
- Q-learning
- Watkins, C. & Dayan, P. (1992). Q-learning. Machine Learning, 8, 279-292.
- (1992) Machine Learning , vol.8 , pp. 279-292
- Watkins, C.¹ Dayan, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.