메뉴 건너뛰기




Volumn 12, Issue 4, 1998, Pages 201-233

Colearning in Differential Games

Author keywords

Differential games; Markov games; Multiagent learning; Pursuit games; Q learning; Reinforcement learning

Indexed keywords

ALGORITHMS; ARTIFICIAL INTELLIGENCE; GAME THEORY; PROBLEM SOLVING; SET THEORY; TREES (MATHEMATICS);

EID: 0032207552     PISSN: 08856125     EISSN: None     Source Type: Journal    
DOI: 10.1023/a:1007566607659     Document Type: Article
Times cited : (14)

References (52)
  • 1
    • 0000217085 scopus 로고
    • Tolerating noise, irrelevant, and novel attributes in instance-based learning algorithms
    • Aha, D. (1992). Tolerating noise, irrelevant, and novel attributes in instance-based learning algorithms. International Journal of Man-Machine Studies, 16, 267-287.
    • (1992) International Journal of Man-Machine Studies , vol.16 , pp. 267-287
    • Aha, D.1
  • 2
    • 0042353224 scopus 로고
    • (Technical Report CS-94-121), Department of Computer Science, Colorado State University
    • Anderson, C. & Crawford-Hines, S. (1994). Multigrid Q-learning. (Technical Report CS-94-121), Department of Computer Science, Colorado State University.
    • (1994) Multigrid Q-learning
    • Anderson, C.1    Crawford-Hines, S.2
  • 4
    • 0002201501 scopus 로고
    • Learning and sequential decision making
    • Gabriel & Moore (Eds.), Cambridge, MA: MIT Press
    • Barto, A., Sutton, R., & Watkins, C. (1990). Learning and sequential decision making. In Gabriel & Moore (Eds.), Learning and computational neuroscience (pp. 539-602). Cambridge, MA: MIT Press.
    • (1990) Learning and Computational Neuroscience , pp. 539-602
    • Barto, A.1    Sutton, R.2    Watkins, C.3
  • 6
    • 0018999905 scopus 로고
    • Multidimensional divide and conquer
    • Bentley, J. (1980). Multidimensional divide and conquer. Communications of the ACM, 23(4), 214-229.
    • (1980) Communications of the ACM , vol.23 , Issue.4 , pp. 214-229
    • Bentley, J.1
  • 8
    • 0004254013 scopus 로고
    • Ph.D. thesis, Department of Computer Science, University of California at Los Angeles, Los Angeles, CA
    • Collins, R. (1992). Studies in artificial evolution. Ph.D. thesis, Department of Computer Science, University of California at Los Angeles, Los Angeles, CA.
    • (1992) Studies in Artificial Evolution
    • Collins, R.1
  • 9
    • 0000430514 scopus 로고
    • The convergence of TD(λ) for general λ
    • Dayan, P. (1992). The convergence of TD(λ) for general λ. Machine Learning, 8, 341-362.
    • (1992) Machine Learning , vol.8 , pp. 341-362
    • Dayan, P.1
  • 13
    • 0000146518 scopus 로고
    • Credit assignment in rule discovery systems based on genetic algorithms
    • Grefenstette, J. (1988). Credit assignment in rule discovery systems based on genetic algorithms. Machine Learning, 3, 225-245.
    • (1988) Machine Learning , vol.3 , pp. 225-245
    • Grefenstette, J.1
  • 16
    • 0000488536 scopus 로고
    • Learning sequential decision rules using simulation models and competition
    • Grefenstette, J., Ramsey, C., & Schultz, A. (1990). Learning sequential decision rules using simulation models and competition. Machine Learning, 5, 355-381.
    • (1990) Machine Learning , vol.5 , pp. 355-381
    • Grefenstette, J.1    Ramsey, C.2    Schultz, A.3
  • 18
    • 21844483885 scopus 로고
    • Reinforcement learning applied to a differential game
    • Harmon, M., Baird, L., & Klopf, A. (1995). Reinforcement learning applied to a differential game. Adaptive Behavior.
    • (1995) Adaptive Behavior
    • Harmon, M.1    Baird, L.2    Klopf, A.3
  • 19
    • 0012193615 scopus 로고
    • Ph.D. thesis, Department of Computer Science, The Johns Hopkins University, Baltimore, MD
    • Heath, D. (1992). A geometric framework for machine learning. Ph.D. thesis, Department of Computer Science, The Johns Hopkins University, Baltimore, MD.
    • (1992) A Geometric Framework for Machine Learning
    • Heath, D.1
  • 21
    • 0004251759 scopus 로고
    • New York, NY: Robert E. Krieger
    • Isaacs, R. (1975). Differential Games. New York, NY: Robert E. Krieger.
    • (1975) Differential Games
    • Isaacs, R.1
  • 22
    • 0009262279 scopus 로고
    • New York, NY: Springer-Verlag
    • Lewin, J. (1994). Differential Games. New York, NY: Springer-Verlag.
    • (1994) Differential Games
    • Lewin, J.1
  • 23
    • 85149834820 scopus 로고
    • Markov games as a framework for multi-agent reinforcement learning
    • New Brunswick, NJ: Morgan Kaufmann
    • Littman, M. (1994). Markov games as a framework for multi-agent reinforcement learning. Proceedings of the Eleventh International Machine Learning Conference (pp. 157-163). New Brunswick, NJ: Morgan Kaufmann.
    • (1994) Proceedings of the Eleventh International Machine Learning Conference , pp. 157-163
    • Littman, M.1
  • 26
    • 0029514510 scopus 로고
    • The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces
    • Moore, A. & Atkeson, C. (1995). The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces. Machine Learning.
    • (1995) Machine Learning
    • Moore, A.1    Atkeson, C.2
  • 27
    • 0004194203 scopus 로고
    • Ph.D. thesis, Department of Computer Science, The Johns Hopkins University, Baltimore, MD
    • Murthy, S. (1995). On growing better decision trees from data. Ph.D. thesis, Department of Computer Science, The Johns Hopkins University, Baltimore, MD.
    • (1995) On Growing Better Decision Trees from Data
    • Murthy, S.1
  • 29
    • 85003405838 scopus 로고
    • Pursuit-evasion of two aircraft in a horizontal plane
    • Rajan, N., Prasad, U., & Rao, N. (1980). Pursuit-evasion of two aircraft in a horizontal plane. Journal of Guidance and Control, 3(3), 261-267.
    • (1980) Journal of Guidance and Control , vol.3 , Issue.3 , pp. 261-267
    • Rajan, N.1    Prasad, U.2    Rao, N.3
  • 31
    • 58149324992 scopus 로고
    • Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term
    • Roth, A. & Erev, I. (1995). Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term. Games and Economic Behavior, 8, 164-212.
    • (1995) Games and Economic Behavior , vol.8 , pp. 164-212
    • Roth, A.1    Erev, I.2
  • 33
    • 0001201756 scopus 로고
    • Some studies in machine learning using the game of checkers
    • Samuel, A. (1959). Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 3(3), 211-229.
    • (1959) IBM Journal of Research and Development , vol.3 , Issue.3 , pp. 211-229
    • Samuel, A.1
  • 34
    • 0030050933 scopus 로고
    • Multiagent reinforcement learning in the iterated prisoner's dilemma
    • Sandholm, T. & Crites, R. (1995). Multiagent reinforcement learning in the iterated prisoner's dilemma. Biosystems, 37, 147-166.
    • (1995) Biosystems , vol.37 , pp. 147-166
    • Sandholm, T.1    Crites, R.2
  • 35
    • 0007918330 scopus 로고    scopus 로고
    • A general method for incremental self-improvement and multi-agent learning in unrestricted environments
    • Schmidhuber, J. (1996). A general method for incremental self-improvement and multi-agent learning in unrestricted environments. Evolutionary Computation: Theory and Applications.
    • (1996) Evolutionary Computation: Theory and Applications
    • Schmidhuber, J.1
  • 36
    • 2542496166 scopus 로고    scopus 로고
    • Ph.D. thesis, Department of Computer Science, The Johns Hopkins University, Baltimore, MD
    • Sheppard, J. (1996). Multi-agent reinforcement learning in Markov games. Ph.D. thesis, Department of Computer Science, The Johns Hopkins University, Baltimore, MD.
    • (1996) Multi-agent Reinforcement Learning in Markov Games
    • Sheppard, J.1
  • 38
    • 0020199330 scopus 로고
    • A self-learning automaton with variable resolution for high precision assembly by industrial robots
    • Simons, J., van Brussel, H., DeSchutter, J., & Verhaert, J. (1982). A self-learning automaton with variable resolution for high precision assembly by industrial robots. IEEE Transactions on Automatic Control, 27(5), 1109-1113.
    • (1982) IEEE Transactions on Automatic Control , vol.27 , Issue.5 , pp. 1109-1113
    • Simons, J.1    Van Brussel, H.2    Deschutter, J.3    Verhaert, J.4
  • 40
    • 2542440859 scopus 로고
    • Iterated prisoner's dilemma with choice and refusal of partners
    • Sante Fe Institute
    • Stanley, E., Ashlock, D., & Tesfatsion, L. (1993). Iterated prisoner's dilemma with choice and refusal of partners. Proceedings of Alife III. Sante Fe Institute.
    • (1993) Proceedings of Alife III
    • Stanley, E.1    Ashlock, D.2    Tesfatsion, L.3
  • 41
    • 0042049192 scopus 로고
    • (Technical Report COINS TR 93-27), Amherts, MA: University of Massachusetts
    • Suguwara, T. & Lesser, V. (1993). On-line learning of coordination plans. (Technical Report COINS TR 93-27), Amherts, MA: University of Massachusetts.
    • (1993) On-line Learning of Coordination Plans
    • Suguwara, T.1    Lesser, V.2
  • 42
    • 33847202724 scopus 로고
    • Learning to predict by methods of temporal differences
    • Sutton, R. (1988). Learning to predict by methods of temporal differences. Machine Learning, 3, 9-44.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.1
  • 48
    • 85152198941 scopus 로고
    • Multi-agent reinforcement learning: Independent vs. cooperative agents
    • San Mateo, CA: Morgan Kaufmann
    • Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents. Machine Learning: Proceedings of the Tenth International Conference, San Mateo, CA: Morgan Kaufmann.
    • (1993) Machine Learning: Proceedings of the Tenth International Conference
    • Tan, M.1
  • 49
    • 0001046225 scopus 로고
    • Practical issues in temporal difference learning
    • Tesauro, G. (1992). Practical issues in temporal difference learning. Machine Learning, 8, 257-277.
    • (1992) Machine Learning , vol.8 , pp. 257-277
    • Tesauro, G.1
  • 50
    • 0029276036 scopus 로고
    • Temporal difference learning and TD-gammon
    • Tesauro, G. (1995). Temporal difference learning and TD-gammon. Communications of the ACM (pp. 58-67).
    • (1995) Communications of the ACM , pp. 58-67
    • Tesauro, G.1
  • 51
    • 0004049895 scopus 로고
    • Ph.D. thesis, Department of Computer Science, Cambridge University, Cambridge, England
    • Watkins, C. (1989). Learning with delayed rewards. Ph.D. thesis, Department of Computer Science, Cambridge University, Cambridge, England.
    • (1989) Learning with Delayed Rewards
    • Watkins, C.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.