메뉴 건너뛰기




Volumn 12, Issue 10, 1997, Pages 695-724

Training and delayed reinforcements in Q-learning agents

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE; COMPUTER SIMULATION; LEARNING SYSTEMS; ROBOTS;

EID: 0031257934     PISSN: 08848173     EISSN: None     Source Type: Journal    
DOI: 10.1002/(SICI)1098-111X(199710)12:10<695::AID-INT1>3.0.CO;2-T     Document Type: Article
Times cited : (16)

References (23)
  • 2
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • R.S. Sutton, "Learning to predict by the methods of temporal differences," Mach. Learn., 3, 9-44 (1988).
    • (1988) Mach. Learn. , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 3
    • 0003617454 scopus 로고
    • Ph.D. Thesis, Department of Computer and Information Science, University of Massachusetts, Amherst, MA
    • R.S. Sutton, "Temporal credit assignment in reinforcement learning," Ph.D. Thesis, Department of Computer and Information Science, University of Massachusetts, Amherst, MA, 1984.
    • (1984) Temporal Credit Assignment in Reinforcement Learning
    • Sutton, R.S.1
  • 5
    • 0024735689 scopus 로고
    • Classifier systems and genetic algorithms
    • L. Booker, D.E. Goldberg, and J.H. Holland, "Classifier systems and genetic algorithms," Artif. Intell., 40, 235-282 (1989).
    • (1989) Artif. Intell. , vol.40 , pp. 235-282
    • Booker, L.1    Goldberg, D.E.2    Holland, J.H.3
  • 6
    • 0004049895 scopus 로고
    • Ph.D. Dissertation, Psychology Department, University of Cambridge, England
    • C.J.C.H. Watkins, "Learning with delayed rewards," Ph.D. Dissertation, Psychology Department, University of Cambridge, England, 1989.
    • (1989) Learning with Delayed Rewards
    • Watkins, C.J.C.H.1
  • 8
    • 0003411271 scopus 로고
    • Efficient Exploration in Reinforcement Learning
    • Carnegie Mellon University, Pittsburgh, PA
    • S.B. Thrun, Efficient Exploration in Reinforcement Learning, Technical Report CMU-CS-92-102, Carnegie Mellon University, Pittsburgh, PA, 1992.
    • (1992) Technical Report CMU-CS-92-102
    • Thrun, S.B.1
  • 10
    • 0029326107 scopus 로고
    • ALECSYS and the autonoMouse: Learning to control a real robot by distributed classifier systems
    • M. Dorigo, "ALECSYS and the autonoMouse: Learning to control a real robot by distributed classifier systems," Mach. Learn., 19, 209-240 (1995).
    • (1995) Mach. Learn. , vol.19 , pp. 209-240
    • Dorigo, M.1
  • 11
    • 0028739953 scopus 로고
    • Robot shaping: Developing autonomous agents through learning
    • M. Dorigo and M. Colombetti, "Robot shaping: Developing autonomous agents through learning," Artif. Intell., 71, 321-370 (1994).
    • (1994) Artif. Intell. , vol.71 , pp. 321-370
    • Dorigo, M.1    Colombetti, M.2
  • 13
    • 85132026293 scopus 로고
    • Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
    • Morgan Kaufmann, San Mateo, CA
    • R.S. Sutton, "Integrated architectures for learning, planning, and reacting based on approximating dynamic programming," Proceedings of the Seventh International Conference on Machine Learning, Morgan Kaufmann, San Mateo, CA, 1990, pp. 216-224.
    • (1990) Proceedings of the Seventh International Conference on Machine Learning , pp. 216-224
    • Sutton, R.S.1
  • 15
    • 0000123778 scopus 로고
    • Self-improving reactive agents based on reinforcement learning, planning and teaching
    • L-J. Lin, "Self-improving reactive agents based on reinforcement learning, planning and teaching," Mach. Learn., 8, 293-322 (1992).
    • (1992) Mach. Learn. , vol.8 , pp. 293-322
    • Lin, L.-J.1
  • 17
    • 0030149709 scopus 로고    scopus 로고
    • Purposive behavior acquisition for a real robot by vision-based reinforcement learning
    • M. Asada, S. Noda, S. Tawaratsumida, and K. Hosoda, "Purposive behavior acquisition for a real robot by vision-based reinforcement learning," Mach. Learn., 23, 279-303 (1996).
    • (1996) Mach. Learn. , vol.23 , pp. 279-303
    • Asada, M.1    Noda, S.2    Tawaratsumida, S.3    Hosoda, K.4
  • 18
    • 0026880130 scopus 로고
    • Automatic programming of behavior-based robots using reinforcement learning
    • S. Mahadevan and J. Connell, "Automatic programming of behavior-based robots using reinforcement learning," Artif. Intell., 55, 311-365 (1992).
    • (1992) Artif. Intell. , vol.55 , pp. 311-365
    • Mahadevan, S.1    Connell, J.2
  • 19
    • 0007908166 scopus 로고    scopus 로고
    • Experiments with reinforcement learning in problems with continuous state and action spaces
    • Department of Computer Science, University of Massachusetts, Amherst, MA
    • J.C. Santamaria, R.S. Sutton, and A. Ram, "Experiments with reinforcement learning in problems with continuous state and action spaces," Technical Report UM-CS-1966-088, Department of Computer Science, University of Massachusetts, Amherst, MA, 1996.
    • (1996) Technical Report UM-CS-1966-088
    • Santamaria, J.C.1    Sutton, R.S.2    Ram, A.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.