메뉴 건너뛰기




Volumn , Issue , 1994, Pages 181-189

Reward Functions for Accelerated Learning

Author keywords

[No Author keywords available]

Indexed keywords

DOMAIN KNOWLEDGE; MULTI AGENT SYSTEMS;

EID: 84957895797     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1016/B978-1-55860-335-6.50030-1     Document Type: Conference Paper
Times cited : (302)

References (25)
  • 4
  • 5
    • 0001924166 scopus 로고
    • Real Robots, Real Learning Problems
    • Kluwer Academic Press
    • Brooks, R. A. & Matarie, M. J. (1992), Real Robots, Real Learning Problems, in 'Robot Learning',Kluwer Academic Press, pp. 193-213.
    • (1992) Robot Learning , pp. 193-213
    • Brooks, R. A.1    Matarie, M. J.2
  • 6
    • 0002192119 scopus 로고
    • Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons
    • Sydney, Australia
    • Chapman, D. & Kaelbling, L. P. (1991), Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons, in 'Proceedings, IJCAI-91', Sydney, Australia.
    • (1991) Proceedings, IJCAI-91
    • Chapman, D.1    Kaelbling, L. P.2
  • 7
    • 0000439891 scopus 로고
    • On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
    • Jaakkola, T. & Jordan, M. I. (1993), 'On the Convergence of Stochastic Iterative Dynamic Programming Algorithms', Submitted to Neural Computation.
    • (1993) Submitted to Neural Computation
    • Jaakkola, T.1    Jordan, M. I.2
  • 8
    • 44049116478 scopus 로고
    • Forward Models: Supervised Learning with a Distal Teacher
    • Jordan, M. I. & Rumelhart, D. E. (1992), 'Forward Models: Supervised Learning with a Distal Teacher', Cognitive Science 16,307-354.
    • (1992) Cognitive Science , vol.16 , pp. 307-354
    • Jordan, M. I.1    Rumelhart, D. E.2
  • 10
    • 84976813028 scopus 로고
    • Learning to Coordinate Behaviors
    • Boston, MA
    • Maes, P. & Brooks, R. A. (1990), Learning to Coordinate Behaviors, in 'Proceedings, AAAI-91', Boston, MA, pp. 796-802.
    • (1990) Proceedings, AAAI-91 , pp. 796-802
    • Maes, P.1    Brooks, R. A.2
  • 11
    • 0002386181 scopus 로고
    • Automatic Programming of Behavior-based Robots using Reinforcement Learning
    • Pittsburgh,PA
    • Mahadevan, S. & Connell, J. (1991), Automatic Programming of Behavior-based Robots using Reinforcement Learning, in 'Proceedings, AAAI-91', Pittsburgh,PA, pp. 8-14.
    • (1991) Proceedings, AAAI-91 , pp. 8-14
    • Mahadevan, S.1    Connell, J.2
  • 15
    • 0028374275 scopus 로고
    • Robot Juggling: An Implementation of Memory-Based Learning
    • Schaal, S. & Atkeson, C. G. (1994), 'Robot Juggling: An Implementation of Memory-Based Learning', Control Systems Magazine.
    • (1994) Control Systems Magazine
    • Schaal, S.1    Atkeson, C. G.2
  • 16
    • 24044497495 scopus 로고
    • Transfer of Leanring Across Compositions of Sequential Tasks
    • Morgan Kaufmann, Evanston, Illinois
    • Singh, S. P. (1991), Transfer of Leanring Across Compositions of Sequential Tasks, in 'Proceedings, Eighth International Conference on Machine Learning', Morgan Kaufmann, Evanston, Illinois, pp. 348-352.
    • (1991) Proceedings, Eighth International Conference on Machine Learning , pp. 348-352
    • Singh, S. P.1
  • 17
    • 33847202724 scopus 로고
    • Learning to Predict by Method of Temporal Differences
    • Sutton, R. (1988), 'Learning to Predict by Method of Temporal Differences', The Journal of Machine Learning 3(1), 9-44.
    • (1988) The Journal of Machine Learning , vol.3 , Issue.1 , pp. 9-44
    • Sutton, R.1
  • 18
    • 85132026293 scopus 로고
    • Integrated Architectures for Learning, Planning and Reacting Based on Approximating Dynamic Programming
    • Austin, Texas
    • Sutton, R. S. (1990), Integrated Architectures for Learning, Planning and Reacting Based on Approximating Dynamic Programming, in 'Proceedings, Seventh International Conference on Machine Learning', Austin, Texas.
    • (1990) Proceedings, Seventh International Conference on Machine Learning
    • Sutton, R. S.1
  • 19
    • 85152198941 scopus 로고
    • Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents
    • Amherst, MA
    • Tan, M. (1993), Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents, in 'Proceedings, Tenth International Conference on Machine Learning', Amherst, MA, pp. 330-337.
    • (1993) Proceedings, Tenth International Conference on Machine Learning , pp. 330-337
    • Tan, M.1
  • 20
    • 2542485629 scopus 로고
    • Practical Issues in Temporal Difference Learning
    • J. E. Moody, S. J. Hanson & R. P. Lippmann, eds, Morgan Kaufmann
    • Tesauro, G. (1992), Practical Issues in Temporal Difference Learning, in J. E. Moody, S. J. Hanson & R. P. Lippmann, eds, 'Advances in Neural Information Processing Systems 4', Morgan Kaufmann, pp. 259-267.
    • (1992) Advances in Neural Information Processing Systems 4 , pp. 259-267
    • Tesauro, G.1
  • 25
    • 0003326518 scopus 로고
    • Learning Multiple Goal Behavior via Task Decomposition and Dynamic Policy Merging
    • J. H. Connell & S. Mahadevan, eds, Kluwer Academic Publishers
    • Whitehead, S. D., Karlsson, J. & Tenenberg, J. (1993), Learning Multiple Goal Behavior via Task Decomposition and Dynamic Policy Merging, in J. H. Connell & S. Mahadevan, eds, 'RobotLearning', Kluwer Academic Publishers, pp. 45-78.
    • (1993) RobotLearning , pp. 45-78
    • Whitehead, S. D.1    Karlsson, J.2    Tenenberg, J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.