SCOPUS 정보 검색 플랫폼

Proceedings of the 11th International Conference on Machine Learning, ICML 1994

Volumn , Issue , 1994, Pages 181-189

Reward Functions for Accelerated Learning

(1) Mataric, Maja J a

a MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

DOMAIN KNOWLEDGE; MULTI AGENT SYSTEMS;

DOMAIN KNOWLEDGE; FORAGING TASK; MODELING RESULTS; MULTIPLE AGENTS; POOR PERFORMANCE; REINFORCEMENT LEARNING ALGORITHMS; REINFORCEMENT LEARNING METHOD; REWARD FUNCTION; SINGLE-AGENT; TRADITIONAL REINFORCEMENTS;

REINFORCEMENT LEARNING;

EID: 84957895797 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1016/B978-1-55860-335-6.50030-1 Document Type: Conference Paper

Times cited : (302)

References (25)

1
- 0025449341
- What Are Plans for?
- P. Maes, ed., The MIT Press
- Agre, P. E. & Chapman, D. (1990), What Are Plans for?, in P. Maes, ed., 'Designing Autonomous Agents: Theory and Practice from Biology to Engineering and Back', The MIT Press, pp. 17-34.
- (1990) Designing Autonomous Agents: Theory and Practice from Biology to Engineering and Back , pp. 17-34
- Agre, P. E.¹ Chapman, D.²

2
- 84941507859
- Memory-Based Approaches to Approximating Continuous Functions
- Atkeson, C. G. (1990), Memory-Based Approaches to Approximating Continuous Functions, in 'Proceedings, Sixt Yale Workshop on Adaptive and Learning Systems'.
- (1990) Proceedings, Sixt Yale Workshop on Adaptive and Learning Systems
- Atkeson, C. G.¹

3
- 2342593717
- Learning to Act using Real-Time Dynamic Programming
- Barto, A. G., Bradtke, S. J. & Singh, S. P. (1993), 'Learning to Act using Real-Time Dynamic Programming', Al Journal.
- (1993) Al Journal
- Barto, A. G.¹ Bradtke, S. J.² Singh, S. P.³

4
- 0003645589
- Technical Report AIM-1127, MIT Artificial Intelligence Lab
- Brooks, R. A. (1990), The Behavior Language; User's Guide, Technical Report AIM-1127, MIT Artificial Intelligence Lab.
- (1990) The Behavior Language; User's Guide
- Brooks, R. A.¹

5
- 0001924166
- Real Robots, Real Learning Problems
- Kluwer Academic Press
- Brooks, R. A. & Matarie, M. J. (1992), Real Robots, Real Learning Problems, in 'Robot Learning',Kluwer Academic Press, pp. 193-213.
- (1992) Robot Learning , pp. 193-213
- Brooks, R. A.¹ Matarie, M. J.²

6
- 0002192119
- Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons
- Sydney, Australia
- Chapman, D. & Kaelbling, L. P. (1991), Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons, in 'Proceedings, IJCAI-91', Sydney, Australia.
- (1991) Proceedings, IJCAI-91
- Chapman, D.¹ Kaelbling, L. P.²

7
- 0000439891
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- Jaakkola, T. & Jordan, M. I. (1993), 'On the Convergence of Stochastic Iterative Dynamic Programming Algorithms', Submitted to Neural Computation.
- (1993) Submitted to Neural Computation
- Jaakkola, T.¹ Jordan, M. I.²

8
- 44049116478
- Forward Models: Supervised Learning with a Distal Teacher
- Jordan, M. I. & Rumelhart, D. E. (1992), 'Forward Models: Supervised Learning with a Distal Teacher', Cognitive Science 16,307-354.
- (1992) Cognitive Science , vol.16 , pp. 307-354
- Jordan, M. I.¹ Rumelhart, D. E.²

9
- 0004280606
- PhD thesis, Stanford University
- Kaelbling, L. P. (1990), Learning in Embedded Systems, PhD thesis, Stanford University.
- (1990) Learning in Embedded Systems
- Kaelbling, L. P.¹

10
- 84976813028
- Learning to Coordinate Behaviors
- Boston, MA
- Maes, P. & Brooks, R. A. (1990), Learning to Coordinate Behaviors, in 'Proceedings, AAAI-91', Boston, MA, pp. 796-802.
- (1990) Proceedings, AAAI-91 , pp. 796-802
- Maes, P.¹ Brooks, R. A.²

11
- 0002386181
- Automatic Programming of Behavior-based Robots using Reinforcement Learning
- Pittsburgh,PA
- Mahadevan, S. & Connell, J. (1991), Automatic Programming of Behavior-based Robots using Reinforcement Learning, in 'Proceedings, AAAI-91', Pittsburgh,PA, pp. 8-14.
- (1991) Proceedings, AAAI-91 , pp. 8-14
- Mahadevan, S.¹ Connell, J.²

12
- 0001415895
- Designing Emergent Behaviors: From Local Interactions to Collective Intelligence
- Matarid, M. J. (1992), Designing Emergent Behaviors: From Local Interactions to Collective Intelligence, in 'From Animals to Animats: International Conference on Simulation of Adaptive Behavior'.
- (1992) From Animals to Animats: International Conference on Simulation of Adaptive Behavior
- Matarid, M. J.¹

13
- 0000824463
- Kin Recognition, Similarity, and Group Behavior
- Boulder, Colorado
- Matarid, M. J. (1993), Kin Recognition, Similarity, and Group Behavior, in 'Proceedings of the Fifteenth Annual Conference of the Cognitive Science Society', Boulder, Colorado, pp. 705-710.
- (1993) Proceedings of the Fifteenth Annual Conference of the Cognitive Science Society , pp. 705-710
- Matarid, M. J.¹

14
- 0003849946
- PhD thesis, MIT
- Matari6, M. J. (1994), Interaction and Intelligent Behavior, PhD thesis, MIT.
- (1994) Interaction and Intelligent Behavior
- Matari6, M. J.¹

15
- 0028374275
- Robot Juggling: An Implementation of Memory-Based Learning
- Schaal, S. & Atkeson, C. G. (1994), 'Robot Juggling: An Implementation of Memory-Based Learning', Control Systems Magazine.
- (1994) Control Systems Magazine
- Schaal, S.¹ Atkeson, C. G.²

16
- 24044497495
- Transfer of Leanring Across Compositions of Sequential Tasks
- Morgan Kaufmann, Evanston, Illinois
- Singh, S. P. (1991), Transfer of Leanring Across Compositions of Sequential Tasks, in 'Proceedings, Eighth International Conference on Machine Learning', Morgan Kaufmann, Evanston, Illinois, pp. 348-352.
- (1991) Proceedings, Eighth International Conference on Machine Learning , pp. 348-352
- Singh, S. P.¹

17
- 33847202724
- Learning to Predict by Method of Temporal Differences
- Sutton, R. (1988), 'Learning to Predict by Method of Temporal Differences', The Journal of Machine Learning 3(1), 9-44.
- (1988) The Journal of Machine Learning , vol.3 , Issue.1 , pp. 9-44
- Sutton, R.¹

18
- 85132026293
- Integrated Architectures for Learning, Planning and Reacting Based on Approximating Dynamic Programming
- Austin, Texas
- Sutton, R. S. (1990), Integrated Architectures for Learning, Planning and Reacting Based on Approximating Dynamic Programming, in 'Proceedings, Seventh International Conference on Machine Learning', Austin, Texas.
- (1990) Proceedings, Seventh International Conference on Machine Learning
- Sutton, R. S.¹

19
- 85152198941
- Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents
- Amherst, MA
- Tan, M. (1993), Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents, in 'Proceedings, Tenth International Conference on Machine Learning', Amherst, MA, pp. 330-337.
- (1993) Proceedings, Tenth International Conference on Machine Learning , pp. 330-337
- Tan, M.¹

20
- 2542485629
- Practical Issues in Temporal Difference Learning
- J. E. Moody, S. J. Hanson & R. P. Lippmann, eds, Morgan Kaufmann
- Tesauro, G. (1992), Practical Issues in Temporal Difference Learning, in J. E. Moody, S. J. Hanson & R. P. Lippmann, eds, 'Advances in Neural Information Processing Systems 4', Morgan Kaufmann, pp. 259-267.
- (1992) Advances in Neural Information Processing Systems 4 , pp. 259-267
- Tesauro, G.¹

21
- 0004049893
- PhD thesis, King's College, Cambridge
- Watkins, C. J. C. H. (1989), Learning from Delayed Rewards, PhD thesis, King's College, Cambridge.
- (1989) Learning from Delayed Rewards
- Watkins, C. J. C. H.¹

22
- 34249833101
- Q-Learning
- Watkins, C. J. C. H. & Dayan, P. (1992), 'Q-Learning', Machine Learning 8,279-292.
- (1992) Machine Learning , vol.8 , pp. 279-292
- Watkins, C. J. C. H.¹ Dayan, P.²

23
- 0003619736
- PhD thesis, University of Rochester
- Whitehead, S. D. (1992), Reinforcement Learning for the Adaptive Control of Perception and Action, PhD thesis, University of Rochester.
- (1992) Reinforcement Learning for the Adaptive Control of Perception and Action
- Whitehead, S. D.¹

24
- 84898804288
- Active Perception and Reinforcement Learning
- Austin, Texas
- Whitehead, S. D. & Ballard, D. H. (1990), Active Perception and Reinforcement Learning, in 'Proceedings, Seventh International Conference on Machine Learning', Austin, Texas.
- (1990) Proceedings, Seventh International Conference on Machine Learning
- Whitehead, S. D.¹ Ballard, D. H.²

25
- 0003326518
- Learning Multiple Goal Behavior via Task Decomposition and Dynamic Policy Merging
- J. H. Connell & S. Mahadevan, eds, Kluwer Academic Publishers
- Whitehead, S. D., Karlsson, J. & Tenenberg, J. (1993), Learning Multiple Goal Behavior via Task Decomposition and Dynamic Policy Merging, in J. H. Connell & S. Mahadevan, eds, 'RobotLearning', Kluwer Academic Publishers, pp. 45-78.
- (1993) RobotLearning , pp. 45-78
- Whitehead, S. D.¹ Karlsson, J.² Tenenberg, J.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.