SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Machine Learning

Volumn 8, Issue 3, 1992, Pages 293-321

Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching

(1) Lin, Long Ji a

a CARNEGIE MELLON UNIVERSITY (United States)

Author keywords

connectionist networks; planning; Reinforcement learning; teaching

Indexed keywords

EID: 0000123778 PISSN: 08856125 EISSN: 15730565 Source Type: Journal
DOI: 10.1023/A:1022628806385 Document Type: Article

Times cited : (1533)

References (30)

1
- 85025865990
- Anderson, C.W. (1987). Strategy learning with multilayer connectionist representations. Proceedings of the Fourth International Workshop on Machine Learning (pp. 103–114).

2
- 84951530330
- Barto, A.G., Sutton, R.S., & Watkins, C.J.C.H. (1990). Learning and sequential decision making. In: M. Gabriel & J.W. Moore (Eds.), Learning and computational neuroscience. MIT Press.

3
- 84951530331
- Barto, A.G., Bradtke, S.J., & Singh, S.P. (1991). Real-time learning and control using asynchronous dynamic programming. (Technical Report 91-57). University of Massachusetts, Computer Science Department.

4
- 84951530332
- Chapman, D. & Kaelbling, L.P. (1991). Input generalization in delayed reinforcement learning: An algorithm and performance comparisons. Proceedings of IJCAI-91.

5
- 0000430514
- The convergence of TD(λ) for general λ
- (1992) Machine Learning , vol.8 , pp. 341-362
- Dayan, P.¹

6
- 0000488536
- Learning sequential decision rules using simulation models and competition
- (1990) Machine Learning , vol.5 , pp. 355-382
- Grefenstette, J.J.¹ Raey, C.L.² Schultz, A.C.³

7
- 84951530333
- Hinton, G.E., McClelland, J.L., & Rumelhart, D.E. (1986). Distributed representations. Parallel distributed processing: Explorations in the microstructure of cognition, Vol. 1, Bradford Books/MIT Press.

8
- 0003644124
- Wiley, New York
- (1960) Dynamic programming and Markov processes
- Howard, R.A.¹

9
- 84951530334
- Kaelbling, L.P. (1990). Learning in embedded systems. Ph.D. Thesis, Department of Computer Science, Stanford University.

10
- 84951530335
- Lang, K.J. (1989). A time-delay neural network architecture for speech recognition. Ph.D. Thesis, School of Computer Science, Carnegie Mellon University.

11
- 84951530336
- Lin, Long-Ji. (1991a). Self-improving reactive agents: Case studies of reinforcement learning frameworks. Proceedings of the First International Conference on Simulation of Adaptive Behavior: From Animals to Animats (pp. 297–305). Also Technical Report CMU-CS-90-109, Carnegie Mellon University.

12
- 85025855831
- Lin, Long-Ji. (1991b). Self-improvement based on reinforcement learning, planning and teaching. Proceedings of the Eighth International Workshop on Machine Learning (pp. 323–327).

13
- 84951530338
- Lin, Long-Ji. (1991c). Programming robots using reinforcement learning and teaching. Proceedings of AAAI-91 (pp. 781–786).

14
- 84951530339
- Mahadevan, S. & Connell, J. (1991). Scaling reinforcement learning to robotics by exploiting the subsumption architecture. Proceedings of the Eighth International Workshop on Machine Learning (pp. 328–332).

15
- 0000531852
- Generalization as search
- (1982) Artificial Intelligence , vol.18 , pp. 203-226
- Mitchell, T.M.¹

16
- 84951530340
- Moore, A.W. (1991). Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued state-spaces. Proceedings of the Eighth International Workshop on Machine Learning (pp. 333–337).

17
- 84948285614
- Institute for Cognitive Science Report, 8610, University of California, San Diego
- (1986) RAMBOT: A connectionist expert system that learns by example
- Mozer, M.C.¹

18
- 84951530341
- Pomerleau, D.A. (1989). ALVINN: An autonomous land vehicle in a neural network (Technical Report CMU-CS-89-107). Carnegie Mellon University.

19
- 0000646059
- Learning internal representations by error propagation
- Bradford Books/, MIT Press
- (1986) Parallel distributed processing: Explorations in the microstructure of cognition
- Rumelhart, D.E.¹ Hinton, G.E.² Willia, R.J.³

20
- 84951530342
- Sutton, R.S. (1984). Temporal credit assignment in reinforcement learning. Ph.D. Thesis, Dept. of Computer and Information Science, University of Massachusetts.

21
- 33847202724
- Learning to predict by the methods of temporal differences
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

22
- 85025864537
- Sutton, R.S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. Proceedings of the Seventh International Workshop on Machine Learning (pp. 216–224).

23
- 84951530344
- Tan, Ming. (1991). Learning a cost-sensitive internal representation for reinforcement learning. Proceedings of the Eighth International Workshop on Machine Learning (pp. 358–362).

24
- 85025874247
- Thrun, S.B., Möller, K., & Linden, A. (1991). Planning with an adaptive world model. In D.S. Touretzky (Ed.), Advances in neural information processing systems 3, Morgan Kaufmann.

25
- 84951530346
- Thrun, S.B. & Möller, K. (1992). Active exploration in dynamic environments. To appear in D.S. Touretzky (Ed.), Advances in neural information processing systems 4, Morgan Kaufmann.

26
- 84951530347
- Watkins, C.J.C.H. (1989). Learning from delayed rewards. Ph.D. Thesis, King's College, Cambridge.

27
- 84934186091
- Institute for Cognitive Science Report, 8805, University of California, San Diego
- (1988) A learning algorithm for continually running fully recurrent neural networks
- Willia, R.J.¹ Zipser, D.²

28
- 85025876171
- Whitehead, S.D. & Ballard, D.H. (1989). A role for anticipation in reactive systems that learn. Proceedings of the Sixth International Workshop on Machine Learning (pp. 354–357).

29
- 0002557085
- Learning to perceive and act by trial and error
- (1991) Machine Learning , vol.7 , pp. 45-83
- Whitehead, S.D.¹ Ballard, D.H.²

30
- 85025856603
- Whitehead, S.D. (1991b). Complexity and cooperation in Q-learning. Proceedings of the Eighth International Workshop on Machine Learning (pp. 363–367).

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.