SCOPUS 정보 검색 플랫폼

Volumn 22, Issue 1-3, 1996, Pages 227-250

The effect of representation and knowledge on goal-directed exploration with reinforcement-learning algorithms

a Carnegie Mellon University (United States)

Author keywords

Action models; Action penalty representation; Admissible and consistent heuristics; Complexity; Goal reward representation; Goal directed exploration; On line reinforcement learning; Prior knowledge; Q hat learning; Q learning; Reward structure

Indexed keywords

COMPUTATIONAL COMPLEXITY; HEURISTIC METHODS; KNOWLEDGE REPRESENTATION; LEARNING ALGORITHMS; MATHEMATICAL MODELS; ONLINE SYSTEMS; POLYNOMIALS; STATE SPACE METHODS; TOPOLOGY;

ACTION MODELS; GOAL DIRECTED EXPLORATION; GOAL REWARD REPRESENTATION; REINFORCEMENT LEARNING;

LEARNING SYSTEMS;

EID: 0029751419 PISSN: 08856125 EISSN: None Source Type: Journal
DOI: 10.1007/BF00114729 Document Type: Article

Times cited : (76)

References (22)

1
- 0029210635
- Learning to act using real-time dynamic programming
- Barto, A. G., S.J. Bradtke, and S. P. Singh (1995) Learning to act using real-time dynamic programming. Artificial Intelligence, 73(1):81-138.
- (1995) Artificial Intelligence , vol.73 , Issue.1 , pp. 81-138
- Barto, A.G.¹ Bradtke, S.J.² Singh, S.P.³

2
- 0003602259
- Learning and sequential decision making
- Department of Computer Science, University of Massachusetts at Amherst
- Barto,A. G., R.S. Sutton, and C.J. Watkins. (1989) Learning and sequential decision making. Technical Report 89-95. Department of Computer Science, University of Massachusetts at Amherst.
- (1989) Technical Report , vol.89-95
- Barto, A.G.¹ Sutton, R.S.² Watkins, C.J.³

3
- 0003787146
- Princeton University Press, Princeton (New Jersey)
- Bellman, R. (1917) Dynamic Programming. Princeton University Press, Princeton (New Jersey)
- (1917) Dynamic Programming
- Bellman, R.¹

4
- 0001854509
- Solving time-dependent planning problems
- Boddy,M. and T. Dean. (1989). Solving time-dependent planning problems. In Proceeding of the IJCAI, pages 970-984.
- (1989) Proceeding of the IJCAI , pp. 970-984
- Boddy, M.¹ Dean, T.²

5
- 2342445305
- Reasoning about when to start acting
- Goodwill, R. (1994) Reasoning about when to start acting. In Proceedings of the International Conference on Artificial Intelligence Planning Systems, pages 86-91.
- (1994) Proceedings of the International Conference on Artificial Intelligence Planning Systems , pp. 86-91
- Goodwill, R.¹

6
- 0029751418
- The loss from imperfect value functions in expectation based and minimax-based tasks
- Heger,M (1996) The loss from imperfect value functions in expectation based and minimax-based tasks. Machine Learning, pages 197-225.
- (1996) Machine Learning , pp. 197-225
- Heger, M.¹

7
- 85120861483
- Consideration of risk in reinforcement learning
- Heger,M. (1994) Consideration of risk in reinforcement learning. In Proceedings of the International Conference on Machine Learning, pages 105-111.
- (1994) Proceedings of the International Conference on Machine Learning , pp. 105-111
- Heger, M.¹

8
- 0026989688
- Moving target search with intelligence
- Ishida,T (1992). Moving target search with intelligence. In Proceedings of the AAAI, pages 525-532.
- (1992) Proceedings of the AAAI , pp. 525-532
- Ishida, T.¹

9
- 0002374379
- Moving target search
- Ishida, T. and R.E. Korf. (1991). Moving target search. In Proceedings of the IJCAI, pages 204-210.
- (1991) Proceedings of the IJCAI , pp. 204-210
- Ishida, T.¹ Korf, R.E.²

10
- 27144512733
- MIT Press, Cambridge (Massachusetts)
- Kaelbling., I. P. (1990). Learning in Entbedded Systems. MIT Press, Cambridge (Massachusetts).
- (1990) Learning in Entbedded Systems
- Kaelbling, I.P.¹

11
- 0347369286
- Master's thesis, Computer Science Department, University of California at Berkeley
- Koenig, S (1991) Optimal probabilistic and decision-theoretic planning using Markovian decision theory. Master's thesis, Computer Science Department, University of California at Berkeley
- (1991) Optimal Probabilistic and Decision-theoretic Planning Using Markovian Decision Theory
- Koenig, S.¹

12
- 27144521638
- (Available as Technical Report UCB/CSD 92/685).
- Technical Report UCB/CSD 92/685

13
- 0008806879
- Complexity analysis of real-time reinforcement learning applied to finding shortest paths in deterministic domains
- School of Computer Science,Carnegie Mellon University
- Koenig,S. and R.G. Simmons (1992) Complexity analysis of real-time reinforcement learning applied to finding shortest paths in deterministic domains. Technical Report CMU-CS-93-106, School of Computer Science,Carnegie Mellon University.
- (1992) Technical Report CMU-CS-93-106
- Koenig, S.¹ Simmons, R.G.²

14
- 0027704767
- Complexity analysis of real time reinforcement learning
- Koenig,S. and R.G. Simmons. (1993). Complexity analysis of real time reinforcement learning. In Proceedings of the AAAI, pages 99-105.
- (1993) Proceedings of the AAAI , pp. 99-105
- Koenig, S.¹ Simmons, R.G.²

15
- 0012153916
- How to make reactive planners risk-sensitive
- Koenig,S. and R.G. Simmons. (1994). How to make reactive planners risk-sensitive. In Proceedings of the International Conference on Artificial Intelligence Planning Systems. page 293-298.
- (1994) Proceedings of the International Conference on Artificial Intelligence Planning Systems. , pp. 293-298
- Koenig, S.¹ Simmons, R.G.²

16
- 27144460841
- The effect of representation and knowledge on goal-directed exploration with reinforcement learning algorithms the proofs
- School of Computer Science, Carnegie Mellon University
- Koenig,S. and R.G. Simmons. (1995a) The effect of representation and knowledge on goal-directed exploration with reinforcement learning algorithms the proofs Technical Report CMU-CS-45-177, School of Computer Science, Carnegie Mellon University.
- (1995) Technical Report CMU-CS-45-177
- Koenig, S.¹ Simmons, R.G.²

17
- 0002038863
- Real time search in non-deterministic domains
- Koenig,S and R. G. Simmons (1995b). Real time search in non-deterministic domains. In Proceedings of the IJCAI, pages 1660-1667.
- (1995) Proceedings of the IJCAI , pp. 1660-1667
- Koenig, S.¹ Simmons, R.G.²

18
- 0025400088
- Real-time beuristic search
- Korf, R. F. (1990). Real-time beuristic search. Artificial Intelligence. 42(2-3):189-211.
- (1990) Artificial Intelligence. , vol.42 , Issue.2-3 , pp. 189-211
- Korf, R.F.¹

19
- 0000123778
- Self-improving reactive agents based on reinforcement learning, planning and teaching
- Lin, L. J. (1992) Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8:293-321.
- (1992) Machine Learning , vol.8 , pp. 293-321
- Lin, L.J.¹

20
- 0003849946
- PhD thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology
- Matarić,M. (1994). Interaction and Intelligent Behavior. PhD thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology
- (1994) Interaction and Intelligent Behavior
- Matarić, M.¹

21
- 0006488247
- The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces
- Moore, A.W. and C.G. Atkeson (1993a). The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces. In Proceeding of the NIPS.
- (1993) Proceeding of the NIPS.
- Moore, A.W.¹ Atkeson, C.G.²

22
- 0027684215
- Prioritized sweeping: Reinforcement learning with less data and less time
- Moore, A.W. Lind C.G. Atkeson. (1993b). Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning, 13:103-130.
- (1993) Machine Learning , vol.13 , pp. 103-130
- Moore, A.W.¹ Lind² Atkeson, C.G.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.