메뉴 건너뛰기




Volumn 21, Issue 1, 1998, Pages 51-71

Embedding a Priori Knowledge in Reinforcement Learning

Author keywords

Experience generalisation; Q learning algorithm; Reinforcement learning

Indexed keywords

KNOWLEDGE BASED SYSTEMS; LEARNING ALGORITHMS; STATE SPACE METHODS;

EID: 0031607078     PISSN: 09210296     EISSN: None     Source Type: Journal    
DOI: 10.1023/A:1007968115863     Document Type: Article
Times cited : (20)

References (21)
  • 3
    • 0001133021 scopus 로고
    • Generalization in reinforcement learning: Safely approximating the value function
    • G. Tesauro, D. S. Touretzky, and T. K. Leen (eds), MIT Press
    • Boyan J. A. and Moore A. W.: 1995, Generalization in reinforcement learning: Safely approximating the value function, in: G. Tesauro, D. S. Touretzky, and T. K. Leen (eds), Advances in Neural Information Processing Systems Vol. 7, MIT Press.
    • (1995) Advances in Neural Information Processing Systems , vol.7
    • Boyan, J.A.1    Moore, A.W.2
  • 5
    • 84968515237 scopus 로고
    • Scattered data interpolation: Tests of some methods
    • Franke R.: 1982, Scattered data interpolation: Tests of some methods. Mathematics of Computation 38(157), 181-200.
    • (1982) Mathematics of Computation , vol.38 , Issue.157 , pp. 181-200
    • Franke, R.1
  • 6
    • 0000439891 scopus 로고
    • On the convergence of stochastic iterative dynamic programming algorithms
    • Jaakola T., Jordan M. I., and Singh S. P.: 1994, On the convergence of stochastic iterative dynamic programming algorithms, Neural Computation 6(6), 1185-1201.
    • (1994) Neural Computation , vol.6 , Issue.6 , pp. 1185-1201
    • Jaakola, T.1    Jordan, M.I.2    Singh, S.P.3
  • 8
    • 0000123778 scopus 로고
    • Self-improving reactive agents based on reinforcement learning, planning and teaching
    • Lin L.-Ji: 1992, Self-improving reactive agents based on reinforcement learning, planning and teaching, Machine Learning 8, 293-321.
    • (1992) Machine Learning , vol.8 , pp. 293-321
    • Lin, L.-J.1
  • 10
    • 0026880130 scopus 로고
    • Automatic programming of behavior-based robots using reinforcement learning
    • Mahadevan S. and Connell J.: 1992, Automatic programming of behavior-based robots using reinforcement learning, Artificial Intelligence 55, 311-365.
    • (1992) Artificial Intelligence , vol.55 , pp. 311-365
    • Mahadevan, S.1    Connell, J.2
  • 11
    • 17144419347 scopus 로고    scopus 로고
    • The NSF workshop on reinforcement learning: Summary and observations
    • in press
    • Mahadevan S. and Kaelbling L. P.: 1996, The NSF workshop on reinforcement learning: Summary and observations, AI Magazine, in press.
    • (1996) AI Magazine
    • Mahadevan, S.1    Kaelbling, L.P.2
  • 13
    • 0039753967 scopus 로고
    • Attentional mechanisms as a strategy for generalisation in the Q-learning algorithm
    • F. Fogelman-Soulié and P. Gallinari (eds), EC2 et Cie
    • Ribeiro C. H. C.: 1995, Attentional mechanisms as a strategy for generalisation in the Q-learning algorithm, in: F. Fogelman-Soulié and P. Gallinari (eds), Procs. of the International Conf. on Artificial Neural Networks (ICANN'95), Vol. 1, EC2 et Cie, pp. 455-460.
    • (1995) Procs. of the International Conf. on Artificial Neural Networks (ICANN'95) , vol.1 , pp. 455-460
    • Ribeiro, C.H.C.1
  • 14
    • 0014432211 scopus 로고
    • A two-dimensional interpolation function for irregularly spaced data
    • Shepard D.: 1968, A two-dimensional interpolation function for irregularly spaced data, in: Procs. of the 23th National Conf. ACM, pp. 517-523.
    • (1968) Procs. of the 23th National Conf. ACM , pp. 517-523
    • Shepard, D.1
  • 15
    • 85156221438 scopus 로고    scopus 로고
    • Generalization in reinforcement learning: Succesful examples using sparse coarse coding
    • D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo (eds), MIT Press
    • Sutton R. S.: 1996, Generalization in reinforcement learning: Succesful examples using sparse coarse coding, in: D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo (eds), Advances in Neural Information Processing Systems Vol. 8, MIT Press, pp. 1038-1044.
    • (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1038-1044
    • Sutton, R.S.1
  • 17
    • 0001046225 scopus 로고
    • Practical issues in temporal difference learning
    • Tesauro G.: 1992, Practical issues in temporal difference learning, Machine Learning 8, 257-277.
    • (1992) Machine Learning , vol.8 , pp. 257-277
    • Tesauro, G.1
  • 19
    • 0029752470 scopus 로고    scopus 로고
    • Feature-based methods for large scale dynamic programming
    • Tsitsiklis J. N. and Van Roy B.: 1996, Feature-based methods for large scale dynamic programming, Machine Learning 22, 59-94.
    • (1996) Machine Learning , vol.22 , pp. 59-94
    • Tsitsiklis, J.N.1    Van Roy, B.2
  • 20


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.