메뉴 건너뛰기




Volumn 49, Issue 2-3, 2002, Pages 325-346

Structure in the space of value functions

Author keywords

Density estimation; Dynamic programming; Mixture models; Reinforcement learning; Unsupervised learning; Value functions

Indexed keywords

DENSITY ESTIMATION; MARKOV DECISION PROBLEMS; MIXTURE MODELS; REINFORCEMENT LEARNING; UNSUPERVISED LEARNING; VALUE FUNCTIONS;

EID: 0036832959     PISSN: 08856125     EISSN: None     Source Type: Journal    
DOI: 10.1023/A:1017944732463     Document Type: Article
Times cited : (65)

References (41)
  • 19
    • 24844453140 scopus 로고    scopus 로고
    • Planning with temporally abstract actions
    • Report CS-98-01, Department of Computer Science, Brown University, Providence, RI
    • (1998)
    • Hauskrecht, M.1
  • 28
    • 0003989214 scopus 로고    scopus 로고
    • Hierarchical control and learning for Markov decision processes
    • Ph.D. Thesis, Computer Science Division, UC Berkeley
    • (1998)
    • Parr, R.1
  • 33
    • 0003636089 scopus 로고
    • On-line Q-learning using connectionist systems
    • Technical Report CUED/F-INFENG/TR 166. Engineering Department, Cambridge University
    • (1994)
    • Rummery, G.A.1    Niranjan, M.2
  • 38
    • 0003899594 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: Learning, planning and representing knowledge at multiple temporal scales
    • Report 98-74, Department of Computer Science, University of Massachusetts, Amherst, MA
    • (1998)
    • Sutton, R.S.1    Precup, D.2    Singh, S.P.3
  • 41
    • 0004049893 scopus 로고
    • Learning from delayed rewards
    • Ph.D. thesis, University of Cambridge, Cambridge, UK
    • (1989)
    • Watkins, C.J.C.H.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.