메뉴 건너뛰기




Volumn 2006, Issue , 2006, Pages 1625-1630

Importance sampling actor-critic algorithms

Author keywords

[No Author keywords available]

Indexed keywords

GRADIENT SEARCH; SYSTEM INTERACTIONS; TEMPORAL DIFFERENCE METHODS;

EID: 34047226109     PISSN: 07431619     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (7)

References (7)
  • 2
    • 0031143730 scopus 로고    scopus 로고
    • An analysis of temporal-difference learning with function approximation
    • May
    • J. N. Tsitsiklis and B. Van Roy, "An analysis of temporal-difference learning with function approximation," IEEE Transactions on Automatic Control, vol. 42, no. 5, pp. 674-690, May 1997.
    • (1997) IEEE Transactions on Automatic Control , vol.42 , Issue.5 , pp. 674-690
    • Tsitsiklis, J.N.1    Van Roy, B.2
  • 3
    • 0009011171 scopus 로고    scopus 로고
    • Simulation-based methods for markov decision processes,
    • PhD Thesis, Massachusetts Institute of Technology, Cambridge, MA
    • P. Marbach, "Simulation-based methods for markov decision processes," PhD Thesis, Massachusetts Institute of Technology, Cambridge, MA, 1998.
    • (1998)
    • Marbach, P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.