메뉴 건너뛰기




Volumn 7207 LNAI, Issue , 2012, Pages 2-6

Beyond reward: The problem of knowledge and data

Author keywords

[No Author keywords available]

Indexed keywords


EID: 84864841464     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-642-31951-8_2     Document Type: Conference Paper
Times cited : (5)

References (24)
  • 16
    • 84912073624 scopus 로고    scopus 로고
    • Learning options in reinforcement learning
    • In: Koenig, S., Holte, R.C. (eds.) Springer, Heidelberg
    • Stolle, M., Precup, D.: Learning Options in Reinforcement Learning. In: Koenig, S., Holte, R.C. (eds.) SARA 2002. LNCS (LNAI), vol. 2371, pp. 212-223. Springer, Heidelberg (2002)
    • (2002) SARA 2002. LNCS (LNAI) , vol.2371 , pp. 212-223
    • Stolle, M.1    Precup, D.2
  • 17
    • 84864837762 scopus 로고    scopus 로고
    • http://richsutton.com/IncIdeas/KeytoAI.html
    • Sutton, R.S.: "Verification" and "Verfication, the key to AI" (2001), http://richsutton.com/IncIdeas/Verification.html, http://richsutton.com/IncIdeas/KeytoAI.html
    • (2001) Verification and Verfication, the key to AI
    • Sutton, R.S.1
  • 22
    • 0033170372 scopus 로고    scopus 로고
    • Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning
    • Sutton, R.S., Precup, D., Singh, S.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112, 181-211 (1999)
    • (1999) Artificial Intelligence , vol.112 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3
  • 23
    • 77956513316 scopus 로고    scopus 로고
    • A convergent O(n) Algorithm for off-policy temporal-difference learning with linear function approximation
    • MIT Press
    • Sutton, R.S., Szepesvári, C., Maei, H.R.: A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation. In: Advances in Neural Information Processing Systems, vol. 21. MIT Press (2009)
    • (2009) Advances in Neural Information Processing Systems , vol.21
    • Sutton, R.S.1    Szepesvári, C.2    Maei, H.R.3
  • 24
    • 0031143730 scopus 로고    scopus 로고
    • An analysis of temporal-difference learning with function approximation
    • Tsitsiklis, J.N., Van Roy, B.: An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control 42, 674-690 (1997)
    • (1997) IEEE Transactions on Automatic Control , vol.42 , pp. 674-690
    • Tsitsiklis, J.N.1    Van Roy, B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.