메뉴 건너뛰기




Volumn 5212 LNAI, Issue PART 2, 2008, Pages 234-249

State-dependent exploration for policy gradient methods

Author keywords

[No Author keywords available]

Indexed keywords

DATABASE SYSTEMS; GRADIENT METHODS; LEARNING SYSTEMS; MAXIMUM LIKELIHOOD ESTIMATION; PROBLEM SOLVING; REINFORCEMENT; REINFORCEMENT LEARNING; ROBOT LEARNING; SOLUTIONS;

EID: 56049089041     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-540-87481-2_16     Document Type: Conference Paper
Times cited : (61)

References (15)
  • 5
    • 0000337576 scopus 로고
    • Simple statistical gradient-following algorithms for connectionist reinforcement learning
    • Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8. 229-256 (1992)
    • (1992) Machine Learning , vol.8 , pp. 229-256
    • Williams, R.J.1
  • 12
    • 0032117046 scopus 로고    scopus 로고
    • Implementation of the simultaneous perturbation algorithm for stochastic optimization
    • Spall, J.C.: Implementation of the simultaneous perturbation algorithm for stochastic optimization. IEEE Transactions on Aerospace and Electronic Systems 34(3), 817-823 (1998)
    • (1998) IEEE Transactions on Aerospace and Electronic Systems , vol.34 , Issue.3 , pp. 817-823
    • Spall, J.C.1
  • 13
    • 38149018611 scopus 로고    scopus 로고
    • Solving deep memory POMDPs with recurrent policy gradients
    • de Sá. J.M, Alexandre, L.A, Duch, W, Mandic, D.P, eds, ICANN 2007, Springer, Heidelberg
    • Wierstra, D., Foerster, A., Peters, J., Schmidhuber, J.: Solving deep memory POMDPs with recurrent policy gradients. In: de Sá. J.M., Alexandre, L.A., Duch, W., Mandic, D.P. (eds.) ICANN 2007. LNCS, vol. 4668, pp. 697-706. Springer, Heidelberg (2007)
    • (2007) LNCS , vol.4668 , pp. 697-706
    • Wierstra, D.1    Foerster, A.2    Peters, J.3    Schmidhuber, J.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.