메뉴 건너뛰기




Volumn 5163 LNCS, Issue PART 1, 2008, Pages 407-416

Episodic reinforcement learning by logistic reward-weighted regression

Author keywords

[No Author keywords available]

Indexed keywords

ADAPTIVE CONTROL; EXPECTATION-MAXIMIZATION ALGORITHMS; LOGISTIC REGRESSIONS; NONSTATIONARY; PARTIALLY OBSERVABLE MARKOV DECISION PROBLEMS; POLICY SEARCH; TRAINING ALGORITHMS; VALUE FUNCTIONS; WEIGHTED REGRESSION;

EID: 58849088597     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-540-87536-9_42     Document Type: Conference Paper
Times cited : (10)

References (15)
  • 4
    • 38149018611 scopus 로고    scopus 로고
    • Solving deep memory pomdps with recurrent policy gradients
    • de Sá, J.M, Alexandre, L.A, Duch, W, Mandic, D.P, eds, ICANN 2007, Springer, Heidelberg
    • Wierstra, D., Foerster, A., Peters, J., Schmidhuber, J.: Solving deep memory pomdps with recurrent policy gradients. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D.P. (eds.) ICANN 2007. LNCS, vol. 4668, pp. 697-706. Springer, Heidelberg (2007)
    • (2007) LNCS , vol.4668 , pp. 697-706
    • Wierstra, D.1    Foerster, A.2    Peters, J.3    Schmidhuber, J.4
  • 6
    • 0346982426 scopus 로고    scopus 로고
    • Using expectation-maximization for reinforcement learning
    • Dayan, P., Hinton, G.E.: Using expectation-maximization for reinforcement learning. Neural Computation 9(2), 271-278 (1997)
    • (1997) Neural Computation , vol.9 , Issue.2 , pp. 271-278
    • Dayan, P.1    Hinton, G.E.2
  • 8
    • 0041914606 scopus 로고    scopus 로고
    • Gradient flow in recurrent nets: The difficulty of learning long-term dependencies
    • Kremer, S.C, Kolen, J.F, eds, IEEE Press, Los Alamitos
    • Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kremer, S.C., Kolen, J.F. (eds.) A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press, Los Alamitos (2001)
    • (2001) A Field Guide to Dynamical Recurrent Neural Networks
    • Hochreiter, S.1    Bengio, Y.2    Frasconi, P.3    Schmidhuber, J.4
  • 10
    • 0025503558 scopus 로고
    • Back propagation through time: What it does and how to do it
    • Werbos, P.: Back propagation through time: What it does and how to do it. Proceedings of the IEEE 78, 1550-1560 (1990)
    • (1990) Proceedings of the IEEE , vol.78 , pp. 1550-1560
    • Werbos, P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.