메뉴 건너뛰기




Volumn 2017-December, Issue , 2017, Pages 5280-5289

Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation

Author keywords

[No Author keywords available]

Indexed keywords

CURVE FITTING; GRADIENT METHODS; REINFORCEMENT LEARNING;

EID: 85046992971     PISSN: 10495258     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (501)

References (28)
  • 1
    • 0000396062 scopus 로고    scopus 로고
    • Natural gradient works efficiently in learning
    • S. I. Amari. Natural gradient works efficiently in learning. Neural Computation, 10(2): 251-276, 1998.
    • (1998) Neural Computation , vol.10 , Issue.2 , pp. 251-276
    • Amari, S.I.1
  • 2
    • 85057316160 scopus 로고    scopus 로고
    • Distributed second-order optimization using kronecker-factored approximations
    • J. Ba, R. Grosse, and J. Martens. Distributed second-order optimization using Kronecker-factored approximations. In ICLR, 2017.
    • (2017) ICLR
    • Ba, J.1    Grosse, R.2    Martens, J.3
  • 6
    • 84998893215 scopus 로고    scopus 로고
    • A kronecker-factored approximate fisher matrix for convolutional layers
    • R. Grosse and J. Martens. A Kronecker-factored approximate Fisher matrix for convolutional layers. In ICML, 2016.
    • (2016) ICML
    • Grosse, R.1    Martens, J.2
  • 7
    • 85041942380 scopus 로고    scopus 로고
    • Q-prop: Sample-efficient policy gradient with an off-policy critic
    • S. Gu, T. Lillicrap, Z. Ghahramani, R. E. Turner, and S. Levine. Q-prop: Sample-efficient policy gradient with an off-policy critic. In ICLR, 2017.
    • (2017) ICLR
    • Gu, S.1    Lillicrap, T.2    Ghahramani, Z.3    Turner, R.E.4    Levine, S.5
  • 11
    • 85083951076 scopus 로고    scopus 로고
    • Adam: A method for stochastic optimization
    • D. Kingma and J. Ba. Adam: A method for stochastic optimization. ICLR, 2015.
    • (2015) ICLR
    • Kingma, D.1    Ba, J.2
  • 13
    • 77956541496 scopus 로고    scopus 로고
    • Deep learning via hessian-free optimization
    • J. Martens. Deep learning via Hessian-free optimization. In ICML-10, 2010.
    • (2010) ICML-10
    • Martens, J.1
  • 15
    • 84969988426 scopus 로고    scopus 로고
    • Optimizing neural networks with kronecker-factored approximate curvature
    • J. Martens and R. Grosse. Optimizing neural networks with kronecker-factored approximate curvature. In ICML, 2015.
    • (2015) ICML
    • Martens, J.1    Grosse, R.2
  • 19
    • 40649106649 scopus 로고    scopus 로고
    • Natural actor-critic
    • J. Peters and S. Schaal. Natural actor-critic. Neurocomputing, 71(7-9): 1180-1190, 2008.
    • (2008) Neurocomputing , vol.71 , Issue.7-9 , pp. 1180-1190
    • Peters, J.1    Schaal, S.2
  • 20
    • 0036631778 scopus 로고    scopus 로고
    • Fast curvature matrix-vector products for second-order gradient descent
    • N. N. Schraudolph. Fast curvature matrix-vector products for second-order gradient descent. Neural Computation, 2002.
    • (2002) Neural Computation
    • Schraudolph, N.N.1
  • 28
    • 0000337576 scopus 로고
    • Simple statistical gradient-following algorithms for connectionist reinforcement learning
    • R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3): 229-256, 1992.
    • (1992) Machine Learning , vol.8 , Issue.3 , pp. 229-256
    • Williams, R.J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.