메뉴 건너뛰기




Volumn , Issue , 2018, Pages

Action-dependent control variates for policy optimization via Stein’s identity

Author keywords

[No Author keywords available]

Indexed keywords

EFFICIENCY; GRADIENT METHODS; MONTE CARLO METHODS;

EID: 85083952784     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (80)

References (41)
  • 8
    • 84897694817 scopus 로고    scopus 로고
    • Variance reduction techniques for gradient estimates in reinforcement learning
    • Nov
    • Evan Greensmith, Peter L Bartlett, and Jonathan Baxter. Variance reduction techniques for gradient estimates in reinforcement learning. Journal of Machine Learning Research, 5(Nov):1471–1530, 2004.
    • (2004) Journal of Machine Learning Research , vol.5 , pp. 1471-1530
    • Greensmith, E.1    Bartlett, P.L.2    Baxter, J.3
  • 18
    • 85018878907 scopus 로고    scopus 로고
    • Stein variational gradient descent: A general purpose Bayesian inference algorithm
    • Qiang Liu and Dilin Wang. Stein variational gradient descent: A general purpose bayesian inference algorithm. In Advances in Neural Information Processing Systems, 2016.
    • (2016) Advances in Neural Information Processing Systems
    • Liu, Q.1    Wang, D.2
  • 26
    • 85046997203 scopus 로고    scopus 로고
    • Sticking the landing: An asymptotically zero-variance gradient estimator for variational inference
    • Geoffrey Roeder, Yuhuai Wu, and David K. Duvenaud. Sticking the landing: An asymptotically zero-variance gradient estimator for variational inference. Advances in Neural Information Processing Systems, 2017.
    • (2017) Advances in Neural Information Processing Systems
    • Roeder, G.1    Wu, Y.2    Duvenaud, D.K.3
  • 34
    • 0003722779 scopus 로고
    • Approximate computation of expectations
    • Charles Stein. Approximate computation of expectations. Lecture Notes-Monograph Series, 7: i–164, 1986.
    • (1986) Lecture Notes-Monograph Series , vol.7 , pp. 164
    • Stein, C.1
  • 40
    • 0000337576 scopus 로고
    • Simple statistical gradient-following algorithms for connectionist reinforcement learning
    • Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3-4):229–256, 1992.
    • (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 229-256
    • Williams, R.J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.