SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems

Volumn 2015-January, Issue , 2015, Pages 3528-3536

Gradient estimation using stochastic computation graphs

(4) Schulman, John a,b Heess, Nicolas a Weber, Theophane a Abbeel, Pieter b

a DEEPMIND (United Kingdom)

b UNIVERSITY OF CALIFORNIA (United States)

Author keywords

[No Author keywords available]

Indexed keywords

BACKPROPAGATION ALGORITHMS; DIRECTED GRAPHS; ESTIMATION; INFORMATION SCIENCE; LEARNING ALGORITHMS; PROBABILITY DISTRIBUTIONS; REINFORCEMENT LEARNING; STOCHASTIC MODELS;

CONDITIONAL PROBABILITY DISTRIBUTIONS; DETERMINISTIC FUNCTIONS; DIRECTED ACYCLIC GRAPH (DAG); GRADIENT-BASED LEARNING; PROBABILISTIC MODELING; SIMPLE MODIFICATIONS; STOCHASTIC COMPUTATIONS; VARIANCE REDUCTION TECHNIQUES;

STOCHASTIC SYSTEMS;

EID: 84965157716 PISSN: 10495258 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (421)

References (29)

1
- 0013535965
- Infinite-horizon policy-gradient estimation
- J. Baxter and P. L. Bartlett. Infinite-horizon policy-gradient estimation. Journal of Artificial Intelligence Research, pages 319-350, 2001.
- (2001) Journal of Artificial Intelligence Research , pp. 319-350
- Baxter, J.¹ Bartlett, P.L.²

2
- 84919825195
- arXiv preprint arXiv:1308.3432
- Y. Bengio, N. Léonard, and A. Courville. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432, 2013.
- (2013) Estimating Or Propagating Gradients Through Stochastic Neurons for Conditional Computation
- Bengio, Y.¹ Léonard, N.² Courville, A.³

3
- 77950463930
- Gradient estimation
- M. C. Fu. Gradient estimation. Handbooks in operations research and management science, 13:575-616, 2006.
- (2006) Handbooks in Operations Research and Management Science , vol.13 , pp. 575-616
- Fu, M.C.¹

4
- 1542342296
- Springer Science & Business Media
- P. Glasserman. Monte Carlo methods in financial engineering, volume 53. Springer Science & Business Media, 2003.
- (2003) Monte Carlo Methods in Financial Engineering , vol.53
- Glasserman, P.¹

5
- 84976859194
- Likelihood ratio gradient estimation for stochastic systems
- P. W. Glynn. Likelihood ratio gradient estimation for stochastic systems. Communications of the ACM, 33(10):75-84, 1990.
- (1990) Communications of the ACM , vol.33 , Issue.10 , pp. 75-84
- Glynn, P.W.¹

6
- 84897694817
- Variance reduction techniques for gradient estimates in reinforcement learning
- E. Greensmith, P. L. Bartlett, and J. Baxter. Variance reduction techniques for gradient estimates in reinforcement learning. The Journal of Machine Learning Research, 5:1471-1530, 2004.
- (2004) The Journal of Machine Learning Research , vol.5 , pp. 1471-1530
- Greensmith, E.¹ Bartlett, P.L.² Baxter, J.³

7
- 84919810318
- arXiv preprint arXiv:1310.8499
- K. Gregor, I. Danihelka, A. Mnih, C. Blundell, and D. Wierstra. Deep autoregressive networks. arXiv preprint arXiv:1310.8499, 2013.
- (2013) Deep Autoregressive Networks
- Gregor, K.¹ Danihelka, I.² Mnih, A.³ Blundell, C.⁴ Wierstra, D.⁵

8
- 0003767674
- Siam
- A. Griewank and A. Walther. Evaluating derivatives: principles and techniques of algorithmic differentiation. Siam, 2008.
- (2008) Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation
- Griewank, A.¹ Walther, A.²

9
- 0031573117
- Long short-term memory
- S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735-1780, 1997.
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

10
- 84919810317
- arXiv:1312.6114
- D. P. Kingma and M. Welling. Auto-encoding variational Bayes. arXiv:1312.6114, 2013.
- (2013) Auto-encoding Variational Bayes
- Kingma, D.P.¹ Welling, M.²

11
- 84919805977
- arXiv preprint arXiv:1402.0480
- D. P. Kingma and M. Welling. Efficient gradient-based inference through transformations between bayes nets and neural nets. arXiv preprint arXiv:1402.0480, 2014.
- (2014) Efficient Gradient-based Inference Through Transformations between Bayes Nets and Neural Nets
- Kingma, D.P.¹ Welling, M.²

12
- 0032203257
- Gradient-based learning applied to document recognition
- Y. Le Cun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324, 1998.
- (1998) Proceedings of the IEEE , vol.86 , Issue.11 , pp. 2278-2324
- Le Cun, Y.¹ Bottou, L.² Bengio, Y.³ Haffner, P.⁴

13
- 77956541496
- Deep learning via Hessian-free optimization
- J. Martens. Deep learning via Hessian-free optimization. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 735-742, 2010.
- (2010) Proceedings of the 27th International Conference on Machine Learning (ICML-10) , pp. 735-742
- Martens, J.¹

14
- 84937852305
- arXiv:1402.0030
- A. Mnih and K. Gregor. Neural variational inference and learning in belief networks. arXiv:1402.0030, 2014.
- (2014) Neural Variational Inference and Learning in Belief Networks
- Mnih, A.¹ Gregor, K.²

15
- 84937959846
- Recurrent models of visual attention
- V. Mnih, N. Heess, A. Graves, and K. Kavukcuoglu. Recurrent models of visual attention. In Advances in Neural Information Processing Systems, pages 2204-2212, 2014.
- (2014) Advances in Neural Information Processing Systems , pp. 2204-2212
- Mnih, V.¹ Heess, N.² Graves, A.³ Kavukcuoglu, K.⁴

16
- 33646399442
- Policy gradient in continuous time
- R. Munos. Policy gradient in continuous time. The Journal of Machine Learning Research, 7:771-791, 2006.
- (2006) The Journal of Machine Learning Research , vol.7 , pp. 771-791
- Munos, R.¹

17
- 4243447828
- Department of Computer Science, University of Toronto
- R. M. Neal. Learning stochastic feedforward networks. Department of Computer Science, University of Toronto, 1990.
- (1990) Learning Stochastic Feedforward Networks
- Neal, R.M.¹

18
- 0002788893
- A view of the em algorithm that justifies incremental, sparse, and other variants
- Springer
- R. M. Neal and G. E. Hinton. A view of the em algorithm that justifies incremental, sparse, and other variants. In Learning in graphical models, pages 355-368. Springer, 1998.
- (1998) Learning in Graphical Models , pp. 355-368
- Neal, R.M.¹ Hinton, G.E.²

19
- 0003391330
- Morgan Kaufmann
- J. Pearl. Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, 2014.
- (2014) Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
- Pearl, J.¹

20
- 84959389596
- arXiv preprint arXiv:1401.0118
- R. Ranganath, S. Gerrish, and D. M. Blei. Black box variational inference. arXiv preprint arXiv:1401.0118, 2013.
- (2013) Black Box Variational Inference
- Ranganath, R.¹ Gerrish, S.² Blei, D.M.³

21
- 84919908080
- arXiv:1401.4082
- D. J. Rezende, S. Mohamed, and D. Wierstra. Stochastic backpropagation and approximate inference in deep generative models. arXiv:1401.4082, 2014.
- (2014) Stochastic Backpropagation and Approximate Inference in Deep Generative Models
- Rezende, D.J.¹ Mohamed, S.² Wierstra, D.³

22
- 84919793697
- Deterministic policy gradient algorithms
- D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller. Deterministic policy gradient algorithms. In ICML, 2014.
- (2014) ICML
- Silver, D.¹ Lever, G.² Heess, N.³ Degris, T.⁴ Wierstra, D.⁵ Riedmiller, M.⁶

23
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- Citeseer
- R. S. Sutton, D. A. McAllester, S. P. Singh, Y. Mansour, et al. Policy gradient methods for reinforcement learning with function approximation. In NIPS, volume 99, pages 1057-1063. Citeseer, 1999.
- (1999) NIPS , vol.99 , pp. 1057-1063
- Sutton, R.S.¹ McAllester, D.A.² Singh, S.P.³ Mansour, Y.⁴

24
- 70349327392
- Learning model-free robot control by a Monte Carlo EM algorithm
- N. Vlassis, M. Toussaint, G. Kontes, and S. Piperidis. Learning model-free robot control by a Monte Carlo EM algorithm. Autonomous Robots, 27(2):123-130, 2009.
- (2009) Autonomous Robots , vol.27 , Issue.2 , pp. 123-130
- Vlassis, N.¹ Toussaint, M.² Kontes, G.³ Piperidis, S.⁴

25
- 77957283019
- Recurrent policy gradients
- D. Wierstra, A. Förster, J. Peters, and J. Schmidhuber. Recurrent policy gradients. Logic Journal of IGPL, 18(5):620-634, 2010.
- (2010) Logic Journal of IGPL , vol.18 , Issue.5 , pp. 620-634
- Wierstra, D.¹ Förster, A.² Peters, J.³ Schmidhuber, J.⁴

26
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3-4):229-256, 1992.
- (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 229-256
- Williams, R.J.¹

27
- 84899011066
- arXiv preprint arXiv:1301.1299
- D. Wingate and T. Weber. Automated variational inference in probabilistic programming. arXiv preprint arXiv:1301.1299, 2013.
- (2013) Automated Variational Inference in Probabilistic Programming
- Wingate, D.¹ Weber, T.²

28
- 0003982971
- Springer New York
- S. J. Wright and J. Nocedal. Numerical optimization, volume 2. Springer New York, 1999.
- (1999) Numerical Optimization , vol.2
- Wright, S.J.¹ Nocedal, J.²

29
- 84943750581
- arXiv preprint arXiv:1505.00521
- W. Zaremba and I. Sutskever. Reinforcement learning neural Turing machines. arXiv preprint arXiv:1505.00521, 2015.
- (2015) Reinforcement Learning Neural Turing Machines
- Zaremba, W.¹ Sutskever, I.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.