-
3
-
-
0023543886
-
Likelihood ratio gradient estimation: An overview
-
A. Thesen, H. Grant, and W. D. Kelton, editors
-
P. W. Glynn. Likelihood ratio gradient estimation: an overview. In A. Thesen, H. Grant, and W. D. Kelton, editors, Proceedings of the 1987 Winter Simulation Conference, pages 366-375, 1987.
-
(1987)
Proceedings of the 1987 Winter Simulation Conference
, pp. 366-375
-
-
Glynn, P.W.1
-
4
-
-
29344452689
-
Sequential control variates for functionals of Markov processes
-
E. Gobet and S. Maire. Sequential control variates for functionals of Markov processes. SIAM Journal on Numerical Analysis, 43(3): 1256-1275, 2005.
-
(2005)
SIAM Journal on Numerical Analysis
, vol.43
, Issue.3
, pp. 1256-1275
-
-
Gobet, E.1
Maire, S.2
-
5
-
-
84897694817
-
Variance reduction techniques for gradient estimates in reinforcement learning
-
E. Greensmith, P. L. Bartlett, and J. Baxter. Variance reduction techniques for gradient estimates in reinforcement learning. Journal of Machine Learning Research, 5:1471-1530, 2005.
-
(2005)
Journal of Machine Learning Research
, vol.5
, pp. 1471-1530
-
-
Greensmith, E.1
Bartlett, P.L.2
Baxter, J.3
-
6
-
-
0014705837
-
A retrospective and prospective survey of the Monte-Carlo method
-
J. H. Halton. A retrospective and prospective survey of the Monte-Carlo method. SIAM Review, 12 (1):1-63, 1970.
-
(1970)
SIAM Review
, vol.12
, Issue.1
, pp. 1-63
-
-
Halton, J.H.1
-
7
-
-
0028458133
-
Sequential Monte-Carlo techniques for the solution of linear systems
-
J. H. Halton. Sequential Monte-Carlo techniques for the solution of linear systems. Journal of Scientific Computing, 9:213-257, 1994.
-
(1994)
Journal of Scientific Computing
, vol.9
, pp. 213-257
-
-
Halton, J.H.1
-
10
-
-
0033449589
-
Adaptive importance sampling on discrete Markov chains
-
C. Kollman, K. Baggerly, D. Cox, and R. Picard. Adaptive importance sampling on discrete Markov chains. The Annals of Applied Probability, 9(2):391-412, 1999.
-
(1999)
The Annals of Applied Probability
, vol.9
, Issue.2
, pp. 391-412
-
-
Kollman, C.1
Baggerly, K.2
Cox, D.3
Picard, R.4
-
11
-
-
0343893613
-
Actor-critic-type learning algorithms for Markov decision processes
-
V. R. Konda and V. S. Borkar. Actor-critic-type learning algorithms for Markov decision processes. SIAM Journal of Control and Optimization, 38:1:94-123, 1999.
-
(1999)
SIAM Journal of Control and Optimization
, vol.38
, Issue.1
, pp. 94-123
-
-
Konda, V.R.1
Borkar, V.S.2
-
12
-
-
0042020169
-
An iterative computation of approximations on Korobov-like spaces
-
S. Maire. An iterative computation of approximations on Korobov-like spaces. J. Comput. Appl. Math., 54(6):261-281, 2003.
-
(2003)
J. Comput. Appl. Math.
, vol.54
, Issue.6
, pp. 261-281
-
-
Maire, S.1
-
13
-
-
0037288469
-
Approximate gradient methods in policy-space optimization of Markov reward processes
-
P. Marbach and J. N. Tsitsiklis. Approximate gradient methods in policy-space optimization of Markov reward processes. Journal of Discrete Event Dynamical Systems, 13:111-148, 2003.
-
(2003)
Journal of Discrete Event Dynamical Systems
, vol.13
, pp. 111-148
-
-
Marbach, P.1
Tsitsiklis, J.N.2
-
14
-
-
0022906632
-
Sensitivity analysis via likelihood ratios
-
J. Wilson, J. Henriksen, and S. Roberts, editors
-
M. I. Reiman and A. Weiss. Sensitivity analysis via likelihood ratios. In J. Wilson, J. Henriksen, and S. Roberts, editors, Proceedings of the 1986 Winter Simulation Conference, pages 285-289, 1986.
-
(1986)
Proceedings of the 1986 Winter Simulation Conference
, pp. 285-289
-
-
Reiman, M.I.1
Weiss, A.2
-
15
-
-
84898939480
-
Policy gradient methods for reinforcement learning with function approximation
-
MIT Press
-
R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. Neural Information Processing Systems. MIT Press, pages 1057-1063, 2000.
-
(2000)
Neural Information Processing Systems
, pp. 1057-1063
-
-
Sutton, R.S.1
McAllester, D.2
Singh, S.3
Mansour, Y.4
-
17
-
-
84887252594
-
Support vector method for function approximation, regression estimation and signal processing
-
V. Vapnik, S. E. Golowich, and A. Smola. Support vector method for function approximation, regression estimation and signal processing. In Advances in Neural Information Processing Systems, pages 281-281, 1997.
-
(1997)
Advances in Neural Information Processing Systems
, pp. 281-281
-
-
Vapnik, V.1
Golowich, S.E.2
Smola, A.3
-
18
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
R. J. Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8:229-256, 1992.
-
(1992)
Machine Learning
, vol.8
, pp. 229-256
-
-
Williams, R.J.1
|