-
1
-
-
33744819512
-
Adaptive importance sampling technique for Markov chains using stochastic approximation
-
Ahamed, T. P., Borkar, V. S., and Juneja, S. Adaptive importance sampling technique for Markov chains using stochastic approximation. Operations Research, 54:489-504, 2006.
-
(2006)
Operations Research
, vol.54
, pp. 489-504
-
-
Ahamed, T.P.1
Borkar, V.S.2
Juneja, S.3
-
2
-
-
0003565783
-
-
Athena Scientific, Belmont, MA, third edition
-
Bertsekas, D. P. Dynamic Programming and Optimal Control, volume II. Athena Scientific, Belmont, MA, third edition, 2007.
-
(2007)
Dynamic programming and optimal control
, vol.2
-
-
Bertsekas, D.P.1
-
3
-
-
77956540624
-
Projected equations, variational inequalities, and temporal difference methods
-
to appear
-
Bertsekas, D. P. Projected equations, variational inequalities, and temporal difference methods. IEEE Trans. Automat. Contr., 2009. to appear.
-
(2009)
IEEE Trans. Automat. Contr.
-
-
Bertsekas, D.P.1
-
5
-
-
61849106433
-
Projected equation methods for approximate solution of large linear systems
-
Bertsekas, D. P. and Yu, H. Projected equation methods for approximate solution of large linear systems. J. Computational and Applied Mathematics, 227(1): 27-50, 2009.
-
(2009)
J. Computational and Applied Mathematics
, vol.227
, Issue.1
, pp. 27-50
-
-
Bertsekas, D.P.1
Yu, H.2
-
7
-
-
0038595396
-
Least-squares temporal difference learning
-
Boyan, J. A. Least-squares temporal difference learning. In Proc. the 16th ICML, pp. 49-56, 1999.
-
(1999)
Proc. the 16th ICML
, pp. 49-56
-
-
Boyan, J.A.1
-
8
-
-
0001771345
-
Linear least-squares algorithms for temporal difference learning
-
Bradtke, S. J. and Barto, A. G. Linear least-squares algorithms for temporal difference learning. Machine Learning, 22(2):33-57, 1996.
-
(1996)
Machine Learning
, vol.22
, Issue.2
, pp. 33-57
-
-
Bradtke, S.J.1
Barto, A.G.2
-
9
-
-
32944469001
-
Probability
-
Philadelphia, PA
-
Breiman, L. Probability. SIAM, Philadelphia, PA, 1992.
-
(1992)
SIAM
-
-
Breiman, L.1
-
11
-
-
0001240715
-
Importance sampling for stochastic simulations
-
Glynn, P. W. and Iglehart, D. L. Importance sampling for stochastic simulations. Management Science, 35: 1367-1392, 1989.
-
(1989)
Management Science
, vol.35
, pp. 1367-1392
-
-
Glynn, P.W.1
Iglehart, D.L.2
-
14
-
-
70350302258
-
-
Cambridge University Press, Cambdrige, UK, 2nd edition
-
Meyn, S. and Tweedie, R. L. Markov Chains and Stochastic Stability. Cambridge University Press, Cambdrige, UK, 2nd edition, 2009.
-
(2009)
Markov Chains and Stochastic Stability
-
-
Meyn, S.1
Tweedie, R.L.2
-
15
-
-
0037288398
-
Least squares policy evaluation algorithms with linear function approximation
-
Nedic, A. and Bertsekas, D. P. Least squares policy evaluation algorithms with linear function approximation. Discrete Event Dyn. Syst., 13:79-110, 2003.
-
(2003)
Discrete Event Dyn. Syst.
, vol.13
, pp. 79-110
-
-
Nedic, A.1
Bertsekas, D.P.2
-
16
-
-
4644328593
-
Off-policy temporal-difference learning with function approximation
-
Precup, D., Sutton, R. S., and Dasgupta, S. Off-policy temporal-difference learning with function approximation. In Proc. the 18th ICML, pp. 417-424, 2001.
-
(2001)
Proc. the 18th ICML
, pp. 417-424
-
-
Precup, D.1
Sutton, R.S.2
Dasgupta, S.3
-
17
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
Sutton, R. S. Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44, 1988.
-
(1988)
Machine Learning
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
19
-
-
0031143730
-
An analysis of temporal-difference learning with function approximation
-
Tsitsiklis, J. N. and Van Roy, B. An analysis of temporal-difference learning with function approximation. IEEE Trans. Automat. Contr., 42(5):674- 690, 1997.
-
(1997)
IEEE Trans. Automat. Contr.
, vol.42
, Issue.5
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
-
20
-
-
56449123618
-
Preconditioned temporal difference learning
-
Yao, H. S. and Liu, Z. Q. Preconditioned temporal difference learning. In Proc. the 25th ICML, pp. 1208-1215, 2008.
-
(2008)
Proc. the 25th ICML
, pp. 1208-1215
-
-
Yao, H.S.1
Liu, Z.Q.2
-
21
-
-
77956506470
-
Convergence of least squares temporal difference methods under general conditions
-
Yu, H. Convergence of least squares temporal difference methods under general conditions. Tech. Report C-2010-1, Dept. CS, Univ. of Helsinki, 2010.
-
(2010)
Tech. Report C-2010-1, Dept. CS, Univ. of Helsinki
-
-
Yu, H.1
|