-
1
-
-
0033351917
-
Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing financial derivatives
-
J. N. Tsitsiklis and B. Van Roy, "Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing financial derivatives," IEEE Trans. Automat. Contr., vol. 44, pp. 1840-1851, 1999.
-
(1999)
IEEE Trans. Automat. Contr.
, vol.44
, pp. 1840-1851
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
-
2
-
-
84974489693
-
Numerical valuation of high dimensional multivariate American securities
-
J. Barraquand and D. Martineau, "Numerical valuation of high dimensional multivariate American securities," Journal of Financial and Quantitative Analysis, vol. 30, pp. 383-405, 1995.
-
(1995)
Journal of Financial and Quantitative Analysis
, vol.30
, pp. 383-405
-
-
Barraquand, J.1
Martineau, D.2
-
3
-
-
0035578679
-
Valuing American options by simulation: A simple least-squares approach
-
F. A. Longstaff and E. S. Schwartz, "Valuing American options by simulation: A simple least-squares approach," Review of Financial Studies, vol. 14, pp. 113-147, 2001.
-
(2001)
Review of Financial Studies
, vol.14
, pp. 113-147
-
-
Longstaff, F.A.1
Schwartz, E.S.2
-
6
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
R. S. Sutton, "Learning to predict by the methods of temporal differences," Machine Learning, vol. 3, pp. 9-44, 1988.
-
(1988)
Machine Learning
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
7
-
-
4243567726
-
Temporal differences-based policy iteration and applications in neuro-dynamic programming
-
D. P. Bertsekas and S. Ioffe, "Temporal differences-based policy iteration and applications in neuro-dynamic programming," MIT, LIDS Tech. Report LIDS-P-2349, 1996.
-
(1996)
MIT, LIDS Tech. Report LIDS-P-2349
-
-
Bertsekas, D.P.1
Ioffe, S.2
-
8
-
-
0037288398
-
Least squares policy evaluation algorithms with linear function approximation
-
A. Nedíc and D. P. Bertsekas, "Least squares policy evaluation algorithms with linear function approximation," Discrete Event Dyn. Syst., vol. 13, pp. 79-110, 2003.
-
(2003)
Discrete Event Dyn. Syst.
, vol.13
, pp. 79-110
-
-
Nedíc, A.1
Bertsekas, D.P.2
-
11
-
-
0001771345
-
Linear least-squares algorithms for temporal difference learning
-
S. J. Bradtke and A. G. Barto, "Linear least-squares algorithms for temporal difference learning," Machine Learning, vol. 22, no. 2, pp. 33-57, 1996.
-
(1996)
Machine Learning
, vol.22
, Issue.2
, pp. 33-57
-
-
Bradtke, S.J.1
Barto, A.G.2
-
13
-
-
33646435300
-
A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning
-
D. S. Choi and B. Van Roy, "A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning," Discrete Event Dyn. Syst., vol. 16, no. 2, pp. 207-239, 2006.
-
(2006)
Discrete Event Dyn. Syst.
, vol.16
, Issue.2
, pp. 207-239
-
-
Choi, D.S.1
Van Roy, B.2
-
14
-
-
33645396919
-
Improved temporal difference methods with linear function approximation
-
by A. Barto, W. Powell, J. Si, (Eds.), IEEE Press
-
D. P. Bertsekas, V. S. Borkar, and A. Nedíc, "Improved temporal difference methods with linear function approximation," MIT, LIDS Tech. Report 2573, 2003, also appears in "Learning and Approximate Dynamic Programming," by A. Barto, W. Powell, J. Si, (Eds.), IEEE Press, 2004.
-
(2004)
MIT, LIDS Tech. Report 2573, 2003, Also Appears in Learning and Approximate Dynamic Programming
-
-
Bertsekas, D.P.1
Borkar, V.S.2
Nedíc, A.3
-
15
-
-
0004049893
-
-
Doctoral Dissertation University of Cambridge, Cambridge, United Kingdom
-
C. J. C. H. Watkins, "Learning from delayed rewards," Doctoral dissertation, University of Cambridge, Cambridge, United Kingdom, 1989.
-
(1989)
Learning from delayed rewards
-
-
Watkins, C.J.C.H.1
-
17
-
-
0028497630
-
Asynchronous stochastic approximation and Qlearning
-
J. N. Tsitsiklis, "Asynchronous stochastic approximation and Qlearning," Machine Learning, vol. 16, pp. 185-202, 1994.
-
(1994)
Machine Learning
, vol.16
, pp. 185-202
-
-
Tsitsiklis, J.N.1
-
18
-
-
0031143730
-
An analysis of temporal-difference learning with function approximation
-
J. N. Tsitsiklis and B. Van Roy, "An analysis of temporal-difference learning with function approximation," IEEE Trans. Automat. Contr., vol. 42, no. 5, pp. 674-690, 1997.
-
(1997)
IEEE Trans. Automat. Contr.
, vol.42
, Issue.5
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
|