-
1
-
-
0000025104
-
On the choice of alternative measures in importance sampling with Markov chains
-
Andradóttir, S., and Heyman, D. P., Ott, T. J. (1995). On the choice of alternative measures in importance sampling with markov chains. Operations Research, 43(3): 509-519.
-
(1995)
Operations Research
, vol.43
, Issue.3
, pp. 509-519
-
-
Andradóttir, S.1
Heyman, D.P.2
Ott, T.J.3
-
2
-
-
61849106433
-
Projected equation methods for approximate solution of large linear systems
-
Bertsekas, D. P., Yu, H. (2009). Projected equation methods for approximate solution of large linear systems. Journal of Computational and Applied Mathematics, 227(1): 27-50.
-
(2009)
Journal of Computational and Applied Mathematics
, vol.227
, Issue.1
, pp. 27-50
-
-
Bertsekas, D.P.1
Yu, H.2
-
5
-
-
84899800132
-
Policy evaluation with temporal differences: A survey and comparison
-
Dann, C., Neumann, G., Peters, J. (2014). Policy evaluation with temporal differences: a survey and comparison. Journal of Machine Learning Research, 15: 809-883.
-
(2014)
Journal of Machine Learning Research
, vol.15
, pp. 809-883
-
-
Dann, C.1
Neumann, G.2
Peters, J.3
-
6
-
-
84897081792
-
Off-policy learning with eligibility traces: A survey
-
Geist, M., Scherrer, B. (2014). Off-policy learning with eligibility traces: A survey. Journal of Machine Learning Research, 15: 289-333.
-
(2014)
Journal of Machine Learning Research
, vol.15
, pp. 289-333
-
-
Geist, M.1
Scherrer, B.2
-
7
-
-
70549113878
-
Adaptive importance sampling for value function approximation in off-policy reinforcement learning
-
Hachiya, H., Akiyama, T., Sugiayma, M., Peters, J. (2009). Adaptive importance sampling for value function approximation in off-policy reinforcement learning. Neural Networks, 22(10): 1399-1410.
-
(2009)
Neural Networks
, vol.22
, Issue.10
, pp. 1399-1410
-
-
Hachiya, H.1
Akiyama, T.2
Sugiayma, M.3
Peters, J.4
-
8
-
-
84855251060
-
Importance-weighted least-squares probabilistic classifier for covariate shift adaptation with application to human activity recognition
-
Hachiya, H., Sugiyama, M., Ueda, N. (2012). Importance-weighted least-squares probabilistic classifier for covariate shift adaptation with application to human activity recognition. Neurocomputing, 80: 93-101.
-
(2012)
Neurocomputing
, vol.80
, pp. 93-101
-
-
Hachiya, H.1
Sugiyama, M.2
Ueda, N.3
-
9
-
-
0141812007
-
-
Ph.D. Dissertation, Statistics Department, Stanford University
-
Hesterberg, T. C. (1988), Advances in importance sampling, Ph.D. Dissertation, Statistics Department, Stanford University.
-
(1988)
Advances in Importance Sampling
-
-
Hesterberg, T.C.1
-
13
-
-
77954101982
-
GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces
-
Atlantis Press
-
Maei, H. R., Sutton, R. S. (2010). GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. In Proceedings of the Third Conference on Artificial General Intelligence, pp. 91-96. Atlantis Press.
-
(2010)
Proceedings of the Third Conference on Artificial General Intelligence
, pp. 91-96
-
-
Maei, H.R.1
Sutton, R.S.2
-
15
-
-
0242393653
-
Eligibility traces for off-policy policy evaluation
-
Morgan Kaufmann
-
Precup, D., and Sutton, R. S., Singh, S. (2000). Eligibility traces for off-policy policy evaluation. In Proceedings of the 17th International Conference on Machine Learning, pp. 759-766. Morgan Kaufmann.
-
(2000)
Proceedings of the 17th International Conference on Machine Learning
, pp. 759-766
-
-
Precup, D.1
Sutton, R.S.2
Singh, S.3
-
20
-
-
0037527188
-
Improving predictive inference under covariate shift by weighting the log-likelihood function
-
Shimodaira, H. (2000). Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 90(2): 227-244.
-
(2000)
Journal of Statistical Planning and Inference
, vol.90
, Issue.2
, pp. 227-244
-
-
Shimodaira, H.1
-
22
-
-
84919913727
-
A new Q(λ) with interim forward view and Monte Carlo equivalence
-
Beijing, China
-
Sutton, R. S., and Mahmood, A. R., Precup, D., van Hasselt, H. (2014). A new Q(λ) with interim forward view and Monte Carlo equivalence. In Proceedings of the 31st International Conference on Machine Learning, Beijing, China.
-
(2014)
Proceedings of the 31st International Conference on Machine Learning
-
-
Sutton, R.S.1
Mahmood, A.R.2
Precup, D.3
Van Hasselt, H.4
-
23
-
-
77956517288
-
Convergence of least squares temporal difference methods under general conditions
-
Yu, H. (2010). Convergence of least squares temporal difference methods under general conditions. In Proceedings of the 27th International Conference on Machine Learning, pp. 1207-1214.
-
(2010)
Proceedings of the 27th International Conference on Machine Learning
, pp. 1207-1214
-
-
Yu, H.1
|