-
2
-
-
84941157238
-
Learning near-optimal policies with fitted policy iteration and a single sample path: Approximate iterative policy evaluation
-
A. Antos, Cs. Szepesvári, and R. Munos. Learning near-optimal policies with fitted policy iteration and a single sample path: approximate iterative policy evaluation, (submitted to ICML'2006, 2006.
-
(2006)
ICML'2006
-
-
Antos, A.1
Szepesvári, Cs.2
Munos, R.3
-
8
-
-
0003161174
-
Rates of convergence for empirical processes of stationary mixing sequences
-
January
-
B. Yu. Rates of convergence for empirical processes of stationary mixing sequences. The Annals of Probability, 22(1):94-116, January 1994.
-
(1994)
The Annals of Probability
, vol.22
, Issue.1
, pp. 94-116
-
-
Yu, B.1
-
9
-
-
0030489341
-
Histogram regression estimation using data-dependent partitions
-
A. Nobel. Histogram regression estimation using data-dependent partitions. Annals of Statistics, 24(3):1084-1105, 1996.
-
(1996)
Annals of Statistics
, vol.24
, Issue.3
, pp. 1084-1105
-
-
Nobel, A.1
-
10
-
-
0000996139
-
Sphere packing numbers for subsets of the boolean n-cube with bounded Vapnik-Chervonenkis dimension
-
D. Haussler. Sphere packing numbers for subsets of the boolean n-cube with bounded Vapnik-Chervonenkis dimension. Journal of Combinatorial Theory Series A, 69:217-232, 1995.
-
(1995)
Journal of Combinatorial Theory Series A
, vol.69
, pp. 217-232
-
-
Haussler, D.1
-
11
-
-
0001201756
-
Some studies in machine learning using the game of checkers
-
A.L. Samuel. Some studies in machine learning using the game of checkers. IBM Journal on Research and Development, pages 210-229, 1959.
-
(1959)
IBM Journal on Research and Development
, pp. 210-229
-
-
Samuel, A.L.1
-
12
-
-
0004242550
-
-
Reprinted, E.A. Feigenbaum and J. Feldman, editors, McGraw-Hill, New York
-
Reprinted in Computers and Thought, E.A. Feigenbaum and J. Feldman, editors, McGraw-Hill, New York, 1963.
-
(1963)
Computers and Thought
-
-
-
15
-
-
0008321896
-
Reinforcement learning: An introduction
-
Richard S. Sutton and Andrew G. Barto. Reinforcement learning: An introduction. Bradford Book, 1998.
-
(1998)
Bradford Book
-
-
Sutton, R.S.1
Barto, A.G.2
-
16
-
-
84880694195
-
Stable function approximation in dynamic programming
-
Armand Prieditis and Stuart Russell, editors, San Francisco, CA. Morgan Kaufmann
-
Geoffrey J. Gordon. Stable function approximation in dynamic programming. In Armand Prieditis and Stuart Russell, editors, Proceedings of the Twelfth International Conference on Machine Learning, pages 261-268, San Francisco, CA, 1995. Morgan Kaufmann.
-
(1995)
Proceedings of the Twelfth International Conference on Machine Learning
, pp. 261-268
-
-
Gordon, G.J.1
-
17
-
-
0029752470
-
Feature-based methods for large scale dynamic programming
-
J. N. Tsitsiklis and B. Van Roy. Feature-based methods for large scale dynamic programming. Machine Learning, 22:59-94, 1996.
-
(1996)
Machine Learning
, vol.22
, pp. 59-94
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
-
21
-
-
84899029004
-
Batch value function approximation via support vectors
-
T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, Cambridge, MA. MIT Press
-
T. G. Dietterich and X. Wang. Batch value function approximation via support vectors. In T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14, Cambridge, MA, 2002. MIT Press.
-
(2002)
Advances in Neural Information Processing Systems
, vol.14
-
-
Dietterich, T.G.1
Wang, X.2
-
22
-
-
31844456754
-
Finite time bounds for sampling based fitted value iteration
-
Cs. Szepesvári and R. Munos. Finite time bounds for sampling based fitted value iteration. In ICML'2005, 2005.
-
(2005)
ICML'2005
-
-
Szepesvári, Cs.1
Munos, R.2
-
23
-
-
0033904367
-
Nonparametric time series prediction through adaptive model selection
-
April
-
R. Meir. Nonparametric time series prediction through adaptive model selection. Machine Learning, 39(1):5-34, April 2000.
-
(2000)
Machine Learning
, vol.39
, Issue.1
, pp. 5-34
-
-
Meir, R.1
|