-
3
-
-
78649714480
-
-
Master's thesis, Technion - Israel Institute of Technology
-
A. Bernstein. Adaptive state aggregation for reinforcement learning. Master's thesis, Technion - Israel Institute of Technology, 2007. URL: http://tx.technion.ac.il/~andreyb/MSc-Thesis-final.pdf.
-
(2007)
Adaptive State Aggregation for Reinforcement Learning
-
-
Bernstein, A.1
-
4
-
-
0003565783
-
-
Athena Scientific, Belmont, MA, third edition
-
D. P. Bertsekas. Dynamic Programming and Optimal Control, vol. 2. Athena Scientific, Belmont, MA, third edition, 2007.
-
(2007)
Dynamic Programming and Optimal Control
, vol.2
-
-
Bertsekas, D.P.1
-
7
-
-
0041965975
-
R-MAX - A general polynomial time algorithm for near-optimal reinforcement learning
-
R. I. Brafman and M. Tennenholtz. R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3: 213-231, 2002.
-
(2002)
Journal of Machine Learning Research
, vol.3
, pp. 213-231
-
-
Brafman, R.I.1
Tennenholtz, M.2
-
9
-
-
0026206780
-
An optimal oneway multigrid algorithm for discrete-time stochastic control
-
C.-S Chow and J.N. Tsitsiklis. An optimal oneway multigrid algorithm for discrete-time stochastic control. IEEE Transactions on Automatic Control, 36(8): 898-914, 1991.
-
(1991)
IEEE Transactions on Automatic Control
, vol.36
, Issue.8
, pp. 898-914
-
-
Chow, C.-S.1
Tsitsiklis, J.N.2
-
11
-
-
0742284358
-
Reinforcement learning with function approximation converges to a region
-
G. J. Gordon. Reinforcement learning with function approximation converges to a region. In Advances in Neural Information Processing Systems (NIPS) 12, pages 1040-1046, 2000.
-
(2000)
Advances in Neural Information Processing Systems (NIPS)
, vol.12
, pp. 1040-1046
-
-
Gordon, G.J.1
-
12
-
-
23244466805
-
-
PhD thesis, Gatsby Computational Neuroscience Unit, University College London, UK
-
S. M. Kakade. On the Sample Complexity of Reinforcement Learning. PhD thesis, Gatsby Computational Neuroscience Unit, University College London, UK, 2003.
-
(2003)
On the Sample Complexity of Reinforcement Learning
-
-
Kakade, S.M.1
-
13
-
-
0036832954
-
Near-optimal reinforcement learning in polynomial time
-
M. Kearns and S. P. Singh. Near-optimal reinforcement learning in polynomial time. Machine Learning, 49: 209-232, 2002.
-
(2002)
Machine Learning
, vol.49
, pp. 209-232
-
-
Kearns, M.1
Singh, S.P.2
-
15
-
-
0029514510
-
The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces
-
A. W. Moore and C. G. Atkeson. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces. Machine Learning, 21: 199-233, 1995.
-
(1995)
Machine Learning
, vol.21
, pp. 199-233
-
-
Moore, A.W.1
Atkeson, C.G.2
-
16
-
-
0036832953
-
Variable resolution discretization in optimal control
-
R. Munos and A. W. Moore. Variable resolution discretization in optimal control. Machine Learning, 49: 291-323, 2002.
-
(2002)
Machine Learning
, vol.49
, pp. 291-323
-
-
Munos, R.1
Moore, A.W.2
-
17
-
-
0003998452
-
-
John Wiley & Sons, Inc., New York, NY, USA
-
M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, USA, 1994.
-
(1994)
Markov Decision Processes: Discrete Stochastic Dynamic Programming
-
-
Puterman, M.L.1
-
19
-
-
33749255382
-
PAC model-free reinforcement learning
-
A. L. Strehl, E. Wiewiora, J. Langford, and M. L. Littman. PAC model-free reinforcement learning. In Proceedings of the 23nd International Conference on Machine Learning, pages 881-888, 2006.
-
(2006)
Proceedings of the 23nd International Conference on Machine Learning
, pp. 881-888
-
-
Strehl, A.L.1
Wiewiora, E.2
Langford, J.3
Littman, M.L.4
-
21
-
-
0017997986
-
Approximations of dynamic programs, I
-
W. Whitt. Approximations of dynamic programs, I. Mathematics of Operations Research, 3(3): 231-243, 1978.
-
(1978)
Mathematics of Operations Research
, vol.3
, Issue.3
, pp. 231-243
-
-
Whitt, W.1
|