-
1
-
-
0016556021
-
A new approach to manipulator control: The cerebellar model articulation controller (CMAC)
-
0314.92007
-
J. S. Albus 1975 A new approach to manipulator control: the cerebellar model articulation controller (CMAC) Journal of Dynamic Systems, Measurement and Control 97 220 227 0314.92007
-
(1975)
Journal of Dynamic Systems, Measurement and Control
, vol.97
, pp. 220-227
-
-
Albus, J.S.1
-
2
-
-
40849145988
-
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
-
10.1007/s10994-007-5038-2
-
A. Antos C. Szepesvári R. Munos 2008 Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path Machine Learning 71 1 89 129 10.1007/s10994-007-5038-2
-
(2008)
Machine Learning
, vol.71
, Issue.1
, pp. 89-129
-
-
Antos, A.1
Szepesvári, C.2
Munos, R.3
-
4
-
-
78649714480
-
-
Master's thesis, Technion-Israel Institute of Technology. URL:
-
Bernstein, A. (2007). Adaptive state aggregation for reinforcement learning. Master's thesis, Technion-Israel Institute of Technology. URL: http://tx.technion.ac.il/~andreyb/MSc-Thesis-final.pdf.
-
(2007)
Adaptive State Aggregation for Reinforcement Learning
-
-
Bernstein, A.1
-
7
-
-
78649707587
-
LEAP: Learning entities adaptive partitioning
-
Whistler, Canada
-
Bonarini, A., Lazaric, A., & Restelli, M. (2005). LEAP: learning entities adaptive partitioning. In Proceedings of neural information processing systems conference (NIPS 2005), workshop on reinforcement learning benchmarks and bake-offs II, Whistler, Canada (pp. 41-47).
-
(2005)
Proceedings of Neural Information Processing Systems Conference (NIPS 2005), Workshop on Reinforcement Learning Benchmarks and Bake-offs II
, pp. 41-47
-
-
Bonarini, A.1
Lazaric, A.2
Restelli, M.3
-
8
-
-
0346942368
-
Decision-theoretic planning: Structural assumptions and computational leverage
-
0918.68110 1718251
-
C. Boutilier T. Dean S. Hanks 1999 Decision-theoretic planning: structural assumptions and computational leverage Journal of Artificial Intelligence Research 11 1 94 0918.68110 1718251
-
(1999)
Journal of Artificial Intelligence Research
, vol.11
, pp. 1-94
-
-
Boutilier, C.1
Dean, T.2
Hanks, S.3
-
9
-
-
0041965975
-
R-MAX-a general polynomial time algorithm for near-optimal reinforcement learning
-
10.1162/153244303765208377 1971337
-
R. I. Brafman M. Tennenholtz 2002 R-MAX-a general polynomial time algorithm for near-optimal reinforcement learning Journal of Machine Learning Research 3 213 231 10.1162/153244303765208377 1971337
-
(2002)
Journal of Machine Learning Research
, vol.3
, pp. 213-231
-
-
Brafman, R.I.1
Tennenholtz, M.2
-
11
-
-
0026206780
-
An optimal one-way multigrid algorithm for discrete-time stochastic control
-
DOI 10.1109/9.133184
-
C.-S. Chow J. N. Tsitsiklis 1991 An optimal one-way multigrid algorithm for discrete-time stochastic control IEEE Transactions on Automatic Control 36 8 898 914 0752.93078 10.1109/9.133184 1116447 (Pubitemid 21674882)
-
(1991)
IEEE Transactions on Automatic Control
, vol.36
, Issue.8
, pp. 898-914
-
-
Chow Chee-Seng1
Tsitsiklis John, N.2
-
12
-
-
0033629916
-
Reinforcement learning in continuous time and space
-
10.1162/089976600300015961
-
K. Doya 2000 Reinforcement learning in continuous time and space Neural Computation 12 219 245 10.1162/089976600300015961
-
(2000)
Neural Computation
, vol.12
, pp. 219-245
-
-
Doya, K.1
-
14
-
-
23244466805
-
-
PhD thesis, Gatsby Computational Neuroscience Unit, University College London, UK
-
Kakade, S. M. (2003). On the sample complexity of reinforcement learning. PhD thesis, Gatsby Computational Neuroscience Unit, University College London, UK.
-
(2003)
On the Sample Complexity of Reinforcement Learning
-
-
Kakade, S.M.1
-
15
-
-
0036832954
-
Near-optimal reinforcement learning in polynomial time
-
DOI 10.1023/A:1017984413808
-
M. Kearns S. P. Singh 2002 Near-optimal reinforcement learning in polynomial time Machine Learning 49 209 232 1014.68071 10.1023/A:1017984413808 (Pubitemid 34325687)
-
(2002)
Machine Learning
, vol.49
, Issue.2-3
, pp. 209-232
-
-
Kearns, M.1
Singh, S.2
-
16
-
-
4043069840
-
On actor-critic algorithms
-
1049.93095 10.1137/S0363012901385691 2044789
-
V. R. Konda J. N. Tsitsiklis 2003 On actor-critic algorithms SIAM Journal on Control and Optimization 42 4 1143 1166 1049.93095 10.1137/S0363012901385691 2044789
-
(2003)
SIAM Journal on Control and Optimization
, vol.42
, Issue.4
, pp. 1143-1166
-
-
Konda, V.R.1
Tsitsiklis, J.N.2
-
17
-
-
78649707742
-
Equi-gradient temporal difference learning
-
Loth, M., Davy, M., Coulom, R., & Preux, P. (2006) Equi-gradient temporal difference learning. In 23th international conference on machine learning (ICML 2006), workshop on kernel machines and reinforcement learning.
-
(2006)
23th International Conference on Machine Learning (ICML 2006), Workshop on Kernel Machines and Reinforcement Learning
-
-
Loth, M.1
Davy, M.2
Coulom, R.3
Preux, P.4
-
18
-
-
0029514510
-
The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces
-
A. W. Moore C. G. Atkeson 1995 The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces Machine Learning 21 199 233
-
(1995)
Machine Learning
, vol.21
, pp. 199-233
-
-
Moore, A.W.1
Atkeson, C.G.2
-
19
-
-
0036832953
-
Variable resolution discretization in optimal control
-
DOI 10.1023/A:1017992615625
-
R. Munos A. W. Moore 2002 Variable resolution discretization in optimal control Machine Learning 49 291 323 1005.68086 10.1023/A:1017992615625 (Pubitemid 34325691)
-
(2002)
Machine Learning
, vol.49
, Issue.2-3
, pp. 291-323
-
-
Munos, R.1
Moore, A.2
-
22
-
-
0036832956
-
Kernel-based reinforcement learning
-
DOI 10.1023/A:1017928328829
-
D. Ormoneit S. Sen 2002 Kernel-based reinforcement learning Machine Learning 49 161 178 1014.68069 10.1023/A:1017928328829 (Pubitemid 34325684)
-
(2002)
Machine Learning
, vol.49
, Issue.2-3
, pp. 161-178
-
-
Ormoneit, D.1
Sen, A.2
-
25
-
-
85153965130
-
Reinforcement learning with soft state aggregation
-
Singh, S. P., Jaakkola, T., & Jordan, M. I. (1995). Reinforcement learning with soft state aggregation. In Advances in neural information processing systems (NIPS) 7 (pp. 361-368).
-
(1995)
Advances in Neural Information Processing Systems (NIPS)
, vol.7
, pp. 361-368
-
-
Singh, S.P.1
Jaakkola, T.2
Jordan, M.I.3
-
28
-
-
33749255382
-
PAC model-free reinforcement learning
-
Strehl, A. L., Wiewiora, E., Langford, J., & Littman, M. L. (2006b). PAC model-free reinforcement learning. In Proceedings of the 23nd international conference on machine learning (pp. 881-888).
-
(2006)
Proceedings of the 23nd International Conference on Machine Learning
, pp. 881-888
-
-
Strehl, A.L.1
Wiewiora, E.2
Langford, J.3
Littman, M.L.4
-
29
-
-
85156221438
-
Generalization in reinforcement learning: Successful examples using sparse coarse coding
-
Sutton, R. S. (1996). Generalization in reinforcement learning: successful examples using sparse coarse coding. In Advances in neural information processing systems 8 (NIPS) (pp. 1038-1044).
-
(1996)
Advances in Neural Information Processing Systems 8 (NIPS)
, pp. 1038-1044
-
-
Sutton, R.S.1
-
31
-
-
0002999362
-
Splines: A perfect fit for signal and image processing
-
10.1109/79.799930
-
M. Unser 1999 Splines: A perfect fit for signal and image processing IEEE Signal Processing Magazine 16 22 38 10.1109/79.799930
-
(1999)
IEEE Signal Processing Magazine
, vol.16
, pp. 22-38
-
-
Unser, M.1
-
32
-
-
0017997986
-
Approximations of dynamic programs, i
-
0393.90094 10.1287/moor.3.3.231 506661
-
W. Whitt 1978 Approximations of dynamic programs, I Mathematics of Operations Research 3 3 231 243 0393.90094 10.1287/moor.3.3.231 506661
-
(1978)
Mathematics of Operations Research
, vol.3
, Issue.3
, pp. 231-243
-
-
Whitt, W.1
|