-
6
-
-
0041965975
-
R-max-A general polynomial time algorithm for near-optimal reinforcement learning
-
R.I. Brafman M. Tennenholtz 2002 R-max-A general polynomial time algorithm for near-optimal reinforcement learning Journal of Machine Learning Research 3 213 231
-
(2002)
Journal of Machine Learning Research
, vol.3
, pp. 213-231
-
-
Brafman, R.I.1
Tennenholtz, M.2
-
7
-
-
0242628951
-
Markov decision processes with state-information lag
-
4
-
D.M. Brooks C.T. Leondes 1972 Markov decision processes with state-information lag Operations Research 20 4 904 907
-
(1972)
Operations Research
, vol.20
, pp. 904-907
-
-
Brooks, D.M.1
Leondes, C.T.2
-
8
-
-
36349002318
-
A reinforcement learning algorithm with polynomial interaction complexity for only-costly-observable MDPs
-
Fox, R., & Tennenholtz, M. (2007). A reinforcement learning algorithm with polynomial interaction complexity for only-costly-observable MDPs. In Proceedings of the 22nd Conference on Artificial Intelligence, pp. 553-558.
-
(2007)
Proceedings of the 22nd Conference on Artificial Intelligence
, pp. 553-558
-
-
Fox, R.1
Tennenholtz, M.2
-
9
-
-
84947403595
-
Probability inequalities for sums of bounded random variables
-
301
-
W. Hoeffding 1963 Probability inequalities for sums of bounded random variables Journal of the American Statistical Association 58 301 13 30
-
(1963)
Journal of the American Statistical Association
, vol.58
, pp. 13-30
-
-
Hoeffding, W.1
-
15
-
-
0003861655
-
-
PhD thesis, Brown University, Providence, RI, 1996
-
Littman, M. L. (1996). Algorithms for sequential decision making. PhD thesis, Brown University, Providence, RI, 1996.
-
(1996)
Algorithms for Sequential Decision Making
-
-
Littman, M.L.1
-
16
-
-
0012327484
-
Using eligibility traces to find the best memoryless policy in partially observable Markov decision processes
-
Loch, J., & Singh, S. (1998). Using eligibility traces to find the best memoryless policy in partially observable Markov decision processes. In Proceedings of the 15th International Conference on Machine Learning, pp. 323-331.
-
(1998)
Proceedings of the 15th International Conference on Machine Learning
, pp. 323-331
-
-
Loch, J.1
Singh, S.2
-
18
-
-
0036832956
-
Kernel-based reinforcement learning
-
D. Ormoneit Ś. Sen 2002 Kernel-based reinforcement learning Machine Learning 49 161 178
-
(2002)
Machine Learning
, vol.49
, pp. 161-178
-
-
Ormoneit, D.1
Sen, Ś.2
-
21
-
-
0029753630
-
Reinforcement learning with replacing eligibility traces
-
1-3
-
S.P. Singh R.S. Sutton 1996 Reinforcement learning with replacing eligibility traces Machine Learning 22 1-3 123 158
-
(1996)
Machine Learning
, vol.22
, pp. 123-158
-
-
Singh, S.P.1
Sutton, R.S.2
-
22
-
-
0028497385
-
An upper bound on the loss from approximate optimal-value functions
-
3
-
S.P. Singh R.C. Yee 1994 An upper bound on the loss from approximate optimal-value functions Machine Learning 16 3 227 233
-
(1994)
Machine Learning
, vol.16
, pp. 227-233
-
-
Singh, S.P.1
Yee, R.C.2
-
23
-
-
33749255382
-
PAC model-free reinforcement learning
-
Strehl, A. L., Li, L., Wiewiora, E., Langford, J., & Littman, M. L. (2006). PAC model-free reinforcement learning. In Proceedings of the 23rd International Conference on Machine Learning, pp. 881-888.
-
(2006)
Proceedings of the 23rd International Conference on Machine Learning
, pp. 881-888
-
-
Strehl, A.L.1
Li, L.2
Wiewiora, E.3
Langford, J.4
Littman, M.L.5
-
24
-
-
85156221438
-
Generalization in reinforcement learning: Successful examples using sparse coarse coding
-
MIT Press Cambridge, MA
-
Sutton R.S. (1996) Generalization in reinforcement learning: Successful examples using sparse coarse coding. In: Touretzky D.S., Mozer M.C., HasselmoM. E. (Eds) Advances in neural information processing systems 8. MIT Press, Cambridge, MA, pp 1038-1045
-
(1996)
Advances in Neural Information Processing Systems 8
, pp. 1038-1045
-
-
Sutton, R.S.1
Touretzky, D.S.2
Mozer, M.C.3
Hasselmo, M.E.4
-
26
-
-
0002891388
-
Locally weighted projection regression: An O(n) algorithm for incremental real time learning in high dimensional space
-
Vijayakumar, S., & Schaal, S. (2000). Locally weighted projection regression: An O(n) algorithm for incremental real time learning in high dimensional space. In Proceedings of the 17th International Conference on Machine Learning, pp. 1079-1086.
-
(2000)
Proceedings of the 17th International Conference on Machine Learning
, pp. 1079-1086
-
-
Vijayakumar, S.1
Schaal, R.2
|