-
2
-
-
84880854156
-
A general polynomial time algorithm for near-optimal reinforcement learning
-
R.I. Brafman, M. Tennenholtz, A general polynomial time algorithm for near-optimal reinforcement learning, in: Proc. 17th International Joint Conference on Artificial Intelligence, IJCAI-01, 1999, pp. 734-739.
-
(1999)
Proc. 17th International Joint Conference on Artificial Intelligence, IJCAI-01
, pp. 734-739
-
-
Brafman, R.I.1
Tennenholtz, M.2
-
3
-
-
84926078662
-
-
Cambridge University Press, New York
-
N. Cesa-Bianchi, G. Lugosi, Prediction, Learning, and Games, Cambridge University Press, New York, 2006.
-
(2006)
Prediction, Learning, and Games
-
-
Cesa-Bianchi, N.1
Lugosi, G.2
-
5
-
-
33845304828
-
How to combine expert (and novice) advice when actions impact the environment?
-
Sebastian Thrun, Lawrence Saul, Bernhard Schölkopf (Eds, MIT Press, Cambridge, MA
-
D. Pucci de Farias, N. Megiddo, How to combine expert (and novice) advice when actions impact the environment? in: Sebastian Thrun, Lawrence Saul, Bernhard Schölkopf (Eds.), Advances in Neural Information Processing Systems, vol. 16, MIT Press, Cambridge, MA, 2004.
-
(2004)
Advances in Neural Information Processing Systems
, vol.16
-
-
De Farias, D.P.1
Megiddo, N.2
-
7
-
-
84880715629
-
Reinforcement learning in POMDPs without resets
-
E. Even-Dar, S.M. Kakade, Y. Mansour, Reinforcement learning in POMDPs without resets, in: IJCAI, 2005, pp. 690-695.
-
(2005)
IJCAI
, pp. 690-695
-
-
Even-Dar, E.1
Kakade, S.M.2
Mansour, Y.3
-
8
-
-
21844436185
-
Prediction with expert advice by following the perturbed leader for general weights
-
Algorithmic Learning Theory - 15th International Conference, ALT 2004
-
M. Hutter, J. Poland, Prediction with expert advice by following the perturbed leader for general weights, in: Proc. 15th International Conf. on Algorithmic Learning Theory, ALT'04, in: LNAI, vol. 3244, Springer, Padova, Berlin, 2004, pp. 279-293. (Pubitemid 41050298)
-
(2004)
Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)
, vol.3244
, pp. 279-293
-
-
Hutter, M.1
Poland, J.2
-
9
-
-
84937417436
-
Self-optimizing and Pareto-optimal policies in general environments based on Bayes-mixtures
-
Lecture Notes in Artificial Intelligence, Springer, Sydney, Australia, July
-
M. Hutter, Self-optimizing and Pareto-optimal policies in general environments based on Bayes-mixtures, in: Proc. 15th Annual Conference on Computational Learning Theory, COLT 2002, in: Lecture Notes in Artificial Intelligence, Springer, Sydney, Australia, July 2002, pp. 364-379.
-
(2002)
Proc. 15th Annual Conference on Computational Learning Theory, COLT 2002
, pp. 364-379
-
-
Hutter, M.1
-
10
-
-
4644374039
-
Optimality of universal Bayesian prediction for general loss and alphabet
-
M. Hutter, Optimality of universal Bayesian prediction for general loss and alphabet, Journal of Machine Learning Research 4 (2003) 971-1000.
-
(2003)
Journal of Machine Learning Research
, vol.4
, pp. 971-1000
-
-
Hutter, M.1
-
12
-
-
33750686975
-
General discounting versus average reward
-
Algorithmic Learning Theory - 17th International Conference, ALT 2006, Proceedings
-
M. Hutter, General discounting versus average reward, in: Proc. 17th International Conf. on Algorithmic Learning Theory, ALT'06, in: LNAI, vol. 4264, Springer, Barcelona, Berlin, 2006, pp. 244-258. (Pubitemid 44705632)
-
(2006)
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
, vol.LNAI
, pp. 244-258
-
-
Hutter, M.1
-
13
-
-
0003691637
-
-
Prentice Hall, Englewood Cliffs, NJ
-
P.R. Kumar, P.P. Varaiya, Stochastic Systems: Estimation, Identification, and Adaptive Control, Prentice Hall, Englewood Cliffs, NJ, 1986.
-
(1986)
Stochastic Systems: Estimation, Identification, and Adaptive Control
-
-
Kumar, P.R.1
Varaiya, P.P.2
-
14
-
-
33646515747
-
-
LNAI, vol. 3734, Springer, Singapore, Berlin
-
J. Poland, M. Hutter, Defensive universal learning with experts, in: Proc. 16th International Conf. on Algorithmic Learning Theory, ALT'05, in: LNAI, vol. 3734, Springer, Singapore, Berlin, 2005, pp. 356-370.
-
(2005)
Defensive Universal Learning with Experts, In: Proc. 16th International Conf. on Algorithmic Learning Theory, ALT'05
, pp. 356-370
-
-
Poland, J.1
Hutter, M.2
-
16
-
-
41149139797
-
Predicting non-stationary processes
-
DOI 10.1016/j.aml.2007.04.004, PII S0893965907001899
-
D. Ryabko, M. Hutter, Predicting Non-Stationary Processes, Applied Mathematics Letters 21 (5) (2008) 477-482. (Pubitemid 351424908)
-
(2008)
Applied Mathematics Letters
, vol.21
, Issue.5
, pp. 477-482
-
-
Ryabko, D.1
Hutter, M.2
-
18
-
-
0004102479
-
-
MIT Press, Cambridge, MA
-
R. Sutton, A. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998.
-
(1998)
Reinforcement Learning: An Introduction
-
-
Sutton, R.1
Barto, A.2
|