-
1
-
-
0037709910
-
The non-stochastic multi-armed bandit problem
-
Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund, and Robert E. Schapire. The non-stochastic multi-armed bandit problem. SIAM Journal on Computing, 32(1):48-77, 2002.
-
(2002)
SIAM Journal on Computing
, vol.32
, Issue.1
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
2
-
-
0031140246
-
How to use expert advice
-
Nicolò Cesa-Bianchi, Yoav Freund, David Haussler, David P. Helmbold, and Robert E. Schapire. How to use expert advice. Journal of the Association for Computing Machinery, 44(3):427-485, 1997.
-
(1997)
Journal of the Association for Computing Machinery
, vol.44
, Issue.3
, pp. 427-485
-
-
Cesa-Bianchi, N.1
Freund, Y.2
Haussler, D.3
Helmbold, D.P.4
Schapire, R.E.5
-
3
-
-
33646519163
-
Exploration-exploitation tradeoffs for experts algorithms in reactive environments
-
Daniela Pucci de Farias and Nimrod Megiddo. Exploration-exploitation tradeoffs for experts algorithms in reactive environments. In Advances in Neural Information Processing Systems 17, pages 409-416, 2004.
-
(2004)
Advances in Neural Information Processing Systems
, vol.17
, pp. 409-416
-
-
Pucci De Farias, D.1
Megiddo, N.2
-
4
-
-
0032137328
-
Tracking the best expert
-
Mark Herbster and Manfred Warmuth. Tracking the best expert. Machine Learning, 32(2):151-78, 1998.
-
(1998)
Machine Learning
, vol.32
, Issue.2
, pp. 151-178
-
-
Herbster, M.1
Warmuth, M.2
-
5
-
-
0029617280
-
Covergence results for the EM approach to mixtures of experts architectures
-
Michael I. Jordan and Lei Xu. Covergence results for the EM approach to mixtures of experts architectures. Neural Networks, 8:1409-1431, 1995.
-
(1995)
Neural Networks
, vol.8
, pp. 1409-1431
-
-
Jordan, M.I.1
Xu, L.2
-
6
-
-
0036832954
-
Near-optimal reinforcement learning in polynomial time
-
Michael Kearns and Satinder Singh. Near-optimal reinforcement learning in polynomial time. Machine Learning, 49:209-232, 2002.
-
(2002)
Machine Learning
, vol.49
, pp. 209-232
-
-
Kearns, M.1
Singh, S.2
-
9
-
-
84966203785
-
Some aspects of the sequential design of experiments
-
Herbert Robbins. Some aspects of the sequential design of experiments. Bulletins of the American Mathematical Society, 58:527-535, 1952.
-
(1952)
Bulletins of the American Mathematical Society
, vol.58
, pp. 527-535
-
-
Robbins, H.1
-
10
-
-
33746857797
-
Keepaway soccer: From machine learning testbed to benchmark
-
Itsuki Noda, Adam Jacoff, Ansgar Bredenfeld, and Yasutake Takahashi, editors, Springer Verlag, Berlin, To appear
-
Peter Stone, Gregory Huhlmann, Matthew E. Taylor, and Yaxin Liu. Keepaway soccer: From machine learning testbed to benchmark. In Itsuki Noda, Adam Jacoff, Ansgar Bredenfeld, and Yasutake Takahashi, editors, RoboCup-2005: Robot Soccer World Cup IX. Springer Verlag, Berlin, 2006. To appear.
-
(2006)
RoboCup-2005: Robot Soccer World Cup IX
-
-
Stone, P.1
Huhlmann, G.2
Taylor, M.E.3
Liu, Y.4
-
11
-
-
0032050241
-
Model-based average reward reinforcement learning
-
Prasad Tadepalli and DoKyeong Ok. Model-based average reward reinforcement learning. Artificial Intelligence, 100:177-224, 1998.
-
(1998)
Artificial Intelligence
, vol.100
, pp. 177-224
-
-
Tadepalli, P.1
Ok, D.2
|