-
1
-
-
0003503387
-
-
John Wiley & Sons Republished by Dover in 2004
-
Wald, A.: Sequential Analysis. John Wiley & Sons (1947) Republished by Dover in 2004.
-
(1947)
Sequential Analysis
-
-
Wald, A.1
-
3
-
-
0004870746
-
A problem in the sequential design of experiments
-
Bellman, R.E.: A problem in the sequential design of experiments. Sankhya 16 (1957) 221-229
-
(1957)
Sankhya
, vol.16
, pp. 221-229
-
-
Bellman, R.E.1
-
4
-
-
30044441333
-
The sample complexity of exploration in the multiarmed bandit problem
-
Marmor, S., Tsitsiklis, J.N.: The sample complexity of exploration in the multiarmed bandit problem. Journal of Machine Learning Research 5 (2004) 623-648
-
(2004)
Journal of Machine Learning Research
, vol.5
, pp. 623-648
-
-
Marmor, S.1
Tsitsiklis, J.N.2
-
6
-
-
33745295134
-
Action elimination and stopping conditions for the multi-armed and reinforcement learning problems
-
to appear
-
Even-Dar, E., Mannor, S., Mansour, Y.: Action elimination and stopping conditions for the multi-armed and reinforcement learning problems. Journal of Machine Learning Research (2006) to appear.
-
(2006)
Journal of Machine Learning Research
-
-
Even-Dar, E.1
Mannor, S.2
Mansour, Y.3
-
8
-
-
21444436092
-
On the lambert W function
-
Corless, R.M., Gonnet, G.H., Hare, D.E.G., Jeffrey, D.J., Knuth, D.E.: On the lambert W function. Advances in Computational Mathematics 5 (1996) 329-359
-
(1996)
Advances in Computational Mathematics
, vol.5
, pp. 329-359
-
-
Corless, R.M.1
Gonnet, G.H.2
Hare, D.E.G.3
Jeffrey, D.J.4
Knuth, D.E.5
-
9
-
-
0033170372
-
Between MDPs and serni-MDPs: A framework for temporal abstraction in reinforcement learning
-
Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs and serni-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1-2) (1999) 181-211
-
(1999)
Artificial Intelligence
, vol.112
, Issue.1-2
, pp. 181-211
-
-
Sutton, R.S.1
Precup, D.2
Singh, S.P.3
-
10
-
-
9444252980
-
The budgeted multi-armed bandit problem
-
Learning Theory: 17th Annual Conference on earning Theory, COLT 2004, Springer-Verlag
-
Madani, O., Lizotte, D.J., Greiner, R.: The budgeted multi-armed bandit problem. In: Learning Theory: 17th Annual Conference on earning Theory, COLT 2004. Volume 3120 of Lecture Notes in Computer Science., Springer-Verlag (2004) 643-645
-
(2004)
Lecture Notes in Computer Science
, vol.3120
, pp. 643-645
-
-
Madani, O.1
Lizotte, D.J.2
Greiner, R.3
-
11
-
-
33749817930
-
Active model selection
-
Banff, Canada, AUAI Press, Arlington, Virginia
-
Madani, O., Lizotte, D.J., Greiner, R.: Active model selection. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, Banff, Canada, AUAI Press, Arlington, Virginia (2004) 357-365
-
(2004)
Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence
, pp. 357-365
-
-
Madani, O.1
Lizotte, D.J.2
Greiner, R.3
-
12
-
-
33749851788
-
Models for trading exploration and exploitation using upper confidence bounds
-
PASCAL workshop on principled methods of trading exploration and exploitation
-
Auer, P.: Models for trading exploration and exploitation using upper confidence bounds, In: PASCAL workshop on principled methods of trading exploration and exploitation, PASCAL Network (2005)
-
(2005)
PASCAL Network
-
-
Auer, P.1
-
13
-
-
0000626524
-
Expected information as expected utility
-
Institute of Mathematical Statistics
-
Bernardo, J.M.: Expected information as expected utility, In: The Annals of Statistics. Volume 7., Institute of Mathematical Statistics (1979) 686-690
-
(1979)
The Annals of Statistics
, vol.7
, pp. 686-690
-
-
Bernardo, J.M.1
|