-
1
-
-
0345224411
-
The continuum-armed bandit problem
-
Agrawal, R. (1995). The continuum-armed bandit problem. SIAM J. Control Optim., 33, 1926-1951.
-
(1995)
SIAM J. Control Optim.
, vol.33
, pp. 1926-1951
-
-
Agrawal, R.1
-
2
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47, 235-256.
-
(2002)
Machine Learning
, vol.47
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
3
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (2003). The nonstochastic multiarmed bandit problem. SIAM J. Comput., 32, 48-77.
-
(2003)
SIAM J. Comput.
, vol.32
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
5
-
-
0030159874
-
Optimal adaptive policies for sequential allocation problems
-
Burnetas, A. N., & Katehakis, M. N. (1996). Optimal adaptive policies for sequential allocation problems. Adv. Appl. Math., 17, 122-142.
-
(1996)
Adv. Appl. Math.
, vol.17
, pp. 122-142
-
-
Burnetas, A.N.1
Katehakis, M.N.2
-
7
-
-
84937398609
-
Pac bounds for multi-armed bandit and Markov decision processes
-
London, UK: Springer-Verlag
-
Even-Dar, E., Mannor, S., & Mansour, Y. (2002). Pac bounds for multi-armed bandit and markov decision processes. Proceedings of COLT 2002 (pp. 255-270). London, UK: Springer-Verlag.
-
(2002)
Proceedings of COLT 2002
, pp. 255-270
-
-
Even-Dar, E.1
Mannor, S.2
Mansour, Y.3
-
9
-
-
84891584370
-
-
Wiley-Interscience Series in Systems and Optimization. Chichester: John Wiley & Sons Ltd. With a foreword by Peter Whittle
-
Gittins, J. C. (1989). Multi-armed bandit allocation indices. Wiley-Interscience Series in Systems and Optimization. Chichester: John Wiley & Sons Ltd. With a foreword by Peter Whittle.
-
(1989)
Multi-armed Bandit Allocation Indices
-
-
Gittins, J.C.1
-
10
-
-
84898076934
-
An asymptotically optimal policy for finite support models in the multi-armed bandit problem
-
Submitted to arXiv:0905.2776v3
-
Honda, J., & Takemura, A. (2010). An asymptotically optimal policy for finite support models in the multi-armed bandit problem. Submitted to Machine Learning, arXiv:0905.2776v3.
-
(2010)
Machine Learning
-
-
Honda, J.1
Takemura, A.2
-
11
-
-
0028531055
-
Multi-armed bandit problem revisited
-
Ishikida, T., & Varaiya, P. (1994). Multi-armed bandit problem revisited. J. Optim. Theory Appl., 83, 113-154.
-
(1994)
J. Optim. Theory Appl.
, vol.83
, pp. 113-154
-
-
Ishikida, T.1
Varaiya, P.2
-
12
-
-
84898981061
-
Nearly tight bounds for the continuum-armed bandit problem
-
MIT Press
-
Kleinberg, R. (2005). Nearly tight bounds for the continuum-armed bandit problem. Proceedings of NIPS 2005 (pp. 697-704). MIT Press.
-
(2005)
Proceedings of NIPS 2005
, pp. 697-704
-
-
Kleinberg, R.1
-
14
-
-
0002899547
-
Asymptotically efficient adaptive allocation rules
-
Lai, T. L., & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6, 4-22.
-
(1985)
Advances in Applied Mathematics
, vol.6
, pp. 4-22
-
-
Lai, T.L.1
Robbins, H.2
-
15
-
-
0038673523
-
Probability; a survey of the mathematical theory
-
New York: John Wiley & Sons Ltd. Second edition
-
Lamperti, J. (1996). Probability; a survey of the mathematical theory. Wiley Series in Probability Statistics. New York: John Wiley & Sons Ltd. Second edition.
-
(1996)
Wiley Series in Probability Statistics
-
-
Lamperti, J.1
-
16
-
-
0032679082
-
Exploration of multi-state environments: Local measures and back-propagation of uncertainty
-
Meuleau, N., & Bourgine, P. (1999). Exploration of multi-state environments: Local measures and back-propagation of uncertainty. Machine Learning, 35, 117-154.
-
(1999)
Machine Learning
, vol.35
, pp. 117-154
-
-
Meuleau, N.1
Bourgine, P.2
-
19
-
-
14344258433
-
A Bayesian framework for reinforcement learning
-
Morgan Kaufmann, San Francisco, CA
-
Strens, M. (2000). A bayesian framework for reinforcement learning. Proceedings of ICML 2000 (pp. 943-950). Morgan Kaufmann, San Francisco, CA.
-
(2000)
Proceedings of ICML 2000
, pp. 943-950
-
-
Strens, M.1
-
20
-
-
33646406807
-
Multi-armed bandit algorithms and empirical evaluation
-
Porto, Portugal: Springer
-
Vermorel, J., & Mohri, M. (2005). Multi-armed bandit algorithms and empirical evaluation. Proceedings of ECML 2005 (pp. 437-448). Porto, Portugal: Springer.
-
(2005)
Proceedings of ECML 2005
, pp. 437-448
-
-
Vermorel, J.1
Mohri, M.2
-
21
-
-
0000607073
-
Nonparametric bandit methods
-
Yakowitz, S., & Lowe, W. (1991). Nonparametric bandit methods. Ann. Oper. Res., 28, 297-312.
-
(1991)
Ann. Oper. Res.
, vol.28
, pp. 297-312
-
-
Yakowitz, S.1
Lowe, W.2
|