-
1
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
DOI 10.1023/A:1013689704352, Computational Learning Theory
-
Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite time analysis of the multi-armed bandit problem. Machine Learning, 47, 235-256. (Pubitemid 34126111)
-
(2002)
Machine Learning
, vol.47
, Issue.2-3
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
2
-
-
0029513526
-
Gambling in a rigged casino: The adversarial multi-armed bandit problem
-
Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (1995). Gambling in a rigged casino: The adversarial multi-armed bandit problem. FOCS.
-
(1995)
FOCS
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
3
-
-
33745295134
-
Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
-
Even-dar, E., Mannor, S., & Mansour, Y. (2006). Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. JMLR, 7, 1079-1105. (Pubitemid 43938989)
-
(2006)
Journal of Machine Learning Research
, vol.7
, pp. 1079-1105
-
-
Even-Bar, E.1
Mannor, S.2
Mansour, Y.3
-
4
-
-
0000125534
-
Sample selection bias as a specification error
-
Heckman, J. (1979). Sample selection bias as a specification error. Econometrica, 47, 153-161.
-
(1979)
Econometrica
, vol.47
, pp. 153-161
-
-
Heckman, J.1
-
5
-
-
84898967749
-
Approximate planning in large pomdps via reusable trajectories
-
Kearns, M., Mansour, Y., & Ng, A. Y. (2000). Approximate planning in large pomdps via reusable trajectories. NIPS.
-
(2000)
NIPS
-
-
Kearns, M.1
Mansour, Y.2
Ng, A.Y.3
-
6
-
-
0002899547
-
Asymptotically efficient adaptive allocation rules
-
Lai, T., & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6, 4-22.
-
(1985)
Advances in Applied Mathematics
, vol.6
, pp. 4-22
-
-
Lai, T.1
Robbins, H.2
-
7
-
-
0029344133
-
Machine learning and nonparametric bandit theory
-
Lai, T., & Yakowitz, S. (1995). Machine learning and nonparametric bandit theory. IEEE TAC, 40, 1199-1209.
-
(1995)
IEEE TAC
, vol.40
, pp. 1199-1209
-
-
Lai, T.1
Yakowitz, S.2
-
8
-
-
70049106076
-
Bandits for taxonomies: A modelbased approach
-
Pandey, S., Agarwal, D., Chakrabarti, D., & Josifovski, V. (2007). Bandits for taxonomies: a modelbased approach. SIAM Data Mining Conference.
-
(2007)
SIAM Data Mining Conference
-
-
Pandey, S.1
Agarwal, D.2
Chakrabarti, D.3
Josifovski, V.4
-
9
-
-
33749242078
-
Experience-efficient learning in associative bandit problems
-
Strehl, A. L., Mesterharm, C., Littman, M. L., & Hirsh, H. (2006). Experience-efficient learning in associative bandit problems. ICML.
-
(2006)
ICML
-
-
Strehl, A.L.1
Mesterharm, C.2
Littman, M.L.3
Hirsh, H.4
-
10
-
-
15844389867
-
Bandit problems with side observations
-
DOI 10.1109/TAC.2005.844079
-
Wang, C.-C., Kulkarni, S. R., & Poor, H. V. (2005). Bandit problems with side observations. IEEE Transactions on Automatic Control, 50, 338-355. (Pubitemid 40448585)
-
(2005)
IEEE Transactions on Automatic Control
, vol.50
, Issue.3
, pp. 338-355
-
-
Wang, C.-C.1
Kulkarni, S.R.2
Poor, H.V.3
|