SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Journal of Machine Learning Research

Volumn 30, Issue , 2013, Pages 228-251

Information complexity in bandit subset selection

(2) Kaufmann, Emilie a,b Kalyanakrishnan, Shivaram c

a INSTITUT MINES TELECOM (France)

b TELECOM PARISTECH (France)

c YAHOO RESEARCH (United States)

Author keywords

KL divergence; Stochastic multi armed bandits; Subset selection

Indexed keywords

ARTIFICIAL INTELLIGENCE; SOFTWARE ENGINEERING;

ADAPTIVE SAMPLING; CHERNOFF INFORMATION; CONFIDENCE INTERVAL; INFORMATION COMPLEXITY; KL-DIVERGENCE; MULTI ARMED BANDIT; SUBSET SELECTION;

STOCHASTIC SYSTEMS;

EID: 84898028877 PISSN: 15324435 EISSN: 15337928 Source Type: Journal
DOI: None Document Type: Conference Paper

Times cited : (155)

References (20)

1
- 84864970677
- Best arm identification in multi-armed bandits
- J-Y. Audibert, S. Bubeck, and R. Munos. Best arm identification in multi-armed bandits. In Conference on Learning Theory (COLT), 2010.
- (2010) Conference on Learning Theory (COLT)
- Audibert, J.-Y.¹ Bubeck, S.² Munos, R.³

2
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2): 235-256, 2002.
- (2002) Machine Learning , vol.47 , Issue.2 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

3
- 79952624396
- Pure exploration in finitely armed and continuous armed bandits
- S. Bubeck, R. Munos, and G. Stoltz. Pure exploration in finitely armed and continuous armed bandits. Theoretical Computer Science 412, 1832-1852, 412: 1832-1852, 2011.
- (2011) Theoretical Computer Science 412, 1832-1852 , vol.412 , pp. 1832-1852
- Bubeck, S.¹ Munos, R.² Stoltz, G.³

4
- 84897498871
- Multiple identifications in multi-armed bandits
- To appear
- S. Bubeck, T. Wang, and N. Viswanathan. Multiple identifications in multi-armed bandits. In International Conference on Machine Learning (ICML). To appear, 2013.
- (2013) International Conference on Machine Learning (ICML)
- Bubeck, S.¹ Wang, T.² Viswanathan, N.³

5
- 84898949562
- Kullback-leibler upper confidence bounds for optimal sequential allocation
- to appear
- O. Cappé, A. Garivier, O-A. Maillard, R. Munos, and G. Stoltz. Kullback-Leibler upper confidence bounds for optimal sequential allocation. to appear in Annals of Statistics, 2013.
- (2013) Annals of Statistics
- Cappé, O.¹ Garivier, A.² Maillard, O.-A.³ Munos, R.⁴ Stoltz, G.⁵

6
- 84889281816
- (2nd Edition). Wiley
- T. Cover and J. Thomas. Elements of Information Theory (2nd Edition). Wiley, 2006.
- (2006) Elements of Information Theory
- Cover, T.¹ Thomas, J.²

7
- 33745295134
- Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
- E. Even-Dar, S. Mannor, and Y. Mansour. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of Machine Learning Research, 7: 1079-1105, 2006.
- (2006) Journal of Machine Learning Research , vol.7 , pp. 1079-1105
- Even-Dar, E.¹ Mannor, S.² Mansour, Y.³

8
- 84877730309
- Best arm identification: A unified approach to fixed budget and fixed confidence
- V. Gabillon, M. Ghavamzadeh, and A. Lazaric. Best arm identification: A unified approach to fixed budget and fixed confidence. In Neural Information and Signal Processing (NIPS), 2012.
- (2012) Neural Information and Signal Processing (NIPS)
- Gabillon, V.¹ Ghavamzadeh, M.² Lazaric, A.³

9
- 84863920694
- The KL-UCB algorithm for bounded stochastic bandits and beyond
- A. Garivier and O. Cappé. The KL-UCB algorithm for bounded stochastic bandits and beyond. In Conference on Learning Theory (COLT), 2011.
- (2011) Conference on Learning Theory (COLT)
- Garivier, A.¹ Cappé, O.²

10
- 71149099672
- Hoeffding and bernstein races for selecting policies in evolutionary direct policy search
- V. Heidrich-Meisner and C. Igel. Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search. In International Conference on Learning Theorey (ICML), 2009.
- (2009) International Conference on Learning Theorey (ICML)
- Heidrich-Meisner, V.¹ Igel, C.²

11
- 84867121052
- PhD thesis, Departement of Computer Science, The University of Texas at Austin
- S. Kalyanakrishnan. Learning Methods for Sequential Decision Making with Imperfect Representations. PhD thesis, Departement of Computer Science, The University of Texas at Austin, 2011.
- (2011) Learning Methods for Sequential Decision Making with Imperfect Representations
- Kalyanakrishnan, S.¹

12
- 77956526578
- Efficient selection in multiple bandit arms: Theory and practice
- S. Kalyanakrishnan and P. Stone. Efficient selection in multiple bandit arms: Theory and practice. In International Conference on Machine Learning (ICML), 2010.
- (2010) International Conference on Machine Learning (ICML)
- Kalyanakrishnan, S.¹ Stone, P.²

13
- 84867131498
- PAC subset selection in stochastic multi-armed bandits
- S. Kalyanakrishnan, A. Tewari, P. Auer, and P. Stone. PAC subset selection in stochastic multi-armed bandits. In International Conference on Machine Learning (ICML), 2012.
- (2012) International Conference on Machine Learning (ICML)
- Kalyanakrishnan, S.¹ Tewari, A.² Auer, P.³ Stone, P.⁴

14
- 84887459202
- Thompson sampling: An asymptotically optimal finite-time analysis
- E. Kaufmann, N. Korda, and R. Munos. Thompson sampling: an asymptotically optimal finite-time analysis. In Algorithmic Learning Theory (ALT), 2012.
- (2012) Algorithmic Learning Theory (ALT)
- Kaufmann, E.¹ Korda, N.² Munos, R.³

15
- 0002899547
- Asymptotically efficient adaptive allocation rules
- T.L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1): 4-22, 1985.
- (1985) Advances in Applied Mathematics , vol.6 , Issue.1 , pp. 4-22
- Lai, T.L.¹ Robbins, H.²

16
- 84874038864
- A finite-time analysis of multi-armed bandits problems with kullback-leibler divergences
- O-A. Maillard, R. Munos, and G. Stoltz. A finite-time analysis of multi-armed bandits problems with Kullback-Leibler divergences. In Conference On Learning Theory (COLT), 2011.
- (2011) Conference on Learning Theory (COLT)
- Maillard, O.-A.¹ Munos, R.² Stoltz, G.³

17
- 30044441333
- The sample complexity of exploration in the multi-armed bandit problem
- S. Mannor and J. Tsitsiklis. The sample complexity of exploration in the multi-armed bandit problem. Journal of Machine Learning Research, pages 623-648, 2004.
- (2004) Journal of Machine Learning Research , pp. 623-648
- Mannor, S.¹ Tsitsiklis, J.²

18
- 0031069121
- The racing algorithm: Model selection for lazy learners
- O. Maron and A. Moore. The racing algorithm: Model selection for lazy learners. Artificial Intelligence Review, 11(1-5): 113-131, 1997.
- (1997) Artificial Intelligence Review , vol.11 , Issue.1-5 , pp. 113-131
- Maron, O.¹ Moore, A.²

19
- 56449108844
- Empirical bernstein stopping
- V. Mnih, C. Szepesvári, and J.-Y. Audibert. Empirical Bernstein stopping. In International Conference on Machine Learning (ICML), 2008.
- (2008) International Conference on Machine Learning (ICML)
- Mnih, V.¹ Szepesvári, C.² Audibert, J.-Y.³

20
- 0001395850
- On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
- W.R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25: 285-294, 1933.
- (1933) Biometrika , vol.25 , pp. 285-294
- Thompson, W.R.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.