SCOPUS 정보 검색 플랫폼

Journal of Machine Learning Research

Volumn 12, Issue , 2011, Pages 1655-1695

X-armed bandits

(4) Bubeck, Sébastien a Munos, Rémi b Stoltz, Gilles c,e Szepesvári, Csaba d

a CENTRE DE RECERCA MATEMÀTICA (Spain)

b INRIA (France)

c ECOLE NORMALE SUPÉRIEURE (France)

d UNIVERSITY OF ALBERTA (Canada)

e HEC PARIS (France)

Author keywords

Bandits with infinitely many arms; Minimax rates; Optimistic online optimization; Regret bounds

Indexed keywords

BANDITS WITH INFINITELY MANY ARMS; DECISION MAKERS; DISSIMILARITY FUNCTION; EUCLIDEAN SPACES; FINITE NUMBER; GLOBAL MAXIMUM; HYPERCUBE; LARGE CLASS; LIPSCHITZ; MEASURABLE SPACE; MINIMAX; ONLINE OPTIMIZATION; OPTIMALITY; REGRET BOUNDS; TIME STEP;

OPTIMIZATION;

COMPUTATIONAL COMPLEXITY;

EID: 79960128338 PISSN: 15324435 EISSN: 15337928 Source Type: Journal
DOI: None Document Type: Article

Times cited : (414)

References (28)

1
- 84898063697
- Competing in the dark: An efficient algorithm for bandit linear optimization
- Omnipress
- J. Abernethy, E. Hazan, and A. Rakhlin. Competing in the dark: an efficient algorithm for bandit linear optimization. In Proceedings of the 21st International Conference on Learning Theory. Omnipress, 2008.
- (2008) Proceedings of the 21st International Conference on Learning Theory
- Abernethy, J.¹ Hazan, E.² Rakhlin, A.³

2
- 0000616723
- Sample mean based index policies with o(logn) regret for the multi-armed bandit problem
- R. Agrawal. Sample mean based index policies with o(logn) regret for the multi-armed bandit problem. Advances in Applied Mathematics, 27:1054-1078, 1995a.
- (1995) Advances in Applied Mathematics , vol.27 , pp. 1054-1078
- Agrawal, R.¹

3
- 0345224411
- The continuum-armed bandit problem
- R. Agrawal. The continuum-armed bandit problem. SIAM Journal on Control and Optimization, 33:1926-1951, 1995b.
- (1995) SIAM Journal on Control and Optimization , vol.33 , pp. 1926-1951
- Agrawal, R.¹

4
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- DOI 10.1023/A:1013689704352, Computational Learning Theory
- P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning Journal, 47(2-3):235-256, 2002a. (Pubitemid 34126111)
- (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

5
- 0037709910
- The non-stochastic multi-armed bandit problem
- P. Auer, N. Cesa-Bianchi, Y. Freund, and R. Schapire. The non-stochastic multi-armed bandit problem. SIAM Journal on Computing, 32(1):48-77, 2002b.
- (2002) SIAM Journal on Computing , vol.32 , Issue.1 , pp. 48-77
- Auer, P.¹ Cesa-Bianchi, N.² Freund, Y.³ Schapire, R.⁴

6
- 38049040954
- Improved rates for the stochastic continuum-armed bandit problem
- P. Auer, R. Ortner, and C. Szepesvári. Improved rates for the stochastic continuum-armed bandit problem. In Proceedings of the 20th Conference on Learning Theory, pages 454-468, 2007.
- (2007) Proceedings of the 20th Conference on Learning Theory , pp. 454-468
- Auer, P.¹ Ortner, R.² Szepesvári, C.³

7
- 84888141227
- Open loop optimistic planning
- Omnipress
- S. Bubeck and R. Munos. Open loop optimistic planning. In Proceedings of the 23rd International Conference on Learning Theory. Omnipress, 2010.
- (2010) Proceedings of the 23rd International Conference on Learning Theory
- Bubeck, S.¹ Munos, R.²

8
- 77952027689
- Online optimization in X-armed bandits
- D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors
- S. Bubeck, R. Munos, G. Stoltz, and Cs. Szepesvari. Online optimization in X-armed bandits. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems 21, pages 201-208, 2009.
- (2009) Advances in Neural Information Processing Systems , vol.21 , pp. 201-208
- Bubeck, S.¹ Munos, R.² Stoltz, G.³ Szepesvari, Cs.⁴

9
- 79952624396
- Pure exploration in multi-armed bandits problems
- S. Bubeck, R.Munos, and G. Stoltz. Pure exploration in multi-armed bandits problems. Theoretical Computer Science, 412:1832-1852, 2011.
- (2011) Theoretical Computer Science , vol.412 , pp. 1832-1852
- Bubeck, S.¹ Munos, R.² Stoltz, G.³

10
- 84926078662
- Cambridge University Press
- N. Cesa-Bianchi and G. Lugosi. Prediction, Learning, and Games. Cambridge University Press, 2006.
- (2006) Prediction, Learning, and Games
- Cesa-Bianchi, N.¹ Lugosi, G.²

11
- 55249127519
- Progressive strategies for Monte-Carlo tree search
- G.M.J. Chaslot, M.H.M. Winands, H. Herik, J. Uiterwijk, and B. Bouzy. Progressive strategies for Monte-Carlo tree search. New Mathematics and Natural Computation, 4(3):343-357, 2008.
- (2008) New Mathematics and Natural Computation , vol.4 , Issue.3 , pp. 343-357
- Chaslot, G.M.J.¹ Winands, M.H.M.² Herik, H.³ Uiterwijk, J.⁴ Bouzy, B.⁵

12
- 67649577204
- Regret and convergence bounds for immediate-reward reinforcement learning with continuous action spaces
- E. Cope. Regret and convergence bounds for immediate-reward reinforcement learning with continuous action spaces. IEEE Transactions on Automatic Control, 54(6):1243-1253, 2009.
- (2009) IEEE Transactions on Automatic Control , vol.54 , Issue.6 , pp. 1243-1253
- Cope, E.¹

13
- 70349275222
- Bandit algorithms for tree search
- P.-A. Coquelin and R. Munos. Bandit algorithms for tree search. In Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence, pages 67-74, 2007.
- (2007) Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence , pp. 67-74
- Coquelin, P.-A.¹ Munos, R.²

14
- 0003954462
- John Wiley & Sons
- J. L. Doob. Stochastic Processes. John Wiley & Sons, 1953.
- (1953) Stochastic Processes
- Doob, J.L.¹

15
- 57749181518
- Simulation-based approach to general game playing
- H. Finnsson and Y. Bjornsson. Simulation-based approach to general game playing. In Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, pages 259-264, 2008.
- (2008) Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence , pp. 259-264
- Finnsson, H.¹ Bjornsson, Y.²

16
- 34547990649
- Combining online and offline knowledge in UCT
- ACM New York, NY, USA
- S. Gelly and D. Silver. Combining online and offline knowledge in UCT. In Proceedings of the 24th international conference on Machine learning, pages 273-280. ACM New York, NY, USA, 2007.
- (2007) Proceedings of the 24th International Conference on Machine Learning , pp. 273-280
- Gelly, S.¹ Silver, D.²

17
- 57749091602
- Achieving master level play in 9×9 computer go
- S. Gelly and D. Silver. Achieving master level play in 9×9 computer go. In Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, pages 1537-1540, 2008.
- (2008) Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence , pp. 1537-1540
- Gelly, S.¹ Silver, D.²

18
- 34250659969
- Technical Report RR-6062 INRIA
- S. Gelly, Y. Wang, R. Munos, and O. Teytaud. Modification of UCT with patterns in Monte-Carlo go. Technical Report RR-6062, INRIA, 2006.
- (2006) Modification of UCT with Patterns in Monte-Carlo Go
- Gelly, S.¹ Wang, Y.² Munos, R.³ Teytaud, O.⁴

19
- 84891584370
- Wiley-Interscience Series in Systems and Optimization. Wiley, Chichester, NY
- J. C. Gittins. Multi-armed Bandit Allocation Indices. Wiley-Interscience Series in Systems and Optimization. Wiley, Chichester, NY, 1989.
- (1989) Multi-armed Bandit Allocation Indices
- Gittins, J.C.¹

20
- 84947403595
- Probability inequalities for sums of bounded random variables
- W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58:13-30, 1963.
- (1963) Journal of the American Statistical Association , vol.58 , pp. 13-30
- Hoeffding, W.¹

21
- 38049011420
- Nearly tight bounds for the continuum-armed bandit problem
- R. Kleinberg. Nearly tight bounds for the continuum-armed bandit problem. In Advances in Neural Information Processing Systems 18, 2004.
- (2004) Advances in Neural Information Processing Systems , vol.18
- Kleinberg, R.¹

22
- 57049185311
- Multi-armed bandits in metric spaces
- R. Kleinberg, A. Slivkins, and E. Upfal. Multi-armed bandits in metric spaces. In Proceedings of the 40th ACM Symposium on Theory of Computing, 2008a.
- (2008) Proceedings of the 40th ACM Symposium on Theory of Computing
- Kleinberg, R.¹ Slivkins, A.² Upfal, E.³

23
- 79960116894
- September
- R. Kleinberg, A. Slivkins, and E. Upfal. Multi-armed bandits in metric spaces, September 2008b. URL http://arxiv.org/abs/0809.4882.
- (2008) Multi-armed Bandits in Metric Spaces
- Kleinberg, R.¹ Slivkins, A.² Upfal, E.³

24
- 33750293964
- Bandit based Monte-Carlo planning
- Machine Learning: ECML 2006 - 17th European Conference on Machine Learning, Proceedings
- L. Kocsis and Cs. Szepesvari. Bandit based Monte-carlo planning. In Proceedings of the 15th European Conference on Machine Learning, pages 282-293, 2006. (Pubitemid 44618839)
- (2006) Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , vol.4212 , pp. 282-293
- Kocsis, L.¹ Szepesvari, C.²

25
- 0002899547
- Asymptotically efficient adaptive allocation rules
- T. L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6:4-22, 1985.
- (1985) Advances in Applied Mathematics , vol.6 , pp. 4-22
- Lai, T.L.¹ Robbins, H.²

26
- 84966203785
- Some aspects of the sequential design of experiments
- H. Robbins. Some aspects of the sequential design of experiments. Bulletin of the American Mathematics Society, 58:527-535, 1952.
- (1952) Bulletin of the American Mathematics Society , vol.58 , pp. 527-535
- Robbins, H.¹

27
- 78649814982
- Addressing NP-complete puzzles with Monte-Carlo methods
- The Society for the study of Artificial Intelligence and Simulation of Behaviour
- M.P.D. Schadd, M.H.M. Winands, H.J. van den Herik, and H. Aldewereld. Addressing NP-complete puzzles with Monte-Carlo methods. In Proceedings of the AISB 2008 Symposium on Logic and the Simulation of Interaction and Reasoning, volume 9, pages 55-61. The Society for the study of Artificial Intelligence and Simulation of Behaviour, 2008.
- (2008) Proceedings of the AISB 2008 Symposium on Logic and the Simulation of Interaction and Reasoning , vol.9 , pp. 55-61
- Schadd, M.P.D.¹ Winands, M.H.M.² Van Den Herik, H.J.³ Aldewereld, H.⁴

28
- 84862270167
- How powerful can any regression learning procedure be?
- Y. Yang. How powerful can any regression learning procedure be? In Proceedings of the 11th International Conference on Artificial Intelligence and Statistics, volume 2, pages 636-643, 2007.
- (2007) Proceedings of the 11th International Conference on Artificial Intelligence and Statistics , vol.2 , pp. 636-643
- Yang, Y.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.