SCOPUS 정보 검색 플랫폼

Journal of Machine Learning Research

Volumn 9, Issue , 2010, Pages 485-492

Contextual multi-armed bandits

(3) Lu, Tyler a Pál, Dávid b Pál, Martin c

a UNIVERSITY OF TORONTO (Canada)

b UNIVERSITY OF ALBERTA (Canada)

c GOOGLE INC (United States)

Author keywords

[No Author keywords available]

Indexed keywords

BOUNDED SUBSET; CLICK-THROUGH RATE; CONTEXT-FREE; EUCLIDEAN SPACES; INTERNET SEARCH ENGINE; LIPSCHITZ; LIPSCHITZ CONDITIONS; LOWER BOUNDS; METRIC SPACES; MULTI ARMED BANDIT; MULTI-ARMED BANDIT PROBLEM; ON-LINE ALGORITHMS; PACKING DIMENSION; PAYOFF FUNCTION; SEARCH QUERIES; SIDE INFORMATION; UPPER AND LOWER BOUNDS; WEB SEARCHES;

ALGORITHMS; ARTIFICIAL INTELLIGENCE; INFORMATION RETRIEVAL; SEARCH ENGINES; SET THEORY; TOPOLOGY; WORLD WIDE WEB;

STATISTICS;

EID: 84862301554 PISSN: 15324435 EISSN: 15337928 Source Type: Journal
DOI: None Document Type: Conference Paper

Times cited : (192)

References (20)

1
- 0345224411
- The continuum-armed bandit problem
- R. Agrawal. The continuum-armed bandit problem. SIAM J. Control and Optimization, 33:1926-1951, 1995.
- (1995) SIAM J. Control and Optimization , vol.33 , pp. 1926-1951
- Agrawal, R.¹

2
- 56449090814
- Logarithmic online regret bounds for undiscounted reinforcement learning
- MIT Press
- Peter Auer and Ronald Ortner. Logarithmic online regret bounds for undiscounted reinforcement learning. In Ad vances in Neural Information Processing Systems 19, (NIPS 2007), pages 49-56. MIT Press, 2007.
- (2007) Ad Vances in Neural Information Processing Systems 19, (NIPS 2007) , pp. 49-56
- Auer, P.¹ Ortner, R.²

3
- 0036568025
- Finitetime analysis of the multiarmed bandit problem
- Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. Finitetime analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235-256, 2002.
- (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

4
- 0037709910
- Schapire. The nonstochastic multiarmed bandit problem
- Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund., and Robert E. Schapire. The nonstochastic multiarmed bandit problem. SIAM Journal on Computing, 32(1):48-77,2003.
- (2003) SIAM Journal on Computing , vol.32 , Issue.1 , pp. 48-77
- Auer, P.¹ Cesa-Bianchi, N.² Freund, Y.³ Robert, E.⁴

5
- 38049040954
- Improved rates for the stochastic continuum-armed bandit problem
- Springer
- Peter Auer, Ronald Ortner, and Csaba Szepesvári. Improved rates for the stochastic continuum-armed bandit problem. In Proceedings of the 20th Annual Conference on Learning Theory, (COLT 2007), pages 454-468.Springer, 2007.
- (2007) Proceedings of the 20th Annual Conference on Learning Theory, (COLT 2007) , pp. 454-468
- Auer, P.¹ Ortner, R.² Szepesvári, C.³

6
- 77952027689
- Online optimization in x-armed bandits
- Sébastien Bubeck, Rémi Munos, Gilles Stoltz, and Csaba Szepesvári. Online optimization in x-armed bandits. In NIPS, pages 201-208, 2008.
- (2008) NIPS , pp. 201-208
- Bubeck, S.¹ Munos, R.² Stoltz, G.³ Szepesvári, C.⁴

7
- 84926078662
- Cambridge University Press
- Nicolò Cesa-Bianchi and Gábor Lugosi. Prediction, Learning, and Games. Cambridge University Press, 2006.
- (2006) Prediction Learning and Games
- Cesa-Bianchi, N.¹ Lugosi, G.²

8
- 0003516492
- Springer
- Luc Devroye and Gábor Lugosi. Combinatorial Methods in Density Estimation. Springer, 2001.
- (2001) Combinatorial Methods in Density Estimation
- Devroye, L.¹ Lugosi, G.²

9
- 33745295134
- Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
- Eyal Even-Dar, Shie Mannor, and Yishay Mansour. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of Machine Learning Research, 7:1079-1105, 2006.
- (2006) Journal of Machine Learning Research , vol.7 , pp. 1079-1105
- Even-Dar, E.¹ Mannor, S.² Mansour, Y.³

10
- 20744454447
- Online convex optimization in the bandit setting: Gradient descent without a gradient
- Society for Industrial and Applied Mathematics Philadelphia, PA, USA
- Abraham D. Flaxman, Adam T. Kalai, and H. Brendan McMahan. Online convex optimization in the bandit setting: gradient descent without a gradient. In Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms (SODA 2005), pages 385-394. Society for Industrial and Applied Mathematics Philadelphia, PA, USA, 2005.
- (2005) Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2005) , pp. 385-394
- Flaxman, A.D.¹ Kalai, A.T.² McMahan, H.B.³

11
- 77953091640
- manuscript
- Alexander Goldenshluger and Assaf Zeevi. Performance limitations in bandit problems with side observations. manuscript, 2007.
- (2007) Performance Limitations in Bandit Problems with Side Observations
- Goldenshluger, A.¹ Zeevi, A.²

12
- 84947403595
- Probability inequalities for sums of bounded random variables
- Wassily Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13-30, 1963.
- (1963) Journal of the American Statistical Association , vol.58 , Issue.301 , pp. 13-30
- Hoeffding, W.¹

13
- 84898981061
- Nearly tight bounds for the continuum-armed bandit problem
- Lawrence K. Saul, Yair Weiss, and Léon Bottou, editors. MIT Press
- Robert D. Kleinberg. Nearly tight bounds for the continuum-armed bandit problem. In Lawrence K. Saul, Yair Weiss, and Léon Bottou, editors, Advances in Neural Information Processing Systems 17, (NIPS 2005), pages 697-704. MIT Press, 2005a.
- (2005) Advances in Neural Information Processing Systems 17, (NIPS 2005) , pp. 697-704
- Kleinberg, R.D.¹

14
- 33748679987
- PhD thesis, Massachusetts Institute of Technology, June
- Robert D. Kleinberg. Online Decision Problems with Large Strategy Sets. PhD thesis, Massachusetts Institute of Technology, June 2005b.
- (2005) Online Decision Problems with Large Strategy Sets
- Kleinberg, R.D.¹

15
- 57049185311
- Multi-armed bandits in metric spaces
- Association for Computing Machinery
- Robert D. Kleinberg, Aleksandrs Slivkins, and Eli Upfal. Multi-armed bandits in metric spaces. In Proceedings of the 40th Annual ACM Symposium, STOC 2008, pages 681-690. Association for Computing Machinery, 2008.
- (2008) Proceedings of the 40th Annual ACM Symposium, STOC 2008 , pp. 681-690
- Kleinberg, R.D.¹ Slivkins, A.² Upfal, E.³

16
- 0002899547
- Asymptotically efficient adaptive allocation rules
- T. L. Lai and Herbert Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1):4-22, 1985.
- (1985) Advances in Applied Mathematics , vol.6 , Issue.1 , pp. 4-22
- Lai, T.L.¹ Robbins, H.²

17
- 84862283462
- Blog post, September
- John Langford. How do we get weak action dependence for learning with partial observations? Blog post: http://hunch.net/?p=421, September 2008.
- (2008) How Do We Get Weak Action Dependence for Learning with Partial Observations?
- Langford, J.¹

18
- 83055177001
- The epoch-greedy algorithm for multi-armed bandits with side information
- John Langford and Tong Zhang. The epoch-greedy algorithm for multi-armed bandits with side information. In NIPS, 2007.
- (2007) NIPS
- Langford, J.¹ Zhang, T.²

19
- 0004007508
- MIT Press
- Richard S. Sutton and Andrew G. Barto. Reinforcement Learning. MIT Press, 1998.
- (1998) Reinforcement Learning
- Sutton, R.S.¹ Barto, A.G.²

20
- 15844389867
- Bandit problems with side observations
- May
- Chih-Chun Wang, Sanjeev R. Kulkarni, and H. Vincent Poor. Bandit problems with side observations. IEEE Transactions on Automatic Control, 50(3):338-355, May 2005.
- (2005) IEEE Transactions on Automatic Control , vol.50 , Issue.3 , pp. 338-355
- Wang, C.-C.¹ Kulkarni, S.R.² Poor, H.V.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.