SCOPUS 정보 검색 플랫폼

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Volumn , Issue , 2011, Pages 204-212

Learning to trade off between exploration and exploitation in multiclass bandit prediction

(3) Valizadegan, Hamed a Jin, Rong b Wang, Shijun c

a Learning Research and Development Ctr (United States)

b Michigan State University (United States)

c NATIONAL INSTITUTES OF HEALTH (United States)

Author keywords

Bandit feedback; Exploration vs. exploitation; Multi class classification; Online learning

Indexed keywords

DATA MINING; E-LEARNING; LEARNING ALGORITHMS; LEARNING SYSTEMS; PARAMETER ESTIMATION;

BANDIT FEEDBACKS; EXPLOITATION; EXPLORATION AND EXPLOITATION; EXPLORATION VS.; LEARNING PROBLEM; MULTI-CLASS CLASSIFICATION; ONLINE LEARNING; PARTIAL FEEDBACK; TRADE OFF; TRADEOFF PARAMETERS;

ECONOMIC AND SOCIAL EFFECTS;

EID: 80052674910 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2020408.2020445 Document Type: Conference Paper

Times cited : (12)

References (19)

1
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- DOI 10.1023/A:1013689704352, Computational Learning Theory
- Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235-256, 2002. (Pubitemid 34126111)
- (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

2
- 80052659095
- An optimal high probability algorithm for the contextual bandit problem
- abs/1002.4058
- Alina Beygelzimer, John Langford, Lihong Li, Lev Reyzin, and Robert E. Schapire. An optimal high probability algorithm for the contextual bandit problem. Computational Research Repository, abs/1002.4058, 2010.
- (2010) Computational Research Repository
- Beygelzimer, A.¹ Langford, J.² Li, L.³ Reyzin, L.⁴ Schapire, R.E.⁵

3
- 84926078662
- Cambridge Univ Pr
- N. Cesa-Bianchi and G. Lugosi. Prediction, learning, and games. Cambridge Univ Pr, 2006.
- (2006) Prediction, Learning, and Games
- Cesa-Bianchi, N.¹ Lugosi, G.²

4
- 0003710380
- Chih-Chung Chang and Chih-Jen Lin. Libsvm : a library for support vector machines, 2001.
- (2001) Libsvm : A Library for Support Vector Machines
- Chang, C.-C.¹ Lin, C.-J.²

5
- 84937398609
- PAC bounds for multi-armed bandit and markov decision processes
- Eyal Even-Dar, Shie Mannor, and Yishay Mansour. PAC bounds for multi-armed bandit and markov decision processes. In COLT'02: Proceedings of the 15th Annual Conference on Computational Learning Theory, pages 255-270, 2002.
- (2002) COLT'02: Proceedings of the 15th Annual Conference on Computational Learning Theory , pp. 255-270
- Even-Dar, E.¹ Mannor, S.² Mansour, Y.³

6
- 33745295134
- Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
- Eyal Even-Dar, Shie Mannor, and Yishay Mansour. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of Machine Learning Research, 7:1079-1105, 2006. (Pubitemid 43938989)
- (2006) Journal of Machine Learning Research , vol.7 , pp. 1079-1105
- Even-Bar, E.¹ Mannor, S.² Mansour, Y.³

7
- 78649934709
- A. Frank and A. Asuncion. UCI machine learning repository, 2010.
- (2010) UCI Machine Learning Repository
- Frank, A.¹ Asuncion, A.²

8
- 56449104477
- Efficient bandit algorithms for online multiclass prediction
- Sham M. Kakade, Shai Shalev-Shwartz, and Ambuj Tewari. Efficient bandit algorithms for online multiclass prediction. In ICML 2008: Proceedings of the 25th international conference on Machine learning, pages 440-447, 2008.
- (2008) ICML 2008: Proceedings of the 25th International Conference on Machine Learning , pp. 440-447
- Kakade, S.M.¹ Shalev-Shwartz, S.² Tewari, A.³

9
- 83055177001
- The epoch-greedy algorithm for contextual multi-armed bandits
- John Langford and Zhang Tong. The epoch-greedy algorithm for contextual multi-armed bandits. In NIPS 2007: Proceeding of the 20th Annual Conference on Neural Information Processing System, 2007.
- (2007) NIPS 2007: Proceeding of the 20th Annual Conference on Neural Information Processing System
- Langford, J.¹ Tong, Z.²

10
- 84876811202
- Rcv1: A new benchmark collection for text categorization research
- D. D. Lewis, Y. Yang, T. Rose, and F. Li. Rcv1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5:361-397, 2004.
- (2004) Journal of Machine Learning Research , vol.5 , pp. 361-397
- Lewis, D.D.¹ Yang, Y.² Rose, T.³ Li, F.⁴

11
- 77954641643
- A contextual-bandit approach to personalized news article recommendation
- New York, NY, USA, ACM
- Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. A contextual-bandit approach to personalized news article recommendation. In WWW'10: Proceedings of the 19th international conference on World wide web, pages 661-670, New York, NY, USA, 2010. ACM.
- (2010) WWW'10: Proceedings of the 19th International Conference on World Wide Web , pp. 661-670
- Li, L.¹ Chu, W.² Langford, J.³ Schapire, R.E.⁴

12
- 77956210502
- Exploitation and exploration in a performance based contextual advertising system
- Wei Li, Xuerui Wang, Ruofei Zhang, Ying Cui, Jianchang Mao, and Rong Jin. Exploitation and exploration in a performance based contextual advertising system. In KDD 2010: Knoledge Discovery and Data Mining, pages 27-36, 2010.
- (2010) KDD 2010: Knoledge Discovery and Data Mining , pp. 27-36
- Li, W.¹ Wang, X.² Zhang, R.³ Cui, Y.⁴ Mao, J.⁵ Jin, R.⁶

13
- 30044441333
- The sample complexity of exploration in the multi-armed bandit problem
- Shie Mannor and John N. Tsitsiklis. The sample complexity of exploration in the multi-armed bandit problem. Journal of Machine Learning Research, 5:623-648, 2004.
- (2004) Journal of Machine Learning Research , vol.5 , pp. 623-648
- Mannor, S.¹ Tsitsiklis, J.N.²

14
- 84966203785
- Some aspects of the sequential design of experiments
- Herbert Robbins. some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society, 58:527-535, 1952.
- (1952) Bulletin of the American Mathematical Society , vol.58 , pp. 527-535
- Robbins, H.¹

15
- 84966203785
- Some aspects of the sequential design of experiments
- Herbert Robins. Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc., 58(5):527-535, 2010.
- (2010) Bull. Amer. Math. Soc. , vol.58 , Issue.5 , pp. 527-535
- Robins, H.¹

16
- 11144273669
- The perceptron: A probabilistic model for information storage and organization in the brain
- F. Rosenblatt. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65:386-408, 1958.
- (1958) Psychological Review , vol.65 , pp. 386-408
- Rosenblatt, F.¹

17
- 33646406807
- Multi-armed bandit algorithms and empirical evaluation
- Springer
- Joannès Vermorel and Mehryar Mohri. Multi-armed bandit algorithms and empirical evaluation. In In European Conference on Machine Learning, pages 437-448. Springer, 2005.
- (2005) European Conference on Machine Learning , pp. 437-448
- Vermorel, J.¹ Mohri, M.²

18
- 80052680363
- A potential-based framework for online multi-class learning with partial feedback
- Shijun Wang, Rong Jin, and Hamed Valizadegan. A potential-based framework for online multi-class learning with partial feedback. In ISTATS 2010: Artificial Intelligence and Statistics, 2010.
- (2010) ISTATS 2010: Artificial Intelligence and Statistics
- Wang, S.¹ Jin, R.² Valizadegan, H.³

19
- 0004049893
- PhD thesis, Cambridge
- C. Watkins. Learning from delayed Rewards. PhD thesis, Cambridge, 1989.
- (1989) Learning from Delayed Rewards
- Watkins, C.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.