SCOPUS 정보 검색 플랫폼

Volumn , Issue , 2011, Pages 1-10

Graphical models for bandit problems

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE; GRAPH ALGORITHMS; PROBABILITY;

ACTION SPACES; BANDIT PROBLEMS; CONTEXT SPACES; GRAPHICAL MODEL; MULTI-ARMED BANDIT PROBLEM; RUNNING TIME; TREE-WIDTH;

GRAPHIC METHODS;

EID: 80053151768 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (12)

References (13)

1
- 0037709910
- The nonstochastic multiarmed bandit problem
- Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund, and Robert E. Schapire. The nonstochastic multiarmed bandit problem. SIAM Journal of Computing, 32 (1):48-77, 2002.
- (2002) SIAM Journal of Computing , vol.32 , Issue.1 , pp. 48-77
- Auer, P.¹ Cesa-Bianchi, N.² Freund, Y.³ Schapire, R.E.⁴

2
- 77956144722
- The epoch-greedy algorithm for contextual multi-armed bandits
- John Langford and Tong Zhang. The epoch-greedy algorithm for contextual multi-armed bandits. In Advances in Neural Information Processing Systems 20, 2007.
- (2007) Advances in Neural Information Processing Systems , vol.20
- Langford, J.¹ Zhang, T.²

3
- 80053144086
- Contextual ban-dit algorithms with supervised learning guarantees
- Alina Beygelzimer, John Langford, Lihong Li, Lev Reyzin, and Robert E. Schapire. Contextual ban-dit algorithms with supervised learning guarantees. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, 2011.
- (2011) Proceedings of the 14th International Conference on Artificial Intelligence and Statistics
- Beygelzimer, A.¹ Langford, J.² Li, L.³ Reyzin, L.⁴ Schapire, R.E.⁵

4
- 0344118814
- Reinforcement learning with immediate rewards and linear hypotheses
- Naoki Abe, Alan W. Biermann, and Philip M. Long.Reinforcement learning with immediate rewards and linear hypotheses. Algorithmica, 37:263-293, 2003.
- (2003) Algorithmica , vol.37 , pp. 263-293
- Abe, N.¹ Biermann, A.W.² Long, P.M.³

5
- 0041966002
- Using confidence bounds for exploitation-exploration trade-offs
- Peter Auer. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3:397-422, 2002.
- (2002) Journal of Machine Learning Research , vol.3 , pp. 397-422
- Auer, P.¹

6
- 77954641643
- A contextual-bandit approach to person-alized news article recommendation
- Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. A contextual-bandit approach to person-alized news article recommendation. In roceedings of the 19th International World Wide Web Conference, 2010.
- (2010) Proceedings of the 19th International World Wide Web Conference
- Li, L.¹ Chu, W.² Langford, J.³ Schapire, R.E.⁴

7
- 84892931731
- Contextual bandits with simi-larity information
- Aleksandrs Slivkins. Contextual bandits with simi-larity information. In Proceedings of the 24th An-nual Conference on Computational Learning Theory, 2011.
- (2011) Proceedings of the 24th Annual Conference on Computational Learning Theory
- Slivkins, A.¹

8
- 84898452145
- Contextual multi-armed bandits
- Tyler Lu, David Pal, and Martin Pal. Contextual multi-armed bandits. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, 2010.
- (2010) Proceedings of the 13th International Conference on Artificial Intelligence and Statistics
- Lu, T.¹ Pal, D.² Pal, M.³

9
- 56449122733
- Knows what it knows: A framework for self- aware learning
- Lihong Li, Michael L Littman, and Thomas J Walsh. Knows what it knows: A framework for self- aware learning. In Proceedings of the 27th International Conference on Machine Learning, 2008.
- (2008) Proceedings of the 27th International Conference on Machine Learning
- Li, L.¹ Littman, M.L.² Walsh, T.J.³

10
- 0001051761
- On the computational complexity of Ising spin glass models
- Francisco Barahona. On the computational complexity of Ising spin glass models. Journal of Physics A: Mathematical, Nuclear and General, 15:3241-3253, 1982.
- (1982) Journal of Physics A: Mathematical, Nuclear and General , vol.15 , pp. 3241-3253
- Barahona, F.¹

11
- 85162058047
- Online linear regression and its application to model-based reinforcement learning
- Alexander Strehl andMichael L Littman. Online linear regression and its application to model-based reinforcement learning. In Advances in Neural Information Processing Systems 20, 2007.
- (2007) Advances in Neural Information Processing Systems , vol.20
- Strehl, A.¹ Littman, M.L.²

12
- 71149102767
- Robust bounds for classification via selective sampling
- Nicolò Cesa-Bianchi, Claudio Gentile, and Francesco Orabona. Robust bounds for classification via selective sampling. In Proceedings of the 28th International Conference on Machine Learning, 2009.
- (2009) Proceedings of the 28th International Conference on Machine Learning
- Cesa-Bianchi, N.¹ Gentile, C.² Orabona, F.³

13
- 84898768231
- 1/2-regret in online multiclass prediction?
- Jacob Abernathy and Alexander Rakhlin. An efficient bandit algorithm for t1/2-regret in online multiclass prediction? In Proceedings of the 22nd Annual Conference on Computational Learning Theory, 2009.
- (2009) Proceedings of the 22nd Annual Conference on Computational Learning Theory
- Abernathy, J.¹ Rakhlin, A.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.