SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems

Volumn , Issue , 2013, Pages

Online learning with switching costs and other adaptive adversaries

(3) Cesa Bianchi, Nicolò a Dekel, Ofer b Shamir, Ohad b

a UNIVERSITY OF MILAN (Italy)

b MICROSOFT RESEARCH (United States)

Author keywords

[No Author keywords available]

Indexed keywords

COSTS;

ADAPTIVE ADVERSARY; BANDIT FEEDBACKS; BOUNDED MEMORY; FULL INFORMATIONS; ONLINE LEARNING; PREDICTION WITH EXPERT ADVICE; STOCHASTIC ADVERSARY; SWITCHING COSTS;

SWITCHING;

EID: 84894413813 PISSN: 10495258 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (137)

References (21)

1
- 84898073179
- Beating the adaptive bandit with high probability
- J. Abernethy and A. Rakhlin. Beating the adaptive bandit with high probability. In COLT, 2009.
- (2009) COLT
- Abernethy, J.¹ Rakhlin, A.²

2
- 0024089489
- Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost
- R. Agrawal, M.V. Hedge, and D. Teneketzis. Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost. IEEE Transactions on Automatic Control, 33(10):899-906, 1988.
- (1988) IEEE Transactions on Automatic Control , vol.33 , Issue.10 , pp. 899-906
- Agrawal, R.¹ Hedge, M.V.² Teneketzis, D.³

3
- 84867129684
- Online bandit learning against an adaptive adversary: From regret to policy regret
- R. Arora, O. Dekel, and A. Tewari. Online bandit learning against an adaptive adversary: from regret to policy regret. In Proceedings of the Twenty-Ninth International Conference on Machine Learning, 2012.
- (2012) Proceedings of the Twenty-Ninth International Conference on Machine Learning
- Arora, R.¹ Dekel, O.² Tewari, A.³

4
- 0037709910
- The nonstochastic multiarmed bandit problem
- P. Auer, N. Cesa-Bianchi, Y. Freund, and R. Schapire. The nonstochastic multiarmed bandit problem. SIAM Journal on Computing, 32(1):48-77, 2002.
- (2002) SIAM Journal on Computing , vol.32 , Issue.1 , pp. 48-77
- Auer, P.¹ Cesa-Bianchi, N.² Freund, Y.³ Schapire, R.⁴

5
- 0004134209
- Cambridge University Press
- A. Borodin and R. El-Yaniv. Online computation and competitive analysis. Cambridge University Press, 1998.
- (1998) Online Computation and Competitive Analysis
- Borodin, A.¹ El-Yaniv, R.²

6
- 79960128338
- X-armed bandits
- S. Bubeck, R. Munos, G. Stoltz, and C. Szepesvári. X-armed bandits. Journal of Machine Learning Research, 12:1655-1695, 2011.
- (2011) Journal of Machine Learning Research , vol.12 , pp. 1655-1695
- Bubeck, S.¹ Munos, R.² Stoltz, G.³ Szepesvári, C.⁴

7
- 84876049382
- Regret minimization for reserve prices in second-price auctions
- N. Cesa-Bianchi, C. Gentile, and Y. Mansour. Regret minimization for reserve prices in second-price auctions. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA13), 2013.
- (2013) Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA13)
- Cesa-Bianchi, N.¹ Gentile, C.² Mansour, Y.³

8
- 84926078662
- Cambridge University Press
- N. Cesa-Bianchi and G. Lugosi. Prediction, learning, and games. Cambridge University Press, 2006.
- (2006) Prediction, Learning, and Games
- Cesa-Bianchi, N.¹ Lugosi, G.²

9
- 33847624608
- Improved second-order bounds for prediction with expert advice
- N. Cesa-Bianchi, Y. Mansour, and G. Stoltz. Improved second-order bounds for prediction with expert advice. Machine Learning, 66(2/3):321-352, 2007.
- (2007) Machine Learning , vol.66 , Issue.2-3 , pp. 321-352
- Cesa-Bianchi, N.¹ Mansour, Y.² Stoltz, G.³

10
- 33244456637
- Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary
- V. Dani and T. P. Hayes. Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary. In Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms, 2006.
- (2006) Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms
- Dani, V.¹ Hayes, T.P.²

11
- 0031211090
- A decision-theoretic generalization of on-line learning and an application to boosting
- Y. Freund and R.E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and System Sciences, 55(1):119-139, 1997.
- (1997) Journal of Computer and System Sciences , vol.55 , Issue.1 , pp. 119-139
- Freund, Y.¹ Schapire, R.E.²

12
- 80054798353
- Near-optimal rates for limited-delay universal lossy source coding
- A. Gyorgy and G. Neu. Near-optimal rates for limited-delay universal lossy source coding. In IEEE International Symposium on Information Theory, pages 2218-2222, 2011.
- (2011) IEEE International Symposium on Information Theory , pp. 2218-2222
- Gyorgy, A.¹ Neu, G.²

13
- 10944236938
- A survey on the bandit problem with switching costs
- T. Jun. A survey on the bandit problem with switching costs. De Economist, 152:513-541, 2004.
- (2004) De Economist , vol.152 , pp. 513-541
- Jun, T.¹

14
- 24644463787
- Efficient algorithms for online decision problems
- A. Kalai and S. Vempala. Efficient algorithms for online decision problems. Journal of Computer and System Sciences, 71:291-307, 2005.
- (2005) Journal of Computer and System Sciences , vol.71 , pp. 291-307
- Kalai, A.¹ Vempala, S.²

15
- 35148838877
- The weighted majority algorithm
- N. Littlestone and M.K. Warmuth. The weighted majority algorithm. Information and Computation, 108:212-261, 1994.
- (1994) Information and Computation , vol.108 , pp. 212-261
- Littlestone, N.¹ Warmuth, M.K.²

16
- 84995420448
- Adaptive bandits: Towards the best history-dependent strategy
- O. Maillard and R. Munos. Adaptive bandits: Towards the best history-dependent strategy. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010.
- (2010) Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics
- Maillard, O.¹ Munos, R.²

17
- 24644470905
- Online geometric optimization in the bandit setting against an adaptive adversary
- H. B. McMahan and A. Blum. Online geometric optimization in the bandit setting against an adaptive adversary. In Proceedings of the Seventeenth Annual Conference on Learning Theory, 2004.
- (2004) Proceedings of the Seventeenth Annual Conference on Learning Theory
- McMahan, H.B.¹ Blum, A.²

18
- 0036649565
- Sequential strategies for loss functions with memory
- N. Merhav, E. Ordentlich, G. Seroussi, and M.J. Weinberger. Sequential strategies for loss functions with memory. IEEE Transactions on Information Theory, 48(7):1947-1958, 2002.
- (2002) IEEE Transactions on Information Theory , vol.48 , Issue.7 , pp. 1947-1958
- Merhav, N.¹ Ordentlich, E.² Seroussi, G.³ Weinberger, M.J.⁴

19
- 84899011267
- Online learning with delayed label feedback
- C. Mesterharm. Online learning with delayed label feedback. In Proceedings of the Sixteenth International Conference on Algorithmic Learning Theory, 2005.
- (2005) Proceedings of the Sixteenth International Conference on Algorithmic Learning Theory
- Mesterharm, C.¹

20
- 77953539718
- Online regret bounds for Markov decision processes with deterministic transitions
- R. Ortner. Online regret bounds for Markov decision processes with deterministic transitions. Theoretical Computer Science, 411(29-30):2684-2695, 2010.
- (2010) Theoretical Computer Science , vol.411 , Issue.29-30 , pp. 2684-2695
- Ortner, R.¹

21
- 84898943410
- CoRR, abs/1209.2388
- O. Shamir. On the complexity of bandit and derivative-free stochastic convex optimization. CoRR, abs/1209.2388, 2012.
- (2012) On the Complexity of Bandit and Derivative-free Stochastic Convex Optimization
- Shamir, O.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.