-
1
-
-
85162387277
-
Distributed delayed stochastic optimization
-
Shawe-Taylor, J., Zemel, R.S., Bartlett, P., Pereira, F., and Weinberger, K.Q. (eds.)
-
Agarwal, Alekh and Duchi, John. Distributed delayed stochastic optimization. In Shawe-Taylor, J., Zemel, R.S., Bartlett, P., Pereira, F., and Weinberger, K.Q. (eds.), Advances in Neural Information Processing Systems 24 (NIPS), pp. 873-881, 2011.
-
(2011)
Advances in Neural Information Processing Systems 24 (NIPS)
, pp. 873-881
-
-
Agarwal, A.1
Duchi, J.2
-
2
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
May
-
Auer, Peter, Cesa-Bianchi, Nicolò, and Fischer, Paul. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235-256, May 2002.
-
(2002)
Machine Learning
, vol.47
, Issue.2-3
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
3
-
-
84926078662
-
-
Cambridge University Press, New York, NY, USA, ISBN 0521841089
-
Cesa-Bianchi, Nicolò and Lugosi, Gábor. Prediction, Learning, and Games. Cambridge University Press, New York, NY, USA, 2006. ISBN 0521841089.
-
(2006)
Prediction, Learning, and Games
-
-
Cesa-Bianchi, N.1
Lugosi, G.2
-
4
-
-
84867115523
-
Parallelizing exploration-exploitation trade-offs with gaussian process bandit optimization
-
Omnipress
-
Desautels, Thomas, Krause, Andreas, and Burdick, Joel. Parallelizing exploration-exploitation trade-offs with gaussian process bandit optimization. In Proceedings of the 29th International Conference on Machine Learning (ICML), Edinburgh, Scotland, UK, 2012. Omnipress.
-
Proceedings of the 29th International Conference on Machine Learning (ICML), Edinburgh, Scotland, UK, 2012
-
-
Desautels, T.1
Krause, A.2
Burdick, J.3
-
5
-
-
80053154335
-
Efficient optimal learning for contextual bandits
-
Corvallis, Oregon, AUAI Press
-
Dudik, Miroslav, Hsu, Daniel, Kale, Satyen, Karampatziakis, Nikos, Langford, John, Reyzin, Lev, and Zhang, Tong. Efficient optimal learning for contextual bandits. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI), pp. 169-178, Corvallis, Oregon, 2011. AUAI Press.
-
(2011)
Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI)
, pp. 169-178
-
-
Dudik, M.1
Hsu, D.2
Kale, S.3
Karampatziakis, N.4
Langford, J.5
Reyzin, L.6
Zhang, T.7
-
6
-
-
84898437076
-
The KL-UCB algorithm for bounded stochastic bandits and beyond
-
Budapest, Hungary, July
-
Garivier, Aurélien and Cappe, Olivier. The KL-UCB algorithm for bounded stochastic bandits and beyond. In Proceedings of the 24th Annual Conference on Learning Theory (COLT), volume 19, pp. 359-376, Budapest, Hungary, July 2011.
-
(2011)
Proceedings of the 24th Annual Conference on Learning Theory (COLT)
, vol.19
, pp. 359-376
-
-
Garivier, A.1
Cappe, O.2
-
7
-
-
84897506818
-
Online learning under delayed feedback
-
Extended version of a paper submitted to URL
-
Joulani, Pooria, György, András, and Szepesvári, Csaba. Online learning under delayed feedback. Extended version of a paper submitted to ICML-2013, 2013. URL http://webdocs.cs.ualberta.ca/~pooria/ publications/DelayedFeedback-ICML2013-Extended.pdf.
-
(2013)
ICML-2013
-
-
Joulani, P.1
György, A.2
Szepesvári, C.3
-
8
-
-
80052488062
-
Slow learners are fast
-
Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C. K. I., and Culotta, A. (eds.)
-
Langford, John, Smola, Alexander, and Zinkevich, Martin. Slow learners are fast. In Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C. K. I., and Culotta, A. (eds.), Advances in Neural Information Processing Systems 22, pp. 2331-2339. 2009.
-
(2009)
Advances in Neural Information Processing Systems
, vol.22
, pp. 2331-2339
-
-
Langford, J.1
Smola, A.2
Zinkevich, M.3
-
9
-
-
77954641643
-
A contextual-bandit approach to personalized news article recommendation
-
New York, NY, USA, ACM
-
Li, Lihong, Chu, Wei, Langford, John, and Schapire, Robert E. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web (WWW), pp. 661-670, New York, NY, USA, 2010. ACM.
-
(2010)
Proceedings of the 19th International Conference on World Wide Web (WWW)
, pp. 661-670
-
-
Li, L.1
Chu, W.2
Langford, J.3
Schapire, R.E.4
-
10
-
-
33646498288
-
On-line learning with delayed label feedback
-
Jain, Sanjay, Simon, HansUlrich, and Tomita, Etsuji (eds.), Algorithmic Learning Theory, Springer Berlin Heidelberg
-
Mesterharm, Chris J. On-line learning with delayed label feedback. In Jain, Sanjay, Simon, HansUlrich, and Tomita, Etsuji (eds.), Algorithmic Learning Theory, volume 3734 of Lecture Notes in Computer Science, pp. 399-413. Springer Berlin Heidelberg, 2005.
-
(2005)
Lecture Notes in Computer Science
, vol.3734
, pp. 399-413
-
-
Mesterharm, C.J.1
-
11
-
-
56749126921
-
-
PhD thesis, Department of Computer Science, Rutgers University, New Brunswick, NJ
-
Mesterharm, Chris J. Improving on-line learning. PhD thesis, Department of Computer Science, Rutgers University, New Brunswick, NJ, 2007.
-
(2007)
Improving On-line Learning
-
-
Mesterharm, C.J.1
-
12
-
-
85162052729
-
Online markov decision processes under bandit feedback
-
Lafferty, J., Williams, C. K. I., Shawe-Taylor, J., Zemel, R.S., and Culotta, A. (eds.)
-
Neu, Gergely, György, András, Szepesvári, Csaba, and Antos, Andras. Online markov decision processes under bandit feedback. In Lafferty, J., Williams, C. K. I., Shawe-Taylor, J., Zemel, R.S., and Culotta, A. (eds.), Advances in Neural Information Processing Systems 23 (NIPS), pp. 1804-1812, 2010.
-
(2010)
Advances in Neural Information Processing Systems 23 (NIPS)
, pp. 1804-1812
-
-
Neu, G.1
György, A.2
Szepesvári, C.3
Antos, A.4
|