-
1
-
-
68949096711
-
SGD-QN: Careful quasi- newton stochastic gradient descent
-
With Erratum (to appear
-
BORDES, A., BOTTOU, L., and GALLINARI, P. (2009): SGD-QN: Careful Quasi- Newton Stochastic Gradient Descent. Journal of Machine Learning Research, 10:1737-1754. With Erratum (to appear).
-
(2009)
Journal of Machine Learning Research
, vol.10
, pp. 1737-1754
-
-
Bordes, A.1
Bottou, L.2
Gallinari, P.3
-
5
-
-
34249753618
-
Support vector networks
-
CORTES, C. and VAPNIK, V. N. (1995): Support Vector Networks, Machine Learning, 20:273-297.
-
(1995)
Machine Learning
, vol.20
, pp. 273-297
-
-
Cortes, C.1
Vapnik, V.N.2
-
8
-
-
0142192295
-
Conditional random fields: Probabilistic models for segmenting and labeling sequence data
-
Morgan Kaufman
-
LAFFERTY, J. D., MCCALLUM, A., and PEREIRA, F. (2001): Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of ICML 2001, 282-289, Morgan Kaufman.
-
(2001)
Proceedings of ICML 2001
, pp. 282-289
-
-
Lafferty, J.D.1
McCallum, A.2
Pereira, F.3
-
9
-
-
0032166052
-
The importance of convexity in learning with squared loss
-
LEE, W. S., BARTLETT, P. L., and WILLIAMSON, R. C. (1998): The Importance of Convexity in Learning with Squared Loss. IEEE Transactions on Information Theory, 44(5):1974-1980.
-
(1998)
IEEE Transactions on Information Theory
, vol.44
, Issue.5
, pp. 1974-1980
-
-
Lee, W.S.1
Bartlett, P.L.2
Williamson, R.C.3
-
10
-
-
84876811202
-
RCV1: A new benchmark collection for text categorization research
-
LEWIS, D. D., YANG, Y., ROSE, T. G., and LI, F. (2004): RCV1: A New Benchmark Collection for Text Categorization Research. Journal of Machine Learning Research, 5:361-397.
-
(2004)
Journal of Machine Learning Research
, vol.5
, pp. 361-397
-
-
Lewis, D.D.1
Yang, Y.2
Rose, T.G.3
Li, F.4
-
11
-
-
34547982357
-
Trust region newton methods for large-scale logistic regression
-
ACM Press
-
LIN, C. J., WENG, R. C., and KEERTHI, S. S. (2007): Trust region Newton methods for large-scale logistic regression. In Proceedings of ICML 2007, 561- 568, ACM Press.
-
(2007)
Proceedings of ICML 2007
, pp. 561-568
-
-
Lin, C.J.1
Weng, R.C.2
Keerthi, S.S.3
-
12
-
-
0001457509
-
Some methods for classification and analysis of multivariate observations
-
University of California Press
-
MACQUEEN, J. (1967): Some Methods for Classification and Analysis of Multivariate Observations. In Fifth Berkeley Symposium on Mathematics, Statistics, and Probabilities, vol.1, 281-297, University of California Press.
-
(1967)
Fifth Berkeley Symposium on Mathematics, Statistics, and Probabilities
, vol.1
, pp. 281-297
-
-
MacQueen, J.1
-
13
-
-
0000595627
-
Some applications of concentration inequalities to Statistics
-
series 6
-
MASSART, P. (2000): Some applications of concentration inequalities to Statistics, Annales de la Facult́e des Sciences de Toulouse, series 6, 9, (2):245-303.
-
(2000)
Annales de la Faculté des Sciences de Toulouse
, vol.9
, Issue.2
, pp. 245-303
-
-
Massart, P.1
-
14
-
-
0001955526
-
A statistical study of on-line learning
-
Cambridge University Press
-
MURATA, N. (1998): A Statistical Study of On-line Learning. In Online Learning and Neural Networks, Cambridge University Press.
-
(1998)
Online Learning and Neural Networks
-
-
Murata, N.1
-
15
-
-
0026899240
-
Acceleration of stochastic approximation by averaging
-
POLYAK, B. T. and JUDITSKY, A. B. (1992): Acceleration of stochastic approximation by averaging. SIAM J. Control and Optimization, 30(4):838-855.
-
(1992)
SIAM J. Control and Optimization
, vol.30
, Issue.4
, pp. 838-855
-
-
Polyak, B.T.1
Juditsky, A.B.2
-
17
-
-
0000646059
-
Learning internal representations by error propagation
-
Bradford Books
-
RUMELHART, D. E., HINTON, G. E., and WILLIAMS, R. J. (1986): Learning internal representations by error propagation. In Parallel distributed processing: Explorations in the microstructure of cognition, vol.I, 318-362, Bradford Books.
-
(1986)
Parallel Distributed Processing: Explorations in the Microstructure of Cognition
, vol.1
, pp. 318-362
-
-
Rumelhart, D.E.1
Hinton, G.E.2
Williams, R.J.3
-
18
-
-
56449110590
-
SVM optimization: Inverse dependence on training set size
-
ACM
-
SHALEV-SHWARTZ, S. and SREBRO, N. (2008): SVM optimization: inverse dependence on training set size. In Proceedings of the ICML 2008, 928-935, ACM.
-
(2008)
Proceedings of the ICML 2008
, pp. 928-935
-
-
Shalev-shwartz, S.1
Srebro, N.2
-
19
-
-
0001287271
-
Regression shrinkage and selection via the lasso
-
Series B
-
TIBSHIRANI, R. (1996): Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B, 58(1):267-288.
-
(1996)
Journal of the Royal Statistical Society
, vol.58
, Issue.1
, pp. 267-288
-
-
Tibshirani, R.1
-
21
-
-
3142725508
-
Optimal aggregation of classifiers in statistical learning
-
TSYBAKOV, A. B. (2004): Optimal aggregation of classifiers in statistical learning, Annals of Statististics, 32(1).
-
(2004)
Annals of Statististics
, vol.32
, Issue.1
-
-
Tsybakov, A.B.1
-
22
-
-
0001024505
-
On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities
-
VAPNIK, V. N. and CHERVONENKIS, A. YA. (1971): On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities. Theory of Probability and its Applications, 16(2):264-280.
-
(1971)
Theory of Probability and its Applications
, vol.16
, Issue.2
, pp. 264-280
-
-
Vapnik, V.N.1
Chervonenkis, A.Ya.2
-
23
-
-
0002278965
-
Adaptive switching circuits
-
WIDROW, B. and HOFF, M. E. (1960): Adaptive switching circuits. IRE WESCON Conv. Record, Part 4., 96-104.
-
(1960)
IRE WESCON Conv. Record
, Issue.PART 4
, pp. 96-104
-
-
Widrow, B.1
Hoff, M.E.2
-
24
-
-
77956944936
-
Towards optimal one pass large scale learning with averaged stochastic gradient descent
-
to appear
-
XU, W. (2010): Towards Optimal One Pass Large Scale Learning with Averaged Stochastic Gradient Descent. Journal of Machine Learning Research (to appear).
-
(2010)
Journal of Machine Learning Research
-
-
Xu, W.1
|