-
1
-
-
0000396062
-
Natural gradient works efficiently in learning
-
Amari, S.-I.: Natural gradient works efficiently in learning. Neural Comput. 10(2), 251–276 (1998)
-
(1998)
Neural Comput.
, vol.10
, Issue.2
, pp. 251-276
-
-
Amari, S.-I.1
-
2
-
-
0034201611
-
Adaptive method of realizing natural gradient learning for multilayer perceptrons
-
Amari, S.-I., Park, H., Kenji, F.: Adaptive method of realizing natural gradient learning for multilayer perceptrons. Neural Comput. 12(6), 1399–1409 (2000)
-
(2000)
Neural Comput.
, vol.12
, Issue.6
, pp. 1399-1409
-
-
Amari, S.-I.1
Park, H.2
Kenji, F.3
-
4
-
-
0037403111
-
Mirror descent and nonlinear projected subgradient methods for convex optimization
-
Beck, A., Teboulle, M.: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31(3), 167–175 (2003)
-
(2003)
Oper. Res. Lett.
, vol.31
, Issue.3
, pp. 167-175
-
-
Beck, A.1
Teboulle, M.2
-
5
-
-
69349090197
-
Learning deep architectures for ai. Foundations and trends®
-
Bengio, Y.: Learning deep architectures for ai. Foundations and trends $$\textregistered $$®. Mach. Learn. 2, 1–127 (2009)
-
(2009)
Mach. Learn.
, vol.2
-
-
Bengio, Y.1
-
6
-
-
67651049775
-
Justifying and generalizing contrastive divergence
-
Bengio, Y., Delalleau, O.: Justifying and generalizing contrastive divergence. Neural Comput. 21(6), 1601–1621 (2009)
-
(2009)
Neural Comput.
, vol.21
, Issue.6
, pp. 1601-1621
-
-
Bengio, Y.1
Delalleau, O.2
-
7
-
-
0003778897
-
-
Springer Publishing Company, Incorporated, New York
-
Benveniste, A., Métivier, M., Priouret, P.: Adaptive Algorithms and Stochastic Approximations. Springer Publishing Company, Incorporated, New York (2012)
-
(2012)
Adaptive Algorithms and Stochastic Approximations
-
-
Benveniste, A.1
Métivier, M.2
Priouret, P.3
-
9
-
-
68949096711
-
Sgd-qn: careful quasi-Newton stochastic gradient descent
-
Bordes, A., Bottou, L., Gallinari, P.: Sgd-qn: careful quasi-Newton stochastic gradient descent. J. Mach. Learn. Res. 10, 1737–1754 (2009)
-
(2009)
J. Mach. Learn. Res.
, vol.10
, pp. 1737-1754
-
-
Bordes, A.1
Bottou, L.2
Gallinari, P.3
-
10
-
-
84904136037
-
Large-scale machine learning with stochastic gradient descent
-
Springer, New York
-
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010, pp. 177–186. Springer, New York (2010)
-
(2010)
Proceedings of COMPSTAT’2010
, pp. 177-186
-
-
Bottou, L.1
-
11
-
-
17444425307
-
On-line learning for very large data sets
-
Bottou, L., Le Cun, Y.: On-line learning for very large data sets. Appl. Stoch. Models Bus. Ind. 21(2), 137–151 (2005)
-
(2005)
Appl. Stoch. Models Bus. Ind.
, vol.21
, Issue.2
, pp. 137-151
-
-
Bottou, L.1
Le Cun, Y.2
-
13
-
-
84968510937
-
A class of methods for solving nonlinear simultaneous equations
-
Broyden, C.G.: A class of methods for solving nonlinear simultaneous equations. Math. Comput. 19, 577–593 (1965)
-
(1965)
Math. Comput.
, vol.19
, pp. 577-593
-
-
Broyden, C.G.1
-
14
-
-
80052231929
-
Online em algorithm for hidden Markov models
-
Cappé, O.: Online em algorithm for hidden Markov models. J. Comput. Graph. Stat. 20(3), 728–749 (2011)
-
(2011)
J. Comput. Graph. Stat.
, vol.20
, Issue.3
, pp. 728-749
-
-
Cappé, O.1
-
15
-
-
66849104300
-
On-line expectation-maximization algorithm for latent data models
-
Cappé, O., Moulines, M.: On-line expectation-maximization algorithm for latent data models. J. R. Stat. Soc. 71(3), 593–613 (2009)
-
(2009)
J. R. Stat. Soc.
, vol.71
, Issue.3
, pp. 593-613
-
-
Cappé, O.1
Moulines, M.2
-
17
-
-
84864074626
-
Implicit online learning with kernels
-
MIT Press, Cambridge
-
Cheng, L., Vishwanathan, S.V.N., Schuurmans, D., Wang, S., Caelli, T.: Implicit online learning with kernels. In: Proceedings of the 2006 Conference Advances in Neural Information Processing Systems 19, vol. 19, p. 249. MIT Press, Cambridge, 2007
-
(2007)
Proceedings of the 2006 Conference Advances in Neural Information Processing Systems 19, vol. 19
, pp. 249
-
-
Cheng, L.1
Vishwanathan, S.V.N.2
Schuurmans, D.3
Wang, S.4
Caelli, T.5
-
18
-
-
0000792517
-
On a stochastic approximation method
-
Chung, K.L.: On a stochastic approximation method. Ann. Math. Stat. 25, 463–483 (1954)
-
(1954)
Ann. Math. Stat.
, vol.25
, pp. 463-483
-
-
Chung, K.L.1
-
19
-
-
0002629270
-
Maximum likelihood from incomplete data via the EM algorithm
-
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39, 1–38 (1977)
-
(1977)
J. R. Stat. Soc. Ser. B
, vol.39
-
-
Dempster, A.1
Laird, N.2
Rubin, D.3
-
20
-
-
80052250414
-
Adaptive subgradient methods for online learning and stochastic optimization
-
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 999999, 2121–2159 (2011)
-
(2011)
J. Mach. Learn. Res.
, pp. 2121-2159
-
-
Duchi, J.1
Hazan, E.2
Singer, Y.3
-
21
-
-
0026205085
-
On sampling controlled stochastic approximation
-
Dupuis, P., Simha, R.: On sampling controlled stochastic approximation. IEEE Trans. Autom. Control 36(8), 915–924 (1991)
-
(1991)
IEEE Trans. Autom. Control
, vol.36
, Issue.8
, pp. 915-924
-
-
Dupuis, P.1
Simha, R.2
-
22
-
-
62349116164
-
Spectrum estimation for large dimensional covariance matrices using random matrix theory
-
El Karoui, N.: Spectrum estimation for large dimensional covariance matrices using random matrix theory. Ann. Stat. 36, 2757–2790 (2008)
-
(2008)
Ann. Stat.
, vol.36
, pp. 2757-2790
-
-
El Karoui, N.1
-
23
-
-
0001214703
-
On asymptotic normality in stochastic approximation
-
Fabian, V.: On asymptotic normality in stochastic approximation. Ann. Math. Stat. 39, 1327–1332 (1968)
-
(1968)
Ann. Math. Stat.
, vol.39
, pp. 1327-1332
-
-
Fabian, V.1
-
24
-
-
68949134067
-
Asymptotically efficient stochastic approximation; the RM case
-
Fabian, V.: Asymptotically efficient stochastic approximation; the RM case. Ann. Stat. 1, 486–495 (1973)
-
(1973)
Ann. Stat.
, vol.1
, pp. 486-495
-
-
Fabian, V.1
-
25
-
-
0001735517
-
On the mathematical foundations of theoretical statistics
-
Fisher, R.A.: On the mathematical foundations of theoretical statistics. Philos. Trans. R. Soc. Lond. Ser. A 222, 309–368 (1922)
-
(1922)
Philos. Trans. R. Soc. Lond. Ser. A
, vol.222
, pp. 309-368
-
-
Fisher, R.A.1
-
28
-
-
0021518209
-
Stochastic relaxation, gibbs distributions, and the Bayesian restoration of images
-
Geman, S., Geman, D.: Stochastic relaxation, gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6, 721–741 (1984)
-
(1984)
IEEE Trans. Pattern Anal. Mach. Intell.
, vol.6
, pp. 721-741
-
-
Geman, S.1
Geman, D.2
-
29
-
-
33748998787
-
Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming
-
George, A.P., Powell, W.B.: Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming. Machine Learn. 65(1), 167–198 (2006)
-
(2006)
Machine Learn.
, vol.65
, Issue.1
, pp. 167-198
-
-
George, A.P.1
Powell, W.B.2
-
30
-
-
79952295497
-
Riemann manifold Langevin and Hamiltonian Monte Carlo methods
-
Girolami, M.: Riemann manifold Langevin and Hamiltonian Monte Carlo methods. J. R. Stat. Soc. Ser. B 73(2), 123–214 (2011)
-
(2011)
J. R. Stat. Soc. Ser. B
, vol.73
, Issue.2
, pp. 123-214
-
-
Girolami, M.1
-
31
-
-
67649964731
-
Reinforcement learning: a tutorial survey and recent advances
-
Gosavi, A.: Reinforcement learning: a tutorial survey and recent advances. INFORMS J. Comput. 21(2), 178–192 (2009)
-
(2009)
INFORMS J. Comput.
, vol.21
, Issue.2
, pp. 178-192
-
-
Gosavi, A.1
-
32
-
-
0001648516
-
Iteratively reweighted least squares for maximum likelihood estimation, and some robust and resistant alternatives
-
Green, P.J.: Iteratively reweighted least squares for maximum likelihood estimation, and some robust and resistant alternatives. J. R. Stat. Soc. Ser. B 46, 149–192 (1984)
-
(1984)
J. R. Stat. Soc. Ser. B
, vol.46
, pp. 149-192
-
-
Green, P.J.1
-
33
-
-
0003684449
-
-
Springer, New York
-
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, New York (2011)
-
(2011)
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
-
-
Hastie, T.1
Tibshirani, R.2
Friedman, J.3
-
34
-
-
84876217045
-
Quasi-Newton methods: a new direction
-
Hennig, P., Kiefel, M.: Quasi-Newton methods: a new direction. J. Mach. Learn. Res. 14(1), 843–865 (2013)
-
(2013)
J. Mach. Learn. Res.
, vol.14
, Issue.1
, pp. 843-865
-
-
Hennig, P.1
Kiefel, M.2
-
35
-
-
0013344078
-
Training products of experts by minimizing contrastive divergence
-
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)
-
(2002)
Neural Comput.
, vol.14
, Issue.8
, pp. 1771-1800
-
-
Hinton, G.E.1
-
36
-
-
84878919168
-
Stochastic variational inference
-
Hoffman, M.D., Blei, D.M., Wang, C., Paisley, J.: Stochastic variational inference. J. Mach. Learn. Res. 14(1), 1303–1347 (2013)
-
(2013)
J. Mach. Learn. Res.
, vol.14
, Issue.1
, pp. 1303-1347
-
-
Hoffman, M.D.1
Blei, D.M.2
Wang, C.3
Paisley, J.4
-
37
-
-
0003157339
-
Robust estimation of a location parameter
-
Huber, P.J., et al.: Robust estimation of a location parameter. Ann. Math. Stat. 35(1), 73–101 (1964)
-
(1964)
Ann. Math. Stat.
, vol.35
, Issue.1
, pp. 73-101
-
-
Huber, P.J.1
-
39
-
-
84898963415
-
Accelerating stochastic gradient descent using predictive variance reduction
-
Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. Adv. Neural Inf. Process. Syst. 26, 315–323 (2013)
-
(2013)
Adv. Neural Inf. Process. Syst.
, vol.26
, pp. 315-323
-
-
Johnson, R.1
Zhang, T.2
-
40
-
-
84932194480
-
-
Kivinen, J., Warmuth, M.K.: Additive versus exponentiated gradient updates for linear prediction. In: Proceedings of the Twenty-Seventh Annual ACM Symposium on Theory of Computing, pp. 209–218
-
Kivinen, J., Warmuth, M.K.: Additive versus exponentiated gradient updates for linear prediction. In: Proceedings of the Twenty-Seventh Annual ACM Symposium on Theory of Computing, pp. 209–218
-
-
-
-
41
-
-
33646032356
-
The p-norm generalization of the lms algorithm for adaptive filtering
-
Kivinen, J., Warmuth, M.K., Hassibi, B.: The p-norm generalization of the lms algorithm for adaptive filtering. IEEE Trans. Signal Process. 54(5), 1782–1793 (2006)
-
(2006)
IEEE Trans. Signal Process.
, vol.54
, Issue.5
, pp. 1782-1793
-
-
Kivinen, J.1
Warmuth, M.K.2
Hassibi, B.3
-
42
-
-
84919819551
-
Austerity in mcmc land: cutting the metropolis-hastings budget
-
Korattikara, A., Chen, Y., Welling, M.: Austerity in mcmc land: cutting the metropolis-hastings budget. In: Proceedings of the 31st International Conference on Machine Learning, pp. 181–189 (2014)
-
(2014)
Proceedings of the 31st International Conference on Machine Learning
, pp. 181-189
-
-
Korattikara, A.1
Chen, Y.2
Welling, M.3
-
44
-
-
0000878355
-
Adaptive design and stochastic approximation
-
Lai, T.L., Robbins, H.: Adaptive design and stochastic approximation. Ann. Stat. 7, 1196–1221 (1979)
-
(1979)
Ann. Stat.
, vol.7
, pp. 1196-1221
-
-
Lai, T.L.1
Robbins, H.2
-
45
-
-
0000808747
-
A gradient algorithm locally equivalent to the EM algorithm
-
Lange, K.: A gradient algorithm locally equivalent to the EM algorithm. J. R. Stat. Soc. Ser. B 57, 425–437 (1995)
-
(1995)
J. R. Stat. Soc. Ser. B
, vol.57
, pp. 425-437
-
-
Lange, K.1
-
47
-
-
84899022736
-
Large scale online learning
-
Le, C., Bottou Yann, L., Bottou, L.: Large scale online learning. Adv. Neural Inf. Process. Syst. 16, 217 (2004)
-
(2004)
Adv. Neural Inf. Process. Syst.
, vol.16
, pp. 217
-
-
Le, C.1
Bottou Yann, L.2
Bottou, L.3
-
49
-
-
56449125197
-
-
Li, L.: A worst-case comparison between temporal difference and residual gradient with linear function approximation. In: Proceedings of the 25th International Conference on Machine Learning, ACM, pp. 560–567
-
Li, L.: A worst-case comparison between temporal difference and residual gradient with linear function approximation. In: Proceedings of the 25th International Conference on Machine Learning, ACM, pp. 560–567
-
-
-
-
50
-
-
26444444069
-
Online em algorithm for mixture with application to internet traffic modeling
-
Liu, Z., Almhana, J., Choulakian, V., McGorman, R.: Online em algorithm for mixture with application to internet traffic modeling. Comput. Stat. Data Anal. 50(4), 1052–1071 (2006)
-
(2006)
Comput. Stat. Data Anal.
, vol.50
, Issue.4
, pp. 1052-1071
-
-
Liu, Z.1
Almhana, J.2
Choulakian, V.3
McGorman, R.4
-
51
-
-
0003746249
-
-
Springer, New York
-
Ljung, L., Pflug, G., Walk, H.: Stochastic Approximation and Optimization of Random Systems, vol. 17. Springer, New York (1992)
-
(1992)
Stochastic Approximation and Optimization of Random Systems
, vol.17
-
-
Ljung, L.1
Pflug, G.2
Walk, H.3
-
52
-
-
0016508280
-
Robust estimation via stochastic approximation
-
Martin, R.D., Masreliez, C.: Robust estimation via stochastic approximation. IEEE Trans. Inf. Theory 21(3), 263–271 (1975)
-
(1975)
IEEE Trans. Inf. Theory
, vol.21
, Issue.3
, pp. 263-271
-
-
Martin, R.D.1
Masreliez, C.2
-
53
-
-
0001955526
-
-
Online Learning and Neural Networks, Cambridge University Press, Cambridge
-
Murata, N.: A Statistical Study of On-line Learning. Online Learning and Neural Networks. Cambridge University Press, Cambridge (1998)
-
(1998)
A Statistical Study of On-line Learning
-
-
Murata, N.1
-
54
-
-
33846451627
-
A learning method for system identification
-
Nagumo, J.-I., Noda, A.: A learning method for system identification. IEEE Trans. Autom. Control 12(3), 282–287 (1967)
-
(1967)
IEEE Trans. Autom. Control
, vol.12
, Issue.3
, pp. 282-287
-
-
Nagumo, J.-I.1
Noda, A.2
-
55
-
-
85052723106
-
-
National Research Council The National Academies Press, Washington, DC
-
National Research Council: Frontiers in Massive Data Analysis. The National Academies Press, Washington, DC (2013)
-
(2013)
Frontiers in Massive Data Analysis
-
-
-
56
-
-
0002788893
-
A view of the em algorithm that justifies incremental, sparse, and other variants
-
Springer, New York
-
Neal, R.M., Hinton, G.E.: A view of the em algorithm that justifies incremental, sparse, and other variants. In: Learning in Graphical Models, pp. 355–368. Springer, New York (1998)
-
(1998)
Learning in Graphical Models
, pp. 355-368
-
-
Neal, R.M.1
Hinton, G.E.2
-
59
-
-
70450197241
-
Robust stochastic approximation approach to stochastic programming
-
Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)
-
(2009)
SIAM J. Optim.
, vol.19
, Issue.4
, pp. 1574-1609
-
-
Nemirovski, A.1
Juditsky, A.2
Lan, G.3
Shapiro, A.4
-
63
-
-
84932194804
-
-
Pillai, N.S., Smith, A.: Ergodicity of approximate mcmc chains with applications to large data sets. arXiv preprint (2014)
-
Pillai, N.S., Smith, A.: Ergodicity of approximate mcmc chains with applications to large data sets. arXiv preprint http://arxiv.org/abs/1405.0182 (2014)
-
-
-
-
64
-
-
0002410521
-
Adaptive algorithms of estimation (convergence, optimality, stability)
-
Polyak, B.T., Tsypkin, Y.Z.: Adaptive algorithms of estimation (convergence, optimality, stability). Autom. Remote Control 3, 74–84 (1979)
-
(1979)
Autom. Remote Control
, vol.3
, pp. 74-84
-
-
Polyak, B.T.1
Tsypkin, Y.Z.2
-
65
-
-
0026899240
-
Acceleration of stochastic approximation by averaging. SIAM
-
Polyak, B.T., Juditsky, A.B.: Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30(4), 838–855 (1992)
-
(1992)
J. Control Optim
, vol.30
, Issue.4
, pp. 838-855
-
-
Polyak, B.T.1
Juditsky, A.B.2
-
66
-
-
0000016172
-
A stochastic approximation method
-
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)
-
(1951)
Ann. Math. Stat.
, vol.22
, pp. 400-407
-
-
Robbins, H.1
Monro, S.2
-
67
-
-
0016985417
-
Monotone operators and the proximal point algorithm
-
Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14(5), 877–898 (1976)
-
(1976)
SIAM J. Control Optim.
, vol.14
, Issue.5
, pp. 877-898
-
-
Rockafellar, R.T.1
-
68
-
-
84932194717
-
-
Rosasco, L., Villa, S., Công Vũ, B.: Convergence of stochastic proximal gradient algorithm. arXiv preprint , 2014
-
Rosasco, L., Villa, S., Công Vũ, B.: Convergence of stochastic proximal gradient algorithm. arXiv preprint http://arxiv.org/abs/1403.5074, 2014
-
-
-
-
70
-
-
84932197319
-
-
Ryu, E.K., Boyd, S.: Stochastic proximal iteration: a non-asymptotic improvement upon stochastic gradient descent. Working paper. (2014)
-
Ryu, E.K., Boyd, S.: Stochastic proximal iteration: a non-asymptotic improvement upon stochastic gradient descent. Working paper. http://web.stanford.edu/~eryu/papers/spi.pdf (2014)
-
-
-
-
71
-
-
0000431134
-
Asymptotic distribution of stochastic approximation procedures
-
Sacks, J.: Asymptotic distribution of stochastic approximation procedures. Ann. Math. Stat. 29(2), 373–405 (1958)
-
(1958)
Ann. Math. Stat.
, vol.29
, Issue.2
, pp. 373-405
-
-
Sacks, J.1
-
72
-
-
0007229977
-
Efficient recursive estimation; application to estimating the parameters of a covariance function
-
Sakrison, D.J.: Efficient recursive estimation; application to estimating the parameters of a covariance function. Int. J. Eng. Sci. 3(4), 461–483 (1965)
-
(1965)
Int. J. Eng. Sci.
, vol.3
, Issue.4
, pp. 461-483
-
-
Sakrison, D.J.1
-
73
-
-
34547983260
-
Restricted boltzmann machines for collaborative filtering
-
Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted boltzmann machines for collaborative filtering. In: Proceedings of the 24th International Conference on Machine Learning, ACM, pp. 791–798 (2007)
-
(2007)
Proceedings of the 24th International Conference on Machine Learning, ACM
, pp. 791-798
-
-
Salakhutdinov, R.1
Mnih, A.2
Hinton, G.3
-
74
-
-
0034131785
-
On-line em algorithm for the normalized Gaussian network
-
Sato, M.-A., Ishii, S.: On-line em algorithm for the normalized Gaussian network. Neural Comput. 12(2), 407–432 (2000)
-
(2000)
Neural Comput.
, vol.12
, Issue.2
, pp. 407-432
-
-
Sato, M.-A.1
Ishii, S.2
-
75
-
-
84932199211
-
Approximation analysis of stochastic gradient langevin dynamics by using Fokker-Planck equation and ito process
-
Sato, I., Nakagawa, H.: Approximation analysis of stochastic gradient langevin dynamics by using Fokker-Planck equation and ito process. JMLR W&CP 32(1), 982–990 (2014)
-
(2014)
JMLR W&CP
, vol.32
, Issue.1
, pp. 982-990
-
-
Sato, I.1
Nakagawa, H.2
-
76
-
-
0013419177
-
On the worst-case analysis of temporal-difference learning algorithms
-
Schapire, R.E., Warmuth, M.K.: On the worst-case analysis of temporal-difference learning algorithms. Mach. Learn. 22(1–3), 95–121 (1996)
-
(1996)
Mach. Learn.
, vol.22
, Issue.1-3
, pp. 95-121
-
-
Schapire, R.E.1
Warmuth, M.K.2
-
77
-
-
84932193261
-
-
Schaul, T., Zhang, S., LeCun, Y.: No more pesky learning rates. arXiv preprint. , 2012
-
Schaul, T., Zhang, S., LeCun, Y.: No more pesky learning rates. arXiv preprint. http://arxiv.org/abs/1206.1106, 2012
-
-
-
-
78
-
-
84862300219
-
A stochastic quasi-Newton method for online convex optimization
-
San Juan, Puerto Rico
-
Schraudolph, N.N., Yu, J., Günter, S.: A stochastic quasi-Newton method for online convex optimization. In: Meila M., Shen X. (eds.) Proceedings of the 11th International Conference on Artificial Intelligence and Statistics (AISTATS), vol. 2, pp. 436–443. San Juan, Puerto Rico (2007)
-
(2007)
Proceedings of the 11th International Conference on Artificial Intelligence and Statistics (AISTATS), vol. 2
, pp. 436-443
-
-
Schraudolph, N.N.1
Yu, J.2
Günter, S.3
Meila, M.4
Shen, X.5
-
79
-
-
0027667902
-
On the convergence behavior of the LMS and the normalized LMS algorithms
-
Slock, D.T.M.: On the convergence behavior of the LMS and the normalized LMS algorithms. IEEE Trans. Signal Process. 41(9), 2811–2825 (1993)
-
(1993)
IEEE Trans. Signal Process
, vol.41
, Issue.9
, pp. 2811-2825
-
-
Slock, D.T.M.1
-
80
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9–44 (1988)
-
(1988)
Mach. Learn.
, vol.3
, Issue.1
, pp. 9-44
-
-
Sutton, R.S.1
-
81
-
-
84983150860
-
Implicit temporal differences
-
Tamar, A., Toulis, P., Mannor, S., Airoldi, E.: Implicit temporal differences. In: Neural Information Processing Systems, Workshop on Large-Scale Reinforcement Learning (2014)
-
(2014)
Neural Information Processing Systems, Workshop on Large-Scale Reinforcement Learning
-
-
Tamar, A.1
Toulis, P.2
Mannor, S.3
Airoldi, E.4
-
82
-
-
84864026688
-
Modeling human motion using binary latent variables
-
Taylor, G.W., Hinton, G.E., Roweis, S.T.: Modeling human motion using binary latent variables. Adv. Neural Inf. Process. Syst. 19, 1345–1352 (2006)
-
(2006)
Adv. Neural Inf. Process. Syst.
, vol.19
, pp. 1345-1352
-
-
Taylor, G.W.1
Hinton, G.E.2
Roweis, S.T.3
-
83
-
-
0001593436
-
Recursive parameter estimation using incomplete data
-
Titterington, M.D.: Recursive parameter estimation using incomplete data. J. R. Stat. Soc. Ser. B 46, 257–267 (1984)
-
(1984)
J. R. Stat. Soc. Ser. B
, vol.46
, pp. 257-267
-
-
Titterington, M.D.1
-
84
-
-
84932198982
-
-
Toulis, P., Airoldi, E.M.: Implicit stochastic gradient descent for principled estimation with large datasets. arXiv preprint , 2014
-
Toulis, P., Airoldi, E.M.: Implicit stochastic gradient descent for principled estimation with large datasets. arXiv preprint http://arxiv.org/abs/1408.2923, 2014
-
-
-
-
85
-
-
85028606750
-
Statistical analysis of stochastic gradient methods for generalized linear models
-
Toulis, P., Airoldi, E., Rennie, J.: Statistical analysis of stochastic gradient methods for generalized linear models. JMLR W&CP 32(1), 667–675 (2014)
-
(2014)
JMLR W&CP
, vol.32
, Issue.1
, pp. 667-675
-
-
Toulis, P.1
Airoldi, E.2
Rennie, J.3
-
86
-
-
0002010858
-
An extension of the robbins-monro procedur
-
Venter, J.H.: An extension of the robbins-monro procedur. Ann. Math. Stat. 38, 181–190 (1967)
-
(1967)
Ann. Math. Stat.
, vol.38
, pp. 181-190
-
-
Venter, J.H.1
-
87
-
-
84899020608
-
Variance reduction for stochastic gradient optimization
-
Wang, C., Chen, X., Smola, A., Xing, E.: Variance reduction for stochastic gradient optimization. Adv. Neural Inf. Process. Syst. 26, 181–189 (2013)
-
(2013)
Adv. Neural Inf. Process. Syst.
, vol.26
, pp. 181-189
-
-
Wang, C.1
Chen, X.2
Smola, A.3
Xing, E.4
-
88
-
-
84899690779
-
Stabilization of stochastic iterative methods for singular and nearly singular linear systems
-
Wang, M., Bertsekas, D.P.: Stabilization of stochastic iterative methods for singular and nearly singular linear systems. Math. Oper. Res. 39(1), 1–30 (2013)
-
(2013)
Math. Oper. Res.
, vol.39
, Issue.1
-
-
Wang, M.1
Bertsekas, D.P.2
-
89
-
-
0000221062
-
Multivariate adaptive stochastic approximation
-
Wei, C.Z.: Multivariate adaptive stochastic approximation. Ann. Stat. 3, 1115–1130 (1987)
-
(1987)
Ann. Stat.
, vol.3
, pp. 1115-1130
-
-
Wei, C.Z.1
-
91
-
-
84932196044
-
-
Xu, W.: Towards optimal one pass large scale learning with averaged stochastic gradient descent. arXiv preprint , 2011
-
Xu, W.: Towards optimal one pass large scale learning with averaged stochastic gradient descent. arXiv preprint http://arxiv.org/abs/1107.2490, 2011
-
-
-
-
92
-
-
33644756784
-
On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates
-
Younes, L.: On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates. Stochastics 65(3–4), 177–228 (1999)
-
(1999)
Stochastics
, vol.65
, Issue.3-4
, pp. 177-228
-
-
Younes, L.1
|