-
2
-
-
0028392483
-
Learning long-term dependencies with gradient descent is difficult
-
Y. Bengio, P. Simard, and P. Frasconi, "Learning long-term dependencies with gradient descent is difficult," IEEE T. Neural Nets, 1994.
-
(1994)
IEEE T. Neural Nets
-
-
Bengio, Y.1
Simard, P.2
Frasconi, P.3
-
3
-
-
0022471098
-
Learning representations by back-propagating errors
-
D.E. Rumelhart, G.E. Hinton, and R.J. Williams, "Learning representations by back-propagating errors," Nature, vol. 323, pp. 533-536, 1986.
-
(1986)
Nature
, vol.323
, pp. 533-536
-
-
Rumelhart, D.E.1
Hinton, G.E.2
Williams, R.J.3
-
4
-
-
80053451847
-
Learning recurrent neural networks with Hessian-free optimization
-
J. Martens and I. Sutskever, "Learning recurrent neural networks with Hessian-free optimization," in ICML2011, 2011.
-
(2011)
ICML2011
-
-
Martens, J.1
Sutskever, I.2
-
5
-
-
80051643236
-
Extensions of recurrent neural network language model
-
T. Mikolov, S. Kombrink, L. Burget, J. Cernocky, and S. Khudanpur, "Extensions of recurrent neural network language model," in ICASSP 2011, 2011.
-
(2011)
ICASSP 2011
-
-
Mikolov, T.1
Kombrink, S.2
Burget, L.3
Cernocky, J.4
Khudanpur, S.5
-
8
-
-
84876220822
-
-
Tech. Rep., arXiv:1206.5538
-
Y. Bengio, A. Courville, and P. Vincent, "Unsupervised feature learning and deep learning: A review and new perspectives," Tech. Rep., arXiv:1206.5538, 2012.
-
(2012)
Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives
-
-
Bengio, Y.1
Courville, A.2
Vincent, P.3
-
9
-
-
33745805403
-
A fast learning algorithm for deep belief nets
-
G. E. Hinton, S. Osindero, and Y.-W. Teh, "A fast learning algorithm for deep belief nets," Neural Computation, vol. 18, pp. 1527-1554, 2006.
-
(2006)
Neural Computation
, vol.18
, pp. 1527-1554
-
-
Hinton, G.E.1
Osindero, S.2
Teh, Y.-W.3
-
10
-
-
84864073449
-
Greedy layer-wise training of deep networks
-
Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, "Greedy layer-wise training of deep networks," in NIPS2006, 2007.
-
(2007)
NIPS2006
-
-
Bengio, Y.1
Lamblin, P.2
Popovici, D.3
Larochelle, H.4
-
11
-
-
84864069017
-
Efficient learning of sparse representations with an energy-based model
-
M. Ranzato, C. Poultney, S. Chopra, and Y. LeCun, "Efficient learning of sparse representations with an energy-based model," in NIPS2006, 2007.
-
(2007)
NIPS2006
-
-
Ranzato, M.1
Poultney, C.2
Chopra, S.3
Lecun, Y.4
-
12
-
-
80055055551
-
Why does unsupervised pre-training help deep learning
-
D. Erhan, Y. Bengio, A. Courville, P. Manzagol, P. Vincent, and S. Bengio, "Why does unsupervised pre-training help deep learning?," J. Machine Learning Res., (11) 2010.
-
(2010)
J. Machine Learning Res.
, Issue.11
-
-
Erhan, D.1
Bengio, Y.2
Courville, A.3
Manzagol, P.4
Vincent, P.5
Bengio, S.6
-
13
-
-
77956541496
-
Deep learning via Hessian-free optimization
-
J. Martens, "Deep learning via Hessian-free optimization," in ICML2010, 2010, pp. 735-742.
-
(2010)
ICML2010
, pp. 735-742
-
-
Martens, J.1
-
14
-
-
84867135575
-
Building high-level features using large scale unsupervised learning
-
Q. Le, M. Ranzato, R. Monga, M. Devin, G. Corrado, K. Chen, J. Dean, and A. Ng, "Building high-level features using large scale unsupervised learning," in ICML2012, 2012.
-
(2012)
ICML2012
-
-
Le, Q.1
Ranzato, M.2
Monga, R.3
Devin, M.4
Corrado, G.5
Chen, K.6
Dean, J.7
Ng, A.8
-
15
-
-
84890510534
-
-
Tech. Rep., Universit e De Montreal, arXiv:arXiv:1211.5063
-
Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio, "Understanding the exploding gradient problem," Tech. Rep., Universit e De Montreal, 2012, arXiv:arXiv:1211.5063.
-
(2012)
Understanding the Exploding Gradient Problem
-
-
Pascanu, R.1
Mikolov, T.2
Bengio, Y.3
-
17
-
-
84890453272
-
-
Tech. Rep. UMICAS-TR-95-78, U. Mariland
-
T. Lin, B. G. Horne, P. Tino, and C. L. Giles, "Learning longterm dependencies is not as difficult with NARX recurrent neural networks," Tech. Rep. UMICAS-TR-95-78, U. Mariland, 1995.
-
(1995)
Learning Longterm Dependencies Is Not As Difficult with NARX Recurrent Neural Networks
-
-
Lin, T.1
Horne, B.G.2
Tino, P.3
Giles, C.L.4
-
18
-
-
0003331189
-
Hierarchical recurrent neural networks for long-term dependencies
-
S. ElHihi and Y. Bengio, "Hierarchical recurrent neural networks for long-term dependencies," in NIPS1995, 1996.
-
(1996)
NIPS1995
-
-
Elhihi, S.1
Bengio, Y.2
-
19
-
-
34249938474
-
Optimization and applications of echo state networks with leaky-integrator neurons
-
Herbert Jaeger, Mantas Lukosevicius, Dan Popovici, and Udo Siewert, "Optimization and applications of echo state networks with leaky-integrator neurons," Neural Networks, vol. 20, no. 3, pp. 335-352, 2007.
-
(2007)
Neural Networks
, vol.20
, Issue.3
, pp. 335-352
-
-
Jaeger, H.1
Lukosevicius, M.2
Popovici, D.3
Siewert, U.4
-
20
-
-
73949127981
-
Temporal kernel recurrent neural networks
-
2
-
I. Sutskever and G. Hinton, "Temporal kernel recurrent neural networks," Neural Networks, vol. 23, no. 2, (23) 2, 2010.
-
(2010)
Neural Networks
, vol.23
, Issue.23
, pp. 2
-
-
Sutskever, I.1
Hinton, G.2
-
21
-
-
0031573117
-
Long short-term memory
-
S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
-
(1997)
Neural Computation
, vol.9
, Issue.8
, pp. 1735-1780
-
-
Hochreiter, S.1
Schmidhuber, J.2
-
22
-
-
70350583515
-
Echo-state networks with band-pass neurons: Towards generic time-scale-independent reservoir structures
-
October
-
Udo Siewert and Welf Wustlich, "Echo-state networks with band-pass neurons: Towards generic time-scale-independent reservoir structures," ¡p¿Preliminary Report¡/p¿, October 2007.
-
(2007)
¡P¿Preliminary Report¡/p¿
-
-
Siewert, U.1
Wustlich, W.2
-
24
-
-
84858768256
-
The recurrent temporal restricted Boltzmann machine
-
I. Sutskever, G. Hinton, and G. Taylor, "The recurrent temporal restricted Boltzmann machine," in NIPS2008. 2009.
-
(2009)
NIPS2008
-
-
Sutskever, I.1
Hinton, G.2
Taylor, G.3
-
25
-
-
84867129058
-
Modeling temporal dependencies in high-dimensional sequences: Application to polyphonic music generation and transcription
-
N. Boulanger-Lewandowski, Y. Bengio, and P. Vincent, "Modeling temporal dependencies in high-dimensional sequences: Application to polyphonic music generation and transcription," in ICML2012, 2012.
-
(2012)
ICML2012
-
-
Boulanger-Lewandowski, N.1
Bengio, Y.2
Vincent, P.3
-
26
-
-
84862524901
-
The neural autoregressive distribution estimator
-
H. Larochelle and I. Murray, "The Neural Autoregressive Distribution Estimator," in AISTATS2011, 2011.
-
(2011)
AISTATS2011
-
-
Larochelle, H.1
Murray, I.2
-
28
-
-
77956509090
-
Rectified linear units improve restricted Boltzmann machines
-
V. Nair and G.E. Hinton, "Rectified linear units improve restricted Boltzmann machines," in ICML2010, 2010.
-
(2010)
ICML2010
-
-
Nair, V.1
Hinton, G.E.2
-
29
-
-
84876231242
-
ImageNet classification with deep convolutional neural networks
-
A. Krizhevsky, I. Sutskever, and G. Hinton, "ImageNet classification with deep convolutional neural networks," in NIPS2012. 2012.
-
(2012)
NIPS2012
-
-
Krizhevsky, A.1
Sutskever, I.2
Hinton, G.3
-
30
-
-
34548480020
-
A method for unconstrained convex minimization problem with the rate of convergence o(1=k2)
-
Yu Nesterov, "A method for unconstrained convex minimization problem with the rate of convergence o(1=k2)," Doklady AN SSSR (translated as Soviet. Math. Docl.), vol. 269, pp. 543-547, 1983.
-
(1983)
Doklady AN SSSR (Translated As Soviet. Math. Docl.)
, vol.269
, pp. 543-547
-
-
Yu, N.1
-
31
-
-
84857855190
-
Random search for hyper-parameter optimization
-
James Bergstra and Yoshua Bengio, "Random search for hyper-parameter optimization," J. Machine Learning Res., vol. 13, pp. 281-305, 2012.
-
(2012)
J. Machine Learning Res.
, vol.13
, pp. 281-305
-
-
Bergstra, J.1
Bengio, Y.2
-
32
-
-
80051643236
-
Extensions of recurrent neural network language model
-
Tomas Mikolov, Stefan Kombrink, Lukas Burget, Jan Cernocky, and Sanjeev Khudanpur, "Extensions of recurrent neural network language model," in Proc. 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP 2011), 2011.
-
(2011)
Proc. 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2011)
-
-
Mikolov, T.1
Kombrink, S.2
Burget, L.3
Cernocky, J.4
Khudanpur, S.5
|