-
1
-
-
0142166851
-
A neural probabilistic language model
-
Feb
-
Bengio, Yoshua, Ducharme, Réjean, Vincent, Pascal, and Jauvin, Christian. A neural probabilistic language model, journal of machine learning research, 3(Feb):1 137-1155, 2003.
-
(2003)
Journal of Machine Learning Research
, vol.3
, Issue.1
, pp. 137-1155
-
-
Bengio, Y.1
Ducharme, R.2
Vincent, P.3
Jauvin, C.4
-
2
-
-
84943795466
-
-
arXiv preprint
-
Chelba, Ciprian, Mikolov, Tomas, Schuster, Mike, Ge, Qi, Brants, Thorsten, Koehn, Phillipp, and Robinson, Tony. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005, 2013.
-
(2013)
One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling
-
-
Chelba, C.1
Mikolov, T.2
Schuster, M.3
Ge, Q.4
Brants, T.5
Koehn, P.6
Robinson, T.7
-
4
-
-
85018888481
-
Strategies for training large vocabulary neural language models
-
abs/1512.04906
-
Chen, Wenlin, Grangier, David, and Auli, Michael. Strategies for training large vocabulary neural language models. CoRR, abs/1512.04906, 2016.
-
(2016)
CoRR
-
-
Chen, W.1
Grangier, D.2
Auli, M.3
-
5
-
-
84888340666
-
Torch7: A matlab-like environment for machine learning
-
Collobert, Ronan, Kavukcuoglu, Koray, and Farabet, Clement. Torch7: A Matlab-like Environment for Machine Learning. In BigLearn, NIPS Workshop, 2011. URL http://torch. ch.
-
(2011)
BigLearn, NIPS Workshop
-
-
Collobert, R.1
Kavukcuoglu, K.2
Farabet, C.3
-
8
-
-
85030484971
-
-
ArXiv e-prints, September
-
Grave, E., Joulin, A., Cissé, M., Grangier, D., and Jégou, H. Efficient softmax approximation for GPUs. ArXiv e-prints, September 2016a.
-
(2016)
Efficient Softmax Approximation for GPUs
-
-
Grave, E.1
Joulin, A.2
Cissé, M.3
Grangier, D.4
Jégou, H.5
-
9
-
-
85037367719
-
-
ArXiv e-prints, December
-
Grave, E., Joulin, A., and Usunier, N. Improving Neural Language Models with a Continuous Cache. ArXiv e-prints, December 2016b.
-
(2016)
Improving Neural Language Models with A Continuous Cache
-
-
Grave, E.1
Joulin, A.2
Usunier, N.3
-
11
-
-
84958589374
-
-
arXiv preprint
-
He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and Sun, Jian. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385, 2015a.
-
(2015)
Deep Residual Learning for Image Recognition
-
-
He, K.1
Zhang, X.2
Ren, S.3
Sun, J.4
-
12
-
-
84973911419
-
Delving deep into rectifiers: Surpassing human-level performance on imagenet classification
-
He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and Sun, Jian. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1026-1034, 2015b.
-
(2015)
Proceedings of the IEEE International Conference on Computer Vision
, pp. 1026-1034
-
-
He, K.1
Zhang, X.2
Ren, S.3
Sun, J.4
-
14
-
-
84994350060
-
-
arXiv preprint
-
Ji, Shihao, Vishwanathan, SVN, Satish, Nadathur, Anderson, Michael J, and Dubey, Pradeep. Blackout: Speeding up recurrent neural network language models with very large vocabularies. arXiv preprint arXiv:1511.06909, 2015.
-
(2015)
Blackout: Speeding Up Recurrent Neural Network Language Models with Very Large Vocabularies
-
-
Ji, S.1
Svn, V.2
Satish, N.3
Anderson, M.J.4
Dubey, P.5
-
15
-
-
84978840213
-
-
arXiv preprint
-
Jozefowicz, Rafal, Vinyals, Oriol, Schuster, Mike, Shazeer, Noam, and Wu, Yonghui. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410, 2016.
-
(2016)
Exploring the Limits of Language Modeling
-
-
Jozefowicz, R.1
Vinyals, O.2
Schuster, M.3
Shazeer, N.4
Wu, Y.5
-
16
-
-
85021676739
-
-
Kalchbrenner, Nal, Espeholt, Lasse, Simonyan, Karen, van den Oord, Aaron, Graves, Alex, and Kavukcuoglu, Koray. Neural Machine Translation in Linear Time. arXiv, 2016.
-
(2016)
Neural Machine Translation in Linear Time
-
-
Kalchbrenner, N.1
Espeholt, L.2
Simonyan, K.3
Van Den Oord, A.4
Graves, A.5
Kavukcuoglu, K.6
-
17
-
-
0028996876
-
Improved backing-off for m-gram language modeling
-
IEEE
-
Kneser, Reinhard and Ney, Hermann. Improved backing-off for m-gram language modeling. In Acoustics, Speech, and Signal Pmcessing, 1995. ICASSP-95., 1995 International Conference on, volume 1, pp. 181-184. IEEE, 1995.
-
(1995)
Acoustics, Speech, and Signal Pmcessing, 1995. ICASSP-95., 1995 International Conference on
, vol.1
, pp. 181-184
-
-
Kneser, R.1
Ney, H.2
-
18
-
-
84928706421
-
-
Cambridge University Press, New York, NY, USA, 1st edition, 9780521874151
-
Koehn, Philipp. Statistical Machine Translation. Cambridge University Press, New York, NY, USA, 1st edition, 2010. ISBN 0521874157, 9780521874151.
-
(2010)
Statistical Machine Translation
-
-
Koehn, P.1
-
19
-
-
85144240571
-
Factorization tricks for LSTM networks
-
abs/1703.10722
-
Kuchaiev, Oleksii and Ginsburg, Boris. Factorization tricks for LSTM networks. CoRR, abs/1703.10722, 2017. URL http://arxiv.org/abs/1703.10722.
-
(2017)
CoRR
-
-
Kuchaiev, O.1
Ginsburg, B.2
-
20
-
-
0002263996
-
Convolutional networks for images, speech, and time series
-
LeCun, Yann and Bengio, Yoshua. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10): 1995, 1995.
-
(1995)
The Handbook of Brain Theory and Neural Networks
, vol.3361
, Issue.10
, pp. 1995
-
-
LeCun, Y.1
Bengio, Y.2
-
22
-
-
85037338922
-
-
ArXiv e-prints, September
-
Merity, S., Xiong, C, Bradbury, J., and Socher, R. Pointer Sentinel Mixture Models. ArXiv e-prints, September 2016.
-
(2016)
Pointer Sentinel Mixture Models
-
-
Merity, S.1
Xiong, C.2
Bradbury, J.3
Socher, R.4
-
23
-
-
79959829092
-
Recurrent Neural Network based Language Model
-
Mikolov, Tomás, Martin, Karafiát, Bürget, Lukás, Cernoclcy, Jan, and Khudanpur, Sanjeev. Recurrent Neural Network based Language Model. In Proc. of INTERSPEECH, pp. 1045-1048, 2010.
-
(2010)
Proc. of INTERSPEECH
, pp. 1045-1048
-
-
Mikolov, T.1
Martin, K.2
Bürget, L.3
Cernoclcy, J.4
Khudanpur, S.5
-
25
-
-
34547997987
-
Hierarchical probabilistic neural network language model
-
Citeseer
-
Morin, Frederic and Bengio, Yoshua. Hierarchical probabilistic neural network language model. In Aistats, volume 5, pp. 246-252. Citeseer, 2005.
-
(2005)
Aistats
, vol.5
, pp. 246-252
-
-
Morin, F.1
Bengio, Y.2
-
27
-
-
85018927054
-
-
arXiv preprint
-
Oord, Aaron van den, Kalchbrenner, Nal, Vinyals, Oriol, Espeholt, Lasse, Graves, Alex, and Kavukcuoglu, Koray. Conditional image generation with pixelcnn decoders. arXiv preprint arXiv:1606.05328, 2016b.
-
(2016)
Conditional Image Generation with Pixelcnn Decoders
-
-
Van Den Oord, A.1
Kalchbrenner, N.2
Vinyals, O.3
Espeholt, L.4
Graves, A.5
Kavukcuoglu, K.6
-
28
-
-
84892982833
-
On the difficulty of training recurrent neural networks
-
Pascanu, Razvan, Mikolov, Tomas, and Bengio, Yoshua. On the difficulty of training recurrent neural networks. In Proceedings of The 30th International Conference on Machine Learning, pp. 1310-1318, 2013.
-
(2013)
Proceedings of the 30th International Conference on Machine Learning
, pp. 1310-1318
-
-
Pascanu, R.1
Mikolov, T.2
Bengio, Y.3
-
31
-
-
85088226307
-
Outrageously large neural networks: The sparsely-gated mixture-of-experts layer
-
abs/1701.06538
-
Shazeer, Noam, Mirhoseini, Azalia, Maziarz, Krzysztof, Davis, Andy, Le, Quoc V., Hinton, Geoffrey E., and Dean, Jeff. Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. CoRR, abs/1701.06538, 2017. URL http://arxiv.org/abs/1701.06538.
-
(2017)
CoRR
-
-
Shazeer, N.1
Mirhoseini, A.2
Maziarz, K.3
Davis, A.4
Le Quoc, V.5
Hinton, G.E.6
Dean, J.7
-
33
-
-
84897510162
-
-
Sutskever, Ilya, Martens, James, Dahl, George E, and Hinton, Geoffrey E. On the importance of initialization and momentum in deep learning. 2013.
-
(2013)
On the Importance of Initialization and Momentum in Deep Learning
-
-
Sutskever, I.1
Martens, J.2
Dahl, G.E.3
Hinton, G.E.4
|