-
1
-
-
0028392483
-
Learning long-term dependencies with gradient descent is difficult
-
Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks, 5(2):157-166, 1994.
-
(1994)
IEEE Transactions on Neural Networks
, vol.5
, Issue.2
, pp. 157-166
-
-
Bengio, Y.1
Simard, P.2
Frasconi, P.3
-
2
-
-
0142166851
-
A neural probabilistic language model
-
Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin. A neural probabilistic language model. journal of machine learning research, 3(Feb):1137-1155, 2003.
-
(2003)
Journal of Machine Learning Research
, vol.3
, pp. 1137-1155
-
-
Bengio, Y.1
Ducharme, R.2
Vincent, P.3
Jauvin, C.4
-
4
-
-
76249118968
-
Topic models
-
D. M. Blei and J. D. Lafferty. Topic models. Text mining: classification, clustering, and applications, 10(71):34, 2009.
-
(2009)
Text Mining: Classification, Clustering, and Applications
, vol.10
, Issue.71
, pp. 34
-
-
Blei, D.M.1
Lafferty, J.D.2
-
7
-
-
84943795466
-
-
arXiv preprint
-
C. Chelba, T. Mikolov, M. Schuster, Q. Ge, T. Brants, P. Koehn, and T. Robinson. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005, 2013.
-
(2013)
One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling
-
-
Chelba, C.1
Mikolov, T.2
Schuster, M.3
Ge, Q.4
Brants, T.5
Koehn, P.6
Robinson, T.7
-
8
-
-
84961291190
-
-
arXiv preprint
-
K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.
-
(2014)
Learning Phrase Representations Using Rnn Encoder-Decoder for Statistical Machine Translation
-
-
Cho, K.1
Van Merriënboer, B.2
Gulcehre, C.3
Bahdanau, D.4
Bougares, F.5
Schwenk, H.6
Bengio, Y.7
-
10
-
-
8644247561
-
Dependence language model for information retrieval
-
J. Gao, J.-Y. Nie, G. Wu, and G. Cao. Dependence language model for information retrieval. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 170-177. ACM, 2004.
-
(2004)
Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, pp. 170-177
-
-
Gao, J.1
Nie, J.-Y.2
Wu, G.3
Cao, G.4
-
11
-
-
84988877050
-
-
arXiv preprint
-
S. Ghosh, O. Vinyals, B. Strope, S. Roy, T. Dean, and L. Heck. Contextual lstm (clstm) models for large scale nlp tasks. arXiv preprint arXiv:1602.06291, 2016.
-
(2016)
Contextual Lstm (Clstm) Models for Large Scale Nlp Tasks
-
-
Ghosh, S.1
Vinyals, O.2
Strope, B.3
Roy, S.4
Dean, T.5
Heck, L.6
-
13
-
-
84994157341
-
-
arXiv preprint
-
Y. Ji, T. Cohn, L. Kong, C. Dyer, and J. Eisenstein. Document context language models. arXiv preprint arXiv:1511.03962, 2015.
-
(2015)
Document Context Language Models
-
-
Ji, Y.1
Cohn, T.2
Kong, L.3
Dyer, C.4
Eisenstein, J.5
-
15
-
-
0033225865
-
An introduction to variational methods for graphical models
-
M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul. An introduction to variational methods for graphical models. Machine learning, 37(2):183-233, 1999.
-
(1999)
Machine Learning
, vol.37
, Issue.2
, pp. 183-233
-
-
Jordan, M.I.1
Ghahramani, Z.2
Jaakkola, T.S.3
Saul, L.K.4
-
16
-
-
84978840213
-
-
arXiv preprint
-
R. Jozefowicz, O. Vinyals, M. Schuster, N. Shazeer, and Y. Wu. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410, 2016.
-
(2016)
Exploring the Limits of Language Modeling
-
-
Jozefowicz, R.1
Vinyals, O.2
Schuster, M.3
Shazeer, N.4
Wu, Y.5
-
19
-
-
84926067654
-
Distributed representations of sentences and documents
-
Q. V. Le and T. Mikolov. Distributed representations of sentences and documents. In ICML, volume 14, pages 1188-1196, 2014.
-
(2014)
ICML
, vol.14
, pp. 1188-1196
-
-
Le, Q.V.1
Mikolov, T.2
-
20
-
-
84959935599
-
Hierarchical recurrent neural network for document modeling
-
R. Lin, S. Liu, M. Yang, M. Li, M. Zhou, and S. Li. Hierarchical recurrent neural network for document modeling. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 899-907, 2015.
-
(2015)
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
, pp. 899-907
-
-
Lin, R.1
Liu, S.2
Yang, M.3
Li, M.4
Zhou, M.5
Li, S.6
-
21
-
-
84859023447
-
Learning word vectors for sentiment analysis
-
Association for Computational Linguistics
-
A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts. Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pages 142-150. Association for Computational Linguistics, 2011.
-
(2011)
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume
, vol.1
, pp. 142-150
-
-
Maas, A.L.1
Daly, R.E.2
Pham, P.T.3
Huang, D.4
Ng, A.Y.5
Potts, C.6
-
22
-
-
34249852033
-
Building a large annotated corpus of english: The penn treebank
-
M. P. Marcus, M. A. Marcinkiewicz, and B. Santorini. Building a large annotated corpus of english: The penn treebank. Computational linguistics, 19(2):313-330, 1993.
-
(1993)
Computational Linguistics
, vol.19
, Issue.2
, pp. 313-330
-
-
Marcus, M.P.1
Marcinkiewicz, M.A.2
Santorini, B.3
-
24
-
-
84874235486
-
Context dependent recurrent neural network language model
-
T. Mikolov and G. Zweig. Context dependent recurrent neural network language model. In SLT, pages 234-239, 2012.
-
(2012)
SLT
, pp. 234-239
-
-
Mikolov, T.1
Zweig, G.2
-
25
-
-
80051627816
-
Recurrent neural network based language model
-
` and
-
T. Mikolov, M. Karafiát, L. Burget, J. Cernocky, ` and S. Khudanpur. Recurrent neural network based language model. In Interspeech, volume 2, page 3, 2010.
-
(2010)
Interspeech
, vol.2
, pp. 3
-
-
Mikolov, T.1
Karafiát, M.2
Burget, L.3
Cernocky, J.4
Khudanpur, S.5
-
26
-
-
80051643236
-
Extensions of recurrent neural network language model
-
` and IEEE
-
T. Mikolov, S. Kombrink, L. Burget, J. Cernocky, ` and S. Khudanpur. Extensions of recurrent neural network language model. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5528-5531. IEEE, 2011.
-
(2011)
2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
, pp. 5528-5531
-
-
Mikolov, T.1
Kombrink, S.2
Burget, L.3
Cernocky, J.4
Khudanpur, S.5
-
27
-
-
84939804661
-
-
arXiv preprint
-
T. Mikolov, A. Joulin, S. Chopra, M. Mathieu, and M. Ranzato. Learning longer memory in recurrent neural networks. arXiv preprint arXiv:1412.7753, 2014.
-
(2014)
Learning Longer Memory in Recurrent Neural Networks
-
-
Mikolov, T.1
Joulin, A.2
Chopra, S.3
Mathieu, M.4
Ranzato, M.5
-
28
-
-
80053260943
-
Optimizing semantic coherence in topic models
-
Association for Computational Linguistics
-
D. Mimno, H. M. Wallach, E. Talley, M. Leenders, and A. McCallum. Optimizing semantic coherence in topic models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 262-272. Association for Computational Linguistics, 2011.
-
(2011)
Proceedings of the Conference on Empirical Methods in Natural Language Processing
, pp. 262-272
-
-
Mimno, D.1
Wallach, H.M.2
Talley, E.3
Leenders, M.4
McCallum, A.5
-
29
-
-
85049078406
-
Adversarial training methods for semi-supervised text classification
-
T. Miyato, A. M. Dai, and I. Goodfellow. Adversarial training methods for semi-supervised text classification. stat, 1050:7, 2016.
-
(2016)
Stat
, vol.1050
, pp. 7
-
-
Miyato, T.1
Dai, A.M.2
Goodfellow, I.3
-
30
-
-
84897497795
-
On the difficulty of training recurrent neural networks
-
R. Pascanu, T. Mikolov, and Y. Bengio. On the difficulty of training recurrent neural networks. ICML (3), 28:1310-1318, 2013.
-
(2013)
ICML
, vol.28
, Issue.3
, pp. 1310-1318
-
-
Pascanu, R.1
Mikolov, T.2
Bengio, Y.3
-
32
-
-
84904163933
-
Dropout: A simple way to prevent neural networks from overfitting
-
N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1): 1929-1958, 2014.
-
(2014)
Journal of Machine Learning Research
, vol.15
, Issue.1
, pp. 1929-1958
-
-
Srivastava, N.1
Hinton, G.E.2
Krizhevsky, A.3
Sutskever, I.4
Salakhutdinov, R.5
|