SCOPUS 정보 검색 플랫폼

34th International Conference on Machine Learning, ICML 2017

Volumn 2, Issue , 2017, Pages 1551-1559

Language modeling with gated convolutional networks

(4) Dauphin, Yann N a Fan, Angela a Auli, Michael a Grangier, David a

a FACEBOOK AI RESEARCH (United States)

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTATIONAL LINGUISTICS; CONVOLUTION; LEARNING SYSTEMS; RECURRENT NEURAL NETWORKS;

ARCHITECTURAL DECISION; CONVOLUTIONAL NETWORKS; GATING MECHANISMS; LANGUAGE MODEL; LONG-TERM DEPENDENCIES; RECURRENT MODELS; STATE OF THE ART;

MODELING LANGUAGES;

EID: 85048443641 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (1101)

References (34)

1
- 0142166851
- A neural probabilistic language model
- Feb
- Bengio, Yoshua, Ducharme, Réjean, Vincent, Pascal, and Jauvin, Christian. A neural probabilistic language model, journal of machine learning research, 3(Feb):1 137-1155, 2003.
- (2003) Journal of Machine Learning Research , vol.3 , Issue.1 , pp. 137-1155
- Bengio, Y.¹ Ducharme, R.² Vincent, P.³ Jauvin, C.⁴

2
- 84943795466
- arXiv preprint
- Chelba, Ciprian, Mikolov, Tomas, Schuster, Mike, Ge, Qi, Brants, Thorsten, Koehn, Phillipp, and Robinson, Tony. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005, 2013.
- (2013) One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling
- Chelba, C.¹ Mikolov, T.² Schuster, M.³ Ge, Q.⁴ Brants, T.⁵ Koehn, P.⁶ Robinson, T.⁷

3
- 85024115120
- An empirical study of smoothing techniques for language modeling
- Association for Computational Linguistics
- Chen, Stanley F and Goodman, Joshua. An empirical study of smoothing techniques for language modeling. In Proceedings of the 34th annual meeting on Association for Computational Linguistics, pp. 310-318. Association for Computational Linguistics, 1996.
- (1996) Proceedings of the 34th Annual Meeting on Association for Computational Linguistics , pp. 310-318
- Chen, S.F.¹ Goodman, J.²

4
- 85018888481
- Strategies for training large vocabulary neural language models
- abs/1512.04906
- Chen, Wenlin, Grangier, David, and Auli, Michael. Strategies for training large vocabulary neural language models. CoRR, abs/1512.04906, 2016.
- (2016) CoRR
- Chen, W.¹ Grangier, D.² Auli, M.³

5
- 84888340666
- Torch7: A matlab-like environment for machine learning
- Collobert, Ronan, Kavukcuoglu, Koray, and Farabet, Clement. Torch7: A Matlab-like Environment for Machine Learning. In BigLearn, NIPS Workshop, 2011. URL http://torch. ch.
- (2011) BigLearn, NIPS Workshop
- Collobert, R.¹ Kavukcuoglu, K.² Farabet, C.³

6
- 85048423619
- arXiv preprint
- Dauphin, Yann N and Grangier, David. Predicting distributions with linearizing belief networks. arXiv preprint arXiv:1511.05622, 2015.
- (2015) Predicting Distributions with Linearizing Belief Networks
- Dauphin, Y.N.¹ Grangier, D.²

7
- 79951563340
- Understanding the difficulty of training deep feedforward neural networks
- Glorot, Xavier and Bengio, Yoshua. Understanding the difficulty of training deep feedforward neural networks. The handbook of brain theory and neural networks, 2010.
- (2010) The Handbook of Brain Theory and Neural Networks
- Glorot, X.¹ Bengio, Y.²

8
- 85030484971
- ArXiv e-prints, September
- Grave, E., Joulin, A., Cissé, M., Grangier, D., and Jégou, H. Efficient softmax approximation for GPUs. ArXiv e-prints, September 2016a.
- (2016) Efficient Softmax Approximation for GPUs
- Grave, E.¹ Joulin, A.² Cissé, M.³ Grangier, D.⁴ Jégou, H.⁵

9
- 85037367719
- ArXiv e-prints, December
- Grave, E., Joulin, A., and Usunier, N. Improving Neural Language Models with a Continuous Cache. ArXiv e-prints, December 2016b.
- (2016) Improving Neural Language Models with A Continuous Cache
- Grave, E.¹ Joulin, A.² Usunier, N.³

10
- 84857889007
- Gutmann, Michael and Hyvärinen, Aapo. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models.
- Noise-contrastive Estimation: A New Estimation Principle for Unnormalized Statistical Models
- Gutmann, M.¹ Hyvarinen, A.²

11
- 84958589374
- arXiv preprint
- He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and Sun, Jian. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385, 2015a.
- (2015) Deep Residual Learning for Image Recognition
- He, K.¹ Zhang, X.² Ren, S.³ Sun, J.⁴

12
- 84973911419
- Delving deep into rectifiers: Surpassing human-level performance on imagenet classification
- He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and Sun, Jian. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1026-1034, 2015b.
- (2015) Proceedings of the IEEE International Conference on Computer Vision , pp. 1026-1034
- He, K.¹ Zhang, X.² Ren, S.³ Sun, J.⁴

13
- 0031573117
- Long short-term memory
- Hochreiter, Sepp and Schmidhuber, Jürgen. Long short-term memory. Neural computation, 9(8): 1735-1780, 1997.
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

14
- 84994350060
- arXiv preprint
- Ji, Shihao, Vishwanathan, SVN, Satish, Nadathur, Anderson, Michael J, and Dubey, Pradeep. Blackout: Speeding up recurrent neural network language models with very large vocabularies. arXiv preprint arXiv:1511.06909, 2015.
- (2015) Blackout: Speeding Up Recurrent Neural Network Language Models with Very Large Vocabularies
- Ji, S.¹ Svn, V.² Satish, N.³ Anderson, M.J.⁴ Dubey, P.⁵

15
- 84978840213
- arXiv preprint
- Jozefowicz, Rafal, Vinyals, Oriol, Schuster, Mike, Shazeer, Noam, and Wu, Yonghui. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410, 2016.
- (2016) Exploring the Limits of Language Modeling
- Jozefowicz, R.¹ Vinyals, O.² Schuster, M.³ Shazeer, N.⁴ Wu, Y.⁵

16
- 85021676739
- Kalchbrenner, Nal, Espeholt, Lasse, Simonyan, Karen, van den Oord, Aaron, Graves, Alex, and Kavukcuoglu, Koray. Neural Machine Translation in Linear Time. arXiv, 2016.
- (2016) Neural Machine Translation in Linear Time
- Kalchbrenner, N.¹ Espeholt, L.² Simonyan, K.³ Van Den Oord, A.⁴ Graves, A.⁵ Kavukcuoglu, K.⁶

17
- 0028996876
- Improved backing-off for m-gram language modeling
- IEEE
- Kneser, Reinhard and Ney, Hermann. Improved backing-off for m-gram language modeling. In Acoustics, Speech, and Signal Pmcessing, 1995. ICASSP-95., 1995 International Conference on, volume 1, pp. 181-184. IEEE, 1995.
- (1995) Acoustics, Speech, and Signal Pmcessing, 1995. ICASSP-95., 1995 International Conference on , vol.1 , pp. 181-184
- Kneser, R.¹ Ney, H.²

18
- 84928706421
- Cambridge University Press, New York, NY, USA, 1st edition, 9780521874151
- Koehn, Philipp. Statistical Machine Translation. Cambridge University Press, New York, NY, USA, 1st edition, 2010. ISBN 0521874157, 9780521874151.
- (2010) Statistical Machine Translation
- Koehn, P.¹

19
- 85144240571
- Factorization tricks for LSTM networks
- abs/1703.10722
- Kuchaiev, Oleksii and Ginsburg, Boris. Factorization tricks for LSTM networks. CoRR, abs/1703.10722, 2017. URL http://arxiv.org/abs/1703.10722.
- (2017) CoRR
- Kuchaiev, O.¹ Ginsburg, B.²

20
- 0002263996
- Convolutional networks for images, speech, and time series
- LeCun, Yann and Bengio, Yoshua. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10): 1995, 1995.
- (1995) The Handbook of Brain Theory and Neural Networks , vol.3361 , Issue.10 , pp. 1995
- LeCun, Y.¹ Bengio, Y.²

21
- 0003612818
- Manning, Christopher D and Schütze, Hinrich. Foundations of statistical natural language processing, 1999.
- (1999) Foundations of Statistical Natural Language Processing
- Manning, C.D.¹ Schütze, H.²

22
- 85037338922
- ArXiv e-prints, September
- Merity, S., Xiong, C, Bradbury, J., and Socher, R. Pointer Sentinel Mixture Models. ArXiv e-prints, September 2016.
- (2016) Pointer Sentinel Mixture Models
- Merity, S.¹ Xiong, C.² Bradbury, J.³ Socher, R.⁴

23
- 79959829092
- Recurrent Neural Network based Language Model
- Mikolov, Tomás, Martin, Karafiát, Bürget, Lukás, Cernoclcy, Jan, and Khudanpur, Sanjeev. Recurrent Neural Network based Language Model. In Proc. of INTERSPEECH, pp. 1045-1048, 2010.
- (2010) Proc. of INTERSPEECH , pp. 1045-1048
- Mikolov, T.¹ Martin, K.² Bürget, L.³ Cernoclcy, J.⁴ Khudanpur, S.⁵

24
- 34547970628
- Three new graphical models for statistical language modelling
- ACM
- Mnih, Andriy and Hinton, Geoffrey. Three new graphical models for statistical language modelling. In Proceedings of the 24th international conference on Machine learning, pp. 641-648. ACM, 2007.
- (2007) Proceedings of the 24th International Conference on Machine Learning , pp. 641-648
- Mnih, A.¹ Hinton, G.²

25
- 34547997987
- Hierarchical probabilistic neural network language model
- Citeseer
- Morin, Frederic and Bengio, Yoshua. Hierarchical probabilistic neural network language model. In Aistats, volume 5, pp. 246-252. Citeseer, 2005.
- (2005) Aistats , vol.5 , pp. 246-252
- Morin, F.¹ Bengio, Y.²

26
- 84989286061
- arXiv preprint
- Oord, Aaron van den, Kalchbrenner, Nal, and Kavukcuoglu, Koray. Pixel recurrent neural networks. arXiv preprint arXiv:1601.06759, 2016.
- (2016) Pixel Recurrent Neural Networks
- Van Den Oord, A.¹ Kalchbrenner, N.² Kavukcuoglu, K.³

27
- 85018927054
- arXiv preprint
- Oord, Aaron van den, Kalchbrenner, Nal, Vinyals, Oriol, Espeholt, Lasse, Graves, Alex, and Kavukcuoglu, Koray. Conditional image generation with pixelcnn decoders. arXiv preprint arXiv:1606.05328, 2016b.
- (2016) Conditional Image Generation with Pixelcnn Decoders
- Van Den Oord, A.¹ Kalchbrenner, N.² Vinyals, O.³ Espeholt, L.⁴ Graves, A.⁵ Kavukcuoglu, K.⁶

28
- 84892982833
- On the difficulty of training recurrent neural networks
- Pascanu, Razvan, Mikolov, Tomas, and Bengio, Yoshua. On the difficulty of training recurrent neural networks. In Proceedings of The 30th International Conference on Machine Learning, pp. 1310-1318, 2013.
- (2013) Proceedings of the 30th International Conference on Machine Learning , pp. 1310-1318
- Pascanu, R.¹ Mikolov, T.² Bengio, Y.³

29
- 85018885758
- arXiv preprint
- Salimans, Urn and Kingma, Diederik P. Weight normalization: A simple reparameterization to accelerate training of deep neural networks. arXiv preprint arXiv: 1602.07868, 2016.
- (2016) Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
- Salimans, U.¹ Kingma, D.P.²

30
- 85018905949
- arXiv preprint
- Shazeer, Noam, Pelemans, Joris, and Chelba, Ciprian. Skip-gram language modeling using sparse non-negative matrix probability estimation. arXiv preprint arXiv:1412.1454, 2014.
- (2014) Skip-gram Language Modeling Using Sparse Non-negative Matrix Probability Estimation
- Shazeer, N.¹ Pelemans, J.² Chelba, C.³

31
- 85088226307
- Outrageously large neural networks: The sparsely-gated mixture-of-experts layer
- abs/1701.06538
- Shazeer, Noam, Mirhoseini, Azalia, Maziarz, Krzysztof, Davis, Andy, Le, Quoc V., Hinton, Geoffrey E., and Dean, Jeff. Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. CoRR, abs/1701.06538, 2017. URL http://arxiv.org/abs/1701.06538.
- (2017) CoRR
- Shazeer, N.¹ Mirhoseini, A.² Maziarz, K.³ Davis, A.⁴ Le Quoc, V.⁵ Hinton, G.E.⁶ Dean, J.⁷

32
- 0004223670
- Steedman, Mark. The syntactic process. 2002.
- (2002) The Syntactic Process
- Steedman, M.¹

33
- 84897510162
- Sutskever, Ilya, Martens, James, Dahl, George E, and Hinton, Geoffrey E. On the importance of initialization and momentum in deep learning. 2013.
- (2013) On the Importance of Initialization and Momentum in Deep Learning
- Sutskever, I.¹ Martens, J.² Dahl, G.E.³ Hinton, G.E.⁴

34
- 84923929378
- Springer Publishing Company, Incorporated, 9781447157786
- Yu, Dong and Deng, Li. Automatic Speech Recognition: A Deep Learning Approach. Springer Publishing Company, Incorporated, 2014. ISBN 1447157788, 9781447157786.
- (2014) Automatic Speech Recognition: A Deep Learning Approach
- Yu, D.¹ Deng, L.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.