-
2
-
-
0020719320
-
A maximum likelihood approach to continuous speech recognition
-
Lalit R Bahl, Frederick Jelinek, and Robert L Mercer. A maximum likelihood approach to continuous speech recognition. PAMI, 1983.
-
(1983)
PAMI
-
-
Bahl, L.R.1
Jelinek, F.2
Mercer, R.L.3
-
3
-
-
0000274403
-
Exploiting latent semantic information in statistical language modeling
-
Jerome R Bellegarda. Exploiting latent semantic information in statistical language modeling. Proceedings of the IEEE, 2000.
-
(2000)
Proceedings of the IEEE
-
-
Bellegarda, J.R.1
-
8
-
-
85128355013
-
Towards better integration of semantic predictors in statistical language modeling
-
Citeseer
-
Noah Coccaro and Daniel Jurafsky. Towards better integration of semantic predictors in statistical language modeling. In ICSLP. Citeseer, 1998.
-
(1998)
ICSLP.
-
-
Coccaro, N.1
Jurafsky, D.2
-
10
-
-
85015943644
-
-
arXiv preprint
-
Jesse Dodge, Andreea Gane, Xiang Zhang, Antoine Bordes, Sumit Chopra, Alexander Miller, Arthur Szlam, and Jason Weston. Evaluating prerequisite qualities for learning end-to-end dialog systems. arXiv preprint arXiv:1511.06931, 2015.
-
(2015)
Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems
-
-
Dodge, J.1
Gane, A.2
Zhang, X.3
Bordes, A.4
Chopra, S.5
Miller, A.6
Szlam, A.7
Weston, J.8
-
11
-
-
80052250414
-
Adaptive subgradient methods for online learning and stochastic optimization
-
John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. JMLR, 2011.
-
(2011)
JMLR
-
-
Duchi, J.1
Hazan, E.2
Singer, Y.3
-
15
-
-
85030484971
-
-
arXiv preprint
-
Edouard Grave, Armand Joulin, Moustapha Cissé, David Grangier, and Hervé Jégou. Efficient softmax approximation for gpus. arXiv preprint arXiv:1609.04309, 2016.
-
(2016)
Efficient Softmax Approximation for GPUs
-
-
Grave, E.1
Joulin, A.2
Cissé, M.3
Grangier, D.4
Jégou, H.5
-
16
-
-
84890543083
-
Speech recognition with deep recurrent neural networks
-
A. Graves, A. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. In ICASSP, 2013.
-
(2013)
ICASSP
-
-
Graves, A.1
Mohamed, A.2
Hinton, G.3
-
18
-
-
84965153738
-
Learning to transduce with unbounded memory
-
Edward Grefenstette, Karl Moritz Hermann, Mustafa Suleyman, and Phil Blunsom. Learning to transduce with unbounded memory. In Advances in Neural Information Processing Systems, pp. 1828-1836, 2015.
-
(2015)
Advances in Neural Information Processing Systems
, pp. 1828-1836
-
-
Grefenstette, E.1
Hermann, K.M.2
Suleyman, M.3
Blunsom, P.4
-
19
-
-
85011836383
-
-
arXiv preprint
-
Caglar Gulcehre, Sungjin Ahn, Ramesh Nallapati, Bowen Zhou, and Yoshua Bengio. Pointing the unknown words. arXiv preprint arXiv:1603.08148, 2016.
-
(2016)
Pointing the Unknown Words
-
-
Gulcehre, C.1
Ahn, S.2
Nallapati, R.3
Zhou, B.4
Bengio, Y.5
-
20
-
-
84965139942
-
Teaching machines to read and comprehend
-
Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. Teaching machines to read and comprehend. In NIPS, 2015.
-
(2015)
NIPS
-
-
Hermann, K.M.1
Kocisky, T.2
Grefenstette, E.3
Espeholt, L.4
Kay, W.5
Suleyman, M.6
Blunsom, P.7
-
22
-
-
0032785782
-
Modeling long distance dependence in language: Topic mixtures versus dynamic cache models
-
Rukmini M Iyer and Mari Ostendorf. Modeling long distance dependence in language: Topic mixtures versus dynamic cache models. IEEE Transactions on speech and audio processing, 1999.
-
(1999)
IEEE Transactions on Speech and Audio Processing
-
-
Iyer, R.M.1
Ostendorf, M.2
-
23
-
-
0012357341
-
A dynamic language model for speech recognition
-
Frederick Jelinek, Bernard Merialdo, Salim Roukos, and Martin Strauss. A dynamic language model for speech recognition. In HLT, 1991.
-
(1991)
HLT
-
-
Jelinek, F.1
Merialdo, B.2
Roukos, S.3
Strauss, M.4
-
25
-
-
84978840213
-
-
arXiv preprint
-
Rafal Jozefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410, 2016.
-
(2016)
Exploring the Limits of Language Modeling
-
-
Jozefowicz, R.1
Vinyals, O.2
Schuster, M.3
Shazeer, N.4
Wu, Y.5
-
27
-
-
0023312404
-
Estimation of probabilities from sparse data for the language model component of a speech recognizer
-
Slava M Katz. Estimation of probabilities from sparse data for the language model component of a speech recognizer. ICASSP, 1987.
-
(1987)
ICASSP
-
-
Katz, S.M.1
-
28
-
-
0034297742
-
Maximum entropy techniques for exploiting syntactic, semantic and colloca-tional dependencies in language modeling
-
Sanjeev Khudanpur and Jun Wu. Maximum entropy techniques for exploiting syntactic, semantic and colloca-tional dependencies in language modeling. Computer Speech & Language, 2000.
-
(2000)
Computer Speech & Language
-
-
Khudanpur, S.1
Wu, J.2
-
29
-
-
0028996876
-
Improved backing-off for m-gram language modeling
-
Reinhard Kneser and Hermann Ney. Improved backing-off for m-gram language modeling. In ICASSP, 1995.
-
(1995)
ICASSP
-
-
Kneser, R.1
Ney, H.2
-
30
-
-
0027192617
-
On the dynamic adaptation of stochastic language models
-
Reinhard Kneser and Volker Steinbiss. On the dynamic adaptation of stochastic language models. In ICASSP, 1993.
-
(1993)
ICASSP
-
-
Kneser, R.1
Steinbiss, V.2
-
31
-
-
0012259838
-
Speech recognition and the frequency of recently used words: A modified markov model for natural language
-
Roland Kuhn. Speech recognition and the frequency of recently used words: A modified markov model for natural language. In Proceedings of the 12th conference on Computational linguistics-Volume 1, 1988.
-
(1988)
Proceedings of the 12th Conference on Computational Linguistics-
, vol.1
-
-
Kuhn, R.1
-
32
-
-
0025446887
-
A cache-based natural language model for speech recognition
-
Roland Kuhn and Renato De Mori. A cache-based natural language model for speech recognition. PAMI, 1990.
-
(1990)
PAMI
-
-
Kuhn, R.1
De Mori, R.2
-
34
-
-
0027252194
-
Trigger-based language models: A maximum entropy approach
-
Raymond Lau, Ronald Rosenfeld, and Salim Roukos. Trigger-based language models: A maximum entropy approach. In ICASSP, 1993.
-
(1993)
ICASSP
-
-
Lau, R.1
Rosenfeld, R.2
Roukos, S.3
-
37
-
-
84874235486
-
Context dependent recurrent neural network language model
-
Tomas Mikolov and Geoffrey Zweig. Context dependent recurrent neural network language model. In SLT, 2012.
-
(2012)
SLT
-
-
Mikolov, T.1
Zweig, G.2
-
38
-
-
79959829092
-
Recurrent neural network based language model
-
` and
-
Tomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernocky, ` and Sanjeev Khudanpur. Recurrent neural network based language model. In INTERSPEECH, 2010.
-
(2010)
INTERSPEECH
-
-
Mikolov, T.1
Karafiát, M.2
Burget, L.3
Cernocky, J.4
Khudanpur, S.5
-
39
-
-
84865803833
-
Empirical evaluation and combination of advanced language modeling techniques
-
Tomas Mikolov, Anoop Deoras, Stefan Kombrink, Lukas Burget, and Jan Cernocky. ` Empirical evaluation and combination of advanced language modeling techniques. In INTERSPEECH, 2011.
-
(2011)
INTERSPEECH
-
-
Mikolov, T.1
Deoras, A.2
Kombrink, S.3
Burget, L.4
Cernocky, J.5
-
40
-
-
84939804661
-
-
arXiv preprint
-
Tomas Mikolov, Armand Joulin, Sumit Chopra, Michael Mathieu, and Marc'Aurelio Ranzato. Learning longer memory in recurrent neural networks. arXiv preprint arXiv:1412.7753, 2014.
-
(2014)
Learning Longer Memory in Recurrent Neural Networks
-
-
Mikolov, T.1
Joulin, A.2
Chopra, S.3
Mathieu, M.4
Marc'Aurelio, R.5
-
41
-
-
85037349059
-
-
arXiv preprint
-
Denis Paperno, Germán Kruszewski, Angeliki Lazaridou, Quan Ngoc Pham, Raffaella Bernardi, Sandro Pezzelle, Marco Baroni, Gemma Boleda, and Raquel Fernández. The lambada dataset: Word prediction requiring a broad discourse context. arXiv preprint arXiv:1606.06031, 2016.
-
(2016)
The Lambada Dataset: Word Prediction Requiring a Broad Discourse Context
-
-
Denis Paperno, G.K.1
Lazaridou, A.2
Pham, Q.N.3
Bernardi, R.4
Pezzelle, S.5
Baroni, M.6
Boleda, G.7
Fernández, R.8
-
42
-
-
0030181951
-
A maximum entropy approach to adaptive statistical language modeling
-
Ronald Rosenfeld. A maximum entropy approach to adaptive statistical language modeling. Computer, Speech and Language, 1996.
-
(1996)
Computer, Speech and Language
-
-
Rosenfeld, R.1
-
43
-
-
0000023031
-
Dialogue act modeling for automatic tagging and recognition of conversational speech
-
Andreas Stolcke, Noah Coccaro, Rebecca Bates, Paul Taylor, Carol Van Ess-Dykema, Klaus Ries, Elizabeth Shriberg, Daniel Jurafsky, Rachel Martin, and Marie Meteer. Dialogue act modeling for automatic tagging and recognition of conversational speech. Computational linguistics, 2000.
-
(2000)
Computational Linguistics
-
-
Stolcke, A.1
Coccaro, N.2
Bates, R.3
Taylor, P.4
Van Ess-Dykema, C.5
Ries, K.6
Shriberg, E.7
Jurafsky, D.8
Martin, R.9
Meteer, M.10
-
48
-
-
0001609567
-
An efficient gradient-based algorithm for on-line training of recurrent network trajectories
-
Ronald J Williams and Jing Peng. An efficient gradient-based algorithm for on-line training of recurrent network trajectories. Neural computation, 1990.
-
(1990)
Neural Computation
-
-
Williams, R.J.1
Peng, J.2
|