-
2
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition
-
IEEE, Nov
-
G. Hinton, L. Deng, et al., "Deep Neural Networks for Acoustic Modeling in Speech Recognition, " Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 82-97, Nov 2012.
-
(2012)
Signal Processing Magazine
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
-
3
-
-
84858976070
-
Feature engineering in context-dependent deep neural networks for conversational speech transcription
-
Dec
-
F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription, " in Proc. of ASRU, Dec 2011.
-
(2011)
Proc. of ASRU
-
-
Seide, F.1
Li, G.2
Chen, X.3
Yu, D.4
-
4
-
-
0033709098
-
Tandem connectionist feature extraction for conventional HMM systems
-
H. Hermansky, D. Ellis, and S. Sharma, "Tandem Connectionist Feature Extraction for ConventionalHMMSystems, " in Proc. of ICASSP, 2000.
-
(2000)
Proc. of ICASSP
-
-
Hermansky, H.1
Ellis, D.2
Sharma, S.3
-
5
-
-
84858955616
-
Study of probabilistic and bottle-neck features in multilingual environment
-
Frantisek Grezl, Martin Karafiat, and Milos Janda, "Study of probabilistic and bottle-neck features in multilingual environment, " in Proc. of ASRU, 2011.
-
(2011)
Proc. of ASRU
-
-
Grezl, F.1
Karafiat, M.2
Janda, M.3
-
6
-
-
0003573244
-
-
Kluwer Academic Publishers, Norwell, MA, USA
-
H. A. Bourlard and N. Morgan, Connectionist Speech Recognition: A Hybrid Approach, Kluwer Academic Publishers, Norwell, MA, USA, 1993.
-
(1993)
Connectionist Speech Recognition: A Hybrid Approach
-
-
Bourlard, H.A.1
Morgan, N.2
-
7
-
-
0001860529
-
A post-processing system to yield reduced word error rates: Recogniser output voting error reduction (ROVER)
-
J. G. Fiscus, "A post-processing system to yield reduced word error rates: Recogniser Output Voting Error Reduction (ROVER), " in Proc. of ASRU, 1997.
-
(1997)
Proc. of ASRU
-
-
Fiscus, J.G.1
-
8
-
-
0033676943
-
Large vocabulary decoding and confidence estimation using word posterior probabilities
-
G. Evermann and P. C. Woodland, "Large vocabulary decoding and confidence estimation using word posterior probabilities, " in Proc. of ICASSP 2000.
-
(2000)
Proc. of ICASSP
-
-
Evermann, G.1
Woodland, P.C.2
-
10
-
-
43849107771
-
The SRI/OGI 2006 spoken term detection system
-
D. Vergyri, I. Shafran, et al., "The SRI/OGI 2006 spoken term detection system, " in Proc. of Interspeech, 2007.
-
(2007)
Proc. of Interspeech
-
-
Vergyri, D.1
Shafran, I.2
-
11
-
-
67649518727
-
Subword modeling of out of vocabulary words in spoken term detection
-
I. Szoke, L. Burget, J Cernocky, and M. Fapso, "Subword modeling of out of vocabulary words in spoken term detection, " in Proc. of SLT 2008.
-
(2008)
Proc. of SLT
-
-
Szoke, I.1
Burget, L.2
Cernocky, J.3
Fapso, M.4
-
12
-
-
84890489531
-
System combination and score normalization for spoken term detection
-
J. Mamou et al., "System combination and score normalization for spoken term detection, " in Proc. of ICASSP, 2013.
-
(2013)
Proc. of ICASSP
-
-
Mamou, J.1
-
13
-
-
84890542302
-
Exploiting diversity for spoken term detection
-
L. Mangu, H. Soltau, H.-K. Kuo, B. Kingsbury, and G. Saon, "Exploiting diversity for spoken term detection, " in Proc. of ICASSP, 2013.
-
(2013)
Proc. of ICASSP
-
-
Mangu, L.1
Soltau, H.2
Kuo, H.-K.3
Kingsbury, B.4
Saon, G.5
-
14
-
-
84890537373
-
A high-performance Cantonese keyword search system
-
B. Kingsbury et al., "A high-performance Cantonese keyword search system, " in Proc. of ICASSP, 2013.
-
(2013)
Proc. of ICASSP
-
-
Kingsbury, B.1
-
15
-
-
84893692703
-
Score normalization and system combination for improved keyword spotting
-
D. Karakos, R Schwartz, S. Tsakalidis, L. Zhang, et al., "Score normalization and system combination for improved keyword spotting, " in Proc. of ASRU 2013.
-
(2013)
Proc. of ASRU
-
-
Karakos, D.1
Schwartz, R.2
Tsakalidis, S.3
Zhang, L.4
-
17
-
-
0003571976
-
-
Cambridge University
-
S. J. Young, G. Evermann, M. J. F. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. C. Woodland, The HTK Book (for HTK version 3.4.1), Cambridge University, http://htk.eng.cam.ac.uk 2009.
-
(2009)
The HTK Book (For HTK Version 3.4.1)
-
-
Young, S.J.1
Evermann, G.2
Gales, M.J.F.3
Hain, T.4
Kershaw, D.5
Liu, X.6
Moore, G.7
Odell, J.8
Ollason, D.9
Povey, D.10
Valtchev, V.11
Woodland, P.C.12
-
18
-
-
84893712779
-
-
David Johnson et al., "QuickNet, " http://www1.icsi.berkeley.edu/Speech/qn.html.
-
QuickNet
-
-
Johnson, D.1
-
19
-
-
85009139544
-
Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
-
T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis, " in Proc. of Eurospeech, 1999.
-
(1999)
Proc. of Eurospeech
-
-
Yoshimura, T.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
20
-
-
0002144369
-
Tree-based state tying for high accuracy acoustic modelling
-
S. J. Young, J. J. Odell, and P. C. Woodland, "Tree-based state tying for high accuracy acoustic modelling, " in Proceedings ARPA Workshop on Human Language Technology, 1994, pp. 307-312.
-
(1994)
Proceedings ARPA Workshop on Human Language Technology
, pp. 307-312
-
-
Young, S.J.1
Odell, J.J.2
Woodland, P.C.3
-
21
-
-
85023776577
-
Flexible deciscion trees for grapheme based speech recognition
-
Cottbus, Germany
-
Borislava Mimer, Sebastian Stüker, and Tanja Schultz, "Flexible deciscion trees for grapheme based speech recognition, " in Proc. 15th Conference Elektronische Sprachsignalverabeitung (ESSV), Cottbus, Germany, 2004.
-
(2004)
Proc. 15th Conference Elektronische Sprachsignalverabeitung (ESSV)
-
-
Mimer, B.1
Stüker, S.2
Schultz, T.3
-
22
-
-
79251574977
-
The efficient incorporation of MLP features into automatic speech recognition systems
-
J. Park et al., "The Efficient Incorporation of MLP Features into Automatic Speech Recognition Systems, " Computer Speech and Language, vol. 25, pp. 519-534, 2010.
-
(2010)
Computer Speech and Language
, vol.25
, pp. 519-534
-
-
Park, J.1
-
23
-
-
0032638856
-
Semi-tied covariance matrices for hidden Markov models
-
May
-
M. J. F. Gales, "Semi-tied covariance matrices for hidden Markov models, " IEEE Transaction of Speech and Audio Processing, vol. 7, no. 3, pp. 272-281, May 1999.
-
(1999)
IEEE Transaction of Speech and Audio Processing
, vol.7
, Issue.3
, pp. 272-281
-
-
Gales, M.J.F.1
-
24
-
-
0032050110
-
Maximum likelihood linear transformations for HMM-Based speech recognition
-
M. J. F. Gales, "Maximum Likelihood Linear Transformations for HMM-Based Speech Recognition, " Computer Speech and Language, vol. 12, no. 2, pp. 75-98, 1998.
-
(1998)
Computer Speech and Language
, vol.12
, Issue.2
, pp. 75-98
-
-
Gales, M.J.F.1
-
25
-
-
0036296863
-
Minimum Phone Error and I-smoothing for improved discriminative training
-
D. Povey and P. C. Woodland, "Minimum Phone Error and I-smoothing for improved discriminative training, " in Proc. of ICASSP, 2002.
-
(2002)
Proc. of ICASSP
-
-
Povey, D.1
Woodland, P.C.2
-
26
-
-
33646788786
-
FMPE: Discriminatively trained features for speech recognition
-
D. Povey et al., "fMPE: Discriminatively trained features for speech recognition, " in Proc. of ICASSP, 2005.
-
(2005)
Proc. of ICASSP
-
-
Povey, D.1
-
27
-
-
84893681011
-
Vocal tract length perturbation (VTLP) improves speech recognition
-
N. Jaitly and G. E. Hinton, "Vocal tract length perturbation (VTLP) improves speech recognition, " in Proc of ICML, 2013.
-
(2013)
Proc of ICML
-
-
Jaitly, N.1
Hinton, G.E.2
-
28
-
-
84905247925
-
Data augmentation for deep neural network acoustic modeling
-
X. Cui, V. Goel, and B. Kingsbury, "Data augmentation for deep neural network acoustic modeling, " in Proc. of ICASSP, 2014.
-
(2014)
Proc. of ICASSP
-
-
Cui, X.1
Goel, V.2
Kingsbury, B.3
-
30
-
-
0036460908
-
Lightly supervised and unsupervised acoustic model training
-
L. Lamel and J.-L. Gauvain, "Lightly supervised and unsupervised acoustic model training, " Computer speech and language, vol. 16, pp. 115-129, 2013.
-
(2013)
Computer Speech and Language
, vol.16
, pp. 115-129
-
-
Lamel, L.1
Gauvain, J.-L.2
-
31
-
-
34047266379
-
Progress in the CUHTK broadcast news transcription system
-
M. J. F. Gales, D. Y. Kim, P. C. Woodland, H. Y. Chan, D. Mrva, R. Sinha, and S. E. Tranter, "Progress in the CUHTK broadcast news transcription system, " IEEE Tran ASLP, vol. 14, no. 5, pp. 1513-1525, 2006.
-
(2006)
IEEE Tran ASLP
, vol.14
, Issue.5
, pp. 1513-1525
-
-
Gales, M.J.F.1
Kim, D.Y.2
Woodland, P.C.3
Chan, H.Y.4
Mrva, D.5
Sinha, R.6
Tranter, S.E.7
-
32
-
-
84906932692
-
Unsupervised morphology-based vocabulary expansion
-
M. S. Rasooli, N. Habash, O. Rambow, and T. Lippincott, "Unsupervised morphology-based vocabulary expansion, " in The 52nd Annual Meeting of the Association for Computational Linguistics, 2014.
-
(2014)
The 52nd Annual Meeting of the Association for Computational Linguistics
-
-
Rasooli, M.S.1
Habash, N.2
Rambow, O.3
Lippincott, T.4
|