-
1
-
-
84055211743
-
Acoustic modeling using deep belief networks
-
A. Mohamed, G.E. Dahl and G. Hinton, "Acoustic modeling using deep belief networks, " IEEE Trans. on ASLP, Vol. 20, no. 1, pp. 14 -22, 2012.
-
(2012)
IEEE Trans. on ASLP
, vol.20
, Issue.1
, pp. 14-22
-
-
Mohamed, A.1
Dahl, G.E.2
Hinton, G.3
-
2
-
-
84865801985
-
Conversational speech transcription using context-dependent deep neural networks
-
F. Seide, G. Li and D. Yu, "Conversational speech transcription using context-dependent deep neural networks, " Proc. of Interspeech, 2011.
-
(2011)
Proc. of Interspeech
-
-
Seide, F.1
Li, G.2
Yu, D.3
-
3
-
-
84878379108
-
Scalable minimum bayes risk training of deep neural network acoustic models using distributed hessian-free optimization
-
B. Kingsbury, T. N. Sainath, and H. Soltau, "Scalable minimum bayes risk training of deep neural network acoustic models using distributed hessian-free optimization, " Proc. of Interspeech, 2012.
-
(2012)
Proc. of Interspeech
-
-
Kingsbury, B.1
Sainath, T.N.2
Soltau, H.3
-
4
-
-
0033097443
-
Single channel speech enhancement based on masking properties of the human auditory system
-
N. Virag, "Single channel speech enhancement based on masking properties of the human auditory system", IEEE Trans. Speech Audio Process., 7(2), pp. 126-137, 1999.
-
(1999)
IEEE Trans. Speech Audio Process
, vol.7
, Issue.2
, pp. 126-137
-
-
Virag, N.1
-
5
-
-
56249136428
-
Transforming binary uncertainties for robust speech recognition
-
S. Srinivasan and D. L. Wang, "Transforming binary uncertainties for robust speech recognition", IEEE Trans Audio, Speech, Lang. Process., 15(7), pp. 2130-2140, 2007.
-
(2007)
IEEE Trans Audio, Speech, Lang. Process
, vol.15
, Issue.7
, pp. 2130-2140
-
-
Srinivasan, S.1
Wang, D.L.2
-
7
-
-
78049398950
-
Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring
-
C. Kim and R. M. Stern, "Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring", in Proc. ICASSP, pp. 4574- 4577, 2010.
-
(2010)
Proc. ICASSP
, pp. 4574-4577
-
-
Kim, C.1
Stern, R.M.2
-
8
-
-
84867613224
-
Fepstrum features: Design and application to conversational speech recognition
-
11009
-
V. Tyagi, "Fepstrum features: Design and application to conversational speech recognition", IBM Research Report, 11009, 2011.
-
(2011)
IBM Research Report
-
-
Tyagi, V.1
-
9
-
-
84867589420
-
Normalized amplitude modulation features for large vocabulary noise-robust speech recognition
-
Japan
-
V. Mitra, H. Franco, M. Graciarena and A. Mandal, "Normalized amplitude modulation features for large vocabulary noise-robust speech recognition", in Proc. of ICASSP, pp. 4117-4120, Japan, 2012.
-
(2012)
Proc. of ICASSP
, pp. 4117-4120
-
-
Mitra, V.1
Franco, H.2
Graciarena, M.3
Mandal, A.4
-
10
-
-
0030638031
-
A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (ROVER)
-
J. G. Fiscus, "A Post-Processing System to Yield Reduced Word Error Rates: Recognizer Output Voting Error Reduction. (ROVER), " Proc. of ASRU, pp. 347-354, 1997.
-
(1997)
Proc. of ASRU
, pp. 347-354
-
-
Fiscus, J.G.1
-
11
-
-
17344389852
-
Robust speech recognition in noisy environments: The 2001 IBM SPIN Eevaluation system
-
FL
-
B. Kingsbury, G. Saon, L. Mangu, M. Padmanabhan and R. Sarikaya, "Robust speech recognition in noisy environments: The 2001 IBM SPIN Eevaluation system", In Proc. of ICASSP, Vol.1, pp.I53-I56, FL, 2002.
-
(2002)
Proc. of ICASSP
, vol.1
, pp. I53-I56
-
-
Kingsbury, B.1
Saon, G.2
Mangu, L.3
Padmanabhan, M.4
Sarikaya, R.5
-
12
-
-
0036291381
-
Digit recognition in noisy environments via a sequential GMM/SVM system
-
FL
-
S. Fine, G. Saon, and R.A. Gopinath, "Digit recognition in noisy environments via a sequential GMM/SVM system", In Proc. of ICASSP, Vol.1, pp.I49-I52, FL, 2002.
-
(2002)
Proc. of ICASSP
, vol.1
, pp. I49-I52
-
-
Fine, S.1
Saon, G.2
Gopinath, R.A.3
-
13
-
-
0035342414
-
Robust automatic speech recognition with missing and unreliable acoustic data
-
M. Cooke, P. Green, L. Josifovski and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data", Speech Comm., 34(3), pp.267-285, 2001.
-
(2001)
Speech Comm
, vol.34
, Issue.3
, pp. 267-285
-
-
Cooke, M.1
Green, P.2
Josifovski, L.3
Vizinho, A.4
-
14
-
-
85083953021
-
Feature learning in deep neural networks - Studies on speech recognition tasks
-
D. Yu, M. Seltzer, J. Li, J-T. Huang and Frank Seide, "Feature Learning in Deep Neural Networks - Studies on Speech Recognition Tasks", ICLR 2013.
-
(2013)
ICLR
-
-
Yu, D.1
Seltzer, M.2
Li, J.3
Huang, J.-T.4
Seide, F.5
-
15
-
-
84858953286
-
Vocal tract length normalization for LVCSR
-
Carnegie Mellon University
-
P. Zhan and A Waibel, "Vocal tract length normalization for LVCSR, " in Tech. Rep. CMU-LTI-97-150. Carnegie Mellon University, 1997.
-
(1997)
Tech. Rep. CMU-LTI-97-150
-
-
Zhan, P.1
Waibel, A.2
-
16
-
-
78649390043
-
Retrieving tract variables from acoustics: A comparison of different machine learning strategies
-
V. Mitra, H. Nam, C. Espy-Wilson, E. Saltzman and L. Goldstein, "Retrieving Tract Variables from Acoustics: A comparison of different Machine Learning strategies, " IEEE Journal of Selected Topics on Signal Processing, Sp. Iss. on Statistical Learning Methods for Speech and Language Processing, Vol. 4, Iss. 6, pp. 1027-1045, 2010.
-
(2010)
IEEE Journal of Selected Topics on Signal Processing, Sp. Iss. on Statistical Learning Methods for Speech and Language Processing
, vol.4
, Issue.6
, pp. 1027-1045
-
-
Mitra, V.1
Nam, H.2
Espy-Wilson, C.3
Saltzman, E.4
Goldstein, L.5
-
17
-
-
84890492030
-
An investigation of deep neural networks for noise robust speech recognition
-
M. Seltzer, D. Yu, and Y. Wang, "An Investigation Of Deep Neural Networks For Noise Robust Speech Recognition", Proc of ICASSP, 2013.
-
(2013)
Proc of ICASSP
-
-
Seltzer, M.1
Yu, D.2
Wang, Y.3
-
18
-
-
84867605836
-
Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition
-
O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, "Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition, " Proc. of ICASSP, pp. 4277 -4280, 2012.
-
(2012)
Proc. of ICASSP
, pp. 4277-4280
-
-
Abdel-Hamid, O.1
Mohamed, A.2
Jiang, H.3
Penn, G.4
-
19
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
-
G. Hinton, L. Deng, D. Yu, G.E. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath and B. Kingsbury, "Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups, " IEEE Signal Proc. Mag., 29(6), pp.82-97, 2012.
-
(2012)
IEEE Signal Proc. Mag.
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.E.4
Mohamed, A.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.10
Kingsbury, B.11
-
20
-
-
84906214784
-
Exploring convolutional neural network structures and optimization techniques for speech recognition
-
O. Abdel-Hamid, L. Deng and D. Yu, "Exploring Convolutional Neural Network Structures and Optimization Techniques for Speech Recognition, " Proc. of Interspeech, pp. 3366-3370, 2013.
-
(2013)
Proc. of Interspeech
, pp. 3366-3370
-
-
Abdel-Hamid, O.1
Deng, L.2
Yu, D.3
-
22
-
-
84906260861
-
Damped oscillator cepstral coefficients for robust speech recognition
-
V. Mitra, H. Franco and M. Graciarena, "Damped Oscillator Cepstral Coefficients for Robust Speech Recognition, " Proc. of Interspeech, pp. 886-890, 2013.
-
(2013)
Proc. of Interspeech
, pp. 886-890
-
-
Mitra, V.1
Franco, H.2
Graciarena, M.3
-
23
-
-
84867589420
-
Normalized amplitude modulation features for large vocabulary noise-robust speech recognition
-
V. Mitra, H. Franco, M. Graciarena, and A. Mandal, "Normalized Amplitude Modulation Features for Large Vocabulary Noise-Robust Speech Recognition, " Proc. of ICASSP, pp. 4117-4120, 2012.
-
(2012)
Proc. of ICASSP
, pp. 4117-4120
-
-
Mitra, V.1
Franco, H.2
Graciarena, M.3
Mandal, A.4
-
24
-
-
0028287770
-
Effect of reducing slow temporal modulations on speech reception
-
R. Drullman, J. M. Festen and R. Plomp, "Effect of Reducing Slow Temporal Modulations on Speech Reception, " J. Acoust. Soc. of Am., Vol. 95, No. 5, pp. 2670-2680, 1994.
-
(1994)
J. Acoust. Soc. of Am
, vol.95
, Issue.5
, pp. 2670-2680
-
-
Drullman, R.1
Festen, J.M.2
Plomp, R.3
-
25
-
-
0034844903
-
On the upper cutoff frequency of auditory critical- band envelope detectors in the context of speech perception
-
O. Ghitza, "On the Upper Cutoff Frequency of Auditory Critical- Band Envelope Detectors in the Context of Speech Perception, " J. Acoust. Soc. of America, vol. 110, no. 3, pp. 1628-1640, 2001.
-
(2001)
J. Acoust. Soc. of America
, vol.110
, Issue.3
, pp. 1628-1640
-
-
Ghitza, O.1
-
26
-
-
0027676955
-
Energy separation in signal modulations with application to speech analysis
-
P. Maragos, J. Kaiser and T. Quatieri, "Energy Separation in Signal Modulations with Application to Speech Analysis, " IEEE Trans. Signal Processing, Vol. 41, pp. 3024-3051, 1993.
-
(1993)
IEEE Trans. Signal Processing
, vol.41
, pp. 3024-3051
-
-
Maragos, P.1
Kaiser, J.2
Quatieri, T.3
-
27
-
-
84906246749
-
Modulation features for noise robust speaker identification
-
V. Mitra, M. McLaren, H. Franco, M. Graciarena and N. Scheffer, "Modulation Features for Noise Robust Speaker Identification, " Proc. of Interspeech, pp. 3703-3707, 2013.
-
(2013)
Proc. of Interspeech
, pp. 3703-3707
-
-
Mitra, V.1
Mclaren, M.2
Franco, H.3
Graciarena, M.4
Scheffer, N.5
-
28
-
-
0019075685
-
Some observations on oral air flow during phonation
-
H. Teager, "Some Observations on Oral Air Flow During Phonation, " in IEEE Trans. ASSP, pp. 599-601, 1980.
-
(1980)
IEEE Trans. ASSP
, pp. 599-601
-
-
Teager, H.1
-
29
-
-
84905269267
-
Medium duration modulation cepstral feature for robust speech recognition
-
Florence
-
V. Mitra, H. Franco, M. Graciarena, D. Vergyri, "Medium duration modulation cepstral feature for robust speech recognition, " Proc. of ICASSP, Florence, 2014.
-
(2014)
Proc. of ICASSP
-
-
Mitra, V.1
Franco, H.2
Graciarena, M.3
Vergyri, D.4
-
30
-
-
84858953642
-
The kaldi speech recognition toolkit
-
D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz et al., "The kaldi speech recognition toolkit, " in Proc. ASRU, 2011.
-
(2011)
Proc. ASRU
-
-
Povey, D.1
Ghoshal, A.2
Boulianne, G.3
Burget, L.4
Glembek, O.5
Goel, N.6
Hannemann, M.7
Motlicek, P.8
Qian, Y.9
Schwarz, P.10
-
32
-
-
84890526837
-
New types of deep neural network learning for speech recognition and related applications: An overview
-
L. Deng, G. Hinton, and B. Kingsbury, "New types of deep neural network learning for speech recognition and related applications: An overview, " proc. of ICASSP, 2013.
-
(2013)
Proc. of ICASSP
-
-
Deng, L.1
Hinton, G.2
Kingsbury, B.3
-
33
-
-
0021892216
-
Speech enhancement using a minimum mean square error log-spectral amplitude estimator
-
Apr
-
Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean square error log-spectral amplitude estimator, " IEEE Trans. on Acoust., Speech, Signal Processing, vol. ASSP- 33, no. 2, pp. 443-445, Apr. 1985.
-
(1985)
IEEE Trans. on Acoust., Speech, Signal Processing
, vol.ASSP-33
, Issue.2
, pp. 443-445
-
-
Ephraim, Y.1
Malah, D.2
-
34
-
-
51449089990
-
A Minimum-mean-square-error noise reduction algorithm on melfrequency cepstra for robust speech recognition
-
Las Vegas, NV
-
D. Yu, L. Deng, J. Droppo, J. Wu, Y. Gong, and A. Acero, "A Minimum-mean-square-error noise reduction algorithm on melfrequency cepstra for robust speech recognition, " in Proc. of ICASSP, Las Vegas, NV, 2008.
-
(2008)
Proc. of ICASSP
-
-
Yu, D.1
Deng, L.2
Droppo, J.3
Wu, J.4
Gong, Y.5
Acero, A.6
|