-
2
-
-
85032751613
-
Making Machines Understand Us in Reverberant Rooms: Robustness against Reverberation for Automatic Speech Recognition
-
T Yoshioka, A Sehr, M Delcroix, K Kinoshita, R Maas, T Nakatani, W Kellermann, Making Machines Understand Us in Reverberant Rooms: Robustness against Reverberation for Automatic Speech Recognition. IEEE Signal Process. Mag.29(6), 114–126 (2012).
-
(2012)
IEEE Signal Process. Mag.
, vol.29
, Issue.6
, pp. 114-126
-
-
Yoshioka, T.1
Sehr, A.2
Delcroix, M.3
Kinoshita, K.4
Maas, R.5
Nakatani, T.6
Kellermann, W.7
-
4
-
-
77955698459
-
Speech dereverberation based on variance-normalized delayed linear prediction
-
T Nakatani, T Yoshioka, K Kinoshita, M Miyoshi, B-H Juang, Speech dereverberation based on variance-normalized delayed linear prediction. IEEE Trans. Audio, Speech, Lang. Process.18(7), 1717–1731 (2010).
-
(2010)
IEEE Trans. Audio, Speech, Lang. Process.
, vol.18
, Issue.7
, pp. 1717-1731
-
-
Nakatani, T.1
Yoshioka, T.2
Kinoshita, K.3
Miyoshi, M.4
Juang, B.-H.5
-
5
-
-
84880538217
-
Regularization for partial multichannel equalization for speech dereverberation
-
I Kodrasi, S Goetze, S Doclo, Regularization for partial multichannel equalization for speech dereverberation. IEEE Trans. Audio, Speech Lang. Process.21(9), 1879–1890 (2013).
-
(2013)
IEEE Trans. Audio, Speech Lang. Process.
, vol.21
, Issue.9
, pp. 1879-1890
-
-
Kodrasi, I.1
Goetze, S.2
Doclo, S.3
-
6
-
-
80051627812
-
in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Amplitude modulation spectrogram based features for robust speech recognition in noisy and reverberant environments (Prague
-
N Moritz, J Anemüller, B Kollmeier, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Amplitude modulation spectrogram based features for robust speech recognition in noisy and reverberant environments (Prague, Czech Republic, 2011), pp. 5492–5495.
-
(2011)
Czech Republic
, pp. 5492-5495
-
-
Moritz, N.1
Anemüller, J.2
Kollmeier, B.3
-
7
-
-
77955683144
-
Reverberation model-based decoding in the Logmelspec domain for robust distant-talking speech recognition
-
A Sehr, R Maas, W Kellermann, Reverberation model-based decoding in the Logmelspec domain for robust distant-talking speech recognition. IEEE Trans. Audio, Speech Lang. Process.18(7), 1676–1691 (2010).
-
(2010)
IEEE Trans. Audio, Speech Lang. Process.
, vol.18
, Issue.7
, pp. 1676-1691
-
-
Sehr, A.1
Maas, R.2
Kellermann, W.3
-
8
-
-
84910043152
-
in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). The REVERB Challenge: a common evaluation framework for dereverberation and recognition of reverberant speech (New Paltz, NY
-
K Kinoshita, M Delcroix, T Yoshioka, T Nakatani, E Habets, R Haeb-Umbach, V Leutnant, A Sehr, W Kellermann, R Maas, S Gannot, B Raj, in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). The REVERB Challenge: a common evaluation framework for dereverberation and recognition of reverberant speech (New Paltz, NY, USA, 2013).
-
(2013)
USA
-
-
Kinoshita, K.1
Delcroix, M.2
Yoshioka, T.3
Nakatani, T.4
Habets, E.5
Haeb-Umbach, R.6
Leutnant, V.7
Sehr, A.8
Kellermann, W.9
Maas, R.10
Gannot, S.11
Raj, B.12
-
9
-
-
84933559258
-
-
Florence, Italy
-
B Cauchi, I Kodrasi, R Rehr, S Gerlach, A Jukić, T Gerkmann, S Doclo, S Goetze, in Proc. of the REVERB Challenge. Joint dereverberation and noise reduction using beamforming and a single-channel speech enhancement scheme (Florence, Italy, 2014).
-
(2014)
Goetze, in Proc. of the REVERB Challenge. Joint dereverberation and noise reduction using beamforming and a single-channel speech enhancement scheme
-
-
B Cauchi, I.1
Kodrasi, R.2
Rehr, S.3
Gerlach, A.4
Jukić, T.5
Gerkmann, S.6
Doclo, S.7
-
10
-
-
84937882289
-
-
Florence, Italy
-
F Weninger, S Watanabe, JL Roux, JR Hershey, Y Tachioka, J Geiger, B Schuller, G Rigoll, in Proc. of the REVERB Challenge. T MERL/MELCO/TUM System for the REVERB Challenge using Deep Recurrent Neural Network Feature Enhancement (Florence, Italy, 2014).
-
(2014)
Rigoll, in Proc. of the REVERB Challenge. T MERL/MELCO/TUM System for the REVERB Challenge using Deep Recurrent Neural Network Feature Enhancement
-
-
F Weninger, S.1
Watanabe, J.L.2
Roux, J.R.3
Hershey, Y.4
Tachioka, J.5
Geiger, B.6
Schuller, G.7
-
11
-
-
84933559263
-
-
Florence, Italy
-
M Delcroix, T Yoshioka, A Ogawa, Y Kubo, M Fujimoto, N Ito, K Kinoshita, M Espi, T Hori, T Nakatani, A Nakamura, in Proc. of the REVERB Challenge. Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the REVERB Challenge (Florence, Italy, 2014).
-
(2014)
Nakamura, in Proc. of the REVERB Challenge. Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the REVERB Challenge
-
-
M Delcroix, T.1
Yoshioka, A.2
Ogawa, Y.3
Kubo, M.4
Fujimoto, N.5
Ito, K.6
Kinoshita, M.7
Espi, T.8
Hori, T.9
Nakatani, A.10
-
12
-
-
84955462883
-
-
Florence, Italy
-
F Xiong, N Moritz, R Rehr, J Anemüller, BT Meyer, T Gerkmann, S Doclo, S Goetze, in Proc. of the REVERB Challenge. Robust ASR in reverberant environments using temporal cepstrum smoothing for speech enhancement and an amplitude modulation filterbank for feature extraction (Florence, Italy, 2014).
-
(2014)
Goetze, in Proc. of the REVERB Challenge. Robust ASR in reverberant environments using temporal cepstrum smoothing for speech enhancement and an amplitude modulation filterbank for feature extraction
-
-
F Xiong, N.1
Moritz, R.2
Rehr, J.3
Anemüller, B.T.4
Meyer, T.5
Gerkmann, S.6
Doclo, S.7
-
13
-
-
0003822743
-
-
Cambridge University Engineering Department, Cambridge
-
S Young, G Evermann, M Gales, T Hain, D Kershaw, XA Liu, G Moore, J Odell, D Ollason, D Povey, V Valtchev, P Woodland, The HTK Book (for HTK Version 3.4) (Cambridge University Engineering Department, Cambridge, 2009).
-
(2009)
Woodland, The HTK Book (for HTK Version 3.4)
-
-
S Young, G.1
Evermann, M.2
Gales, T.3
Hain, D.4
Kershaw, X.A.5
Liu, G.6
Moore, J.7
Odell, D.8
Ollason, D.9
Povey, V.10
Valtchev, P.11
-
14
-
-
84938640123
-
-
Big Island, HI: USA
-
D Povey, A Ghoshal, G Boulianne, L Burget, O Glembek, N Goel, M Hannemann, P Motlíček, Y Qian, P Schwarz, J Silovský, G Stemmer, K Veselý, in IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). The Kaldi speech recognition toolkit (Big Island, HI, USA, 2011).
-
(2011)
in IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). The Kaldi speech recognition toolkit
-
-
Povey, D.1
Ghoshal, A.2
Boulianne, G.3
Burget, L.4
Glembek, O.5
Goel, N.6
Hannemann, M.7
Motlíček, P.8
Qian, Y.9
Schwarz, P.10
Silovský, J.11
Stemmer, G.12
Veselý, K.13
-
15
-
-
84938640124
-
-
F Grézl, M Karafiát, S Kontáir, J Černocký, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 4. Probabilistic and bottle-neck features for LVCSR of meetings (Honolulu, HI, USA, 2007), pp. 757–760
-
F Grézl, M Karafiát, S Kontáir, J Černocký, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 4. Probabilistic and bottle-neck features for LVCSR of meetings (Honolulu, HI, USA, 2007), pp. 757–760.
-
-
-
-
16
-
-
78049502526
-
The subspace Gaussian mixture model - a structured model for speech recognition
-
D Povey, L Burget, M Agarwal, P Akyazi, F Kai, A Ghoshal, O Glembek, N Goel, M Karafiát, A Rastrow, RC Rose, P Schwarz, S Thomas, The subspace Gaussian mixture model - a structured model for speech recognition. Comput. Speech Lang.25(2), 404–439 (2011).
-
(2011)
Comput. Speech Lang.
, vol.25
, Issue.2
, pp. 404-439
-
-
Povey, D.1
Burget, L.2
Agarwal, M.3
Akyazi, P.4
Kai, F.5
Ghoshal, A.6
Glembek, O.7
Goel, N.8
Karafiát, M.9
Rastrow, A.10
Rose, R.C.11
Schwarz, P.12
Thomas, S.13
-
17
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups
-
G Hinton, L Deng, D Yu, GE Dahl, A Mohamed, N Jaitly, A Senior, V Vanhoucke, P Nguyen, TN Sainath, B Kingsbury, Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag.29(6), 82–97 (2012).
-
(2012)
IEEE Signal Process. Mag.
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.E.4
Mohamed, A.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.N.10
Kingsbury, B.11
-
20
-
-
84890492030
-
in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). An Investigation of deep neural networks for noise robust speech recognition (Vancouver
-
M Seltzer, D Yu, Y Wang, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). An Investigation of deep neural networks for noise robust speech recognition (Vancouver, Canada, 2013), pp. 7398–7402.
-
(2013)
Canada
, pp. 7398-7402
-
-
M Seltzer, D.Y.1
Wang, Y.2
-
21
-
-
51449096949
-
-
C Breithaupt, M Krawczyk, R Martin, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Parameterized MMSE Spectral magnitude estimation for the enhancement of noisy speech (Las Vegas, NV, USA, 2008), pp. 4037–4040
-
C Breithaupt, M Krawczyk, R Martin, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Parameterized MMSE Spectral magnitude estimation for the enhancement of noisy speech (Las Vegas, NV, USA, 2008), pp. 4037–4040.
-
-
-
-
22
-
-
0035396555
-
Noise power spectral density estimation based on optimal smoothing and minimum statistics
-
R Martin, Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans. Speech Audio Process.9(5), 504–512 (2001).
-
(2001)
IEEE Trans. Speech Audio Process.
, vol.9
, Issue.5
, pp. 504-512
-
-
Martin, R.1
-
23
-
-
70350488536
-
On the statistics of spectral amplitudes after variance reduction by temporal cepstrum smoothing and cepstral nulling
-
T Gerkmann, R Martin, On the statistics of spectral amplitudes after variance reduction by temporal cepstrum smoothing and cepstral nulling. IEEE Trans. Signal Process.57(11), 4165–4174 (2009).
-
(2009)
IEEE Trans. Signal Process.
, vol.57
, Issue.11
, pp. 4165-4174
-
-
Gerkmann, T.1
Martin, R.2
-
24
-
-
14344274593
-
A new method based on spectral subtraction for speech dereverberation
-
K Lebart, JM Boucher, PN Denbigh, A new method based on spectral subtraction for speech dereverberation. Acta Acustica United Acustica. 87(3), 359–366 (2001).
-
(2001)
Acta Acustica United Acustica
, vol.87
, Issue.3
, pp. 359-366
-
-
Lebart, K.1
Boucher, J.M.2
Denbigh, P.N.3
-
26
-
-
77955697587
-
Late reverberant spectral variance estimation based on a statistical model
-
EAP Habets, S Gannot, I Cohen, Late reverberant spectral variance estimation based on a statistical model. IEEE Signal Process. Lett.16(9), 770–773 (2009).
-
(2009)
IEEE Signal Process. Lett.
, vol.16
, Issue.9
, pp. 770-773
-
-
Habets, E.A.P.1
Gannot, S.2
Cohen, I.3
-
27
-
-
84890487970
-
in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Blind Estimation of Reverberation Time based on Spectro-Temporal Modulation Filtering (Vancouver
-
F Xiong, S Goetze, BT Meyer, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Blind Estimation of Reverberation Time based on Spectro-Temporal Modulation Filtering (Vancouver, Canada, 2013), pp. 443–447.
-
(2013)
Canada
, pp. 443-447
-
-
Xiong, F.1
Goetze, S.2
Meyer, B.T.3
-
28
-
-
84938640127
-
-
Florence, Italy
-
F Xiong, S Goetze, BT Meyer, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Estimating room acoustic parameters for speech recognizer adaptation and combination in reverberant environments (Florence, Italy, 2014).
-
(2014)
in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Estimating room acoustic parameters for speech recognizer adaptation and combination in reverberant environments
-
-
Xiong, F.1
Goetze, S.2
Meyer, B.T.3
-
29
-
-
84938640128
-
An auditory inspired amplitude modulation filter bank for robust feature extraction in automatic speech recognition
-
N Moritz, J Anemüller, B Kollmeier, An auditory inspired amplitude modulation filter bank for robust feature extraction in automatic speech recognition. IEEE Trans. Audio, Speech and Language Processing. 23(11), 1926–1937 (2015).
-
(2015)
IEEE Trans. Audio, Speech and Language Processing
, vol.23
, Issue.11
, pp. 1926-1937
-
-
Moritz, N.1
Anemüller, J.2
Kollmeier, B.3
-
30
-
-
0024241221
-
Periodicity coding in the inferior colliculus of the Cat. I. Neuronal Mechanisms
-
G Langner, CE Schreiner, Periodicity coding in the inferior colliculus of the Cat. I. Neuronal Mechanisms. J. Neurophysiol.60(6), 1799–1822 (1988).
-
(1988)
J. Neurophysiol.
, vol.60
, Issue.6
, pp. 1799-1822
-
-
Langner, G.1
Schreiner, C.E.2
-
31
-
-
34547509128
-
in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 4. Representation of phonemes in primary auditory cortex: how the brain analyzes speech (Honolulu, HI
-
N Mesgarani, S David, S Shamma, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 4. Representation of phonemes in primary auditory cortex: how the brain analyzes speech (Honolulu, HI, USA, 2007), pp. 765–768.
-
(2007)
USA
, pp. 765-768
-
-
Mesgarani, N.1
David, S.2
Shamma, S.3
-
32
-
-
0030691985
-
Modeling auditory processing of amplitude modulation. I, Detection and masking with narrow-band carriers
-
T Dau, B Kollmeier, A Kohlrausch, Modeling auditory processing of amplitude modulation. I, Detection and masking with narrow-band carriers. J. Acoust. Soc. Am.102(5), 2892–2905 (1997).
-
(1997)
J. Acoust. Soc. Am.
, vol.102
, Issue.5
, pp. 2892-2905
-
-
Dau, T.1
Kollmeier, B.2
Kohlrausch, A.3
-
33
-
-
79953659090
-
Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition
-
BT Meyer, B Kollmeier, Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition. Speech Commun.53(5), 753–767 (2011).
-
(2011)
Speech Commun.
, vol.53
, Issue.5
, pp. 753-767
-
-
Meyer, B.T.1
Kollmeier, B.2
-
34
-
-
0019053271
-
Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences
-
SB David, P Mermelstein, Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoustics, Speech Signal Process.28(4), 357–366 (1980).
-
(1980)
IEEE Trans. Acoustics, Speech Signal Process.
, vol.28
, Issue.4
, pp. 357-366
-
-
David, S.B.1
Mermelstein, P.2
-
35
-
-
0016067897
-
Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification
-
B Atal, Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J. Acoust. Soc. Am.55(6), 1304–1322 (1974).
-
(1974)
J. Acoust. Soc. Am.
, vol.55
, Issue.6
, pp. 1304-1322
-
-
Atal, B.1
-
36
-
-
84938077304
-
Combination of MVDR beamforming and single-channel spectral processing for enhancing noisy and reverberant speech
-
B Cauchi, I Kodrasi, R Rehr, S Gerlach, A Jukić, T Gerkmann, S Doclo, S Goetze, Combination of MVDR beamforming and single-channel spectral processing for enhancing noisy and reverberant speech. EURASIP Journal on Advances in Signal Processing. 2015, 61 (2015).
-
(2015)
EURASIP Journal on Advances in Signal Processing
, vol.2015
, pp. 61
-
-
Cauchi, B.1
Kodrasi, I.2
Rehr, R.3
Gerlach, S.4
Jukić, A.5
Gerkmann, T.6
Doclo, S.7
Goetze, S.8
-
37
-
-
51449107956
-
-
C Breithaupt, T Gerkmann, R Martin, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). A Novel A Priori SNR Estimation Approach based on Selective Cepstro-Temporal Smoothing (Las Vegas, NV, USA, 2008), pp. 4897–4900
-
C Breithaupt, T Gerkmann, R Martin, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). A Novel A Priori SNR Estimation Approach based on Selective Cepstro-Temporal Smoothing (Las Vegas, NV, USA, 2008), pp. 4897–4900.
-
-
-
-
38
-
-
0021645331
-
Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator
-
Y Ephraim, D Malah, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Trans. Acoustics, Speech Signal Process.32(6), 1109–1121 (1984).
-
(1984)
IEEE Trans. Acoustics, Speech Signal Process.
, vol.32
, Issue.6
, pp. 1109-1121
-
-
Ephraim, Y.1
Malah, D.2
-
39
-
-
84893705681
-
Kolossa, in Proc. 2nd CHiME Workshop on Machine Listening in Multisource Environments. Binaural signal processing for enhanced speech recognition robustness in complex listening environments (Vancouver
-
H Meutzner, A Schlesinger, S Zeiler, D Kolossa, in Proc. 2nd CHiME Workshop on Machine Listening in Multisource Environments. Binaural signal processing for enhanced speech recognition robustness in complex listening environments (Vancouver, Canada, 2013), pp. 7–12.
-
(2013)
Canada
, pp. 7-12
-
-
H Meutzner, A.1
Schlesinger, S.2
Zeiler, D.3
-
40
-
-
84890476022
-
in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Noise-robust reverberation time estimation using spectral decay distributions with reduced computational cost (Vancouver
-
J Eaton, ND Gaubitch, PA Naylor, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Noise-robust reverberation time estimation using spectral decay distributions with reduced computational cost (Vancouver, Canada, 2013), pp. 161–165.
-
(2013)
Canada
, pp. 161-165
-
-
Eaton, J.1
Gaubitch, N.D.2
Naylor, P.A.3
-
41
-
-
84938640130
-
in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). A study on joint beamforming and spectral enhancement for robust speech recognition in reverberant environments (Brisbane
-
F Xiong, BT Meyer, S Goetze, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). A study on joint beamforming and spectral enhancement for robust speech recognition in reverberant environments (Brisbane, Australia, 2015), pp. 5043–5047.
-
(2015)
Australia
, pp. 5043-5047
-
-
Xiong, F.1
Meyer, B.T.2
Goetze, S.3
-
43
-
-
0035540087
-
Computing the Confluent Hypergeometric Function, M(a,b,x)
-
KE Muller, Computing the Confluent Hypergeometric Function, M(a,b,x). Numerische Mathematik. 90(1), 179–196 (2001).
-
(2001)
Numerische Mathematik
, vol.90
, Issue.1
, pp. 179-196
-
-
Muller, K.E.1
-
44
-
-
84867584057
-
in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). On the application of reverberation suppression to robust speech recognition (Kyoto
-
R Maas, EAP Habets, A Sehr, W Kellermann, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). On the application of reverberation suppression to robust speech recognition (Kyoto, Japan, 2012), pp. 297–300.
-
(2012)
Japan
, pp. 297-300
-
-
Maas, R.1
Habets, E.A.P.2
Sehr, A.3
Kellermann, W.4
-
45
-
-
84865769808
-
Morgan, in Interspeech. Comparing Different Flavors of Spectro-Temporal Features for ASR (Florence
-
BT Meyer, SV Ravuri, MR Schädler, N Morgan, in Interspeech. Comparing Different Flavors of Spectro-Temporal Features for ASR (Florence, Italy, 2011), pp. 1269–1272.
-
(2011)
Italy
, pp. 1269-1272
-
-
BT Meyer, S.V.1
Ravuri, M.R.2
Schädler, N.3
-
46
-
-
84863799482
-
Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition
-
MR Schädler, BT Meyer, B Kollmeier, Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition. J. Acoust. Soc. Am.131(5), 4134–4151 (2012).
-
(2012)
J. Acoust. Soc. Am.
, vol.131
, Issue.5
, pp. 4134-4151
-
-
Schädler, M.R.1
Meyer, B.T.2
Kollmeier, B.3
-
48
-
-
84938640133
-
-
QuickNet package
-
QuickNet package. http://wwwl.icsLberkeley.edu/Speech/qn.html.
-
-
-
-
49
-
-
84953653955
-
New method of measuring reverberation time
-
MR Schroeder, New method of measuring reverberation time. J. Acoust. Soc. Amer.37(3), 409–412 (1965).
-
(1965)
J. Acoust. Soc. Amer.
, vol.37
, Issue.3
, pp. 409-412
-
-
Schroeder, M.R.1
-
50
-
-
84865785753
-
Seltzer, in Proc. Interspeech. Improved Bottleneck Features using Pretrained Deep Neural Networks (Florence
-
D Yu, ML Seltzer, in Proc. Interspeech. Improved Bottleneck Features using Pretrained Deep Neural Networks (Florence, Italy, 2011), pp. 237–240.
-
(2011)
Italy
, pp. 237-240
-
-
D Yu, M.L.1
-
52
-
-
0022691022
-
Maximum likelihood estimation for multivariate mixture observations of Markov chains
-
BH Juang, S Levinson, M Sondhi, Maximum likelihood estimation for multivariate mixture observations of Markov chains. IEEE Trans. Inform. Theory. 32(2), 307–309 (1986).
-
(1986)
IEEE Trans. Inform. Theory
, vol.32
, Issue.2
, pp. 307-309
-
-
Juang, B.H.1
Levinson, S.2
Sondhi, M.3
-
53
-
-
80051649263
-
in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). A basis method for robust estimation of constrained MLLR (Prague
-
D Povey, K Yao, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). A basis method for robust estimation of constrained MLLR (Prague, Czech Republic, 2011), pp. 4460–4463.
-
(2011)
Czech Republic
, pp. 4460-4463
-
-
Povey, D.1
Yao, K.2
-
54
-
-
51449120120
-
-
D Povey, D Kanevsky, B Kingsbury, B Ramabhadran, G Saon, K Visweswariah, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Boosted MMI for Model and Feature-Space Discriminative Training (Las Vegas, NV, USA, 2008), pp. 4057–4060
-
D Povey, D Kanevsky, B Kingsbury, B Ramabhadran, G Saon, K Visweswariah, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Boosted MMI for Model and Feature-Space Discriminative Training (Las Vegas, NV, USA, 2008), pp. 4057–4060.
-
-
-
-
55
-
-
44949182698
-
Hain, in Proc. Interspeech. Hypothesis spaces for minimum Bayes risk training in large vocabulary speech recognition (Pittsburgh, Pennsylvania
-
M Gibson, T Hain, in Proc. Interspeech. Hypothesis spaces for minimum Bayes risk training in large vocabulary speech recognition (Pittsburgh, Pennsylvania, USA, 2006), pp. 2406–2409.
-
(2006)
USA
, pp. 2406-2409
-
-
M Gibson, T.1
-
56
-
-
84938640135
-
-
DE Rumelhart, GE Hinton, RJ Williams, Learning Internal Representations by Error Propagation. Parallel distributed processing: Explorations in the microstructure of cognition. 1: Foundations. MIT Press (1986). ISBN:0-262-68053-X
-
DE Rumelhart, GE Hinton, RJ Williams, Learning Internal Representations by Error Propagation. Parallel distributed processing: Explorations in the microstructure of cognition. 1: Foundations. MIT Press (1986). ISBN:0-262-68053-X.
-
-
-
-
57
-
-
84055211743
-
Acoustic modeling using deep belief networks
-
A Mohamed, GE Dahl, G Hinton, Acoustic modeling using deep belief networks. IEEE Trans. Audio Speech Lang. Process.20(1), 14–22 (2012).
-
(2012)
IEEE Trans. Audio Speech Lang. Process.
, vol.20
, Issue.1
, pp. 14-22
-
-
Mohamed, A.1
Dahl, G.E.2
Hinton, G.3
-
58
-
-
85083953021
-
-
D Yu, ML Seltzer, J Li, J-T Huang, F Seide, in Proc. of ICLR. Feature learning in deep neural networks - studies on speech recognition tasks, (2013). arXiv:1301.3605v3
-
D Yu, ML Seltzer, J Li, J-T Huang, F Seide, in Proc. of ICLR. Feature learning in deep neural networks - studies on speech recognition tasks, (2013). arXiv:1301.3605v3.
-
-
-
-
59
-
-
0028996854
-
in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). WSJCAM0: A British English Speech Corpus for Large Vocabulary Continuous Speech Recognition (Detroit, Michigan
-
T Robinson, J Fransen, D Pye, J Foote, S Renals, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). WSJCAM0: A British English Speech Corpus for Large Vocabulary Continuous Speech Recognition (Detroit, Michigan, USA, 1995), pp. 81–84.
-
(1995)
USA
, pp. 81-84
-
-
Robinson, T.1
Fransen, J.2
Pye, D.3
Foote, J.4
Renals, S.5
-
60
-
-
33846217002
-
in IEEE Workshop on Automatic Speech Recognition and Understanding. The Multi-Channel Wall Street Journal Audio Visual Corpus (MC-WSJ-AV): Specification and Initial Experiments (San Juan
-
M Lincoln, I McCowan, J Vepa, HK Maganti, in IEEE Workshop on Automatic Speech Recognition and Understanding. The Multi-Channel Wall Street Journal Audio Visual Corpus (MC-WSJ-AV): Specification and Initial Experiments (San Juan, Puerto Rico, 2005), pp. 357–362.
-
(2005)
Puerto Rico
, pp. 357-362
-
-
Lincoln, M.1
McCowan, I.2
Vepa, J.3
Maganti, H.K.4
-
61
-
-
84938640137
-
D Graff, D Paul, D Pallett, in Linguistic Data Lconsortium (LDC). CSR-I (WSJ0) Complete (Philadelphia
-
J Garofalo, D Graff, D Paul, D Pallett, in Linguistic Data Lconsortium (LDC). CSR-I (WSJ0) Complete (Philadelphia, USA, 2007).
-
(2007)
USA
-
-
-
62
-
-
84892187452
-
-
RA Gopinath, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2. Maximum Likelihood Modeling with Gaussian Distributions for Classification (Seattle, WA, USA, 1998), pp. 661–664
-
RA Gopinath, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2. Maximum Likelihood Modeling with Gaussian Distributions for Classification (Seattle, WA, USA, 1998), pp. 661–664.
-
-
-
-
63
-
-
84937852689
-
-
Florence, Italy
-
Y Tachioka, T Narita, F Weninger, S Watanabe, in Proc. of the REVERB Challenge. Dual system combination approach for various reverberant environments with dereverberation Techniques (Florence, Italy, 2014).
-
(2014)
Watanabe, in Proc. of the REVERB Challenge. Dual system combination approach for various reverberant environments with dereverberation Techniques
-
-
Y Tachioka, T.1
Narita, F.2
Weninger, S.3
-
64
-
-
51449103447
-
-
F Grézl, P Fousek, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Optimizing Bottle-Neck Features for LVCSR (Las Vegas, NV, USA, 2008), pp. 4729–4732
-
F Grézl, P Fousek, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Optimizing Bottle-Neck Features for LVCSR (Las Vegas, NV, USA, 2008), pp. 4729–4732.
-
-
-
-
65
-
-
84932612278
-
Povey, in Proc. Interspeech. Sequence-discriminative training of deep neural networks (Lyon
-
K Veselý, A Ghosal, L Burget, D Povey, in Proc. Interspeech. Sequence-discriminative training of deep neural networks (Lyon, France, 2013), pp. 2345–2349.
-
(2013)
France
, pp. 2345-2349
-
-
K Veselý, A.1
Ghosal, L.2
Burget, D.3
-
66
-
-
84874282188
-
-
J Li, D Yu, J-T Huang, Y Gong, in IEEE Workshop on Spoken Language Technology. Improving Wideband Speech Recognition using Mixed-Bandwidth Training Data in CD-DNN-HMM (Miami, FL, USA, 2012), pp. 131–136
-
J Li, D Yu, J-T Huang, Y Gong, in IEEE Workshop on Spoken Language Technology. Improving Wideband Speech Recognition using Mixed-Bandwidth Training Data in CD-DNN-HMM (Miami, FL, USA, 2012), pp. 131–136.
-
-
-
-
67
-
-
84910065702
-
Ney, in Proc
-
Z Tüske, P Golik, R Schlüter, H Ney, in Proc. Interspeech. Acoustic modeling with deep neural networks using raw time signal for LVCSR (Singapore, 2014), pp. 890–894.
-
(2014)
Interspeech. Acoustic modeling with deep neural networks using raw time signal for LVCSR (Singapore
, pp. 890-894
-
-
Z Tüske, P.1
Golik, R.2
Schlüter, H.3
|