SCOPUS 정보 검색 플랫폼

Eurasip Journal on Advances in Signal Processing

Volumn 2015, Issue 1, 2015, Pages

Front-end technologies for robust ASR in reverberant environments—spectral enhancement-based dereverberation and auditory modulation filterbank features

(8) Xiong, Feifei a,b Meyer, Bernd T b Moritz, Niko a,b Rehr, Robert b Anemüller, Jörn b Gerkmann, Timo b Doclo, Simon a,b Goetze, Stefan a,b

a FRAUNHOFER IDMT (Germany)

b UNIVERSITY OF OLDENBURG (Germany)

Author keywords

Auditory modulation filterbank; Automatic speech recognition; Deep neural network; Dereverberation; REVERB challenge

Indexed keywords

AMPLITUDE MODULATION; ARCHITECTURAL ACOUSTICS; EXTRACTION; FEATURE EXTRACTION; FILTER BANKS; GAUSSIAN DISTRIBUTION; HIDDEN MARKOV MODELS; HYBRID SYSTEMS; IMPULSE RESPONSE; MARKOV PROCESSES; MODULATION; NETWORK ARCHITECTURE; NOISE ABATEMENT; POWER SPECTRAL DENSITY; REVERBERATION; SPECTRAL DENSITY; SPEECH; SPEECH ENHANCEMENT; TRELLIS CODES;

AUTOMATIC SPEECH RECOGNITION; DEEP NEURAL NETWORKS; DEREVERBERATION; MODULATION FILTERBANK; REVERB CHALLENGE;

SPEECH RECOGNITION;

EID: 84938591218 PISSN: 16876172 EISSN: 16876180 Source Type: Journal
DOI: 10.1186/s13634-015-0256-4 Document Type: Article

Times cited : (24)

References (67)

1
- 50449083999
- Sons Ltd, United Kingdom
- M Wölfel, J McDonough, Distant Speech Recognition (John Wiley & Sons Ltd, United Kingdom, 2009).
- (2009) Distant Speech Recognition (John Wiley &
- Wölfel, M.¹ McDonough, J.²

2
- 85032751613
- Making Machines Understand Us in Reverberant Rooms: Robustness against Reverberation for Automatic Speech Recognition
- T Yoshioka, A Sehr, M Delcroix, K Kinoshita, R Maas, T Nakatani, W Kellermann, Making Machines Understand Us in Reverberant Rooms: Robustness against Reverberation for Automatic Speech Recognition. IEEE Signal Process. Mag.29(6), 114–126 (2012).
- (2012) IEEE Signal Process. Mag. , vol.29 , Issue.6 , pp. 114-126
- Yoshioka, T.¹ Sehr, A.² Delcroix, M.³ Kinoshita, K.⁴ Maas, R.⁵ Nakatani, T.⁶ Kellermann, W.⁷

3
- 51449084820
- Eindhoven, The Netherlands
- EAP Habets, Single- and Multi-Microphone Speech Dereverberation using Spectral Enhancement. PhD thesis (University of Eindhoven, Eindhoven, The Netherlands, 2007).
- (2007) Single- and Multi-Microphone Speech Dereverberation using Spectral Enhancement. PhD thesis (University of Eindhoven
- Habets, E.A.P.¹

4
- 77955698459
- Speech dereverberation based on variance-normalized delayed linear prediction
- T Nakatani, T Yoshioka, K Kinoshita, M Miyoshi, B-H Juang, Speech dereverberation based on variance-normalized delayed linear prediction. IEEE Trans. Audio, Speech, Lang. Process.18(7), 1717–1731 (2010).
- (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.7 , pp. 1717-1731
- Nakatani, T.¹ Yoshioka, T.² Kinoshita, K.³ Miyoshi, M.⁴ Juang, B.-H.⁵

5
- 84880538217
- Regularization for partial multichannel equalization for speech dereverberation
- I Kodrasi, S Goetze, S Doclo, Regularization for partial multichannel equalization for speech dereverberation. IEEE Trans. Audio, Speech Lang. Process.21(9), 1879–1890 (2013).
- (2013) IEEE Trans. Audio, Speech Lang. Process. , vol.21 , Issue.9 , pp. 1879-1890
- Kodrasi, I.¹ Goetze, S.² Doclo, S.³

6
- 80051627812
- in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Amplitude modulation spectrogram based features for robust speech recognition in noisy and reverberant environments (Prague
- N Moritz, J Anemüller, B Kollmeier, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Amplitude modulation spectrogram based features for robust speech recognition in noisy and reverberant environments (Prague, Czech Republic, 2011), pp. 5492–5495.
- (2011) Czech Republic , pp. 5492-5495
- Moritz, N.¹ Anemüller, J.² Kollmeier, B.³

7
- 77955683144
- Reverberation model-based decoding in the Logmelspec domain for robust distant-talking speech recognition
- A Sehr, R Maas, W Kellermann, Reverberation model-based decoding in the Logmelspec domain for robust distant-talking speech recognition. IEEE Trans. Audio, Speech Lang. Process.18(7), 1676–1691 (2010).
- (2010) IEEE Trans. Audio, Speech Lang. Process. , vol.18 , Issue.7 , pp. 1676-1691
- Sehr, A.¹ Maas, R.² Kellermann, W.³

8
- 84910043152
- in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). The REVERB Challenge: a common evaluation framework for dereverberation and recognition of reverberant speech (New Paltz, NY
- K Kinoshita, M Delcroix, T Yoshioka, T Nakatani, E Habets, R Haeb-Umbach, V Leutnant, A Sehr, W Kellermann, R Maas, S Gannot, B Raj, in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). The REVERB Challenge: a common evaluation framework for dereverberation and recognition of reverberant speech (New Paltz, NY, USA, 2013).
- (2013) USA
- Kinoshita, K.¹ Delcroix, M.² Yoshioka, T.³ Nakatani, T.⁴ Habets, E.⁵ Haeb-Umbach, R.⁶ Leutnant, V.⁷ Sehr, A.⁸ Kellermann, W.⁹ Maas, R.¹⁰ Gannot, S.¹¹ Raj, B.¹²

9
- 84933559258
- Florence, Italy
- B Cauchi, I Kodrasi, R Rehr, S Gerlach, A Jukić, T Gerkmann, S Doclo, S Goetze, in Proc. of the REVERB Challenge. Joint dereverberation and noise reduction using beamforming and a single-channel speech enhancement scheme (Florence, Italy, 2014).
- (2014) Goetze, in Proc. of the REVERB Challenge. Joint dereverberation and noise reduction using beamforming and a single-channel speech enhancement scheme
- B Cauchi, I.¹ Kodrasi, R.² Rehr, S.³ Gerlach, A.⁴ Jukić, T.⁵ Gerkmann, S.⁶ Doclo, S.⁷

10
- 84937882289
- Florence, Italy
- F Weninger, S Watanabe, JL Roux, JR Hershey, Y Tachioka, J Geiger, B Schuller, G Rigoll, in Proc. of the REVERB Challenge. T MERL/MELCO/TUM System for the REVERB Challenge using Deep Recurrent Neural Network Feature Enhancement (Florence, Italy, 2014).
- (2014) Rigoll, in Proc. of the REVERB Challenge. T MERL/MELCO/TUM System for the REVERB Challenge using Deep Recurrent Neural Network Feature Enhancement
- F Weninger, S.¹ Watanabe, J.L.² Roux, J.R.³ Hershey, Y.⁴ Tachioka, J.⁵ Geiger, B.⁶ Schuller, G.⁷

11
- 84933559263
- Florence, Italy
- M Delcroix, T Yoshioka, A Ogawa, Y Kubo, M Fujimoto, N Ito, K Kinoshita, M Espi, T Hori, T Nakatani, A Nakamura, in Proc. of the REVERB Challenge. Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the REVERB Challenge (Florence, Italy, 2014).
- (2014) Nakamura, in Proc. of the REVERB Challenge. Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the REVERB Challenge
- M Delcroix, T.¹ Yoshioka, A.² Ogawa, Y.³ Kubo, M.⁴ Fujimoto, N.⁵ Ito, K.⁶ Kinoshita, M.⁷ Espi, T.⁸ Hori, T.⁹ Nakatani, A.¹⁰

12
- 84955462883
- Florence, Italy
- F Xiong, N Moritz, R Rehr, J Anemüller, BT Meyer, T Gerkmann, S Doclo, S Goetze, in Proc. of the REVERB Challenge. Robust ASR in reverberant environments using temporal cepstrum smoothing for speech enhancement and an amplitude modulation filterbank for feature extraction (Florence, Italy, 2014).
- (2014) Goetze, in Proc. of the REVERB Challenge. Robust ASR in reverberant environments using temporal cepstrum smoothing for speech enhancement and an amplitude modulation filterbank for feature extraction
- F Xiong, N.¹ Moritz, R.² Rehr, J.³ Anemüller, B.T.⁴ Meyer, T.⁵ Gerkmann, S.⁶ Doclo, S.⁷

13
- 0003822743
- Cambridge University Engineering Department, Cambridge
- S Young, G Evermann, M Gales, T Hain, D Kershaw, XA Liu, G Moore, J Odell, D Ollason, D Povey, V Valtchev, P Woodland, The HTK Book (for HTK Version 3.4) (Cambridge University Engineering Department, Cambridge, 2009).
- (2009) Woodland, The HTK Book (for HTK Version 3.4)
- S Young, G.¹ Evermann, M.² Gales, T.³ Hain, D.⁴ Kershaw, X.A.⁵ Liu, G.⁶ Moore, J.⁷ Odell, D.⁸ Ollason, D.⁹ Povey, V.¹⁰ Valtchev, P.¹¹

14
- 84938640123
- Big Island, HI: USA
- D Povey, A Ghoshal, G Boulianne, L Burget, O Glembek, N Goel, M Hannemann, P Motlíček, Y Qian, P Schwarz, J Silovský, G Stemmer, K Veselý, in IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). The Kaldi speech recognition toolkit (Big Island, HI, USA, 2011).
- (2011) in IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). The Kaldi speech recognition toolkit
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motlíček, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰ Silovský, J.¹¹ Stemmer, G.¹² Veselý, K.¹³

15
- 84938640124
- F Grézl, M Karafiát, S Kontáir, J Černocký, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 4. Probabilistic and bottle-neck features for LVCSR of meetings (Honolulu, HI, USA, 2007), pp. 757–760
- F Grézl, M Karafiát, S Kontáir, J Černocký, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 4. Probabilistic and bottle-neck features for LVCSR of meetings (Honolulu, HI, USA, 2007), pp. 757–760.

16
- 78049502526
- The subspace Gaussian mixture model - a structured model for speech recognition
- D Povey, L Burget, M Agarwal, P Akyazi, F Kai, A Ghoshal, O Glembek, N Goel, M Karafiát, A Rastrow, RC Rose, P Schwarz, S Thomas, The subspace Gaussian mixture model - a structured model for speech recognition. Comput. Speech Lang.25(2), 404–439 (2011).
- (2011) Comput. Speech Lang. , vol.25 , Issue.2 , pp. 404-439
- Povey, D.¹ Burget, L.² Agarwal, M.³ Akyazi, P.⁴ Kai, F.⁵ Ghoshal, A.⁶ Glembek, O.⁷ Goel, N.⁸ Karafiát, M.⁹ Rastrow, A.¹⁰ Rose, R.C.¹¹ Schwarz, P.¹² Thomas, S.¹³

17
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups
- G Hinton, L Deng, D Yu, GE Dahl, A Mohamed, N Jaitly, A Senior, V Vanhoucke, P Nguyen, TN Sainath, B Kingsbury, Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag.29(6), 82–97 (2012).
- (2012) IEEE Signal Process. Mag. , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰ Kingsbury, B.¹¹

18
- 84865752675
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
- A Sehr, Reverberation Modeling for Robust Distant-Talking Speech Recognition. PhD thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg, (Germany, 2009).
- (2009) Reverberation Modeling for Robust Distant-Talking Speech Recognition. PhD thesis
- Sehr, A.¹

19
- 84938640125
- Aachen, Germany
- C Breithaupt, R Martin, in ITG Conference on Voice Communication (Sprachkommunikation). DFT-based speech enhancement for robust automatic speech recognition (Aachen, Germany, 2008).
- (2008) Martin, in ITG Conference on Voice Communication (Sprachkommunikation). DFT-based speech enhancement for robust automatic speech recognition
- C Breithaupt, R.¹

20
- 84890492030
- in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). An Investigation of deep neural networks for noise robust speech recognition (Vancouver
- M Seltzer, D Yu, Y Wang, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). An Investigation of deep neural networks for noise robust speech recognition (Vancouver, Canada, 2013), pp. 7398–7402.
- (2013) Canada , pp. 7398-7402
- M Seltzer, D.Y.¹ Wang, Y.²

21
- 51449096949
- C Breithaupt, M Krawczyk, R Martin, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Parameterized MMSE Spectral magnitude estimation for the enhancement of noisy speech (Las Vegas, NV, USA, 2008), pp. 4037–4040
- C Breithaupt, M Krawczyk, R Martin, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Parameterized MMSE Spectral magnitude estimation for the enhancement of noisy speech (Las Vegas, NV, USA, 2008), pp. 4037–4040.

22
- 0035396555
- Noise power spectral density estimation based on optimal smoothing and minimum statistics
- R Martin, Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans. Speech Audio Process.9(5), 504–512 (2001).
- (2001) IEEE Trans. Speech Audio Process. , vol.9 , Issue.5 , pp. 504-512
- Martin, R.¹

23
- 70350488536
- On the statistics of spectral amplitudes after variance reduction by temporal cepstrum smoothing and cepstral nulling
- T Gerkmann, R Martin, On the statistics of spectral amplitudes after variance reduction by temporal cepstrum smoothing and cepstral nulling. IEEE Trans. Signal Process.57(11), 4165–4174 (2009).
- (2009) IEEE Trans. Signal Process. , vol.57 , Issue.11 , pp. 4165-4174
- Gerkmann, T.¹ Martin, R.²

24
- 14344274593
- A new method based on spectral subtraction for speech dereverberation
- K Lebart, JM Boucher, PN Denbigh, A new method based on spectral subtraction for speech dereverberation. Acta Acustica United Acustica. 87(3), 359–366 (2001).
- (2001) Acta Acustica United Acustica , vol.87 , Issue.3 , pp. 359-366
- Lebart, K.¹ Boucher, J.M.² Denbigh, P.N.³

25
- 84938588729
- Spon Press, London
- H Kuttruff, Room Acoustics, 4th edn (Spon Press, London, 2000).
- (2000) 4th edn
- Kuttruff, H.¹ Acoustics, R.²

26
- 77955697587
- Late reverberant spectral variance estimation based on a statistical model
- EAP Habets, S Gannot, I Cohen, Late reverberant spectral variance estimation based on a statistical model. IEEE Signal Process. Lett.16(9), 770–773 (2009).
- (2009) IEEE Signal Process. Lett. , vol.16 , Issue.9 , pp. 770-773
- Habets, E.A.P.¹ Gannot, S.² Cohen, I.³

27
- 84890487970
- in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Blind Estimation of Reverberation Time based on Spectro-Temporal Modulation Filtering (Vancouver
- F Xiong, S Goetze, BT Meyer, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Blind Estimation of Reverberation Time based on Spectro-Temporal Modulation Filtering (Vancouver, Canada, 2013), pp. 443–447.
- (2013) Canada , pp. 443-447
- Xiong, F.¹ Goetze, S.² Meyer, B.T.³

28
- 84938640127
- Florence, Italy
- F Xiong, S Goetze, BT Meyer, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Estimating room acoustic parameters for speech recognizer adaptation and combination in reverberant environments (Florence, Italy, 2014).
- (2014) in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Estimating room acoustic parameters for speech recognizer adaptation and combination in reverberant environments
- Xiong, F.¹ Goetze, S.² Meyer, B.T.³

29
- 84938640128
- An auditory inspired amplitude modulation filter bank for robust feature extraction in automatic speech recognition
- N Moritz, J Anemüller, B Kollmeier, An auditory inspired amplitude modulation filter bank for robust feature extraction in automatic speech recognition. IEEE Trans. Audio, Speech and Language Processing. 23(11), 1926–1937 (2015).
- (2015) IEEE Trans. Audio, Speech and Language Processing , vol.23 , Issue.11 , pp. 1926-1937
- Moritz, N.¹ Anemüller, J.² Kollmeier, B.³

30
- 0024241221
- Periodicity coding in the inferior colliculus of the Cat. I. Neuronal Mechanisms
- G Langner, CE Schreiner, Periodicity coding in the inferior colliculus of the Cat. I. Neuronal Mechanisms. J. Neurophysiol.60(6), 1799–1822 (1988).
- (1988) J. Neurophysiol. , vol.60 , Issue.6 , pp. 1799-1822
- Langner, G.¹ Schreiner, C.E.²

31
- 34547509128
- in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 4. Representation of phonemes in primary auditory cortex: how the brain analyzes speech (Honolulu, HI
- N Mesgarani, S David, S Shamma, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 4. Representation of phonemes in primary auditory cortex: how the brain analyzes speech (Honolulu, HI, USA, 2007), pp. 765–768.
- (2007) USA , pp. 765-768
- Mesgarani, N.¹ David, S.² Shamma, S.³

32
- 0030691985
- Modeling auditory processing of amplitude modulation. I, Detection and masking with narrow-band carriers
- T Dau, B Kollmeier, A Kohlrausch, Modeling auditory processing of amplitude modulation. I, Detection and masking with narrow-band carriers. J. Acoust. Soc. Am.102(5), 2892–2905 (1997).
- (1997) J. Acoust. Soc. Am. , vol.102 , Issue.5 , pp. 2892-2905
- Dau, T.¹ Kollmeier, B.² Kohlrausch, A.³

33
- 79953659090
- Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition
- BT Meyer, B Kollmeier, Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition. Speech Commun.53(5), 753–767 (2011).
- (2011) Speech Commun. , vol.53 , Issue.5 , pp. 753-767
- Meyer, B.T.¹ Kollmeier, B.²

34
- 0019053271
- Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences
- SB David, P Mermelstein, Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoustics, Speech Signal Process.28(4), 357–366 (1980).
- (1980) IEEE Trans. Acoustics, Speech Signal Process. , vol.28 , Issue.4 , pp. 357-366
- David, S.B.¹ Mermelstein, P.²

35
- 0016067897
- Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification
- B Atal, Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J. Acoust. Soc. Am.55(6), 1304–1322 (1974).
- (1974) J. Acoust. Soc. Am. , vol.55 , Issue.6 , pp. 1304-1322
- Atal, B.¹

36
- 84938077304
- Combination of MVDR beamforming and single-channel spectral processing for enhancing noisy and reverberant speech
- B Cauchi, I Kodrasi, R Rehr, S Gerlach, A Jukić, T Gerkmann, S Doclo, S Goetze, Combination of MVDR beamforming and single-channel spectral processing for enhancing noisy and reverberant speech. EURASIP Journal on Advances in Signal Processing. 2015, 61 (2015).
- (2015) EURASIP Journal on Advances in Signal Processing , vol.2015 , pp. 61
- Cauchi, B.¹ Kodrasi, I.² Rehr, R.³ Gerlach, S.⁴ Jukić, A.⁵ Gerkmann, T.⁶ Doclo, S.⁷ Goetze, S.⁸

37
- 51449107956
- C Breithaupt, T Gerkmann, R Martin, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). A Novel A Priori SNR Estimation Approach based on Selective Cepstro-Temporal Smoothing (Las Vegas, NV, USA, 2008), pp. 4897–4900
- C Breithaupt, T Gerkmann, R Martin, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). A Novel A Priori SNR Estimation Approach based on Selective Cepstro-Temporal Smoothing (Las Vegas, NV, USA, 2008), pp. 4897–4900.

38
- 0021645331
- Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator
- Y Ephraim, D Malah, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Trans. Acoustics, Speech Signal Process.32(6), 1109–1121 (1984).
- (1984) IEEE Trans. Acoustics, Speech Signal Process. , vol.32 , Issue.6 , pp. 1109-1121
- Ephraim, Y.¹ Malah, D.²

39
- 84893705681
- Kolossa, in Proc. 2nd CHiME Workshop on Machine Listening in Multisource Environments. Binaural signal processing for enhanced speech recognition robustness in complex listening environments (Vancouver
- H Meutzner, A Schlesinger, S Zeiler, D Kolossa, in Proc. 2nd CHiME Workshop on Machine Listening in Multisource Environments. Binaural signal processing for enhanced speech recognition robustness in complex listening environments (Vancouver, Canada, 2013), pp. 7–12.
- (2013) Canada , pp. 7-12
- H Meutzner, A.¹ Schlesinger, S.² Zeiler, D.³

40
- 84890476022
- in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Noise-robust reverberation time estimation using spectral decay distributions with reduced computational cost (Vancouver
- J Eaton, ND Gaubitch, PA Naylor, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Noise-robust reverberation time estimation using spectral decay distributions with reduced computational cost (Vancouver, Canada, 2013), pp. 161–165.
- (2013) Canada , pp. 161-165
- Eaton, J.¹ Gaubitch, N.D.² Naylor, P.A.³

41
- 84938640130
- in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). A study on joint beamforming and spectral enhancement for robust speech recognition in reverberant environments (Brisbane
- F Xiong, BT Meyer, S Goetze, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). A study on joint beamforming and spectral enhancement for robust speech recognition in reverberant environments (Brisbane, Australia, 2015), pp. 5043–5047.
- (2015) Australia , pp. 5043-5047
- Xiong, F.¹ Meyer, B.T.² Goetze, S.³

42
- 84938580018
- Bochum, Germany
- C Breithaupt, Noise Reduction Algorithms for Speech Communications - Statistical Analysis and Improved Estimation Procedures. PhD thesis (Ruhr-Universität Bochum, Bochum, Germany, 2008).
- (2008) Noise Reduction Algorithms for Speech Communications - Statistical Analysis and Improved Estimation Procedures. PhD thesis (Ruhr-Universität Bochum
- Breithaupt, C.¹

43
- 0035540087
- Computing the Confluent Hypergeometric Function, M(a,b,x)
- KE Muller, Computing the Confluent Hypergeometric Function, M(a,b,x). Numerische Mathematik. 90(1), 179–196 (2001).
- (2001) Numerische Mathematik , vol.90 , Issue.1 , pp. 179-196
- Muller, K.E.¹

44
- 84867584057
- in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). On the application of reverberation suppression to robust speech recognition (Kyoto
- R Maas, EAP Habets, A Sehr, W Kellermann, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). On the application of reverberation suppression to robust speech recognition (Kyoto, Japan, 2012), pp. 297–300.
- (2012) Japan , pp. 297-300
- Maas, R.¹ Habets, E.A.P.² Sehr, A.³ Kellermann, W.⁴

45
- 84865769808
- Morgan, in Interspeech. Comparing Different Flavors of Spectro-Temporal Features for ASR (Florence
- BT Meyer, SV Ravuri, MR Schädler, N Morgan, in Interspeech. Comparing Different Flavors of Spectro-Temporal Features for ASR (Florence, Italy, 2011), pp. 1269–1272.
- (2011) Italy , pp. 1269-1272
- BT Meyer, S.V.¹ Ravuri, M.R.² Schädler, N.³

46
- 84863799482
- Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition
- MR Schädler, BT Meyer, B Kollmeier, Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition. J. Acoust. Soc. Am.131(5), 4134–4151 (2012).
- (2012) J. Acoust. Soc. Am. , vol.131 , Issue.5 , pp. 4134-4151
- Schädler, M.R.¹ Meyer, B.T.² Kollmeier, B.³

47
- 84938640132
- Prentice Hall, USA
- S Haykin, Neural Networks and Learning Machines, 3rd edn (Prentice Hall, USA, 2008).
- (2008) 3rd edn
- Haykin, S.¹ Networks, N.² Machines, L.³

48
- 84938640133
- QuickNet package
- QuickNet package. http://wwwl.icsLberkeley.edu/Speech/qn.html.

49
- 84953653955
- New method of measuring reverberation time
- MR Schroeder, New method of measuring reverberation time. J. Acoust. Soc. Amer.37(3), 409–412 (1965).
- (1965) J. Acoust. Soc. Amer. , vol.37 , Issue.3 , pp. 409-412
- Schroeder, M.R.¹

50
- 84865785753
- Seltzer, in Proc. Interspeech. Improved Bottleneck Features using Pretrained Deep Neural Networks (Florence
- D Yu, ML Seltzer, in Proc. Interspeech. Improved Bottleneck Features using Pretrained Deep Neural Networks (Florence, Italy, 2011), pp. 237–240.
- (2011) Italy , pp. 237-240
- D Yu, M.L.¹

51
- 0040856612
- New York: Academic
- JK Baker, Stochastic Modeling for Automatic Speech Recognition. Speech Recognition. (DR Reddy, ed.), (New York: Academic, 1975).
- (1975) Stochastic Modeling for Automatic Speech Recognition. Speech Recognition. (DR Reddy, ed.)
- Baker, J.K.¹

52
- 0022691022
- Maximum likelihood estimation for multivariate mixture observations of Markov chains
- BH Juang, S Levinson, M Sondhi, Maximum likelihood estimation for multivariate mixture observations of Markov chains. IEEE Trans. Inform. Theory. 32(2), 307–309 (1986).
- (1986) IEEE Trans. Inform. Theory , vol.32 , Issue.2 , pp. 307-309
- Juang, B.H.¹ Levinson, S.² Sondhi, M.³

53
- 80051649263
- in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). A basis method for robust estimation of constrained MLLR (Prague
- D Povey, K Yao, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). A basis method for robust estimation of constrained MLLR (Prague, Czech Republic, 2011), pp. 4460–4463.
- (2011) Czech Republic , pp. 4460-4463
- Povey, D.¹ Yao, K.²

54
- 51449120120
- D Povey, D Kanevsky, B Kingsbury, B Ramabhadran, G Saon, K Visweswariah, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Boosted MMI for Model and Feature-Space Discriminative Training (Las Vegas, NV, USA, 2008), pp. 4057–4060
- D Povey, D Kanevsky, B Kingsbury, B Ramabhadran, G Saon, K Visweswariah, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Boosted MMI for Model and Feature-Space Discriminative Training (Las Vegas, NV, USA, 2008), pp. 4057–4060.

55
- 44949182698
- Hain, in Proc. Interspeech. Hypothesis spaces for minimum Bayes risk training in large vocabulary speech recognition (Pittsburgh, Pennsylvania
- M Gibson, T Hain, in Proc. Interspeech. Hypothesis spaces for minimum Bayes risk training in large vocabulary speech recognition (Pittsburgh, Pennsylvania, USA, 2006), pp. 2406–2409.
- (2006) USA , pp. 2406-2409
- M Gibson, T.¹

56
- 84938640135
- DE Rumelhart, GE Hinton, RJ Williams, Learning Internal Representations by Error Propagation. Parallel distributed processing: Explorations in the microstructure of cognition. 1: Foundations. MIT Press (1986). ISBN:0-262-68053-X
- DE Rumelhart, GE Hinton, RJ Williams, Learning Internal Representations by Error Propagation. Parallel distributed processing: Explorations in the microstructure of cognition. 1: Foundations. MIT Press (1986). ISBN:0-262-68053-X.

57
- 84055211743
- Acoustic modeling using deep belief networks
- A Mohamed, GE Dahl, G Hinton, Acoustic modeling using deep belief networks. IEEE Trans. Audio Speech Lang. Process.20(1), 14–22 (2012).
- (2012) IEEE Trans. Audio Speech Lang. Process. , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.E.² Hinton, G.³

58
- 85083953021
- D Yu, ML Seltzer, J Li, J-T Huang, F Seide, in Proc. of ICLR. Feature learning in deep neural networks - studies on speech recognition tasks, (2013). arXiv:1301.3605v3
- D Yu, ML Seltzer, J Li, J-T Huang, F Seide, in Proc. of ICLR. Feature learning in deep neural networks - studies on speech recognition tasks, (2013). arXiv:1301.3605v3.

59
- 0028996854
- in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). WSJCAM0: A British English Speech Corpus for Large Vocabulary Continuous Speech Recognition (Detroit, Michigan
- T Robinson, J Fransen, D Pye, J Foote, S Renals, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). WSJCAM0: A British English Speech Corpus for Large Vocabulary Continuous Speech Recognition (Detroit, Michigan, USA, 1995), pp. 81–84.
- (1995) USA , pp. 81-84
- Robinson, T.¹ Fransen, J.² Pye, D.³ Foote, J.⁴ Renals, S.⁵

60
- 33846217002
- in IEEE Workshop on Automatic Speech Recognition and Understanding. The Multi-Channel Wall Street Journal Audio Visual Corpus (MC-WSJ-AV): Specification and Initial Experiments (San Juan
- M Lincoln, I McCowan, J Vepa, HK Maganti, in IEEE Workshop on Automatic Speech Recognition and Understanding. The Multi-Channel Wall Street Journal Audio Visual Corpus (MC-WSJ-AV): Specification and Initial Experiments (San Juan, Puerto Rico, 2005), pp. 357–362.
- (2005) Puerto Rico , pp. 357-362
- Lincoln, M.¹ McCowan, I.² Vepa, J.³ Maganti, H.K.⁴

61
- 84938640137
- D Graff, D Paul, D Pallett, in Linguistic Data Lconsortium (LDC). CSR-I (WSJ0) Complete (Philadelphia
- J Garofalo, D Graff, D Paul, D Pallett, in Linguistic Data Lconsortium (LDC). CSR-I (WSJ0) Complete (Philadelphia, USA, 2007).
- (2007) USA

62
- 84892187452
- RA Gopinath, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2. Maximum Likelihood Modeling with Gaussian Distributions for Classification (Seattle, WA, USA, 1998), pp. 661–664
- RA Gopinath, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2. Maximum Likelihood Modeling with Gaussian Distributions for Classification (Seattle, WA, USA, 1998), pp. 661–664.

63
- 84937852689
- Florence, Italy
- Y Tachioka, T Narita, F Weninger, S Watanabe, in Proc. of the REVERB Challenge. Dual system combination approach for various reverberant environments with dereverberation Techniques (Florence, Italy, 2014).
- (2014) Watanabe, in Proc. of the REVERB Challenge. Dual system combination approach for various reverberant environments with dereverberation Techniques
- Y Tachioka, T.¹ Narita, F.² Weninger, S.³

64
- 51449103447
- F Grézl, P Fousek, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Optimizing Bottle-Neck Features for LVCSR (Las Vegas, NV, USA, 2008), pp. 4729–4732
- F Grézl, P Fousek, in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). Optimizing Bottle-Neck Features for LVCSR (Las Vegas, NV, USA, 2008), pp. 4729–4732.

65
- 84932612278
- Povey, in Proc. Interspeech. Sequence-discriminative training of deep neural networks (Lyon
- K Veselý, A Ghosal, L Burget, D Povey, in Proc. Interspeech. Sequence-discriminative training of deep neural networks (Lyon, France, 2013), pp. 2345–2349.
- (2013) France , pp. 2345-2349
- K Veselý, A.¹ Ghosal, L.² Burget, D.³

66
- 84874282188
- J Li, D Yu, J-T Huang, Y Gong, in IEEE Workshop on Spoken Language Technology. Improving Wideband Speech Recognition using Mixed-Bandwidth Training Data in CD-DNN-HMM (Miami, FL, USA, 2012), pp. 131–136
- J Li, D Yu, J-T Huang, Y Gong, in IEEE Workshop on Spoken Language Technology. Improving Wideband Speech Recognition using Mixed-Bandwidth Training Data in CD-DNN-HMM (Miami, FL, USA, 2012), pp. 131–136.

67
- 84910065702
- Ney, in Proc
- Z Tüske, P Golik, R Schlüter, H Ney, in Proc. Interspeech. Acoustic modeling with deep neural networks using raw time signal for LVCSR (Singapore, 2014), pp. 890–894.
- (2014) Interspeech. Acoustic modeling with deep neural networks using raw time signal for LVCSR (Singapore , pp. 890-894
- Z Tüske, P.¹ Golik, R.² Schlüter, H.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.