SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2013, Pages 6817-6821

Coupling binary masking and robust ASR

(2) Narayanan, Arun a Wang, Deliang a,b

a The Ohio State University (United States)

b OHIO STATE UNIVERSITY (United States)

Author keywords

Aurora 4; bidirectional speech decoder; Computational Auditory Scene Analysis; noise robust ASR

Indexed keywords

AURORA-4; BIDIRECTIONAL SPEECH DECODER; CEPSTRAL FEATURES; COMPUTATIONAL AUDITORY SCENE ANALYSIS; IDEAL BINARY MASK (IBM); NOISE ROBUST ASR; ROBUST AUTOMATIC SPEECH RECOGNITIONS (ASR); SPEECH SEPARATION;

ESTIMATION; SIGNAL PROCESSING; SPEECH PROCESSING; SPEECH RECOGNITION;

ACOUSTIC NOISE;

EID: 84890475416 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2013.6638982 Document Type: Conference Paper

Times cited : (7)

References (34)

1
- 0028517164
- RASTA processing of speech
- H. Hermansky and N. Morgan, "RASTA processing of speech," IEEE Trans. Speech and Audio Process., vol. 2, pp. 578-589, 1994
- (1994) IEEE Trans. Speech and Audio Process. , vol.2 , pp. 578-589
- Hermansky, H.¹ Morgan, N.²

2
- 0442317754
- ETSI, ES 202 050 V1.1.4
- ETSI, ES 202 050 V1.1.4, "Speech processing transmission and quality aspects (STQ); Distributed speech recognition; Ad-vanced front-end feature extraction algorithm; Compression algorithms," 2005
- (2005) Speech Processing Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Ad-vanced Front-end Feature Extraction Algorithm; Compression Algorithms

3
- 85135375893
- HMM recognition in noise using parallel model combination
- M. J. F. Gales and S. J. Young, "HMM recognition in noise using parallel model combination," in Proc. Eurospeech, 1993, vol. 2, pp. 837-840
- (1993) Proc. Eurospeech , vol.2 , pp. 837-840
- Gales, M.J.F.¹ Young, S.J.²

4
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- M. J. F. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," Comput. Speech Lang., vol. 12, pp. 75-98, 1998
- (1998) Comput. Speech Lang. , vol.12 , pp. 75-98
- Gales, M.J.F.¹

5
- 62249130045
- A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions
- J. Li, L. Deng, D. Yu, Y. Gong, and A. Acero, "A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions," Comput. Speech Lang., vol. 23, pp. 389-405, 2009
- (2009) Comput. Speech Lang. , vol.23 , pp. 389-405
- Li, J.¹ Deng, L.² Yu, D.³ Gong, Y.⁴ Acero, A.⁵

6
- 34447100796
- CRC Press, Boca Raton, Florida
- P. C. Loizou, Speech Enhancement: Theory and Practice, CRC Press, Boca Raton, Florida, 2007
- (2007) Speech Enhancement: Theory and Practice
- Loizou, P.C.¹

7
- 84867584623
- Improvements to VTS feature enhancement
- J. Droppo, L. Deng, and A. Acero, "Improvements to VTS feature enhancement," in Proc. IEEE ICASSP, 2012, pp. 4677-4680
- (2012) Proc. IEEE ICASSP , pp. 4677-4680
- Droppo, J.¹ Deng, L.² Acero, A.³

8
- 33750376174
- Model-based feature enhancement with uncertainty decoding for noise robust ASR
- V. Stouten, H. Van Hamme, and P. Wambacq, "Model-based feature enhancement with uncertainty decoding for noise robust ASR," Speech Commun., vol. 48, pp. 1502-1514, 2006
- (2006) Speech Commun. , vol.48 , pp. 1502-1514
- Stouten, V.¹ Van Hamme, H.² Wambacq, P.³

9
- 84878567715
- Advances in noise robust digit recognition using hybrid exemplar-based techniques
- J. F. Gemmeke and H. Van Hamme, "Advances in noise robust digit recognition using hybrid exemplar-based techniques," in Proc. Interspeech, 2012
- (2012) Proc. Interspeech
- Gemmeke, J.F.¹ Van Hamme, H.²

10
- 56249136428
- Transforming binary uncertainties for robust speech recognition
- S. Srinivasan and D. L.Wang, "Transforming binary uncertainties for robust speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, pp. 2130-2140, 2007
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , pp. 2130-2140
- Srinivasan, S.¹ Wang, D.L.²

11
- 0031187171
- Speech recognition by machines and humans
- R. P. Lippmann, "Speech recognition by machines and humans," Speech Commun., vol. 22, pp. 1-16, 1997
- (1997) Speech Commun. , vol.22 , pp. 1-16
- Lippmann, R.P.¹

12
- 0003684441
- MIT Press, Cambridge, MA
- A. S. Bregman, Auditory Scene Analysis, MIT Press, Cambridge, MA, 1990
- (1990) Auditory Scene Analysis
- Bregman, A.S.¹

13
- 82255178542
- Wiley/ IEEE Press, Hoboken, NJ
- D. L. Wang and G. J. Brown, Eds., Computational Auditory Scene Analysis: Principles, Algorithms, and Applications, Wiley/ IEEE Press, Hoboken, NJ, 2006
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
- Wang, D.L.¹ Brown, G.J.²

14
- 84892233308
- On ideal binary masks as the computational goal of auditory scene analysis
- P. Divenyi, Ed.Kluwer Academic, Boston, MA
- D. L.Wang, "On ideal binary masks as the computational goal of auditory scene analysis," in Speech Separation by Humans and Machines, P. Divenyi, Ed., pp. 181-197. Kluwer Academic, Boston, MA, 2005
- (2005) Speech Separation by Humans and Machines , pp. 181-197
- Wang, D.L.¹

15
- 70349161218
- Role of mask pattern in intelligibility of ideal binarymasked noisy speech
- U. Kjems, J. B. Boldt, M. S. Pedersen, T. Lunner, and D. L. Wang, "Role of mask pattern in intelligibility of ideal binarymasked noisy speech," J. Acoust. Soc. Am., vol. 126, pp. 1415-1426, 2009
- (2009) J. Acoust. Soc. Am. , vol.126 , pp. 1415-1426
- Kjems, U.¹ Boldt, J.B.² Pedersen, M.S.³ Lunner, T.⁴ Wang, D.L.⁵

16
- 0035342414
- Robust automatic speech recognition with missing and uncertain acoustic data
- M. P Cooke, P. Greene, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and uncertain acoustic data," Speech Commun., vol. 34, pp. 141-177, 2001
- (2001) Speech Commun. , vol.34 , pp. 141-177
- Cooke, M.P.¹ Greene, P.² Josifovski, L.³ Vizinho, A.⁴

17
- 85032752225
- Missing-feature approaches in speech recognition
- B. Raj and R. Stern, "Missing-feature approaches in speech recognition," IEEE Signal Process. Mag., vol. 22, pp. 101-116, 2005
- (2005) IEEE Signal Process. Mag. , vol.22 , pp. 101-116
- Raj, B.¹ Stern, R.²

18
- 84877594942
- Tech. Rep. OSU-CISRC-7/11-TR21, Depart. Comput. Sc. Eng., The Ohio State University, Columbus, Ohio, USA
- W. Hartmann, A. Narayanan, E. Fosler-Lussier, and D. L. Wang, "Nothing doing: Re-evaluating missing feature ASR," Tech. Rep. OSU-CISRC-7/11-TR21, Depart. Comput. Sc. Eng., The Ohio State University, Columbus, Ohio, USA, 2011, Available: ftp://ftp.cse.ohio-state.edu/pub/tech- report/2011
- (2011) Nothing Doing: Re-evaluating Missing Feature ASR
- Hartmann, W.¹ Narayanan, A.² Fosler-Lussier, E.³ Wang, D.L.⁴

19
- 85008054377
- Unvoiced speech segregation from nonspeech interference via CASA and spectral subtraction
- K. Hu and D. L. Wang, "Unvoiced speech segregation from nonspeech interference via CASA and spectral subtraction," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, pp. 1600-1609, 2011
- (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , pp. 1600-1609
- Hu, K.¹ Wang, D.L.²

20
- 11144316019
- Decoding speech in the presence of other sources
- J. Barker, M. P. Cooke, and D. P. W. Ellis, "Decoding speech in the presence of other sources," Speech Commun., vol. 45, pp. 5-25, 2005
- (2005) Speech Commun. , vol.45 , pp. 5-25
- Barker, J.¹ Cooke, M.P.² Ellis, D.P.W.³

21
- 33750311718
- Binary and ratio time-frequency masks for robust speech recognition
- S. Srinivasan, N. Roman, and D. L. Wang, "Binary and ratio time-frequency masks for robust speech recognition," Speech Commun., vol. 48, pp. 1486-1501, 2006
- (2006) Speech Commun. , vol.48 , pp. 1486-1501
- Srinivasan, S.¹ Roman, N.² Wang, D.L.³

22
- 84878618681
- Coupling identification and reconstruction of missing features for noise-robust automatic speech recognition
- N. Ma and J. Barker, "Coupling identification and reconstruction of missing features for noise-robust automatic speech recognition," in Proc. Interspeech, 2012
- (2012) Proc. Interspeech
- Ma, N.¹ Barker, J.²

23
- 70350038037
- Robust speech recognition by integrating speech separation and hypothesis testing
- S. Srinivasan and D. L. Wang, "Robust speech recognition by integrating speech separation and hypothesis testing," Speech Commun., vol. 52, pp. 72-81, 2010
- (2010) Speech Commun. , vol.52 , pp. 72-81
- Srinivasan, S.¹ Wang, D.L.²

24
- 84878392281
- Improved model selection for the ASR-driven binary mask
- W. Hartmann and E. Fosler-Lussier, "Improved model selection for the ASR-driven binary mask," in Proc. Interspeech, 2012
- (2012) Proc. Interspeech
- Hartmann, W.¹ Fosler-Lussier, E.²

25
- 77957739976
- Advances in missing feature techniques for robust large-vocabulary continuous speech recognition
- M. Van Segbroeck and H. Van Hamme, "Advances in missing feature techniques for robust large-vocabulary continuous speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, pp. 123-137, 2011
- (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , pp. 123-137
- Van Segbroeck, M.¹ Van Hamme, H.²

26
- 84877621926
- The role of binary mask pattern in automatic speech recognition in background noise
- in press
- A. Narayanan and D. L. Wang, "The role of binary mask pattern in automatic speech recognition in background noise," J. Acoust. Soc. Am., 2013, in press
- (2013) J. Acoust. Soc. Am.
- Narayanan, A.¹ Wang, D.L.²

27
- 78649580634
- Robust speech recognition from binary masks
- A. Narayanan and D. L. Wang, "Robust speech recognition from binary masks," J. Acoust. Soc. Am., vol. 128, pp. EL217-222, 2010
- (2010) J. Acoust. Soc. Am. , vol.128
- Narayanan, A.¹ Wang, D.L.²

28
- 85009227702
- Analysis of the Aurora large vocabulary evalutions
- N. Parihar and J. Picone, "Analysis of the Aurora large vocabulary evalutions," in Proc. ECSCT, 2003, pp. 337-340
- (2003) Proc. ECSCT , pp. 337-340
- Parihar, N.¹ Picone, J.²

29
- 0003548585
- J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, and N. L. Dahlgren, "DARPA TIMIT acoustic phonetic continuous speech corpus," 1993 [Online]. Available: http://www.ldc.upenn.edu/Catalog/ LDC93S1.html
- (1993) DARPA TIMIT Acoustic Phonetic Continuous Speech Corpus
- Garofolo, J.S.¹ Lamel, L.F.² Fisher, W.M.³ Fiscus, J.G.⁴ Pallett, D.S.⁵ Dahlgren, N.L.⁶

30
- 0003822743
- Cambridge University Publishing Department
- S. Young, G. Evermann, T. Hain, D. Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK Book, Cambridge University Publishing Department, 2002, [Online]. Available: http://htk.eng.cam.ac.uk
- (2002) The HTK Book
- Young, S.¹ Evermann, G.² Hain, T.³ Kershaw, D.⁴ Moore, G.⁵ Odell, J.⁶ Ollason, D.⁷ Povey, D.⁸ Valtchev, V.⁹ Woodlands, P.¹⁰

31
- 84865682906
- A CASA-based system for long-term SNR estimation
- A. Narayanan and DeLiang Wang, "A CASA-based system for long-term SNR estimation," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, pp. 2518-2527, 2012
- (2012) IEEE Trans. Audio, Speech, Lang. Process. , vol.20 , pp. 2518-2527
- Narayanan, A.¹ Wang, D.²

32
- 78049364397
- MMSE based noise PSD tracking with low complexity
- R.C. Hendriks, R. Heusdens, and J. Jensen, "MMSE based noise PSD tracking with low complexity," in Proc. IEEE ICASSP, 2010, pp. 4266-4269
- (2010) Proc. IEEE ICASSP , pp. 4266-4269
- Hendriks, R.C.¹ Heusdens, R.² Jensen, J.³

33
- 70349093614
- An algorithm that improves speech intelligibility in noise for normal-hearing listeners
- G. Kim, Y. Lu, Y. Hu, and P. Loizou, "An algorithm that improves speech intelligibility in noise for normal-hearing listeners," J. Acoust. Soc. Am, vol. 126, pp. 1486-1494, 2009
- (2009) J. Acoust. Soc. Am , vol.126 , pp. 1486-1494
- Kim, G.¹ Lu, Y.² Hu, Y.³ Loizou, P.⁴

34
- 84870477511
- Exploring monaural features for classification-based speech segregation
- Y. Wang, K. Han, and D. Wang, "Exploring monaural features for classification-based speech segregation," IEEE Trans. Audio, Speech, Lang. Process., vol. 21, pp. 270-279, 2013.
- (2013) IEEE Trans. Audio, Speech, Lang. Process. , vol.21 , pp. 270-279
- Wang, Y.¹ Han, K.² Wang, D.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.