SCOPUS 정보 검색 플랫폼

Volumn 20, Issue 1, 2011, Pages 74-84

ASR systems in noisy environment: Analysis and solutions for increasing noise robustness

a CZECH TECHNICAL UNIVERSITY IN PRAGUE (Czech Republic)

Author keywords

Feature extraction; Front end; Noisy speech; Parameterization; Robust ASR; Robust speech recognition; Spectral subtraction; Voice activity detection

Indexed keywords

EID: 79955978656 PISSN: 12102512 EISSN: None Source Type: Journal
DOI: None Document Type: Article

Times cited : (15)

References (36)

1
- 85079234583
- On the limitations of Cepstral features in noise
- OPENSHAW, J. P., MASON, J. S. On the limitations of Cepstral features in noise. In Proc. ICASSP, 1994, vol. 2, p. 49-52.
- (1994) Proc. ICASSP , vol.2 , pp. 49-52
- Openshaw, J.P.¹ Mason, J.S.²

2
- 85009065130
- A comparison of LPC and FFT-based acoustic features for noise robust ASR
- WET, F. de, CRANEN, B., VETH, J. de, BOVES, L. A comparison of LPC and FFT-based acoustic features for noise robust ASR. In Eurospeech 2001, p. 865-868.
- (2001) Eurospeech , pp. 865-868
- de Wet, F.¹ Cranen, B.² de Veth, J.³ Boves, L.⁴

3
- 33745208682
- Noise robust front-end for ASR using spectral subtraction, spectral flooring and cumulative distribution mapping
- SST'04. Sydney (Australia), Dec
- CHOI, E. Noise robust front-end for ASR using spectral subtraction, spectral flooring and cumulative distribution mapping. In Proc. 10th Australian Int. Conf. on Speech Science and Technology, SST'04. Sydney (Australia), Dec. 2004, p. 451-456.
- (2004) Proc. 10th Australian Int. Conf. on Speech Science and Technology , pp. 451-456
- Choi, E.¹

4
- 0038000235
- Technical report CUED/FINFENG/ TR 135, Cambridge, England
- GALES, M. J. F., YOUNG, S. J. Parallel Model Combination for Speech Recognition in Noise. Technical report CUED/FINFENG/ TR 135, Cambridge, England, 1993.
- (1993) Parallel Model Combination for Speech Recognition in Noise
- Gales, M.J.F.¹ Young, S.J.²

5
- 0032658253
- Temporal patterns (TRAPs) in ASR of noisy speech
- Washington DC (USA), IEEE Computer Society
- HERMANSKY, H., SHARMA, S. Temporal patterns (TRAPs) in ASR of noisy speech. In ICASSP '99: Proc. of IEEE Int. Conf. on the Acoustics, Speech, and Signal Processing. Washington DC (USA), IEEE Computer Society, 1999, p. 289-292.
- (1999) ICASSP '99: Proc. of IEEE Int. Conf. on the Acoustics, Speech, and Signal Processing , pp. 289-292
- Hermansky, H.¹ Sharma, S.²

6
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- Aug
- DAVIS, S., MERMELSTEIN, P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech and Signal Processing, Aug 1980, vol. 28, p. 357-366.
- (1980) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.28 , pp. 357-366
- Davis, S.¹ Mermelstein, P.²

7
- 0025041264
- Perceptual linear predictive (PLP) analysis of speech
- April
- HERMANSKY, H. Perceptual linear predictive (PLP) analysis of speech. In Proc. JASA, April 1990, vol. 87, no. 4.
- (1990) In Proc. JASA , vol.87 , Issue.4
- Hermansky, H.¹

8
- 0024679985
- Quality improvement of LPCprocessed noisy speech by using spectral subtraction
- June
- KANG, G. S., FRANSEN, L. J. Quality improvement of LPCprocessed noisy speech by using spectral subtraction. IEEE Trans. on ASSP, June 1989, vol. 37, no. 6, p. 939-942.
- (1989) IEEE Trans. on ASSP , vol.37 , Issue.6 , pp. 939-942
- Kang, G.S.¹ Fransen, L.J.²

9
- 65549171142
- Likelihoodmaximizing-based multi-band spectral subtraction for robust speech recognition
- Article ID 878105
- BABA ALI, B., SAMETI, H., SAFAYANI, M. Likelihoodmaximizing-based multi-band spectral subtraction for robust speech recognition. EURASIP Journal on Advances in Signal Processing, 2009. Article ID 878105, 15 p.
- (2009) EURASIP Journal on Advances in Signal Processing , pp. 15
- Baba Ali, B.¹ Sameti, H.² Safayani, M.³

10
- 21444440065
- A system for Mandarin short phrase recognition on portable devices
- XU, C., LIU, Y., YANG, Y. S., et al. A system for Mandarin short phrase recognition on portable devices. In Proc. of Int. Symp. on Chinese Spoken Language Processing, 2004.
- (2004) Proc. of Int. Symp. on Chinese Spoken Language Processing
- Xu, C.¹ Liu, Y.² Yang, Y.S.³

11
- 0021645331
- Speech enhancement using a minimum mean square error short time spectral amplitude estimator
- Dec
- EPHRAIM, Y., MALAH, D. Speech enhancement using a minimum mean square error short time spectral amplitude estimator. IEEE Trans. on ASSP, Dec. 1984, vol. 32, no. 6, p. 1109-1121.
- (1984) IEEE Trans. on ASSP , vol.32 , Issue.6 , pp. 1109-1121
- Ephraim, Y.¹ Malah, D.²

12
- 85009154262
- Modeling the mixtures of known noise and unknown unexpected noise for robust speech recognition
- Aalborg (Denmark)
- MING, J., JANCOVIC, P., HANNA, P., STEWART, D. Modeling the mixtures of known noise and unknown unexpected noise for robust speech recognition. In Proc. of Eurospeech'2001. Aalborg (Denmark), 2001, p. 579-582.
- (2001) Proc. of Eurospeech'2001 , pp. 579-582
- Ming, J.¹ Jancovic, P.² Hanna, P.³ Stewart, D.⁴

13
- 0025681008
- Hidden Markov model decomposition of speech and noise
- VARGA, A. P., MOORE, R. E. Hidden Markov model decomposition of speech and noise. In Proc. ICASSP, 1990, p. 845-848.
- (1990) Proc. ICASSP , pp. 845-848
- Varga, A.P.¹ Moore, R.E.²

14
- 84867213516
- Eigen-MLLR environment/ speaker compensation for robust speech recognition
- Brisbane (Australia), September
- LIAO, Y. F., FANG, H. H., HSU, C. H. Eigen-MLLR environment/ speaker compensation for robust speech recognition. In Proc. Interspeech'08. Brisbane (Australia), September 2008, p. 1249-1252.
- (2008) Proc. Interspeech'08 , pp. 1249-1252
- Liao, Y.F.¹ Fang, H.H.² Hsu, C.H.³

15
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- April
- LEGGETTER, C. J., WOODLAND, P. C. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Computer Speech & Language, April 1995, vol. 9, no. 2, p. 171-185.
- (1995) Computer Speech & Language , vol.9 , Issue.2 , pp. 171-185
- Leggetter, C.J.¹ Woodland, C.²

16
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
- GAUVAIN, J. L., LEE, C. H. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Trans. on SAP, 1994, vol. 2, no. 2, p. 291-298.
- (1994) IEEE Trans. on SAP , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.L.¹ Lee, C.H.²

17
- 0032638028
- Database and online adaptation for improved speech recognition in car environments
- FISHER, A., STAHL, V. Database and online adaptation for improved speech recognition in car environments. In Proc. ICASSP'99, p. 445-448.
- Proc. ICASSP'99 , pp. 445-448
- Fisher, A.¹ Stahl, V.²

18
- 84946730259
- TRAP-TANDEM: Data-driven extraction of temporal features from speech
- Martigny (Switzerland)
- HERMANSKY, H. TRAP-TANDEM: Data-driven extraction of temporal features from speech. In Proc. of ASRU'03. Martigny (Switzerland), 2003, p. 255-260.
- (2003) Proc. of ASRU'03 , pp. 255-260
- Hermansky, H.¹

19
- 0032638669
- Fitting the Mel scale
- UMESH, S., COHEN, L., NELSON, D. Fitting the Mel scale. In Proc. ICASSP, 1999, vol. 1, p. 217-220.
- (1999) Proc. ICASSP , vol.1 , pp. 217-220
- Umesh, S.¹ Cohen, L.² Nelson, D.³

20
- 33745193024
- Revising Perceptual Linear Prediction (PLP)
- HÖNIG, F., STEMMER, G., HACKER, C., BRUGNARA, F. Revising Perceptual Linear Prediction (PLP). In Eurospeech 2005, p. 2997-3000.
- (2005) Eurospeech , pp. 2997-3000
- Hönig, F.¹ Stemmer, G.² Hacker, C.³ Brugnara, F.⁴

21
- 33646767079
- Acoustic feature combination for robust speech recognition
- Philadelphia (PA, USA), March
- ZOLNAY, A., SCHLÜTER, R., NEY, H. Acoustic feature combination for robust speech recognition. In ICASSP'05. Philadelphia (PA, USA), March 2005, vol. 1, p. 457-460.
- (2005) ICASSP'05 , vol.1 , pp. 457-460
- Zolnay, A.¹ Schlüter, R.² Ney, H.³

22
- 34547539413
- Gamma tone features and feature combination for large vocabulary speech recognition
- Honolulu (HI, USA), April
- SCHLÜTER, R., BEZRUKOV, I., WAGNER, H., NEY, H. Gamma tone features and feature combination for large vocabulary speech recognition. In ICASSP 2007. Honolulu (HI, USA), April 2007, p. 649-652.
- (2007) ICASSP 2007 , pp. 649-652
- Schlüter, R.¹ Bezrukov, I.² Wagner, H.³ Ney, H.⁴

23
- 85009115888
- An auditory system-based feature for robust speech recognition
- LI, Q., SOONG, F. K., SIOHAN, O. An auditory system-based feature for robust speech recognition. In Eurospeech 2001, p. 619-622.
- (2001) Eurospeech , pp. 619-622
- Li, Q.¹ Soong, F.K.² Siohan, O.³

24
- 84942591958
- The influence of a filter shape in the telephone-based recognition module using PLP parameterization
- Berlin, Springer-Verlag
- PSUTKA, J., MÜLLER, L., PSUTKA, J. V. The influence of a filter shape in the telephone-based recognition module using PLP parameterization. In TSD 2001. Berlin, Springer-Verlag 2001, p. 222-228.
- (2001) TSD 2001 , pp. 222-228
- Psutka, J.¹ Müller, L.² Psutka, J.V.³

25
- 74549179210
- Comparison of MFCC and PLP parameterizations in the speaker independent continuous speech recognition task
- PSUTKA, J., MÜLLER, L., PSUTKA, J. V. Comparison of MFCC and PLP parameterizations in the speaker independent continuous speech recognition task. In Eurospeech 2001, p. 1813-1816.
- (2001) Eurospeech , pp. 1813-1816
- Psutka, J.¹ Müller, L.² Psutka, J.V.³

26
- 84871076921
- Additive noise and channel distortionrobust parameterization tool - performance evaluation on Aurora 2&3
- FOUSEK, P., POLLÁK, P. Additive noise and channel distortionrobust parameterization tool - performance evaluation on Aurora 2&3. In Eurospeech 2003, p. 1785-1788.
- (2003) Eurospeech , pp. 1785-1788
- Fousek, P.¹ Pollák, P.²

27
- 34548809160
- Computational auditory scene analysis and its application to robot audition: Five years experience
- ICKS. IEEE Computer Society, Washington, DC
- nd Int. Conf. on Informatics Research for Development of Knowledge Society Infrastructure. ICKS. IEEE Computer Society, Washington, DC, 2007, p. 69-76.
- (2007) nd Int. Conf. on Informatics Research for Development of Knowledge Society Infrastructure , pp. 69-76
- Okuno, H.G.¹ Ogata, T.² Komatani, K.³

28
- 0035396555
- Noise power spectral density estimation based on optimal smoothing and minimum statistics
- July
- MARTIN, R. Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Tran. on Speech and Audio Processing, July 2001, vol. 9, no. 5, p. 504 - 512.
- (2001) IEEE Tran. on Speech and Audio Processing , vol.9 , Issue.5 , pp. 504-512
- Martin, R.¹

29
- 0242671638
- Extended spectral subtraction
- Trieste (Italy), September
- SOVKA, P., POLLÁK, P., KYBIC, J. Extended spectral subtraction. In European Signal Processing Conference (EUSIPCO'96). Trieste (Italy), September 1996.
- (1996) European Signal Processing Conference (EUSIPCO'96)
- Sovka, P.¹ Pollák, P.² Kybic, J.³

30
- 85009215795
- Noise reduction applied in real time speech recognition system
- Budapest (Hungary), September
- NOVOTNY, J., MACHACEK, L. Noise reduction applied in real time speech recognition system. In Polish-Czech-Hungarian Workshop on Circuit Theory, Signal Processing, and Telecommunication Networks. Budapest (Hungary), September 2001.
- (2001) Polish-Czech-Hungarian Workshop on Circuit Theory, Signal Processing, and Telecommunication Networks
- Novotny, J.¹ Machacek, L.²

31
- 79955952939
- HTK speech recognition toolkit. [Online]. Ver. 3.3. July. Available at
- HTK speech recognition toolkit. [Online]. Ver. 3.3. July 2005. Available at: http://htk.eng.cam.ac.uk/
- (2005)

32
- 79955976718
- CtuCopy. [Online]. Ver. 3.0.11. Available at
- CtuCopy. [Online]. Ver. 3.0.11. Available at: http://noel.feld.cvut.cz/speechlab/en/download/CtuCopy_3.0.11.tar.bz2
- (2020)

33
- 79955944980
- Available at
- SPEECON database distributed through the European Language Resources Association [Online]. Available at: http://catalog.elra.info/search_result.php?keywords=speecon&language=en&osCsid=66
- SPEECON database distributed through the European Language Resources Association [Online]

34
- 77949361917
- The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions
- Paris (France), September
- HIRSCH, H. G., PEARCE, D. The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions. In ISCA ITRW ASR2000 Automatic Speech Recognition: Challenges for the Next Millennium. Paris (France), September 2000.
- (2000) ISCA ITRW ASR2000 Automatic Speech Recognition: Challenges for the Next Millennium
- Hirsch, H.G.¹ Pearce, D.²

35
- 79955945159
- Voice activity detection based on perceptual cepstral analysis
- Prague: HUMUSOFT (in Czech)
- RAJNOHA, J., POLLÁK, P. Voice activity detection based on perceptual cepstral analysis. In Technical Computing Prague 2008 [CD-ROM]. Prague: HUMUSOFT, 2008, vol. 1, p. 1-9. (in Czech).
- (2008) Technical Computing Prague 2008 [CD-ROM] , vol.1 , pp. 1-9
- Rajnoha, J.¹ Pollák, P.²

36
- 79955952136
- Available at
- ETSI Distributed speech recognition ES 202 050 standard [online]. Available at: http://www.etsi.org/WebSite/Technologies/DistributedSpeechRecognition.aspx
- (2020) ETSI Distributed speech recognition ES 202 050 standard [online]

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.