메뉴 건너뛰기




Volumn 130, Issue 5, 2011, Pages 3013-3027

An evaluation of objective measures for intelligibility prediction of time-frequency weighted noisy speech

Author keywords

[No Author keywords available]

Indexed keywords

NOISY SPEECH; OBJECTIVE MEASURE; SINGLE-CHANNEL; TIME FREQUENCY;

EID: 81355153924     PISSN: 00014966     EISSN: None     Source Type: Journal    
DOI: 10.1121/1.3641373     Document Type: Article
Times cited : (48)

References (49)
  • 2
    • 0036805081 scopus 로고    scopus 로고
    • Perceptual evaluation of speech quality (PESQ): The new ITU standard for end-to-end speech quality assessment. Part II - Psychoacoustic model
    • Beerends, J. G., Hekstra, A. P., Rix, A. W., and Hollier, M. P. (2002). Perceptual evaluation of speech quality (PESQ): The new ITU standard for end-to-end speech quality assessment part II-psychoacoustic model., J. Audio Eng. Soc. 50, 765-778. (Pubitemid 35296264)
    • (2002) AES: Journal of the Audio Engineering Society , vol.50 , Issue.10 , pp. 765-778
    • Beerends, J.G.1    Hekstra, A.P.2    Rix, A.W.3    Hollier, M.P.4
  • 4
    • 78649295196 scopus 로고    scopus 로고
    • Extension of ITU-T recommendation P. 862 PESQ towards measuring speech intelligibility with vocoders
    • Beerends, J. G., van Wijngaarden, S., and van Buuren, R. (2005). Extension of ITU-T recommendation P. 862 PESQ towards measuring speech intelligibility with vocoders., TNO Technical Report.
    • (2005) TNO Technical Report
    • Beerends, J.G.1    Van Wijngaarden, S.2    Van Buuren, R.3
  • 5
    • 84863763285 scopus 로고    scopus 로고
    • A simple correlation-based model of intelligibility for nonlinear speech enhancement and separation
    • in
    • Boldt, J. B., and Ellis, D. P. W. (2009). A simple correlation-based model of intelligibility for nonlinear speech enhancement and separation., in Proceedings of EUSIPCO, pp. 1849-1853.
    • (2009) Proceedings of EUSIPCO , pp. 1849-1853
    • Boldt, J.B.1    Ellis, D.P.W.2
  • 6
    • 33845354768 scopus 로고    scopus 로고
    • Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation
    • DOI 10.1121/1.2363929
    • Brungart, D. S., Chang, P. S., Simpson, B. D., and Wang, D. L. (2006). Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation., J. Acoust. Soc. Am. 120, 4007-4018. 10.1121/1.2363929 (Pubitemid 44888096)
    • (2006) Journal of the Acoustical Society of America , vol.120 , Issue.6 , pp. 4007-4018
    • Brungart, D.S.1    Chang, P.S.2    Simpson, B.D.3    Wang, D.4
  • 7
    • 79952871923 scopus 로고    scopus 로고
    • Prediction of speech intelligibility based on an auditory preprocessing model
    • 10.1016/j.specom.2010.03.004
    • Christiansen, C., Pedersen, M. S., and Dau, T. (2010). Prediction of speech intelligibility based on an auditory preprocessing model., Speech Commun. 52, 678-692. 10.1016/j.specom.2010.03.004
    • (2010) Speech Commun. , vol.52 , pp. 678-692
    • Christiansen, C.1    Pedersen, M.S.2    Dau, T.3
  • 8
    • 0029952425 scopus 로고    scopus 로고
    • A quantitative model of the 'effective' signal processing in the auditory system. I. Model structure
    • DOI 10.1121/1.414959
    • Dau, T., Pschel, D., and Kohlrausch, A. (1996). A quantitative model of the effective signal processing in the auditory system. I. Model structure., J. Acoust. Soc. Am. 99, 3615-3622. 10.1121/1.414959 (Pubitemid 26190250)
    • (1996) Journal of the Acoustical Society of America , vol.99 , Issue.6 , pp. 3615-3622
    • Dau, T.1    Puschel, D.2    Kohlrausch, A.3
  • 10
    • 60049084444 scopus 로고    scopus 로고
    • The concept of signal-to-noise ratio in the modulation domain and speech intelligibility
    • 10.1121/1.3001713
    • Dubbelboer, F., and Houtgast, T. (2008). The concept of signal-to-noise ratio in the modulation domain and speech intelligibility., J. Acoust. Soc. Am. 124, 3937-3946. 10.1121/1.3001713
    • (2008) J. Acoust. Soc. Am. , vol.124 , pp. 3937-3946
    • Dubbelboer, F.1    Houtgast, T.2
  • 11
    • 0038711696 scopus 로고    scopus 로고
    • A spectro-temporal modulation index (STMI) for assessment of speech intelligibility
    • 10.1016/S0167-6393(02)00134-6
    • Elhilali, M., Chi, T., and Shamma, S. (2003). A spectro-temporal modulation index (STMI) for assessment of speech intelligibility., Speech Commun. 41, 331-348. 10.1016/S0167-6393(02)00134-6
    • (2003) Speech Commun. , vol.41 , pp. 331-348
    • Elhilali, M.1    Chi, T.2    Shamma, S.3
  • 12
    • 0021645331 scopus 로고
    • Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator
    • DOI 10.1109/TASSP.1984.1164453
    • Ephraim, Y., and Malah, D. (1984). Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator., IEEE Trans. Acoust. Speech Signal Process. 32, 1109-1121. 10.1109/TASSP.1984.1164453 (Pubitemid 15159457)
    • (1984) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.32 , Issue.6 , pp. 1109-1121
    • Ephraim, Y.1    Malah, D.2
  • 13
    • 51449104842 scopus 로고    scopus 로고
    • Minimum mean-square error estimation of discrete fourier coefficients with generalized gamma priors
    • 10.1109/TASL.2007.899233
    • Erkelens, J. S., Hendriks, R. C., Heusdens, R., and Jensen, J. (2007). Minimum mean-square error estimation of discrete fourier coefficients with generalized gamma priors., IEEE Trans. Audio Speech Lang. Process. 15, 1741-1752. 10.1109/TASL.2007.899233
    • (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , pp. 1741-1752
    • Erkelens, J.S.1    Hendriks, R.C.2    Heusdens, R.3    Jensen, J.4
  • 14
    • 84953657538 scopus 로고
    • Factors governing the intelligibility of speech sounds
    • 10.1121/1.1916407
    • French, N. R., and Steinberg, J. C. (1947). Factors governing the intelligibility of speech sounds., J. Acoust. Soc. Am. 19, 90-119. 10.1121/1.1916407
    • (1947) J. Acoust. Soc. Am. , vol.19 , pp. 90-119
    • French, N.R.1    Steinberg, J.C.2
  • 15
    • 11144348189 scopus 로고    scopus 로고
    • Analysis of speech-based speech transmission index methods with implications for nonlinear operations
    • DOI 10.1121/1.1804628
    • Goldsworthy, R. L., and Greenberg, J. E. (2004). Analysis of speech-based speech transmission index methods with implications for nonlinear operations., J. Acoust. Soc. Am. 116, 3679-3689. 10.1121/1.1804628 (Pubitemid 40029948)
    • (2004) Journal of the Acoustical Society of America , vol.116 , Issue.6 , pp. 3679-3689
    • Goldsworthy, R.L.1    Greenberg, J.E.2
  • 16
    • 0017097474 scopus 로고
    • Distance measures for speech processing
    • Gray, Jr., A. H., and Markel, J. D. (1976). Distance measures for speech processing., IEEE Trans. Acoust. Speech Signal Process. 24, 380-391. 10.1109/TASSP.1976.1162849 (Pubitemid 8091024)
    • (1976) IEEE TRANS.ACOUST.SPEECH SIGN.PROC. , vol.24 , Issue.5 , pp. 380-391
    • Gray Jr., A.H.1    Markel, J.D.2
  • 19
    • 35248891610 scopus 로고    scopus 로고
    • A comparative intelligibility study of single-microphone noise reduction algorithms
    • DOI 10.1121/1.2766778
    • Hu, Y., and Loizou, P. C. (2007a). A comparative intelligibility study of single-microphone noise reduction algorithms., J. Acoust. Soc. Am. 122, 1777-1786. 10.1121/1.2766778 (Pubitemid 47560539)
    • (2007) Journal of the Acoustical Society of America , vol.122 , Issue.3 , pp. 1777-1786
    • Hu, Y.1    Loizou, P.C.2
  • 20
    • 34447092407 scopus 로고    scopus 로고
    • Subjective comparison and evaluation of speech enhancement algorithms
    • DOI 10.1016/j.specom.2006.12.006, PII S0167639306001920
    • Hu, Y., and Loizou, P. C. (2007b). Subjective comparison and evaluation of speech enhancement algorithms., Speech Commun. 49, 588-601. 10.1016/j.specom.2006.12.006 (Pubitemid 47031352)
    • (2007) Speech Communication , vol.49 , Issue.7-8 , pp. 588-601
    • Hu, Y.1    Loizou, P.C.2
  • 21
    • 44149106061 scopus 로고    scopus 로고
    • Evaluation of objective quality measures for speech enhancement
    • 10.1109/TASL.2007.911054
    • Hu, Y., and Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement., IEEE Trans. Audio Speech Lang. Process. 16, 229-238. 10.1109/TASL.2007.911054
    • (2008) IEEE Trans. Audio Speech Lang. Process. , vol.16 , pp. 229-238
    • Hu, Y.1    Loizou, P.C.2
  • 22
    • 0014704814 scopus 로고
    • A statistical method for estimation of speech spectral density and formant frequencies
    • Itakura, F., and Saito, S. (1970). A statistical method for estimation of speech spectral density and formant frequencies., Electron. Commun. Jpn. 53, 36-43.
    • (1970) Electron. Commun. Jpn. , vol.53 , pp. 36-43
    • Itakura, F.1    Saito, S.2
  • 23
    • 17644399140 scopus 로고    scopus 로고
    • Coherence and the speech intelligibility index
    • DOI 10.1121/1.1862575
    • Kates, J. M., and Arehart, K. H. (2005). Coherence and the speech intelligibility index., J. Acoust. Soc. Am. 117, 2224-2237. 10.1121/1.1862575 (Pubitemid 40570480)
    • (2005) Journal of the Acoustical Society of America , vol.117 , Issue.4 , pp. 2224-2237
    • Kates, J.M.1    Arehart, K.H.2
  • 25
    • 70349161218 scopus 로고    scopus 로고
    • Role of mask pattern in intelligibility of ideal binary-masked noisy speech
    • 10.1121/1.3179673
    • Kjems, U., Boldt, J. B., Pedersen, M. S., Lunner, T., and Wang, D. (2009). Role of mask pattern in intelligibility of ideal binary-masked noisy speech., J. Acoust. Soc. Am. 126, 1415-1426. 10.1121/1.3179673
    • (2009) J. Acoust. Soc. Am. , vol.126 , pp. 1415-1426
    • Kjems, U.1    Boldt, J.B.2    Pedersen, M.S.3    Lunner, T.4    Wang, D.5
  • 28
    • 84889381426 scopus 로고
    • Methods for the calculation and use of the articulation index
    • 10.1121/1.1909094
    • Kryter, K. D. (1962). Methods for the calculation and use of the articulation index., J. Acoust. Soc. Am. 34, 1689-1697. 10.1121/1.1909094
    • (1962) J. Acoust. Soc. Am. , vol.34 , pp. 1689-1697
    • Kryter, K.D.1
  • 30
    • 0032166975 scopus 로고    scopus 로고
    • Mimicking the human ear
    • Loizou, P. (1998). Mimicking the human ear., IEEE Sign. Process. Mag. 15, 101-130. 10.1109/79.708543 (Pubitemid 128634179)
    • (1998) IEEE Signal Processing Magazine , vol.15 , Issue.5 , pp. 101-130
    • Loizou, P.C.1
  • 32
    • 0027868016 scopus 로고
    • Evaluation of a noise reduction method - Comparison between observed scores and scores predicted from STI
    • Ludvigsen, C., Elberling, C., and Keidser, G. (1993). Evaluation of a noise reduction method-Comparison between observed scores and scores predicted from STI., Scand. Audiol. Suppl. 38, 50-55. (Pubitemid 23362792)
    • (1993) Scandinavian Audiology, Supplement , vol.22 , Issue.38 , pp. 50-55
    • Ludvigsen, C.1    Elberling, C.2    Keidser, G.3
  • 33
    • 65549157071 scopus 로고    scopus 로고
    • Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions
    • 10.1121/1.3097493
    • Ma, J., Hu, Y., and Loizou, P. (2009). Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions., J. Acoust. Soc. Am. 125, 3387-3405. 10.1121/1.3097493
    • (2009) J. Acoust. Soc. Am. , vol.125 , pp. 3387-3405
    • Ma, J.1    Hu, Y.2    Loizou, P.3
  • 34
    • 0035396555 scopus 로고    scopus 로고
    • Noise power spectral density estimation based on optimal smoothing and minimum statistics
    • DOI 10.1109/89.928915, PII S106366760104980X
    • Martin, R. (2001). Noise power spectral density estimation based on optimal smoothing and minimum statistics., IEEE Trans. Speech Audio Process. 9, 504-512. 10.1109/89.928915 (Pubitemid 32631178)
    • (2001) IEEE Transactions on Speech and Audio Processing , vol.9 , Issue.5 , pp. 504-512
    • Martin, R.1
  • 35
    • 85009100883 scopus 로고    scopus 로고
    • Usefulness of phase spectrum in human speech perception
    • in
    • Paliwal, K. K., and Alsteris, L. (2003). Usefulness of phase spectrum in human speech perception., in Proceedings of Interspeech, pp. 2117-2120.
    • (2003) Proceedings of Interspeech , pp. 2117-2120
    • Paliwal, K.K.1    Alsteris, L.2
  • 37
    • 0029007678 scopus 로고
    • Quantifying the relation between speech quality and speech intelligibility
    • Preminger, J., and Tasell, D. (1995). Quantifying the relation between speech quality and speech intelligibility., J. Speech Lang. Hear. Res. 38, 714.
    • (1995) J. Speech Lang. Hear. Res. , vol.38 , pp. 714
    • Preminger, J.1    Tasell, D.2
  • 39
    • 17644371385 scopus 로고    scopus 로고
    • A Speech Intelligibility Index-based approach to predict the speech reception threshold for sentences in fluctuating noise for normal-hearing listeners
    • DOI 10.1121/1.1861713
    • Rhebergen, K. S., and Versfeld, N. J. (2005). A speech intelligibility index-based approach to predict the speech reception threshold for sentences in fluctuating noise for normal-hearing listeners., J. Acoust. Soc. Am. 117, 2181-2192. 10.1121/1.1861713 (Pubitemid 40570476)
    • (2005) Journal of the Acoustical Society of America , vol.117 , Issue.4 , pp. 2181-2192
    • Rhebergen, K.S.1    Versfeld, N.J.2
  • 40
    • 0028823541 scopus 로고
    • Speech recognition with primarily temporal cues
    • 10.1126/science.270.5234.303
    • Shannon, R., Zeng, F., Kamath, V., Wygonski, J., and Ekelid, M. (1995). Speech recognition with primarily temporal cues., Science 270, 303. 10.1126/science.270.5234.303
    • (1995) Science , vol.270 , pp. 303
    • Shannon, R.1    Zeng, F.2    Kamath, V.3    Wygonski, J.4    Ekelid, M.5
  • 44
    • 70450161547 scopus 로고    scopus 로고
    • An evaluation of objective quality measures for speech intelligibility prediction
    • in
    • Taal, C. H., Hendriks, R. C., Heusdens, R., Jensen, J., and Kjems, U. (2009). An evaluation of objective quality measures for speech intelligibility prediction., in Proceedings of Interspeech, pp. 1947-1950.
    • (2009) Proceedings of Interspeech , pp. 1947-1950
    • Taal, C.H.1    Hendriks, R.C.2    Heusdens, R.3    Jensen, J.4    Kjems, U.5
  • 47
    • 0037504237 scopus 로고    scopus 로고
    • Design, optimization and evaluation of a Danish sentence test in noise
    • Wagener, K., Josvassen, J. L., and Ardenkjaer, R. (2003). Design, optimization and evaluation of a Danish sentence test in noise., Int. J. Audiol. 42, 10-17. 10.3109/14992020309056080 (Pubitemid 37372682)
    • (2003) International Journal of Audiology , vol.42 , Issue.1 , pp. 10-17
    • Wagener, K.1    Josvassen, J.L.2    Ardenkjaer, R.3
  • 48
    • 84892233308 scopus 로고    scopus 로고
    • On ideal binary mask as the computational goal of auditory scene analysis
    • edited by P. Divenyi (Springer, New York)
    • Wang, D. (2005). On ideal binary mask as the computational goal of auditory scene analysis., in Speech Separation by Humans and Machines, edited by, P. Divenyi, (Springer, New York), pp. 181-197.
    • (2005) Speech Separation by Humans and Machines , pp. 181-197
    • Wang, D.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.