메뉴 건너뛰기




Volumn 131, Issue 5, 2012, Pages EL368-EL374

Spectro-temporal modulation energy based mask for robust speaker identification

Author keywords

[No Author keywords available]

Indexed keywords

AUDIO ACOUSTICS; LOUDSPEAKERS; MODULATION; SIGNAL TO NOISE RATIO; SPEECH; TENSORS;

EID: 84863799485     PISSN: 00014966     EISSN: None     Source Type: Journal    
DOI: 10.1121/1.3697534     Document Type: Article
Times cited : (15)

References (19)
  • 1
    • 0031233424 scopus 로고    scopus 로고
    • Speaker recognition: A tutorial
    • J. P. Campbell, "Speaker recognition: A tutorial," Proc. IEEE 85, 1437-1462 (1997).
    • (1997) Proc. IEEE , vol.85 , pp. 1437-1462
    • Campbell, J.P.1
  • 2
    • 0029355999 scopus 로고
    • Speaker identification and verification using Gaussian mixture speaker models
    • D. A. Reynolds, "Speaker identification and verification using gaussian mixture speaker models," Speech Commun. 17, 91-108 (1995).
    • (1995) Speech Commun. , vol.17 , pp. 91-108
    • Reynolds, D.A.1
  • 5
    • 57349117784 scopus 로고    scopus 로고
    • Auditory sparse representation for robust speaker recognition based on tensor structure
    • Q. Wu and L. Zhang, "Auditory sparse representation for robust speaker recognition based on tensor structure," EURASIP J. Audio, Speech, Music Process. 2008, 578612 (2008).
    • (2008) EURASIP J. Audio, Speech, Music Process. , vol.2008
    • Wu, Q.1    Zhang, L.2
  • 6
    • 70449360175 scopus 로고    scopus 로고
    • Modulation spectral features for robust far-field speaker identification
    • T. H. Falk and W.-Y. Chan, "Modulation spectral features for robust far-field speaker identification," IEEE Trans. Audio, Speech, Lang. Process. 18, 90-100 (2010).
    • (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , pp. 90-100
    • Falk, T.H.1    Chan, W.-Y.2
  • 7
    • 0031187171 scopus 로고    scopus 로고
    • Speech recognition by machines and humans
    • R. P. Lippmann, "Speech recognition by machines and humans," Speech Commun. 22, 1-15 (1997).
    • (1997) Speech Commun. , vol.22 , pp. 1-15
    • Lippmann, R.P.1
  • 8
    • 84892233308 scopus 로고    scopus 로고
    • On ideal binary mask as the computational goal of auditory scene analysis
    • edited by P. Divenyi (Kluwer, Norwell, MA)
    • D. L. Wang, "On ideal binary mask as the computational goal of auditory scene analysis," in Speech Separation by Humans and Machines, edited by P. Divenyi (Kluwer, Norwell, MA, 2005), pp. 181-197.
    • (2005) Speech Separation by Humans and Machines , pp. 181-197
    • Wang, D.L.1
  • 9
    • 33845354768 scopus 로고    scopus 로고
    • Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation
    • D. S. Brungart, P. S. Chang, B. D. Simpson, and D. L. Wang, "Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation," J. Acoust. Soc. Am. 120 (6), 4007-4018 (2006).
    • (2006) J. Acoust. Soc. Am. , vol.120 , Issue.6 , pp. 4007-4018
    • Brungart, D.S.1    Chang, P.S.2    Simpson, B.D.3    Wang, D.L.4
  • 10
    • 64649103540 scopus 로고    scopus 로고
    • Speech intelligibility in background noise with ideal binary time-frequency masking
    • D. L. Wang, U. Kjems, M. S. Pedersen, J. B. Boldt, and T. Lunner, "Speech intelligibility in background noise with ideal binary time-frequency masking," J. Acoust. Soc. Am. 125 (4), 2336-2347 (2009).
    • (2009) J. Acoust. Soc. Am. , vol.125 , Issue.4 , pp. 2336-2347
    • Wang, D.L.1    Kjems, U.2    Pedersen, M.S.3    Boldt, J.B.4    Lunner, T.5
  • 11
    • 70349093614 scopus 로고    scopus 로고
    • An algorithm that improves speech intelligibility in noise for normal-hearing listeners
    • G. Kim, Y. Lu, Y. Hu, and P. C. Loizou, "An algorithm that improves speech intelligibility in noise for normal-hearing listeners," J. Acoust. Soc. Am. 126 (3), 1486-1494 (2009).
    • (2009) J. Acoust. Soc. Am. , vol.126 , Issue.3 , pp. 1486-1494
    • Kim, G.1    Lu, Y.2    Hu, Y.3    Loizou, P.C.4
  • 12
    • 23744508888 scopus 로고    scopus 로고
    • Multi-resolution spectro-temporal analysis of complex sounds
    • T. Chi, P. Ru, and S. A. Shamma, "Multi-resolution spectro-temporal analysis of complex sounds," J. Acoust. Soc. Am. 118 (2), 887-906 (2005).
    • (2005) J. Acoust. Soc. Am. , vol.118 , Issue.2 , pp. 887-906
    • Chi, T.1    Ru, P.2    Shamma, S.A.3
  • 14
    • 0038711696 scopus 로고    scopus 로고
    • A spectro-temporal modulation index (STMI) for assessment of speech intelligibility
    • M. Elhilali, T. Chi, and S. A. Shamma, "A spectro-temporal modulation index (STMI) for assessment of speech intelligibility," Speech Commun. 41, 331-348 (2003).
    • (2003) Speech Commun. , vol.41 , pp. 331-348
    • Elhilali, M.1    Chi, T.2    Shamma, S.A.3
  • 15
    • 33750368310 scopus 로고    scopus 로고
    • An audio-visual corpus for speech perception and automatic speech recognition
    • M. Cooke, J. Barker, S. Cunningham, and X. Shao, "An audio-visual corpus for speech perception and automatic speech recognition," J. Acoust. Soc. Am. 120 (5), 2421-2424 (2006).
    • (2006) J. Acoust. Soc. Am. , vol.120 , Issue.5 , pp. 2421-2424
    • Cooke, M.1    Barker, J.2    Cunningham, S.3    Shao, X.4
  • 16
    • 84890477287 scopus 로고    scopus 로고
    • Robust emotion recognition by spectro-temporal modulation statistic features
    • T.-S. Chi, L.-Y. Yeh, and C.-C. Hsu, "Robust emotion recognition by spectro-temporal modulation statistic features," J. Ambient Intell. Human. Comput. 3 (2), 47-60 (2012).
    • (2012) J. Ambient Intell. Human. Comput. , vol.3 , Issue.2 , pp. 47-60
    • Chi, T.-S.1    Yeh, L.-Y.2    Hsu, C.-C.3
  • 18
    • 0033884858 scopus 로고    scopus 로고
    • Speaker verification using adapted Gaussian mixture models
    • D. A. Reyolds, T. F. Quatieri, and R. B. Dunn, "Speaker verification using adapted Gaussian mixture models," Digital Signal Process. 10, 19-41 (2000).
    • (2000) Digital Signal Process. , vol.10 , pp. 19-41
    • Reyolds, D.A.1    Quatieri, T.F.2    Dunn, R.B.3
  • 19
    • 56249136428 scopus 로고    scopus 로고
    • Transforming binary uncertainties for robust speech recognition
    • S. Srinivasan and D. L. Wang, "Transforming binary uncertainties for robust speech recognition," IEEE Trans. Audio, Speech Lang. Process. 15 (7), 2130-2140 (2007).
    • (2007) IEEE Trans. Audio, Speech Lang. Process. , vol.15 , Issue.7 , pp. 2130-2140
    • Srinivasan, S.1    Wang, D.L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.