메뉴 건너뛰기




Volumn 20, Issue 5, 2012, Pages 1608-1616

CASA-Based robust speaker identification

Author keywords

Computational auditory scene analysis (CASA); gammatone frequency cepstral coefficient (GFCC); ideal binary mask; robust speaker identification

Indexed keywords

AUDITORY PERCEPTION; CEPSTRAL COEFFICIENTS; COMPUTATIONAL AUDITORY SCENE ANALYSIS; IDEAL BINARY MASK; MARGINALIZATION; NOISY SPEECH; PERFORMANCE IMPROVEMENTS; RELATED SYSTEMS; ROBUST SPEAKER IDENTIFICATION; SPEAKER CHARACTERISTICS; SPEAKER RECOGNITION SYSTEM; TIME FREQUENCY;

EID: 84859024513     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2012.2186803     Document Type: Article
Times cited : (161)

References (39)
  • 1
    • 65249170183 scopus 로고    scopus 로고
    • Speaker model clustering for efficient speaker identification in large population applications
    • May
    • V. R. Apsingekar and P. L. De Leon, "Speaker model clustering for efficient speaker identification in large population applications," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 4, pp. 848-853, May 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.4 , pp. 848-853
    • Apsingekar, V.R.1    De Leon, P.L.2
  • 3
    • 0031233424 scopus 로고    scopus 로고
    • Speaker recognition: A tutorial
    • PII S0018921997069478
    • J. P. Campbell, "Speaker recognition: A tutorial," Proc. IEEE, vol. 85, no. 9, pp. 1437-1462, Sep. 1997. (Pubitemid 127745630)
    • (1997) Proceedings of the IEEE , vol.85 , Issue.9 , pp. 1437-1462
    • Campbell, J.P.1
  • 4
    • 0035342414 scopus 로고    scopus 로고
    • Robust automatic speech recognition with missing and unreliable acoustic data
    • DOI 10.1016/S0167-6393(00)00034-0, PII S0167639300000340
    • M. Cooke, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data," Speech Commun., vol. 34, pp. 267-285, 2001. (Pubitemid 32284867)
    • (2001) Speech Communication , vol.34 , Issue.3 , pp. 267-285
    • Cooke, M.1    Green, P.2    Josifovski, L.3    Vizinho, A.4
  • 6
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • S. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-28, no. 4, pp. 357-366, Aug. 1980. (Pubitemid 11464930)
    • (1980) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.ASSP-28 , Issue.4 , pp. 357-366
    • Davis Steven, B.1    Mermelstein Paul2
  • 7
    • 85135252448 scopus 로고    scopus 로고
    • Missing features detection and handling for robust speaker verification
    • M. El-Maliki and A. Drygajlo, "Missing features detection and handling for robust speaker verification," in Proc. Eurospeech, 1999, pp. 975-978.
    • (1999) Proc. Eurospeech , pp. 975-978
    • El-Maliki, M.1    Drygajlo, A.2
  • 8
    • 69949172494 scopus 로고    scopus 로고
    • 40 years of progress in automatic speaker recognition
    • S. Furui, "40 years of progress in automatic speaker recognition," Lecture Notes Comput. Sci., vol. 5558, pp. 1050-1059, 2009.
    • (2009) Lecture Notes Comput. Sci. , vol.5558 , pp. 1050-1059
    • Furui, S.1
  • 9
    • 0019555090 scopus 로고
    • Cepstral analysis technique for automatic speaker verification
    • S. Furui, "Cepstral analysis technique for automatic speaker verification," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-29, no. 2, pp. 254-272, Apr. 1981. (Pubitemid 11495877)
    • (1981) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.ASSP-29 , Issue.2 , pp. 254-272
    • Furui Sadaoki1
  • 10
    • 0036296948 scopus 로고    scopus 로고
    • Noise-robust open-set speaker recognition using noise-dependent Gaussian mixture classifier
    • Y. Gong, "Noise-robust open-set speaker recognition using noise-dependent Gaussian mixture classifier," in Proc. ICASSP, 2002, pp. 133-136.
    • (2002) Proc. ICASSP , pp. 133-136
    • Gong, Y.1
  • 12
    • 85008056718 scopus 로고    scopus 로고
    • HMM-based multipitch tracking for noisy and reverberant speech
    • Jul.
    • Z. Jin and D. L.Wang, "HMM-based multipitch tracking for noisy and reverberant speech," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 5, pp. 1091-1102, Jul. 2011.
    • (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.5 , pp. 1091-1102
    • Jin, Z.1    Wang, D.L.2
  • 13
    • 65249103478 scopus 로고    scopus 로고
    • A supervised learning approach to monaural segregation of reverberant speech
    • May
    • Z. Jin and D. L. Wang, "A supervised learning approach to monaural segregation of reverberant speech," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 4, pp. 625-638, May 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.4 , pp. 625-638
    • Jin, Z.1    Wang, D.L.2
  • 14
    • 0002560960 scopus 로고
    • A database for speaker-independent digit recognition
    • R.G. Leonard, "A database for speaker-independent digit recognition," in Proc. ICASSP, 1984, pp. 328-331.
    • (1984) Proc. ICASSP , pp. 328-331
    • Leonard, R.G.1
  • 15
    • 78049408631 scopus 로고    scopus 로고
    • Robust speaker identification using an auditorybased feature
    • Q. Li and Y. Huang, "Robust speaker identification using an auditorybased feature," in Proc. ICASSP, 2010, pp. 4514-4517.
    • (2010) Proc. ICASSP , pp. 4514-4517
    • Li, Q.1    Huang, Y.2
  • 16
    • 58149196390 scopus 로고    scopus 로고
    • On the optimality of ideal binary time-frequency masks
    • Y. Li and D. L. Wang, "On the optimality of ideal binary time-frequency masks," Speech Commun., vol. 51, pp. 230-239, 2009.
    • (2009) Speech Commun. , vol.51 , pp. 230-239
    • Li, Y.1    Wang, D.L.2
  • 17
    • 0030125219 scopus 로고    scopus 로고
    • Speaker recognition using HMM composition in noisy environments
    • DOI 10.1006/csla.1996.0007
    • T. Matsui, T. Kanno, and S. Furui, "Speaker recognition using HMM composition in noisy environments," Comput. Speech Lang., vol. 10, pp. 107-116, 1996. (Pubitemid 126346924)
    • (1996) Computer Speech and Language , vol.10 , Issue.2 , pp. 107-116
    • Matsui, T.1    Kanno, T.2    Furui, S.3
  • 20
    • 85009227702 scopus 로고    scopus 로고
    • Analysis of the Aurora large vocabulary evaluations
    • N. Parihar and J. Picone, "Analysis of the Aurora large vocabulary evaluations," in Proc. Eurospeech, 2003, pp. 337-340.
    • (2003) Proc. Eurospeech , pp. 337-340
    • Parihar, N.1    Picone, J.2
  • 21
  • 23
    • 51449083412 scopus 로고    scopus 로고
    • Robust speaker identification using combined feature selection and missing data recognition
    • D. Pullella, M. Kühne, and R. Togneri, "Robust speaker identification using combined feature selection and missing data recognition," in Proc. ICASSP, 2008, pp. 4833-4836.
    • (2008) Proc. ICASSP , pp. 4833-4836
    • Pullella, D.1    Kühne, M.2    Togneri, R.3
  • 24
    • 4644336054 scopus 로고    scopus 로고
    • Reconstruction of missing features for robust speech recognition
    • B. Raj, M. L. Seltzer, and R. M. Stern, "Reconstruction of missing features for robust speech recognition," Speech Commun., vol. 43, pp. 275-296, 2004.
    • (2004) Speech Commun. , vol.43 , pp. 275-296
    • Raj, B.1    Seltzer, M.L.2    Stern, R.M.3
  • 25
    • 85075924869 scopus 로고    scopus 로고
    • Comparison of background normalization methods for text-independent speaker verification
    • D. A. Reynolds, "Comparison of background normalization methods for text-independent speaker verification," in Proc. Eurospeech, 1997, pp. 963-966.
    • (1997) Proc. Eurospeech , pp. 963-966
    • Reynolds, D.A.1
  • 26
    • 0029355999 scopus 로고
    • Speaker identification and verification using Gaussian mixture speaker models
    • D. A. Reynolds, "Speaker identification and verification using Gaussian mixture speaker models," Speech Commun., vol. 17, pp. 91-108, 1995.
    • (1995) Speech Commun. , vol.17 , pp. 91-108
    • Reynolds, D.A.1
  • 28
    • 0033889739 scopus 로고    scopus 로고
    • Speaker verification by human listeners: Experiments comparing human and machine performance using the NIST 1998 speaker evaluation data
    • A. Schmidt-Nielsen and T. H. Crystal, "Speaker verification by human listeners: Experiments comparing human and machine performance using the NIST 1998 speaker evaluation data," Digital Signal Process., vol. 10, pp. 249-266, 2000.
    • (2000) Digital Signal Process. , vol.10 , pp. 249-266
    • Schmidt-Nielsen, A.1    Crystal, T.H.2
  • 29
    • 34547499683 scopus 로고    scopus 로고
    • Incorporating auditory feature uncertainties in robust speaker identification
    • Y. Shao, S. Srinivasan, andD. L.Wang, "Incorporating auditory feature uncertainties in robust speaker identification," in Proc. ICASSP, 2007, pp. 277-280.
    • (2007) Proc. ICASSP , pp. 277-280
    • Shao, Y.1    Srinivasan, S.2    Wang, D.L.3
  • 30
    • 51449101666 scopus 로고    scopus 로고
    • Robust speaker identification using auditory features and computational auditory scene analysis
    • Y. Shao and D. L.Wang, "Robust speaker identification using auditory features and computational auditory scene analysis," in Proc. ICASSP, 2008, pp. 1589-1592.
    • (2008) Proc. ICASSP , pp. 1589-1592
    • Shao, Y.1    Wang, D.L.2
  • 31
    • 33947649051 scopus 로고    scopus 로고
    • Robust speaker recognition using binary time-frequency masks
    • Y. Shao and D. L. Wang, "Robust speaker recognition using binary time-frequency masks," in Proc. ICASSP, 2006, pp. 645-648.
    • (2006) Proc. ICASSP , pp. 645-648
    • Shao, Y.1    Wang, D.L.2
  • 33
    • 33750311718 scopus 로고    scopus 로고
    • Binary and ratio time-frequency masks for robust speech recognition
    • DOI 10.1016/j.specom.2006.09.003, PII S0167639306001129
    • S. Srinivasan, N. Roman, and D. L. Wang, "Binary and ratio time-frequency masks for robust speech recognition," Speech Commun., vol. 48, pp. 1486-1501, 2006. (Pubitemid 44634774)
    • (2006) Speech Communication , vol.48 , Issue.11 , pp. 1486-1501
    • Srinivasan, S.1    Roman, N.2    Wang, D.3
  • 34
    • 56249136428 scopus 로고    scopus 로고
    • Transforming binary uncertainties for robust speech recognition
    • Sep.
    • S. Srinivasan and D. L. Wang, "Transforming binary uncertainties for robust speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 7, pp. 2130-2140, Sep. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.7 , pp. 2130-2140
    • Srinivasan, S.1    Wang, D.L.2
  • 35
    • 38849170676 scopus 로고    scopus 로고
    • ETSI Standard Distributed Speech Recognition; Advanced Front-End Feature Extraction Algorithm; Compression Algorithms" ETSI ES 202 050 v1.1.4 European Telecommunications Standards Institute, ETSI ES 202 050 v1.1.4
    • ETSI Standard, "Speech Processing, Transmission andQuality Aspects (STQ); Distributed Speech Recognition; Advanced Front-End Feature Extraction Algorithm; Compression Algorithms", ETSI ES 202 050 v1.1.4, 2005, European Telecommunications Standards Institute, ETSI ES 202 050 v1.1.4.
    • (2005) Speech Processing, Transmission AndQuality Aspects (STQ)
  • 36
    • 84892233308 scopus 로고    scopus 로고
    • On ideal binary mask as the computational goal of auditory scene analysis
    • P. Divenyi, Ed. Norwell, MA: Kluwer
    • D. L. Wang, "On ideal binary mask as the computational goal of auditory scene analysis," in Speech Separation by Humans and Machines, P. Divenyi, Ed. Norwell, MA: Kluwer, 2005, pp. 181-197.
    • (2005) Speech Separation by Humans and Machines , pp. 181-197
    • Wang, D.L.1
  • 38
    • 77957744636 scopus 로고    scopus 로고
    • Robust speaker recognition using denoised vocal source and vocal tract features
    • Jan.
    • N.Wang, P. C. Ching, N. Zheng, and T. Lee, "Robust speaker recognition using denoised vocal source and vocal tract features," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 1, pp. 196-205, Jan. 2011.
    • (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.1 , pp. 196-205
    • Wang, N.1    Ching, P.C.2    Zheng, N.3    Lee, T.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.