메뉴 건너뛰기




Volumn 15, Issue 5, 2007, Pages 1711-1723

Robust speaker recognition in noisy conditions

Author keywords

Missing feature theory; Multicondition training; Noise compensation; Noise modeling; Speaker recognition

Indexed keywords

MISSING-FEATURE THEORY; MULTICONDITION TRAINING; NOISE COMPENSATION; NOISE MODELING; SPEAKER RECOGNITION;

EID: 63249107289     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2007.899278     Document Type: Article
Times cited : (234)

References (51)
  • 1
    • 0016067897 scopus 로고
    • Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification
    • B. S. Atal, "Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification," J. Acoust. Soc. Amer., vol. 55, pp. 1304-1312, 1974.
    • (1974) J. Acoust. Soc. Amer , vol.55 , pp. 1304-1312
    • Atal, B.S.1
  • 3
    • 0028515984 scopus 로고
    • Experimental evaluation of features for robust speaker identification
    • Oct
    • D. A. Reynolds, "Experimental evaluation of features for robust speaker identification," IEEE Trans. Speech Audio Process., vol. 2, no. 4, pp. 639-643, Oct. 1994.
    • (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.4 , pp. 639-643
    • Reynolds, D.A.1
  • 4
    • 0030247355 scopus 로고    scopus 로고
    • Robust speaker recognition: A feature-based approach
    • Sep
    • R. Mammone, X. Zhang, and R. P. Ramachandran, "Robust speaker recognition: A feature-based approach," IEEE Signal Process. Mag., vol. 13, no. 5, pp. 58-71, Sep. 1996.
    • (1996) IEEE Signal Process. Mag , vol.13 , Issue.5 , pp. 58-71
    • Mammone, R.1    Zhang, X.2    Ramachandran, R.P.3
  • 5
    • 0030353333 scopus 로고    scopus 로고
    • Comparison of text-independent speaker recognition methods on telephone speech with acoustic mismatch
    • Philadelpia, PA
    • S. van Vuuren, "Comparison of text-independent speaker recognition methods on telephone speech with acoustic mismatch," in Proc. ICSLP'96, Philadelpia, PA, 1996, pp. 1788-1791.
    • (1996) Proc. ICSLP'96 , pp. 1788-1791
    • van Vuuren, S.1
  • 6
    • 0026835134 scopus 로고
    • Global optmization of a neural network-hidden markov model hybrid
    • Mar
    • Y. Bengio, R. De Mori, G. Flammia, and R. Kompe, "Global optmization of a neural network-hidden markov model hybrid," IEEE Trans. Neural Netw., vol. 3, no. 2, pp. 252-259, Mar. 1992.
    • (1992) IEEE Trans. Neural Netw , vol.3 , Issue.2 , pp. 252-259
    • Bengio, Y.1    De Mori, R.2    Flammia, G.3    Kompe, R.4
  • 7
    • 33747684554 scopus 로고
    • Integrated optimization of feature transformation for speech recognition
    • Madrid, Spain
    • S. Euler, "Integrated optimization of feature transformation for speech recognition," in Proc. Eurospeech'95, Madrid, Spain, 1995, pp. 109-112.
    • (1995) Proc. Eurospeech'95 , pp. 109-112
    • Euler, S.1
  • 8
    • 85135185331 scopus 로고    scopus 로고
    • Discriminative feature and model design for automatic speech recognition
    • Rhodes, Greece
    • M. Rahim, Y. Bengio, and Y. Lecun, "Discriminative feature and model design for automatic speech recognition," in Proc. Eurospeech' 97, Rhodes, Greece, 1997, pp. 75-78.
    • (1997) Proc. Eurospeech' 97 , pp. 75-78
    • Rahim, M.1    Bengio, Y.2    Lecun, Y.3
  • 9
    • 0033746018 scopus 로고    scopus 로고
    • Robustness to telephone handset distortion in speaker recognition by discriminative feature design
    • L. P. Heck, Y. Konig, M. K. Sonmez, and M. Weintraub, "Robustness to telephone handset distortion in speaker recognition by discriminative feature design," Speech Commun., vol. 31, pp. 181-192, 2000.
    • (2000) Speech Commun , vol.31 , pp. 181-192
    • Heck, L.P.1    Konig, Y.2    Sonmez, M.K.3    Weintraub, M.4
  • 10
    • 0030247355 scopus 로고    scopus 로고
    • Robust speaker recognition-A feature-based approach
    • Sep
    • R. Mammone, X. Zhang, and R. P. Ramachandran, "Robust speaker recognition-A feature-based approach," IEEE Signal Process. Mag., vol. 13, no. 5, pp. 58-71, Sep. 1996.
    • (1996) IEEE Signal Process. Mag , vol.13 , Issue.5 , pp. 58-71
    • Mammone, R.1    Zhang, X.2    Ramachandran, R.P.3
  • 11
    • 84892149368 scopus 로고    scopus 로고
    • Magnitude-only estimation of handset nonlinearity with application to speaker recopgnition
    • Seattle, WA
    • T. F. Quatieri, D. A. Reynolds, and G. C. O'Leary, "Magnitude-only estimation of handset nonlinearity with application to speaker recopgnition," in Proc. ICASSP'98, Seattle, WA, 1998, pp. 745-748.
    • (1998) Proc. ICASSP'98 , pp. 745-748
    • Quatieri, T.F.1    Reynolds, D.A.2    O'Leary, G.C.3
  • 14
    • 0033884858 scopus 로고    scopus 로고
    • Speaker verification using adapted Gaussian mixture models
    • D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, "Speaker verification using adapted Gaussian mixture models," Digital Signal Process., vol. 10, pp. 19-41, 2000.
    • (2000) Digital Signal Process , vol.10 , pp. 19-41
    • Reynolds, D.A.1    Quatieri, T.F.2    Dunn, R.B.3
  • 15
    • 0141702107 scopus 로고    scopus 로고
    • Feature and score normalization for speaker verification of cellular data
    • Hong Kong, China
    • C. Barras and J. L. Gauvain, "Feature and score normalization for speaker verification of cellular data," in Proc. ICASSP'03, Hong Kong, China, 2003, pp. 49-52.
    • (2003) Proc. ICASSP'03 , pp. 49-52
    • Barras, C.1    Gauvain, J.L.2
  • 16
    • 0033884857 scopus 로고    scopus 로고
    • Score normalization for text-independent speaker verification systems
    • R. Auckenthaler, M. Carey, and H. Lloyd-Thomas, "Score normalization for text-independent speaker verification systems," Digital Signal Process., vol. 10, pp. 42-54, 2000.
    • (2000) Digital Signal Process , vol.10 , pp. 42-54
    • Auckenthaler, R.1    Carey, M.2    Lloyd-Thomas, H.3
  • 17
    • 0032595177 scopus 로고    scopus 로고
    • Robust text-independent speaker identification over telephone channels
    • Sep
    • H. A. Murthy, F. Beaufays, L. P. Heck, and M. Weintraub, "Robust text-independent speaker identification over telephone channels," IEEE Trans. Speech Audio Process., vol. 7, no. 5, pp. 554-568, Sep. 1999.
    • (1999) IEEE Trans. Speech Audio Process , vol.7 , Issue.5 , pp. 554-568
    • Murthy, H.A.1    Beaufays, F.2    Heck, L.P.3    Weintraub, M.4
  • 18
    • 84988224855 scopus 로고    scopus 로고
    • A model-based transformational approach to robust speaker recognition
    • Beijing, China
    • R. Teunen, B. Shahshahani, and L. P. Heck, "A model-based transformational approach to robust speaker recognition," in Proc. ICSLP'00, Beijing, China, 2000, pp. 495-498.
    • (2000) Proc. ICSLP'00 , pp. 495-498
    • Teunen, R.1    Shahshahani, B.2    Heck, L.P.3
  • 19
    • 0033748244 scopus 로고    scopus 로고
    • Speaker verification over the telephone
    • L. F. Lamel and J. L. Gauvain, "Speaker verification over the telephone," Speech Commun., vol. 31, pp. 141-154, 2000.
    • (2000) Speech Commun , vol.31 , pp. 141-154
    • Lamel, L.F.1    Gauvain, J.L.2
  • 20
    • 85009167959 scopus 로고    scopus 로고
    • Environment adaptation for robust speaker verification
    • Geneva, Switzerland
    • K. K. Yiu, M. W. Mak, and S. Y. Kung, "Environment adaptation for robust speaker verification," in Proc. Eurospeech'03, Geneva, Switzerland, 2003, pp. 2973-2976.
    • (2003) Proc. Eurospeech'03 , pp. 2973-2976
    • Yiu, K.K.1    Mak, M.W.2    Kung, S.Y.3
  • 21
    • 0030371776 scopus 로고    scopus 로고
    • Overview of speaker enhancement techniques for automatic speaker recognition
    • Philadelpia, PA
    • J. Ortega-Garcia and L. Gonzalez-Rodriguez, "Overview of speaker enhancement techniques for automatic speaker recognition," in Proc. ICSLP'96, Philadelpia, PA, 1996, pp. 929-932.
    • (1996) Proc. ICSLP'96 , pp. 929-932
    • Ortega-Garcia, J.1    Gonzalez-Rodriguez, L.2
  • 22
    • 33646772812 scopus 로고    scopus 로고
    • An evaluation of VTS and IMM for speaker verification in noise
    • Geneva, Switzerland
    • Suhadi, S. Stan, T. Fingscheidt, and C. Beaugeant, "An evaluation of VTS and IMM for speaker verification in noise," in Proc. Eurospeech' 03, Geneva, Switzerland, 2003, pp. 1669-1672.
    • (2003) Proc. Eurospeech' 03 , pp. 1669-1672
    • Suhadi1    Stan, S.2    Fingscheidt, T.3    Beaugeant, C.4
  • 23
    • 85135375893 scopus 로고
    • HMM recognition in noise using parallel model combination
    • Berlin, Germany
    • M. J. F. Gales and S. Young, "HMM recognition in noise using parallel model combination," in Proc. Eurospeech'93, Berlin, Germany, 1993, pp. 837-840.
    • (1993) Proc. Eurospeech'93 , pp. 837-840
    • Gales, M.J.F.1    Young, S.2
  • 24
    • 0030125219 scopus 로고    scopus 로고
    • Speaker recognition using HMM composition in noisy environments
    • T. Matsui, T. Kanno, and S. Furui, "Speaker recognition using HMM composition in noisy environments," Comput. Speech Lang., vol. 10, pp. 107-116, 1996.
    • (1996) Comput. Speech Lang , vol.10 , pp. 107-116
    • Matsui, T.1    Kanno, T.2    Furui, S.3
  • 25
    • 0034848879 scopus 로고    scopus 로고
    • Text-dependent speaker verification under noisy conditions using parallel model combination
    • Salt Lake City, UT
    • L. P. Wong and M. Russell, "Text-dependent speaker verification under noisy conditions using parallel model combination," in Proc. ICASSP'01, Salt Lake City, UT, 2003, pp. 457-460.
    • (2003) Proc. ICASSP'01 , pp. 457-460
    • Wong, L.P.1    Russell, M.2
  • 26
    • 0030649027 scopus 로고    scopus 로고
    • Jacobian approach to fast acoustic model adaptation
    • Munich, Germany
    • S. Sagayama, Y. Yamaguchi, S. Takahashi, and J. Takahashi, "Jacobian approach to fast acoustic model adaptation," in Proc. ICASSP'97, Munich, Germany, 1997, pp. 835-838.
    • (1997) Proc. ICASSP'97 , pp. 835-838
    • Sagayama, S.1    Yamaguchi, Y.2    Takahashi, S.3    Takahashi, J.4
  • 27
  • 28
    • 0030647921 scopus 로고    scopus 로고
    • Robust speaker recognition through acoustic array processing and spectral normalization
    • Munich, Germany
    • L. Gonzalez-Rodriguez and J. Ortega-Garcia, "Robust speaker recognition through acoustic array processing and spectral normalization," in Proc. ICASSP'97, Munich, Germany, 1997, pp. 1103-1106.
    • (1997) Proc. ICASSP'97 , pp. 1103-1106
    • Gonzalez-Rodriguez, L.1    Ortega-Garcia, J.2
  • 30
    • 0031619912 scopus 로고    scopus 로고
    • Speaker verification in noisy environment with combined spectral subtraction and missing data theory
    • Seattle, WA
    • A. Drygajlo and M. El-Maliki, "Speaker verification in noisy environment with combined spectral subtraction and missing data theory," in Proc. ICASSP'98, Seattle, WA, 1998, pp. 121-124.
    • (1998) Proc. ICASSP'98 , pp. 121-124
    • Drygajlo, A.1    El-Maliki, M.2
  • 31
    • 0033748161 scopus 로고    scopus 로고
    • Localization and selection of speaker-specific information with statistical modelling
    • L. Besacier, J. F. Bonastre, and C. Fredouille, "Localization and selection of speaker-specific information with statistical modelling," Speech Commun., vol. 31, pp. 89-106, 2000.
    • (2000) Speech Commun , vol.31 , pp. 89-106
    • Besacier, L.1    Bonastre, J.F.2    Fredouille, C.3
  • 32
    • 4544349444 scopus 로고    scopus 로고
    • Universal compensation-An approach to noisy speech recognition assuming no knowledge of noise
    • Montreal, QC, Canada
    • J. Ming, "Universal compensation-An approach to noisy speech recognition assuming no knowledge of noise," in Proc. ICASSP'04, Montreal, QC, Canada, 2004, pp. I.961-I.964.
    • (2004) Proc. ICASSP'04
    • Ming, J.1
  • 33
    • 33646782289 scopus 로고    scopus 로고
    • Speaker identification in unknown noisy conditions-A universal compensation approach
    • Philadelphia, PA
    • J. Ming, D. Stewart, and S. Vaseghi, "Speaker identification in unknown noisy conditions-A universal compensation approach," in Proc. ICASSP'05, Philadelphia, PA, 2005, pp. 617-620.
    • (2005) Proc. ICASSP'05 , pp. 617-620
    • Ming, J.1    Stewart, D.2    Vaseghi, S.3
  • 34
    • 0030355935 scopus 로고    scopus 로고
    • A new ASR approach based on independent processing and recombination of partial frequency bands
    • Philadelpia, PA
    • H. Bourlard and S. Dupont, "A new ASR approach based on independent processing and recombination of partial frequency bands," in Proc. ICSLP'96, Philadelpia, PA, 1996, pp. 426-429.
    • (1996) Proc. ICSLP'96 , pp. 426-429
    • Bourlard, H.1    Dupont, S.2
  • 35
    • 0030365517 scopus 로고    scopus 로고
    • Towards ASR on partially corrupted speech
    • Philadelpia, PA
    • H. Hermansky, S. Tibrewala, and M. Pavel, "Towards ASR on partially corrupted speech," in Proc. ICSLP'96, Philadelpia, PA, 1996, pp. 462-465.
    • (1996) Proc. ICSLP'96 , pp. 462-465
    • Hermansky, H.1    Tibrewala, S.2    Pavel, M.3
  • 36
    • 0023263708 scopus 로고
    • Multi-style training for robust isolated-word speech recognition
    • Dallas, TX
    • R. P. Lippmann, E. A. Martin, and D. B. Paul, "Multi-style training for robust isolated-word speech recognition," in Proc. ICASSP'87, Dallas, TX, 1987, pp. 705-708.
    • (1987) Proc. ICASSP'87 , pp. 705-708
    • Lippmann, R.P.1    Martin, E.A.2    Paul, D.B.3
  • 37
    • 85009070292 scopus 로고    scopus 로고
    • Large-vocabulary speech recognition under adverse acoustic environments
    • Beijing, China
    • L. Deng, A. Acero, M. Plumpe, and X.-D. Huang, "Large-vocabulary speech recognition under adverse acoustic environments," in Proc. ICSLP'00, Beijing, China, 2000, pp. 806-809.
    • (2000) Proc. ICSLP'00 , pp. 806-809
    • Deng, L.1    Acero, A.2    Plumpe, M.3    Huang, X.-D.4
  • 38
    • 0036754943 scopus 로고    scopus 로고
    • Robust speech recognition using probabilistic union models
    • Sep
    • J. Ming, P. Jancovic, and F. J. Smith, "Robust speech recognition using probabilistic union models," IEEE Trans. Speech Audio Process., vol. 10, no. 6, pp. 403-414, Sep. 2002.
    • (2002) IEEE Trans. Speech Audio Process , vol.10 , Issue.6 , pp. 403-414
    • Ming, J.1    Jancovic, P.2    Smith, F.J.3
  • 39
    • 33646410695 scopus 로고    scopus 로고
    • A posterior union model with applications to robust speech and speaker recognition
    • Article ID 75390
    • J. Ming, J. Lin, and F. J. Smith, "A posterior union model with applications to robust speech and speaker recognition," EURASIP J. Appl. Signal Process., vol. 2006, pp. 1-12, 2006, Article ID 75390.
    • (2006) EURASIP J. Appl. Signal Process , vol.2006 , pp. 1-12
    • Ming, J.1    Lin, J.2    Smith, F.J.3
  • 40
    • 0030682302 scopus 로고    scopus 로고
    • HTIMIT and LLHDB: Speech corpora for the study of handset transducer effects
    • Munich, Germany
    • D. A. Reynolds, "HTIMIT and LLHDB: Speech corpora for the study of handset transducer effects," in Proc. ICASSP'97,Munich, Germany, 1997, pp. 1535-1538.
    • (1997) Proc. ICASSP'97 , pp. 1535-1538
    • Reynolds, D.A.1
  • 41
    • 0029355999 scopus 로고
    • Speaker identification and verification using Gaussian mixture speaker models
    • D. A. Reynolds, "Speaker identification and verification using Gaussian mixture speaker models," Speech Commun., vol. 17, pp. 91-108, 1995.
    • (1995) Speech Commun , vol.17 , pp. 91-108
    • Reynolds, D.A.1
  • 42
    • 0028996867 scopus 로고
    • CTIMIT: A speech corpus for the cellular environment with applications to automatic speech recognition
    • Detroit, MI
    • K. L. Brown and E. B. George, "CTIMIT: A speech corpus for the cellular environment with applications to automatic speech recognition," in Proc. ICASSP'95, Detroit, MI, 1995, pp. 105-108.
    • (1995) Proc. ICASSP'95 , pp. 105-108
    • Brown, K.L.1    George, E.B.2
  • 43
    • 0025680225 scopus 로고
    • NTIMIT: A phonetically balanced, continuous speech telephone bandwidth speech database
    • Albuquerque, NM
    • C. Jankowski, A. Kalyanswamy, S. Basson, and J. Spitz, "NTIMIT: A phonetically balanced, continuous speech telephone bandwidth speech database," in Proc. ICASSP'90, Albuquerque, NM, 1990, pp. 109-112.
    • (1990) Proc. ICASSP'90 , pp. 109-112
    • Jankowski, C.1    Kalyanswamy, A.2    Basson, S.3    Spitz, J.4
  • 44
    • 0032091375 scopus 로고    scopus 로고
    • Text-independent speaker recognition using non-linear frame likelihood transformation
    • K. P. Markov and S. Nakagawa, "Text-independent speaker recognition using non-linear frame likelihood transformation," Speech Commun., vol. 24, pp. 193-209, 1998.
    • (1998) Speech Commun , vol.24 , pp. 193-209
    • Markov, K.P.1    Nakagawa, S.2
  • 45
    • 85135144525 scopus 로고
    • On the decorrelation of the filter-bank energies in speech recognition
    • Madrid, Spain
    • C. Nadeu, J. Hernando, and M. Gorricho, "On the decorrelation of the filter-bank energies in speech recognition," in Proc. Eurospeech'95, Madrid, Spain, 1995, pp. 1381-1384.
    • (1995) Proc. Eurospeech'95 , pp. 1381-1384
    • Nadeu, C.1    Hernando, J.2    Gorricho, M.3
  • 46
    • 0038338247 scopus 로고    scopus 로고
    • Decorrelated and liftered filter-bank energies for robust speech recognition
    • Budapest, Hungary
    • K. K. Paliwal, "Decorrelated and liftered filter-bank energies for robust speech recognition," in Proc. Eurospeech'99, Budapest, Hungary, 1999, pp. 85-88.
    • (1999) Proc. Eurospeech'99 , pp. 85-88
    • Paliwal, K.K.1
  • 47
    • 0027465491 scopus 로고
    • The Lombard reflex and its role on human listeners and automatic speech recognizer
    • J.-C. Junqua, "The Lombard reflex and its role on human listeners and automatic speech recognizer," J. Acoust. Soc. Amer., vol. 93, pp. 510-524, 1993.
    • (1993) J. Acoust. Soc. Amer , vol.93 , pp. 510-524
    • Junqua, J.-C.1
  • 48
    • 0030283741 scopus 로고    scopus 로고
    • Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition
    • J. H. L. Hansen, "Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition," Speech Commun., vol. 20, pp. 151-173, 1996.
    • (1996) Speech Commun , vol.20 , pp. 151-173
    • Hansen, J.H.L.1
  • 49
    • 0030365546 scopus 로고    scopus 로고
    • Experiments of speech recognition in a noisy and reverberant environment using a microphone array and HMM adaptation
    • Trento, Italy
    • D. Giuliani, M. Omologo, and P. Svaizer, "Experiments of speech recognition in a noisy and reverberant environment using a microphone array and HMM adaptation," in Proc. ICSLP'96, Trento, Italy, 1996, pp. 1329-1332.
    • (1996) Proc. ICSLP'96 , pp. 1329-1332
    • Giuliani, D.1    Omologo, M.2    Svaizer, P.3
  • 50
    • 42749101361 scopus 로고    scopus 로고
    • The MIT mobile device speaker verification corpus: Data collection and preliminary experiments
    • San Juan, Puerto Rico, Online, Available
    • R. Woo, A. Park, and T. J. Hazen, "The MIT mobile device speaker verification corpus: Data collection and preliminary experiments," in Proc. IEEE Odyssey 2006-The Speaker and Language Recognition Workshop, San Juan, Puerto Rico, 2006, pp. 1-6[Online]. Available: http://groups.csail. mit.edu/sls/mdsvc
    • (2006) Proc. IEEE Odyssey 2006-The Speaker and Language Recognition Workshop , pp. 1-6
    • Woo, R.1    Park, A.2    Hazen, T.J.3
  • 51
    • 33947622673 scopus 로고    scopus 로고
    • Speaker verification over handheld devices with realistic noisy speech data
    • Toulouse, France
    • J. Ming, T. J. Hazen, and J. R. Glass, "Speaker verification over handheld devices with realistic noisy speech data," in Proc. ICASSP'06, Toulouse, France, 2006, pp. 637-640.
    • (2006) Proc. ICASSP'06 , pp. 637-640
    • Ming, J.1    Hazen, T.J.2    Glass, J.R.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.