메뉴 건너뛰기




Volumn 20, Issue 8, 2012, Pages 2280-2290

Evaluation of speaker verification security and detection of HMM-based synthetic speech

Author keywords

Security; speaker recognition; speech synthesis

Indexed keywords

AVERAGE-VOICE; BACKGROUND MODEL; EQUAL ERROR RATE; FEATURE-BASED; GAUSSIAN MIXTURE MODEL-UNIVERSAL BACKGROUND MODELS; MODEL ADAPTATION; RELATIVE PHASE; RELIABLE DETECTION; SECURITY; SPEAKER MODEL; SPEAKER RECOGNITION; SPEAKER VERIFICATION; SPEECH CORPORA; SYNTHETIC SPEECH; TARGET SPEAKER; TEXT-TO-SPEECH SYNTHESIZERS; TRAINING DATA; WALL STREET JOURNAL;

EID: 84865369980     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2012.2201472     Document Type: Article
Times cited : (222)

References (54)
  • 2
    • 0029355724 scopus 로고
    • Likelihood normalization for speaker verification using a phoneme- and speaker-independent model
    • Aug
    • T. Matsui and S. Furui, "Likelihood normalization for speaker verification using a phoneme- and speaker-independent model," Speech Commun., vol. 17, no. 1-2, pp. 109-116, Aug. 1995.
    • (1995) Speech Commun. , vol.17 , Issue.1-2 , pp. 109-116
    • Matsui, T.1    Furui, S.2
  • 4
    • 82055208361 scopus 로고    scopus 로고
    • Revisiting carl bildt's impostor: Would a speaker verification system foil him?
    • Audio- and Video-Based Biometric Person Authentication Third International Conference, AVBPA 2001 Halmstad, Sweden, June 6-8, 2001 Proceedings
    • K. Sullivan and J. Pelecanos, "Revisiting carl bildt's impostor:Would a speaker verification system foil him?," in Audio- and Video-Based Biometric Person Authentication, ser. Lecture Notes Computer Science, J. Bigun and F. Smeraldi, Eds. Berlin/Heidelberg, Germany: Springer, 2001, vol. 2091, pp. 144-149. (Pubitemid 33291998)
    • (2001) Lecture Notes in Computer Science , Issue.2091 , pp. 144-149
    • Sullivan, K.P.H.1    Pelecanos, J.2
  • 5
    • 85114982170 scopus 로고    scopus 로고
    • Speech pre-processing against intentional imposture speaker recognition
    • Dec
    • D. Genoud and G. Chollet, "Speech pre-processing against intentional imposture speaker recognition," in Proc. Int. Conf. Spoken Lang. Process. (ICSLP), Dec. 1998, vol. 2, pp. 105-108.
    • (1998) Proc. Int. Conf. Spoken Lang. Process. (ICSLP , vol.2 , pp. 105-108
    • Genoud, D.1    Chollet, G.2
  • 7
    • 85135261394 scopus 로고    scopus 로고
    • Vulnerability speaker verification-A study of possible technical imposter techniques
    • J. Lindberg and M. Blomberg, "Vulnerability speaker verification-A study of possible technical imposter techniques.," in Proc. Eur. Conf. Speech Commun. Technol. (Eurospeech), 1999, vol. 3, pp. 1211-1214.
    • (1999) Proc. Eur. Conf. Speech Commun. Technol. (Eurospeech , vol.3 , pp. 1211-1214
    • Lindberg, J.1    Blomberg, M.2
  • 12
    • 0033746751 scopus 로고    scopus 로고
    • Forensic voice identification france
    • L.-J. Bo, "Forensic voice identification France," Speech Commun., vol. 31, no. 2-3, pp. 205-224, 2000.
    • (2000) Speech Commun. , vol.31 , Issue.2-3 , pp. 205-224
    • Bo, L.-J.1
  • 16
    • 85009077529 scopus 로고    scopus 로고
    • Imposture using synthetic speech against speaker verification based on spectrum and pitch
    • T. Masuko, K. Tokuda, and T. Kobayashi, "Imposture using synthetic speech against speaker verification based on spectrum and pitch," in Proc. Int. Conf. Spoken Lang. Process. (ICSLP), 2000, vol. 2, pp. 302-305.
    • (2000) Proc. Int. Conf. Spoken Lang. Process. (ICSLP , vol.2 , pp. 302-305
    • Masuko, T.1    Tokuda, K.2    Kobayashi, T.3
  • 19
    • 33645887246 scopus 로고    scopus 로고
    • Support vector machines using gmm supervectors for speaker verification
    • May
    • W. M. Campbell, D. E. Sturim, and D. A. Reynolds, "Support vector machines using GMM supervectors for speaker verification," IEEE Signal Process. Lett., vol. 13, no. 5, pp. 308-311, May 2006.
    • (2006) IEEE Signal Process. Lett. , vol.13 , Issue.5 , pp. 308-311
    • Campbell, W.M.1    Sturim, D.E.2    Reynolds, D.A.3
  • 20
    • 65249096207 scopus 로고    scopus 로고
    • Combining derivative and parametric kernels for speaker verification
    • May
    • C. Longworth and M. Gales, "Combining derivative and parametric kernels for speaker verification," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 4, pp. 748-757, May 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.4 , pp. 748-757
    • Longworth, C.1    Gales, M.2
  • 21
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • Nov
    • H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis," Speech Commun., vol. 51, no. 11, pp. 1039-1064, Nov. 2009.
    • (2009) Speech Commun. , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3
  • 30
    • 84966398940 scopus 로고
    • Optimising selection of units from speech database for concatenative synthesis
    • Sep
    • A. Black and N. Cambpbell, "Optimising selection of units from speech database for concatenative synthesis," Proc. Eurospeech-95, pp. 581-584, Sep. 1995.
    • (1995) Proc. Eurospeech-95 , pp. 581-584
    • Black, A.1    Cambpbell, N.2
  • 31
    • 67650790758 scopus 로고    scopus 로고
    • The blizzard challenge 2008
    • Sep. [Online]. Available
    • V. Karaiskos, S. King, R. A. J. Clark, and C. Mayo, "The Blizzard challenge 2008," in Proc. Blizzard Challenge, Sep. 2008 [Online]. Available: http://festvox.org/blizzard/bc2008/summary-Blizzard2008.pdf
    • (2008) Proc. Blizzard Challenge
    • Karaiskos, V.1    King, S.2    Clark, R.A.J.3    Mayo, C.4
  • 32
    • 70449126171 scopus 로고    scopus 로고
    • The hts- 2008 system: Yet another evaluation of the speaker-Adaptive hmmbased speech synthesis system the 2008 blizzard challenge
    • Sep. [Online]. Available
    • J. Yamagishi, H. Zen, Y.-J. Wu, T. Toda, and K. Tokuda, "The HTS- 2008 system: Yet another evaluation of the speaker-Adaptive HMMbased speech synthesis system the 2008 Blizzard Challenge," in Proc. Blizzard Challenge, Sep. 2008 [Online]. Available: http://festvox.org/blizzard/bc2008/hts- Blizzard2008.pdf
    • (2008) Proc. Blizzard Challenge
    • Yamagishi, J.1    Zen, H.2    Wu, Y.-J.3    Toda, T.4    Tokuda, K.5
  • 33
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-Adaptive time-frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. Cheveigné, "Restructuring speech representations using a pitch-Adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure sounds," Speech Commun., vol. 27, pp. 187-207, 1999.
    • (1999) Speech Commun. , vol.27 , pp. 187-207
    • Kawahara, H.1    Magrin-Chagnolleaua, I.2    Cheveigné, A.3
  • 34
    • 33846405723 scopus 로고    scopus 로고
    • Details of the nitech HMM-based speech synthesis system for the blizzard challenge 2005
    • DOI 10.1093/ietisy/e90-1.1.325
    • H. Zen, T. Toda, M. Nakamura, and K. Tokuda, "Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005," IEICE Trans. Inf. Syst., vol. E90-D, no. 1, pp. 325-333, Jan. 2007. (Pubitemid 46145336)
    • (2007) IEICE Transactions on Information and Systems , vol.E90-D , Issue.1 , pp. 325-333
    • Zen, H.1    Toda, T.2    Nakamura, M.3    Tokuda, K.4
  • 35
    • 44449177634 scopus 로고    scopus 로고
    • A hidden semi-markov model-based speech synthesis system
    • May
    • H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "A hidden semi-Markov model-based speech synthesis system," IEICE Trans. Inf. Syst., vol. E90-D, no. 5, pp. 825-834, May 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.5 , pp. 825-834
    • Zen, H.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 38
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • M. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," Comput. Speech Lang., vol. 12, no. 2, pp. 75-98, 1998. (Pubitemid 128383747)
    • (1998) Computer Speech and Language , vol.12 , Issue.2 , pp. 75-98
    • Gales, M.J.F.1
  • 39
    • 67650854725 scopus 로고    scopus 로고
    • Analysis of speaker adaptation algorithms for hmm-based speech synthesis and a constrained smaplr adaptation algorithm
    • J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, and J. Isogai, "Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm," IEEE Trans. Speech, Audio, Lang. Process., vol. 17, no. 1, pp. 66-83, 2009.
    • (2009) IEEE Trans. Speech, Audio, Lang. Process. , vol.17 , Issue.1 , pp. 66-83
    • Yamagishi, J.1    Kobayashi, T.2    Nakano, Y.3    Ogata, K.4    Isogai, J.5
  • 40
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithm considering global variance for hmm-based speech synthesis
    • May
    • T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis," IEICE Trans. Inf. Syst., vol. E90-D, no. 5, pp. 816-824, May 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2
  • 41
    • 0025543906 scopus 로고
    • Pitch-synchronous waveform processing techniques for text-To-speech synthesis using diphones
    • E. Moulines and F. Charpentier, "Pitch-synchronous waveform processing techniques for text-To-speech synthesis using diphones," Speech Commun., vol. 9, no. 5-6, pp. 453-468, 1990.
    • (1990) Speech Commun. , vol.9 , Issue.5-6 , pp. 453-468
    • Moulines, E.1    Charpentier, F.2
  • 44
    • 63349083218 scopus 로고    scopus 로고
    • Simple representation of signal phase for harmonic speech models
    • I. Saratxaga, I. Hernaez, D. Erro, E. Navas, and J. Sanchez, "Simple representation of signal phase for harmonic speech models," Electron. Lett., vol. 45, pp. 381-383, 2009.
    • (2009) Electron. Lett. , vol.45 , pp. 381-383
    • Saratxaga, I.1    Hernaez, I.2    Erro, D.3    Navas, E.4    Sanchez, J.5
  • 47
    • 34547503468 scopus 로고    scopus 로고
    • Evaluation of pitch detection algorithms under real conditions
    • Honolulu, HI, Apr
    • I. Luengo, I. Saratxaga, E. Navas, I. Hernaez, J. Sanchez, and I. Sainz, "Evaluation of pitch detection algorithms under real conditions," in Proc. ICASSP '07, Honolulu, HI, Apr. 2007, pp. 1057-1060.
    • (2007) Proc. ICASSP '07 , pp. 1057-1060
    • Luengo, I.1    Saratxaga, I.2    Navas, E.3    Hernaez, I.4    Sanchez, J.5    Sainz, I.6
  • 48
    • 84865379198 scopus 로고    scopus 로고
    • [Online]. Available
    • Wall Street Journal Corpus, 2010. [Online]. Available: http://www.ldc. upenn.edu
    • (2010) Wall Street Journal Corpus
  • 49
    • 0012330750 scopus 로고
    • The design for the wall street journalbased csr corpus
    • D. B. Paul and J. M. Baker, "The design for the Wall Street Journalbased CSR corpus," in Proc. Workshop Speech Natural Lang., 1992, pp. 357-362.
    • (1992) Proc. Workshop Speech Natural Lang. , pp. 357-362
    • Paul, D.B.1    Baker, J.M.2
  • 50
  • 51
    • 0033884857 scopus 로고    scopus 로고
    • Score normalization for text-independent speaker verification systems
    • DOI 10.1006/dspr.1999.0360
    • R. Auckenthaler, M. Carey, and H. Lloyd-Thomas, "Score normalization for test-independent speaker verification system," Digital Signal Process., vol. 10, no. 1, pp. 42-54, 2000. (Pubitemid 30592165)
    • (2000) Digital Signal Processing: A Review Journal , vol.10 , Issue.1 , pp. 42-54
    • Auckenthaler, R.1    Carey, M.2    Lloyd-Thomas, H.3
  • 53
    • 0020703324 scopus 로고
    • Mel log spectrum approximation (MLSA) filter for speech synthesis
    • S. Imai, K. Sumita, and C. Furuichi, "Mel log spectrum approximation (MLSA) filter for speech synthesis," Electron. Commun. Japan (Part I: Commun.) vol. 66, no. 2, pp. 10-18, 1983 [Online].Available: http://dx. doi.org/10.1002/ecja.4400660203 (Pubitemid 14491503)
    • (1983) Electronics & Communications in Japan , vol.66 , Issue.2 , pp. 10-18
    • Imai, S.1    Sumita, K.2    Furuichi, C.3
  • 54
    • 84874199000 scopus 로고    scopus 로고
    • Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system straight
    • H. Kawahara, J. Estill, and O. Fujimura, "Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT," in Proc. Models Anal. Vocal Emissions for Biomed. Applicat. (MAVEBA), 2001, pp. 1-6.
    • (2001) Proc. Models Anal. Vocal Emissions for Biomed. Applicat. (MAVEBA , pp. 1-6
    • Kawahara, H.1    Estill, J.2    Fujimura, O.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.