메뉴 건너뛰기




Volumn , Issue , 2013, Pages

Voice conversion and spoofing attack on speaker verification systems

Author keywords

[No Author keywords available]

Indexed keywords

ANTI-SPOOFING; MARKET ADOPTION; ONLINE COMMERCE; SPEAKER VERIFICATION; SPEAKER VERIFICATION SYSTEM; SPOOFING ATTACKS; USER AUTHENTICATION; VOICE CONVERSION;

EID: 84893302435     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/APSIPA.2013.6694344     Document Type: Conference Paper
Times cited : (57)

References (81)
  • 2
    • 33751542948 scopus 로고    scopus 로고
    • Speaker verification security improvement by means of speech watermarking
    • DOI 10.1016/j.specom.2006.06.010, PII S0167639306000653
    • Marcos Faundez-Zanuy, Martin Hagmüller, and Gernot Kubin, "Speaker verification security improvement by means of speech watermarking," Speech communication, vol. 48, no. 12, pp. 1608-1619, 2006. (Pubitemid 44829871)
    • (2006) Speech Communication , vol.48 , Issue.12 , pp. 1608-1619
    • Faundez-Zanuy, M.1    Hagmuller, M.2    Kubin, G.3
  • 4
    • 84867605072 scopus 로고    scopus 로고
    • Speaker verification performance degradation against spoofing and tampering attacks
    • Jesús Villalba and Eduardo Lleida, "Speaker verification performance degradation against spoofing and tampering attacks," in FALA 10 workshop, 2010.
    • (2010) FALA 10 Workshop
    • Villalba, J.1    Lleida, E.2
  • 8
    • 0029765811 scopus 로고    scopus 로고
    • Unit selection in a concatenative speech synthesis system using a large speech database
    • Andrew J Hunt and A.W. Black, "Unit selection in a concatenative speech synthesis system using a large speech database," in Proc. ICASSP, 1996.
    • (1996) Proc. ICASSP
    • Hunt, A.J.1    Black, A.W.2
  • 9
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • Heiga Zen, Keiichi Tokuda, and A.W. Black, "Statistical parametric speech synthesis," Speech Communication, vol. 51, no. 11, pp. 1039-1064, 2009.
    • (2009) Speech Communication , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3
  • 11
    • 85135274466 scopus 로고    scopus 로고
    • On the security of HMM-based speaker verification systems against imposture using synthetic speech
    • Takashi Masuko, Takafumi Hitotsumatsu, Keiichi Tokuda, and Takao Kobayashi, "On the security of HMM-based speaker verification systems against imposture using synthetic speech," in Proc. EUROSPEECH, 1999.
    • (1999) Proc. Eurospeech
    • Masuko, T.1    Hitotsumatsu, T.2    Tokuda, K.3    Kobayashi, T.4
  • 12
    • 85009077529 scopus 로고    scopus 로고
    • Imposture using synthetic speech against speaker verification based on spectrum and pitch
    • Takashi Masuko, Keiichi Tokuda, and Takao Kobayashi, "Imposture using synthetic speech against speaker verification based on spectrum and pitch," in Proc. ICSLP, 2000.
    • (2000) Proc. ICSLP
    • Masuko, T.1    Tokuda, K.2    Kobayashi, T.3
  • 13
    • 85009119461 scopus 로고    scopus 로고
    • A robust speaker verification system against imposture using an HMMbased speech synthesis system
    • Takayuki Satoh, Takashi Masuko, Takao Kobayashi, and Keiichi Tokuda, "A robust speaker verification system against imposture using an HMMbased speech synthesis system," in Proc. Eurospeech, 2001.
    • (2001) Proc. Eurospeech
    • Satoh, T.1    Masuko, T.2    Kobayashi, T.3    Tokuda, K.4
  • 15
    • 67650854725 scopus 로고    scopus 로고
    • Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm
    • Junichi Yamagishi, Takao Kobayashi, Yuji Nakano, Katsumi Ogata, and Juri Isogai, "Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm," IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 1, pp. 66-83, 2009.
    • (2009) IEEE Transactions on Audio, Speech, and Language Processing , vol.17 , Issue.1 , pp. 66-83
    • Yamagishi, J.1    Kobayashi, T.2    Nakano, Y.3    Ogata, K.4    Isogai, J.5
  • 16
    • 70350125882 scopus 로고    scopus 로고
    • An overview of text-independent speaker recognition: From features to supervectors
    • January
    • Tomi Kinnunen and Haizhou Li, "An overview of text-independent speaker recognition: from features to supervectors," Speech Communication, vol. 52, no. 1, pp. 12-40, January 2010.
    • (2010) Speech Communication , vol.52 , Issue.1 , pp. 12-40
    • Kinnunen, T.1    Li, H.2
  • 17
    • 0033884858 scopus 로고    scopus 로고
    • Speaker verification using adapted Gaussian mixture models
    • DOI 10.1006/dspr.1999.0361
    • Douglas A Reynolds, Thomas F Quatieri, and Robert B Dunn, "Speaker verification using adapted gaussian mixture models," Digital signal processing, vol. 10, no. 1, pp. 19-41, 2000. (Pubitemid 30592166)
    • (2000) Digital Signal Processing: A Review Journal , vol.10 , Issue.1 , pp. 19-41
    • Reynolds, D.A.1    Quatieri, T.F.2    Dunn, R.B.3
  • 18
    • 33645887246 scopus 로고    scopus 로고
    • Support vector machines using GMM supervectors for speaker verification
    • William M Campbell, Douglas E Sturim, and Douglas A Reynolds, "Support vector machines using GMM supervectors for speaker verification," IEEE Signal Processing Letters, vol. 13, no. 5, pp. 308-311, 2006.
    • (2006) IEEE Signal Processing Letters , vol.13 , Issue.5 , pp. 308-311
    • Campbell, W.M.1    Sturim, D.E.2    Reynolds, D.A.3
  • 19
    • 84872169719 scopus 로고    scopus 로고
    • Advances in channel compensation for SVM speaker recognition
    • Alex Solomonoff, William M Campbell, and Ian Boardman, "Advances in channel compensation for SVM speaker recognition," in Proc. ICASSP.
    • Proc. ICASSP
    • Solomonoff, A.1    Campbell, W.M.2    Boardman, I.3
  • 21
    • 44949114401 scopus 로고    scopus 로고
    • Within-class covariance normalization for SVM-based speaker recognition
    • Andrew O Hatch, Sachin Kajarekar, and Andreas Stolcke, "Within-class covariance normalization for SVM-based speaker recognition," in Proc. ICSLP, 2006.
    • (2006) Proc. ICSLP
    • Hatch, A.O.1    Kajarekar, S.2    Stolcke, A.3
  • 22
    • 33947637189 scopus 로고    scopus 로고
    • Joint factor analysis of speaker and session variability: Theory and algorithms
    • P. Kenny, "Joint factor analysis of speaker and session variability: theory and algorithms," technical report CRIM-06/08-14, 2006.
    • (2006) Technical Report CRIM-06/08-14
    • Kenny, P.1
  • 28
    • 84890510678 scopus 로고    scopus 로고
    • Improving speaker identification robustness to highly channel-degraded speech through multiple system fusion
    • Mitchell McLaren, Nicolas Scheffer, Martin Graciarena, Luciana Ferrer, and Yun Lei, "Improving speaker identification robustness to highly channel-degraded speech through multiple system fusion," in Proc. ICASSP, 2013.
    • (2013) Proc. ICASSP
    • McLaren, M.1    Scheffer, N.2    Graciarena, M.3    Ferrer, L.4    Lei, Y.5
  • 29
    • 0029254176 scopus 로고
    • Transformation of formants for voice conversion using artificial neural networks
    • M. Narendranath, H.A. Murthy, S. Rajendran, and B. Yegnanarayana, "Transformation of formants for voice conversion using artificial neural networks," Speech communication, vol. 16, no. 2, pp. 207-216, 1995.
    • (1995) Speech Communication , vol.16 , Issue.2 , pp. 207-216
    • Narendranath, M.1    Murthy, H.A.2    Rajendran, S.3    Yegnanarayana, B.4
  • 30
    • 0031623661 scopus 로고    scopus 로고
    • Spectral voice conversion for text-to-speech synthesis
    • Alexander Kain and Michael W Macon, "Spectral voice conversion for text-to-speech synthesis," in Proc. ICASSP, 1998.
    • (1998) Proc. ICASSP
    • Kain, A.1    Macon, M.W.2
  • 31
    • 0032026483 scopus 로고    scopus 로고
    • Continuous probabilistic transform for voice conversion
    • PII S1063667698017386
    • Yannis Stylianou, Olivier Cappé, and Eric Moulines, "Continuous probabilistic transform for voice conversion," IEEE Transactions on Speech and Audio Processing, vol. 6, no. 2, pp. 131-142, 1998. (Pubitemid 128720639)
    • (1998) IEEE Transactions on Speech and Audio Processing , vol.6 , Issue.2 , pp. 131-142
    • Stylianou, Y.1    Cappe, O.2    Moulines, E.3
  • 32
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum-likelihood estimation of spectral parameter trajec tory
    • Tomoki Toda, A.W. Black, and Keiichi Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajec tory," IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 8, pp. 2222-2235, 2007.
    • (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 34
    • 34047247202 scopus 로고    scopus 로고
    • Voice conversion using duration-embedded Bi-HMMs for expressive speech synthesis
    • DOI 10.1109/TASL.2006.876112
    • Chung-Hsien Wu, Chi-Chun Hsia, Te-Hsien Liu, and Jhing-Fa Wang, "Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis," IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 4, pp. 1109-1116, 2006. (Pubitemid 46547608)
    • (2006) IEEE Transactions on Audio, Speech and Language Processing , vol.14 , Issue.4 , pp. 1109-1116
    • Wu, C.-H.1    Hsia, C.-C.2    Liu, T.-H.3    Wang, J.-F.4
  • 35
    • 34547520011 scopus 로고    scopus 로고
    • A novel method for prosody prediction in voice conversion
    • Elina E Helander and Jani Nurminen, "A novel method for prosody prediction in voice conversion," in ICASSP, 2007.
    • (2007) ICASSP
    • Helander, E.E.1    Nurminen, J.2
  • 36
    • 79959842826 scopus 로고    scopus 로고
    • Text-independent F0 transformation with non-parallel data for voice conversion
    • Z.-Z. Wu, Tomi Kinnunen, E.S. Chng, and Haizhou Li, "Text- independent F0 transformation with non-parallel data for voice conversion," in Proc. Interspeech, 2010.
    • (2010) Proc. Interspeech
    • Wu, Z.-Z.1    Kinnunen, T.2    Chng, E.S.3    Li, H.4
  • 37
    • 77953726259 scopus 로고    scopus 로고
    • Pitch and duration transformation with non-parallel data
    • Damien Lolive, Nelly Barbot, and Olivier Boeffard, "Pitch and duration transformation with non-parallel data," Proc. Speech Prosody, pp. 111-114, 2008.
    • (2008) Proc. Speech Prosody , pp. 111-114
    • Lolive, D.1    Barbot, N.2    Boeffard, O.3
  • 39
    • 77953725318 scopus 로고    scopus 로고
    • INCA algorithm for training voice conversion systems from nonparallel corpora
    • Daniel Erro, Asunción Moreno, and Antonio Bonafonte, "INCA algorithm for training voice conversion systems from nonparallel corpora," IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 5, pp. 944-953, 2010.
    • (2010) IEEE Transactions on Audio, Speech, and Language Processing , vol.18 , Issue.5 , pp. 944-953
    • Erro, D.1    Moreno, A.2    Bonafonte, A.3
  • 40
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • Hideki Kawahara, Ikuyo Masuda-Katsuse, and Alain de Cheveigné, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech communication, vol. 27, no. 3, pp. 187-207, 1999.
    • (1999) Speech Communication , vol.27 , Issue.3 , pp. 187-207
    • Kawahara, H.1    Katsuse, I.M.2    De Cheveigné, A.3
  • 41
    • 0035127703 scopus 로고    scopus 로고
    • Applying the harmonic plus noise model in concatenative speech synthesis
    • DOI 10.1109/89.890068
    • Yannis Stylianou, "Applying the harmonic plus noise model in concatenative speech synthesis," IEEE Transactions on Speech and Audio Processing, vol. 9, no. 1, pp. 21-29, 2001. (Pubitemid 32130684)
    • (2001) IEEE Transactions on Speech and Audio Processing , vol.9 , Issue.1 , pp. 21-29
    • Stylianou, Y.1
  • 45
    • 85131821539 scopus 로고
    • Mel-generalized cepstral analysis-A unified approach to speech spectral estimation
    • Keiichi Tokuda, Takao Kobayashi, Takashi Masuko, and Satoshi Imai, "Mel-generalized cepstral analysis-a unified approach to speech spectral estimation.," in Proc. ICSLP, 1994.
    • (1994) Proc. ICSLP
    • Tokuda, K.1    Kobayashi, T.2    Masuko, T.3    Imai, S.4
  • 46
    • 51449107658 scopus 로고    scopus 로고
    • LSF mapping for voice conversion with very small training sets
    • Elina Helander, Jani Nurminen, and Moncef Gabbouj, "LSF mapping for voice conversion with very small training sets," in Proc. ICASSP, 2008.
    • (2008) Proc. ICASSP
    • Helander, E.1    Nurminen, J.2    Gabbouj, M.3
  • 47
  • 48
    • 0021157408 scopus 로고
    • Line spectrum pair (LSP) and speech data compression
    • Frank Soong and Biing-Hwang Juang, "Line spectrum pair (LSP) and speech data compression," in Proc. ICASSP, 1984.
    • (1984) Proc. ICASSP
    • Soong, F.1    Juang, B.-H.2
  • 50
    • 0033154052 scopus 로고    scopus 로고
    • Speaker transformation algorithm using segmental codebooks (STASC)
    • Levent M Arslan, "Speaker transformation algorithm using segmental codebooks (STASC)," Speech Communication, vol. 28, no. 3, pp. 211-226, 1999.
    • (1999) Speech Communication , vol.28 , Issue.3 , pp. 211-226
    • Arslan, L.M.1
  • 52
    • 84905560807 scopus 로고    scopus 로고
    • Voice conversion with smoothed GMM and MAP adaptation
    • Yining Chen, Min Chu, Eric Chang, Jia Liu, and Runsheng Liu, "Voice conversion with smoothed GMM and MAP adaptation," in Proc. Eurospeech, 2003.
    • (2003) Proc. Eurospeech
    • Chen, Y.1    Chu, M.2    Chang, E.3    Liu, J.4    Liu, R.5
  • 55
    • 84869384026 scopus 로고    scopus 로고
    • Mixture of factor analyzers using priors from non-parallel speech for voice conversion
    • Zhizheng Wu, Tomi Kinnunen, E.S. Chng, and Haizhou Li, "Mixture of factor analyzers using priors from non-parallel speech for voice conversion," IEEE SIGNAL PROCESSING LETTERS, vol. 19, no. 12, pp. 914-917, 2012.
    • (2012) IEEE Signal Processing Letters , vol.19 , Issue.12 , pp. 914-917
    • Wu, Z.1    Kinnunen, T.2    Chng, E.S.3    Li, H.4
  • 57
    • 80053068819 scopus 로고    scopus 로고
    • Voice conversion using support vector regression
    • P Song, YQ Bao, L Zhao, and CR Zou, "Voice conversion using support vector regression," Electronics letters, vol. 47, no. 18, pp. 1045-1046, 2011.
    • (2011) Electronics Letters , vol.47 , Issue.18 , pp. 1045-1046
    • Song, P.1    Bao, Y.Q.2    Zhao, L.3    Zou, C.R.4
  • 63
    • 84857498745 scopus 로고    scopus 로고
    • Rosec, and thierry chonavel, voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora
    • Elizabeth Godoy, Olivier Rosec, and Thierry Chonavel, "Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 4, pp. 1313-1323, 2012.
    • (2012) IEEE Transactions on Audio, Speech, and Language Processing , vol.20 , Issue.4 , pp. 1313-1323
    • Olivier, E.G.1
  • 64
    • 84872177757 scopus 로고    scopus 로고
    • Parametric voice conversion based on bilinear frequency warping plus amplitude scaling
    • Daniel Erro, Eva Navas, and Inma Hernaez, "Parametric voice conversion based on bilinear frequency warping plus amplitude scaling," IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 3, pp. 556-566, 2013.
    • (2013) IEEE Transactions on Audio, Speech, and Language Processing , vol.21 , Issue.3 , pp. 556-566
    • Erro, D.1    Navas, E.2    Hernaez, I.3
  • 65
  • 68
    • 84906276055 scopus 로고    scopus 로고
    • Exemplar-based unit selection for voice conversion utilizing temporal information
    • Zhizheng Wu, Tuomas Virtanen, Tomi Kinnunen, E.S. Chng, and Haizhou Li, "Exemplar-based unit selection for voice conversion utilizing temporal information," in Proc. Interspeech, 2013.
    • (2013) Proc. Interspeech
    • Wu, Z.1    Virtanen, T.2    Kinnunen, T.3    Chng, E.S.4    Li, H.5
  • 70
    • 84906275384 scopus 로고    scopus 로고
    • Vulnerability evaluation of speaker verification under voice conversion spoofing: The effect of text constraints
    • Zhizheng Wu, Anthony Larcher, Kong Aik Lee, E.S. Chng, Tomi Kinnunen, and Haizhou Li, "Vulnerability evaluation of speaker verification under voice conversion spoofing: the effect of text constraints," in Proc. Interspeech, 2013.
    • (2013) Proc. Interspeech
    • Wu, Z.1    Larcher, A.2    Aik Lee, K.3    Chng, E.S.4    Kinnunen, T.5    Li, H.6
  • 71
    • 65349113532 scopus 로고    scopus 로고
    • Artificial impostor voice transformation effects on false acceptance rates
    • Jean-Francois Bonastre, Driss Matrouf, and Corinne Fredouille, "Artificial impostor voice transformation effects on false acceptance rates," in Proc. Interspeech, 2007.
    • (2007) Proc. Interspeech
    • Bonastre, J.-F.1    Matrouf, D.2    Fredouille, C.3
  • 73
    • 84867600098 scopus 로고    scopus 로고
    • Vulnerability of speaker verification systems against voice conversion spoofing attacks: The case of telephone speech
    • Tomi Kinnunen, Z.-Z. Wu, Kong Aik Lee, Filip Sedlak, E.S. Chng, and Haizhou Li, "Vulnerability of speaker verification systems against voice conversion spoofing attacks: The case of telephone speech," in Proc. ICASSP, 2012.
    • (2012) Proc. ICASSP
    • Kinnunen, T.1    Wu, Z.-Z.2    Lee, K.A.3    Sedlak, F.4    Chng, E.S.5    Li, H.6
  • 74
    • 84906234851 scopus 로고    scopus 로고
    • Voice transformation-based spoofing of text-dependent speaker verification systems
    • Zvi Kons and Hagai Aronowitz, "Voice transformation-based spoofing of text-dependent speaker verification systems," in Proc. Interspeech, 2013.
    • (2013) Proc. Interspeech
    • Kons, Z.1    Aronowitz, H.2
  • 76
    • 84878465724 scopus 로고    scopus 로고
    • RSR2015: Database for text-dependent speaker verification using multiple passphrases
    • Anthony Larcher, Kong-Aik Lee, Bin Ma, and Haizhou Li, "RSR2015: Database for text-dependent speaker verification using multiple passphrases.," in Proc. Interspeech, 2012.
    • (2012) Proc. Interspeech
    • Larcher, A.1    Lee, K.-A.2    Ma, B.3    Li, H.4
  • 77
  • 78
    • 84878410960 scopus 로고    scopus 로고
    • Detecting converted speech and natural speech for anti-spoofing attack in speaker recognition
    • Zhizheng Wu, E.S. Chng, and Haizhou Li, "Detecting converted speech and natural speech for anti-spoofing attack in speaker recognition," in Proc. Interspeech, 2012.
    • (2012) Proc. Interspeech
    • Wu, Z.1    Chng, E.S.2    Li, H.3
  • 79
    • 84878412793 scopus 로고    scopus 로고
    • Spoofing countermeasures for the protection of automatic speaker recognition systems against attacks with artificial signals
    • Federico Alegre, Ravichander Vipperla, Nicholas Evans, et al., "Spoofing countermeasures for the protection of automatic speaker recognition systems against attacks with artificial signals," in Proc. Interspeech, 2012.
    • (2012) Proc. Interspeech
    • Alegre, F.1    Vipperla, R.2    Evans, N.3
  • 80
    • 84890543945 scopus 로고    scopus 로고
    • Synthetic speech detection using temporal modulation feature
    • Zhizheng Wu, Xiong Xiao, E.S. Chng, and Haizhou Li, "Synthetic speech detection using temporal modulation feature," in Proc. ICASSP, 2013.
    • (2013) Proc. ICASSP
    • Wu, Z.1    Xiao, X.2    Chng, E.S.3    Li, H.4
  • 81
    • 84890542394 scopus 로고    scopus 로고
    • Spoofing countermeasures to protect automatic speker verification from voice conversion
    • Federico Alegre, Asmaa Amehraye, and Nicholas Evans, "Spoofing countermeasures to protect automatic speker verification from voice conversion," in Proc. ICASSP, 2013.
    • (2013) Proc. ICASSP
    • Alegre, F.1    Amehraye, A.2    Evans, N.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.