메뉴 건너뛰기




Volumn , Issue , 2017, Pages 4900-4904

Training algorithm to deceive Anti-Spoofing Verification for DNN-based speech synthesis

Author keywords

anti spoofing verification; DNN based speech synthesis; generative adversarial training; multitask learning; training algorithm

Indexed keywords


EID: 85023772724     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2017.7953088     Document Type: Conference Paper
Times cited : (32)

References (31)
  • 1
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • H. Zen, K. Tokuda, and A. Black, "Statistical parametric speech synthesis," Speech Communication, Vol. 51, no. 11, pp. 1039-1064, 2009.
    • (2009) Speech Communication , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.3
  • 3
    • 85032750981 scopus 로고    scopus 로고
    • Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques And future trends
    • Z. H. Ling, S. Y Kang, H. Zen, A. Senior, M. Schuster, X. J. Qian, H. Meng, and L. Deng, "Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends," IEEE Signal Processing Magazine, Vol. 32, no. 3, pp. 35-52, 2015.
    • (2015) IEEE Signal Processing Magazine , vol.32 , Issue.3 , pp. 35-52
    • Ling, Z.H.1    Kang, S.Y.2    Zen, H.3    Senior, A.4    Schuster, M.5    Qian, X.J.6    Meng, H.7    Deng, L.8
  • 4
    • 33846429403 scopus 로고    scopus 로고
    • Minimum generation error training for HMM-based speech synthesis
    • Toulouse, France, May
    • Y J. Wu and R. H. Wang, "Minimum generation error training for HMM-based speech synthesis," in Proc. ICASSP, Toulouse, France, May 2006, pp. 89-92.
    • (2006) Proc. ICASSP , pp. 89-92
    • Wu, Y.J.1    Wang, R.H.2
  • 5
    • 84978086501 scopus 로고    scopus 로고
    • Improving trajectory modeling for DNN-based speech synthesis by using stacked bottleneck features and minimum trajectory error training
    • Z. Wu and S. King, "Improving trajectory modeling for DNN-based speech synthesis by using stacked bottleneck features and minimum trajectory error training," IEEE Transactions on Audio, Speech, and Language Processing, Vol. 24, no. 7, pp. 1255-1265, 2016.
    • (2016) IEEE Transactions on Audio, Speech, and Language Processing , vol.24 , Issue.7 , pp. 1255-1265
    • Wu, Z.1    King, S.2
  • 7
    • 84994234512 scopus 로고    scopus 로고
    • Objective evaluation using association between dimensions within spectral features for statistical parametric speech synthesis
    • California, U.S.A., Sep.
    • Y. Ijima, T. Asami, and H. Mizuno, "Objective evaluation using association between dimensions within spectral features for statistical parametric speech synthesis," in Proc. INTERSPEECH, California, U.S.A., Sep. 2016, pp. 337-341.
    • (2016) Proc. INTERSPEECH , pp. 337-341
    • Ijima, Y.1    Asami, T.2    Mizuno, H.3
  • 8
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum likelihood estimation of spectral parameter trajectory
    • T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based on maximum likelihood estimation of spectral parameter trajectory," IEEE Transactions on Audio, Speech, and Language Processing, Vol. 15, no. 8, pp. 2222-2235, 2007.
    • (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 9
    • 84878387899 scopus 로고    scopus 로고
    • Histogram-based spectral equalization for HMM-based speech synthesis using mel-LSP
    • Portland, U.S.A Sep.
    • Y. Ohtani, M. Tamura, M. Morita, T. Kagoshima, and M. Akamine, "Histogram-based spectral equalization for HMM-based speech synthesis using mel-LSP," in Proc. INTERSPEECH, Portland, U.S.A., Sep. 2012.
    • (2012) Proc. INTERSPEECH
    • Ohtani, Y.1    Tamura, M.2    Morita, M.3    Kagoshima, T.4    Akamine, M.5
  • 11
    • 84946033919 scopus 로고    scopus 로고
    • Modulation spectrum-constrained trajectory training algorithm for GMM-based voice conversion
    • Brisbane, Australia, Apr.
    • S. Takamichi, T. Toda, A. W. Black, and S. Nakamura, "Modulation spectrum-constrained trajectory training algorithm for GMM-based voice conversion," in Proc. ICASSP, Brisbane, Australia, Apr. 2015, pp. 4859-4863.
    • (2015) Proc. ICASSP , pp. 4859-4863
    • Takamichi, S.1    Toda, T.2    Black, A.W.3    Nakamura, S.4
  • 12
    • 84973375140 scopus 로고    scopus 로고
    • Trajectory training considering global variance for speech synthesis based on neural networks
    • Shanghai, China, Mar.
    • K. Hashimoto, K. Oura, Y Nankaku, and K. Tokuda, "Trajectory training considering global variance for speech synthesis based on neural networks," in Proc. ICASSP, Shanghai, China, Mar. 2016, pp. 5600-5604.
    • (2016) Proc. ICASSP , pp. 5600-5604
    • Hashimoto, K.1    Oura, K.2    Nankaku, Y.3    Tokuda, K.4
  • 13
    • 84910088495 scopus 로고    scopus 로고
    • Analysis of spectral enhancement using global variance in HMM-based speech synthesis
    • MAX Atria, Singapore, May
    • T. Nose and A. Ito, "Analysis of spectral enhancement using global variance in HMM-based speech synthesis," in Proc. INTERSPEECH, MAX Atria, Singapore, May 2014, pp. 2917-2921.
    • (2014) Proc. INTERSPEECH , pp. 2917-2921
    • Nose, T.1    Ito, A.2
  • 14
    • 84890490547 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis using deep neural networks
    • Vancouver, Canada, May
    • H. Zen, A. Senior, and M. Schuster, "Statistical parametric speech synthesis using deep neural networks," in Proc. ICASSP, Vancouver, Canada, May 2013, pp. 7962-7966.
    • (2013) Proc. ICASSP , pp. 7962-7966
    • Zen, H.1    Senior, A.2    Schuster, M.3
  • 16
    • 84959178048 scopus 로고    scopus 로고
    • Robust deep feature for spoofing detection - The SJTU system for ASVspoof 2015 challenge
    • Dresden, Germany, Sep.
    • N. Chen, Y Qian, H. Dinkel, B. Chen, and K. Yu, "Robust deep feature for spoofing detection - the SJTU system for ASVspoof 2015 Challenge," in Proc. INTERSPEECH, Dresden, Germany, Sep. 2015, pp. 2097-2101.
    • (2015) Proc. INTERSPEECH , pp. 2097-2101
    • Chen, N.1    Qian, Y.2    Dinkel, H.3    Chen, B.4    Yu, K.5
  • 18
    • 33746600649 scopus 로고    scopus 로고
    • Reducing the dimensionality of data with neural networks
    • G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," Science, Vol. 313, no. 5786, pp. 504-507, 2006.
    • (2006) Science , vol.313 , Issue.5786 , pp. 504-507
    • Hinton, G.E.1    Salakhutdinov, R.R.2
  • 19
    • 84946045510 scopus 로고    scopus 로고
    • Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis
    • Brisbane, Australia, Apr.
    • H. Zen and H. Sak, "Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis," in Proc. ICASSP, Brisbane, Australia, Apr. 2015, pp. 4470-4474.
    • (2015) Proc. ICASSP , pp. 4470-4474
    • Zen, H.1    Sak, H.2
  • 20
    • 84959090360 scopus 로고    scopus 로고
    • Multitask learning deep neural networks for speech feature denoising
    • Dresden, Germany, Sep.
    • B. Huang, D. Ke, H. Zheng, B. Xu, Y Xu, and K. Su, "Multitask learning deep neural networks for speech feature denoising," in Proc. INTERSPEECH, Dresden, Germany, Sep. 2015, pp. 2464-2468.
    • (2015) Proc. INTERSPEECH , pp. 2464-2468
    • Huang, B.1    Ke, D.2    Zheng, H.3    Xu, B.4    Xu, Y.5    Su, K.6
  • 22
    • 84984985889 scopus 로고    scopus 로고
    • "Why should I trust you?": Explaining the predictions of any classifier
    • San Francisco, U.S.A., Aug.
    • T. R. Marco, S. Sameer, and G. Carlos, ""Why should I trust you?": Explaining the predictions of any classifier," in Proc. KDD, San Francisco, U.S.A., Aug. 2016, pp. 1135-1164.
    • (2016) Proc. KDD , pp. 1135-1164
    • Marco, T.R.1    Sameer, S.2    Carlos, G.3
  • 25
    • 84874199000 scopus 로고    scopus 로고
    • Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT
    • Firentze, Italy, Sep.
    • H. Kawahara, Jo Estill, and O. Fujimura, "Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT," in MAVEBA 2001, Firentze, Italy, Sep. 2001, pp. 1-6.
    • (2001) MAVEBA 2001 , pp. 1-6
    • Kawahara, H.1    Estill, J.2    Fujimura, O.3
  • 26
    • 44949143155 scopus 로고    scopus 로고
    • Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation
    • Pittsburgh, U.S.A., Sep.
    • Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation," in Proc. INTERSPEECH, Pittsburgh, U.S.A., Sep. 2006, pp. 2266-2269.
    • (2006) Proc. INTERSPEECH , pp. 2266-2269
    • Ohtani, Y.1    Toda, T.2    Saruwatari, H.3    Shikano, K.4
  • 27
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. D. Cheveigne, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech Communication, Vol. 27, no. 3-4, pp. 187-207, 1999.
    • (1999) Speech Communication , vol.27 , Issue.3-4 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    Cheveigne, A.D.3
  • 29
    • 84862294866 scopus 로고    scopus 로고
    • Deep sparse rectifier neural networks
    • Lauderdale, U.S.A., Apr.
    • X. Glorot, A. Bordes, and Y Bengio, "Deep sparse rectifier neural networks," in Proc. AISTATS, Lauderdale, U.S.A., Apr. 2011, pp. 315-323.
    • (2011) Proc. AISTATS , pp. 315-323
    • Glorot, X.1    Bordes, A.2    Bengio, Y.3
  • 30
    • 80052250414 scopus 로고    scopus 로고
    • Adaptive subgradient methods for online learning and stochastic optimization
    • J. Duchi, E. Hazan, and Y Singer, "Adaptive subgradient methods for online learning and stochastic optimization," Journal of Machine Learning Research, Vol. 12, pp. 2121-2159, 2011.
    • (2011) Journal of Machine Learning Research , vol.12 , pp. 2121-2159
    • Duchi, J.1    Hazan, E.2    Singer, Y.3
  • 31
    • 0031573117 scopus 로고    scopus 로고
    • Long short-term memory
    • S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, Vol. 9, no. 8, pp. 1735-1780, 1997.
    • (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
    • Hochreiter, S.1    Schmidhuber, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.