메뉴 건너뛰기




Volumn 88, Issue , 2017, Pages 65-82

An overview of voice conversion systems

Author keywords

Overview; Survey; Voice conversion

Indexed keywords

COMPUTER SIMULATION; SURVEYING;

EID: 85010399617     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2017.01.008     Document Type: Review
Times cited : (305)

References (305)
  • 2
    • 84930664922 scopus 로고    scopus 로고
    • VOCAINE the vocoder and applications in speech synthesis
    • Agiomyrgiannakis, Y., VOCAINE the vocoder and applications in speech synthesis. Proceedings of the ICASSP, 2015.
    • (2015) Proceedings of the ICASSP
    • Agiomyrgiannakis, Y.1
  • 3
    • 70349208681 scopus 로고    scopus 로고
    • ARX-LF-based source-filter methods for voice modification and transformation
    • Agiomyrgiannakis, Y., Rosec, O., ARX-LF-based source-filter methods for voice modification and transformation. Proceedings of the ICASSP, 2009.
    • (2009) Proceedings of the ICASSP
    • Agiomyrgiannakis, Y.1    Rosec, O.2
  • 4
    • 84905227265 scopus 로고    scopus 로고
    • Voice conversion based on non-negative matrix factorization using phoneme-categorized dictionary
    • Aihara, R., Nakashika, T., Takiguchi, T., Ariki, Y., Voice conversion based on non-negative matrix factorization using phoneme-categorized dictionary. Proceedings of the ICASSP, 2014.
    • (2014) Proceedings of the ICASSP
    • Aihara, R.1    Nakashika, T.2    Takiguchi, T.3    Ariki, Y.4
  • 5
    • 84890519936 scopus 로고    scopus 로고
    • Individuality-preserving voice conversion for articulation disorders based on non-negative matrix factorization
    • Aihara, R., Takashima, R., Takiguchi, T., Ariki, Y., Individuality-preserving voice conversion for articulation disorders based on non-negative matrix factorization. Proceedings of the ICASSP, 2013.
    • (2013) Proceedings of the ICASSP
    • Aihara, R.1    Takashima, R.2    Takiguchi, T.3    Ariki, Y.4
  • 6
    • 84946095434 scopus 로고    scopus 로고
    • Activity-mapping non-negative matrix factorization for exemplar-based voice conversion
    • AIHARA, R., TAKIGUCHI, T., ARIKI, Y., Activity-mapping non-negative matrix factorization for exemplar-based voice conversion. Proceedings of the ICASSP, 2015.
    • (2015) Proceedings of the ICASSP
    • AIHARA, R.1    TAKIGUCHI, T.2    ARIKI, Y.3
  • 7
    • 84959090646 scopus 로고    scopus 로고
    • Many-to-many voice conversion based on multiple non-negative matrix factorization
    • Aihara, R., Takiguchi, T., Ariki, Y., Many-to-many voice conversion based on multiple non-negative matrix factorization. Proceedings of the INTERSPEECH, 2015.
    • (2015) Proceedings of the INTERSPEECH
    • Aihara, R.1    Takiguchi, T.2    Ariki, Y.3
  • 8
    • 84949924136 scopus 로고    scopus 로고
    • Exemplar-based emotional voice conversion using non-negative matrix factorization
    • Aihara, R., Ueda, R., Takiguchi, T., Ariki, Y., Exemplar-based emotional voice conversion using non-negative matrix factorization. Proceedings of the APSIPA, 2014, 10.1109/APSIPA.2014.7041640.
    • (2014) Proceedings of the APSIPA
    • Aihara, R.1    Ueda, R.2    Takiguchi, T.3    Ariki, Y.4
  • 9
    • 84890542394 scopus 로고    scopus 로고
    • Spoofing countermeasures to protect automatic speaker verification from voice conversion
    • Alegre, F., Amehraye, A., Evans, N., Spoofing countermeasures to protect automatic speaker verification from voice conversion. Proceedings of the ICASSP, 2013.
    • (2013) Proceedings of the ICASSP
    • Alegre, F.1    Amehraye, A.2    Evans, N.3
  • 11
    • 0033154052 scopus 로고    scopus 로고
    • Speaker transformation algorithm using segmental codebooks (STASC)
    • Arslan, L.M., Speaker transformation algorithm using segmental codebooks (STASC). Speech Commun. 28:3 (1999), 211–226.
    • (1999) Speech Commun. , vol.28 , Issue.3 , pp. 211-226
    • Arslan, L.M.1
  • 12
    • 84863268465 scopus 로고    scopus 로고
    • Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum
    • Arslan, L.M., Talkin, D., Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum. Proceedings of the EUROSPEECH, 1997.
    • (1997) Proceedings of the EUROSPEECH
    • Arslan, L.M.1    Talkin, D.2
  • 13
    • 0031643805 scopus 로고    scopus 로고
    • Speaker transformation using sentence HMM based alignments and detailed prosody modification
    • Arslan, L.M., Talkin, D., Speaker transformation using sentence HMM based alignments and detailed prosody modification. Proceedings of the ICASSP, 1998.
    • (1998) Proceedings of the ICASSP
    • Arslan, L.M.1    Talkin, D.2
  • 19
    • 0031104132 scopus 로고    scopus 로고
    • Application of speech conversion to alaryngeal speech enhancement
    • Bi, N., Qi, Y., Application of speech conversion to alaryngeal speech enhancement. IEEE Trans. Speech Audio Process. 5:2 (1997), 97–105.
    • (1997) IEEE Trans. Speech Audio Process. , vol.5 , Issue.2 , pp. 97-105
    • Bi, N.1    Qi, Y.2
  • 24
    • 84910104946 scopus 로고    scopus 로고
    • Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes
    • Chen, L.-H., Ling, Z.-H., Dai, L.-R., Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes. Proceedings of the INTERSPEECH, 2014.
    • (2014) Proceedings of the INTERSPEECH
    • Chen, L.-H.1    Ling, Z.-H.2    Dai, L.-R.3
  • 26
    • 84906225084 scopus 로고    scopus 로고
    • Joint spectral distribution modeling using restricted boltzmann machines for voice conversion
    • Chen, L.-H., Ling, Z.-H., Song, Y., Dai, L.-R., Joint spectral distribution modeling using restricted boltzmann machines for voice conversion. Proceedings of the INTERSPEECH, 2013.
    • (2013) Proceedings of the INTERSPEECH
    • Chen, L.-H.1    Ling, Z.-H.2    Song, Y.3    Dai, L.-R.4
  • 27
    • 84994337398 scopus 로고    scopus 로고
    • The USTC system for voice conversion challenge 2016: neural network based approaches for spectrum, aperiodicity and F0 conversion
    • Chen, L.-H., Liu, L.-J., Ling, Z.-H., Jiang, Y., Dai, L.-R., The USTC system for voice conversion challenge 2016: neural network based approaches for spectrum, aperiodicity and F0 conversion. Proceedings of the INTERSPEECH, 2016.
    • (2016) Proceedings of the INTERSPEECH
    • Chen, L.-H.1    Liu, L.-J.2    Ling, Z.-H.3    Jiang, Y.4    Dai, L.-R.5
  • 30
    • 0029253818 scopus 로고
    • Glottal source modeling for voice conversion
    • Childers, D.G., Glottal source modeling for voice conversion. Speech Commun. 16:2 (1995), 127–138.
    • (1995) Speech Commun. , vol.16 , Issue.2 , pp. 127-138
    • Childers, D.G.1
  • 33
    • 84867216755 scopus 로고    scopus 로고
    • The linear transformation of lf glottal waveforms for voice conversion.
    • Del Pozo, A., Young, S., The linear transformation of lf glottal waveforms for voice conversion. Proceedings of the INTERSPEECH, 2008.
    • (2008) Proceedings of the INTERSPEECH
    • Del Pozo, A.1    Young, S.2
  • 35
    • 84874403435 scopus 로고    scopus 로고
    • Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system
    • Doi, H., Toda, T., Nakano, T., Goto, M., Nakamura, S., Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system. Proceedings of the APSIPA, 2012.
    • (2012) Proceedings of the APSIPA
    • Doi, H.1    Toda, T.2    Nakano, T.3    Goto, M.4    Nakamura, S.5
  • 38
    • 33947629275 scopus 로고    scopus 로고
    • Residual conversion versus prediction on voice morphing systems
    • Duxans, H., Bonafonte, A., Residual conversion versus prediction on voice morphing systems. Proceedings of the ICASSP, 2006.
    • (2006) Proceedings of the ICASSP
    • Duxans, H.1    Bonafonte, A.2
  • 41
    • 84946210905 scopus 로고    scopus 로고
    • A new method for pitch prediction from spectral envelope and its application in voice conversion
    • En-Najjary, T., Rosec, O., Chonavel, T., A new method for pitch prediction from spectral envelope and its application in voice conversion. Proceedings of the INTERSPEECH, 2003.
    • (2003) Proceedings of the INTERSPEECH
    • En-Najjary, T.1    Rosec, O.2    Chonavel, T.3
  • 42
    • 85010449478 scopus 로고    scopus 로고
    • A voice conversion method based on joint pitch and spectral envelope transformation.
    • En-Najjary, T., Rosec, O., Chonavel, T., A voice conversion method based on joint pitch and spectral envelope transformation. Proceedings of the INTERSPEECH, 2004.
    • (2004) Proceedings of the INTERSPEECH
    • En-Najjary, T.1    Rosec, O.2    Chonavel, T.3
  • 45
    • 84913585254 scopus 로고    scopus 로고
    • Interpretable parametric voice conversion functions based on gaussian mixture models and constrained transformations
    • Erro, D., Alonso, A., Serrano, L., Navas, E., Hernaez, I., Interpretable parametric voice conversion functions based on gaussian mixture models and constrained transformations. Comput. Speech Lang. 30:1 (2015), 3–15.
    • (2015) Comput. Speech Lang. , vol.30 , Issue.1 , pp. 3-15
    • Erro, D.1    Alonso, A.2    Serrano, L.3    Navas, E.4    Hernaez, I.5
  • 47
    • 56149106209 scopus 로고    scopus 로고
    • Frame alignment method for cross-lingual voice conversion
    • Erro, D., Moreno, A., Frame alignment method for cross-lingual voice conversion. Proceedings of the INTERSPEECH, 2007.
    • (2007) Proceedings of the INTERSPEECH
    • Erro, D.1    Moreno, A.2
  • 49
    • 77953725318 scopus 로고    scopus 로고
    • INCA algorithm for training voice conversion systems from nonparallel corpora
    • Erro, D., Moreno, A., Bonafonte, A., INCA algorithm for training voice conversion systems from nonparallel corpora. IEEE Trans. Audio Speech Lang. Process. 18:5 (2010), 944–953.
    • (2010) IEEE Trans. Audio Speech Lang. Process. , vol.18 , Issue.5 , pp. 944-953
    • Erro, D.1    Moreno, A.2    Bonafonte, A.3
  • 51
    • 84878409257 scopus 로고    scopus 로고
    • Iterative MMSE estimation of vocal tract length normalization factors for voice transformation.
    • Erro, D., Navas, E., Hernáez, I., Iterative MMSE estimation of vocal tract length normalization factors for voice transformation. Proceedings of the INTERSPEECH, 2012.
    • (2012) Proceedings of the INTERSPEECH
    • Erro, D.1    Navas, E.2    Hernáez, I.3
  • 52
    • 51449121679 scopus 로고    scopus 로고
    • On combining statistical methods and frequency warping for high-quality voice conversion
    • Erro, D., Polyakova, T., Moreno, A., On combining statistical methods and frequency warping for high-quality voice conversion. Proceedings of the ICASSP, 2008.
    • (2008) Proceedings of the ICASSP
    • Erro, D.1    Polyakova, T.2    Moreno, A.3
  • 55
    • 84986212974 scopus 로고    scopus 로고
    • A waveform representation framework for high-quality statistical parametric speech synthesis
    • arXiv preprint arXiv:1510.01443
    • Fan, B., Lee, S.W., Tian, X., Xie, L., Dong, M., A waveform representation framework for high-quality statistical parametric speech synthesis. Proceedings of the APSIPA, 2015 arXiv preprint arXiv:1510.01443.
    • (2015) Proceedings of the APSIPA
    • Fan, B.1    Lee, S.W.2    Tian, X.3    Xie, L.4    Dong, M.5
  • 57
    • 0022667694 scopus 로고
    • Speaker-independent isolated word recognition using dynamic features of speech spectrum
    • Furui, S., Speaker-independent isolated word recognition using dynamic features of speech spectrum. IEEE Transactions on Acoustics, Speech and Signal Processing 34:1 (1986), 52–59.
    • (1986) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.34 , Issue.1 , pp. 52-59
    • Furui, S.1
  • 58
    • 6344222337 scopus 로고
    • DARPA TIMIT Acoustic-Phonetic Continous Speech Corpus CD-ROM. Nist Speech Disc 1-1.1
    • NASA STI, Recon Technical Report N
    • Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., DARPA TIMIT Acoustic-Phonetic Continous Speech Corpus CD-ROM. Nist Speech Disc 1-1.1. 93, 1993, NASA STI, Recon Technical Report N, 27403.
    • (1993) , vol.93 , pp. 27403
    • Garofolo, J.S.1    Lamel, L.F.2    Fisher, W.M.3    Fiscus, J.G.4    Pallett, D.S.5
  • 62
  • 63
    • 84906241950 scopus 로고    scopus 로고
    • Assessing the intelligibility impact of vowel space expansion via clear speech-inspired frequency warping.
    • Godoy, E., Koutsogiannaki, M., Stylianou, Y., Assessing the intelligibility impact of vowel space expansion via clear speech-inspired frequency warping. Proceedings of the INTERSPEECH, 2013.
    • (2013) Proceedings of the INTERSPEECH
    • Godoy, E.1    Koutsogiannaki, M.2    Stylianou, Y.3
  • 64
    • 84890562746 scopus 로고    scopus 로고
    • Approaching speech intelligibility enhancement with inspiration from lombard and clear speaking styles
    • Godoy, E., Koutsogiannaki, M., Stylianou, Y., Approaching speech intelligibility enhancement with inspiration from lombard and clear speaking styles. Comput. Speech. Lang. 28:2 (2014), 629–647.
    • (2014) Comput. Speech. Lang. , vol.28 , Issue.2 , pp. 629-647
    • Godoy, E.1    Koutsogiannaki, M.2    Stylianou, Y.3
  • 65
    • 70450186582 scopus 로고    scopus 로고
    • Alleviating the one-to-many mapping problem in voice conversion with context-dependent modelling
    • Godoy, E., Rosec, O., Chonavel, T., Alleviating the one-to-many mapping problem in voice conversion with context-dependent modelling. Proceedings of the INTERSPEECH, 2009.
    • (2009) Proceedings of the INTERSPEECH
    • Godoy, E.1    Rosec, O.2    Chonavel, T.3
  • 67
    • 78650273608 scopus 로고    scopus 로고
    • Speech spectral envelope estimation through explicit control of peak evolution in time
    • Godoy, E., Rosec, O., Chonavel, T., Speech spectral envelope estimation through explicit control of peak evolution in time. Proceedings of the ISSPA, 2010.
    • (2010) Proceedings of the ISSPA
    • Godoy, E.1    Rosec, O.2    Chonavel, T.3
  • 68
    • 84865717274 scopus 로고    scopus 로고
    • Spectral envelope transformation using DFW and amplitude scaling for voice conversion with parallel or nonparallel corpora
    • Godoy, E., Rosec, O., Chonavel, T., Spectral envelope transformation using DFW and amplitude scaling for voice conversion with parallel or nonparallel corpora. Proceeding of the INTERSPEECH, 2011.
    • (2011) Proceeding of the INTERSPEECH
    • Godoy, E.1    Rosec, O.2    Chonavel, T.3
  • 69
    • 84857498745 scopus 로고    scopus 로고
    • Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora
    • Godoy, E., Rosec, O., Chonavel, T., Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora. IEEE Trans. Audio Speech Lang. Process. 20:4 (2012), 1313–1323.
    • (2012) IEEE Trans. Audio Speech Lang. Process. , vol.20 , Issue.4 , pp. 1313-1323
    • Godoy, E.1    Rosec, O.2    Chonavel, T.3
  • 70
    • 84912138720 scopus 로고    scopus 로고
    • Improving segmental GMM based voice conversion method with target frame selection
    • Gu, H.-Y., Tsai, S.-F., Improving segmental GMM based voice conversion method with target frame selection. Proceedings of the ISCSLP, 2014.
    • (2014) Proceedings of the ISCSLP
    • Gu, H.-Y.1    Tsai, S.-F.2
  • 72
    • 85010456787 scopus 로고
    • Spectral mapping method for voice conversion using speaker selection and vector field smoothing
    • Hashimoto, M., Higuchi, N., Spectral mapping method for voice conversion using speaker selection and vector field smoothing. Proceedings of the EUROSPEECH, 1995.
    • (1995) Proceedings of the EUROSPEECH
    • Hashimoto, M.1    Higuchi, N.2
  • 73
    • 0030351582 scopus 로고    scopus 로고
    • Training data selection for voice conversion using speaker selection and vector field smoothing
    • Hashimoto, M., Higuchi, N., Training data selection for voice conversion using speaker selection and vector field smoothing. Proceedings of the ICSLP, 1996.
    • (1996) Proceedings of the ICSLP
    • Hashimoto, M.1    Higuchi, N.2
  • 81
    • 56149114123 scopus 로고    scopus 로고
    • On the importance of pure prosody in the perception of speaker identity
    • Helander, E.E., Nurminen, J., On the importance of pure prosody in the perception of speaker identity. Proceedings of the INTERSPEECH, 2007.
    • (2007) Proceedings of the INTERSPEECH
    • Helander, E.E.1    Nurminen, J.2
  • 82
    • 77956795483 scopus 로고    scopus 로고
    • Esophageal speech enhancement based on statistical voice conversion with gaussian mixture models
    • Hironori, D., Nakamura, K., Tomoki, T., Saruwatari, H., Shikano, K., Esophageal speech enhancement based on statistical voice conversion with gaussian mixture models. IEICE Trans. Inf. Syst. 93:9 (2010), 2472–2482.
    • (2010) IEICE Trans. Inf. Syst. , vol.93 , Issue.9 , pp. 2472-2482
    • Hironori, D.1    Nakamura, K.2    Tomoki, T.3    Saruwatari, H.4    Shikano, K.5
  • 83
    • 0024880831 scopus 로고
    • Multilayer feedforward networks are universal approximators
    • Hornik, K., Stinchcombe, M., White, H., Multilayer feedforward networks are universal approximators. Neural Netw. 2:5 (1989), 359–366.
    • (1989) Neural Netw. , vol.2 , Issue.5 , pp. 359-366
    • Hornik, K.1    Stinchcombe, M.2    White, H.3
  • 85
    • 34548216761 scopus 로고    scopus 로고
    • Conversion function clustering and selection using linguistic and spectral information for emotional voice conversion
    • Hsia, C.-C., Wu, C.-H., Wu, J.-Q., Conversion function clustering and selection using linguistic and spectral information for emotional voice conversion. IEEE Trans. Comput. 56:9 (2007), 1245–1254.
    • (2007) IEEE Trans. Comput. , vol.56 , Issue.9 , pp. 1245-1254
    • Hsia, C.-C.1    Wu, C.-H.2    Wu, J.-Q.3
  • 89
    • 0020596154 scopus 로고
    • Cepstral analysis synthesis on the mel frequency scale
    • Imai, S., Cepstral analysis synthesis on the mel frequency scale. Proceedings of the ICASSP, 1983.
    • (1983) Proceedings of the ICASSP
    • Imai, S.1
  • 91
    • 0020703324 scopus 로고
    • Mel log spectrum approximation (MLSA) filter for speech synthesis
    • Imai, S., Sumita, K., Furuichi, C., Mel log spectrum approximation (MLSA) filter for speech synthesis. Electron. Commun. Japan 66:2 (1983), 10–18.
    • (1983) Electron. Commun. Japan , vol.66 , Issue.2 , pp. 10-18
    • Imai, S.1    Sumita, K.2    Furuichi, C.3
  • 93
    • 84938935270 scopus 로고    scopus 로고
    • A system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality.
    • Inanoglu, Z., Young, S., A system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality. Proceedings of the INTERSPEECH, 2007, 490–493.
    • (2007) Proceedings of the INTERSPEECH , pp. 490-493
    • Inanoglu, Z.1    Young, S.2
  • 94
    • 58149203393 scopus 로고    scopus 로고
    • Data-driven emotion conversion in spoken english
    • Inanoglu, Z., Young, S., Data-driven emotion conversion in spoken english. Speech Commun. 51:3 (2009), 268–283.
    • (2009) Speech Commun. , vol.51 , Issue.3 , pp. 268-283
    • Inanoglu, Z.1    Young, S.2
  • 95
    • 85064715894 scopus 로고
    • Speech spectrum transformation by speaker interpolation
    • IEEE
    • Iwahashi, N., Sagisaka, Y., Speech spectrum transformation by speaker interpolation. Proceedings of the ICASSP. Vol. 1, 1994, IEEE, I–461.
    • (1994) Proceedings of the ICASSP. Vol. 1 , pp. I-461
    • Iwahashi, N.1    Sagisaka, Y.2
  • 96
    • 0029251946 scopus 로고
    • Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks
    • Iwahashi, N., Sagisaka, Y., Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks. Speech Commun. 16:2 (1995), 139–151.
    • (1995) Speech Commun. , vol.16 , Issue.2 , pp. 139-151
    • Iwahashi, N.1    Sagisaka, Y.2
  • 97
    • 0031623661 scopus 로고    scopus 로고
    • Spectral voice conversion for text-to-speech synthesis
    • Kain, A., Macon, M.W., Spectral voice conversion for text-to-speech synthesis. Proceedings of the ICASSP, 1998.
    • (1998) Proceedings of the ICASSP
    • Kain, A.1    Macon, M.W.2
  • 98
    • 84984905455 scopus 로고    scopus 로고
    • Text-to-speech voice adaptation from sparse training data.
    • Kain, A., Macon, M.W., Text-to-speech voice adaptation from sparse training data. Proceedings of the ICSLP, 1998.
    • (1998) Proceedings of the ICSLP
    • Kain, A.1    Macon, M.W.2
  • 99
    • 0034841948 scopus 로고    scopus 로고
    • Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction
    • Kain, A., Macon, M.W., Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction. Proceedings of the ICASSP, 2001.
    • (2001) Proceedings of the ICASSP
    • Kain, A.1    Macon, M.W.2
  • 100
    • 77953816641 scopus 로고    scopus 로고
    • Unit-selection text-to-speech synthesis using an asynchronous interpolation model.
    • Kain, A., van Santen, J.P., Unit-selection text-to-speech synthesis using an asynchronous interpolation model. Proceedings of the SSW, 2007.
    • (2007) Proceedings of the SSW
    • Kain, A.1    van Santen, J.P.2
  • 101
    • 70349210296 scopus 로고    scopus 로고
    • Using speech transformation to increase speech intelligibility for the hearing-and speaking-impaired
    • Kain, A., Van Santen, J., Using speech transformation to increase speech intelligibility for the hearing-and speaking-impaired. Proceedings of the ICASSP, 2009.
    • (2009) Proceedings of the ICASSP
    • Kain, A.1    Van Santen, J.2
  • 105
    • 33947698917 scopus 로고    scopus 로고
    • Applying pitch target model to convert f0 contour for expressive mandarin speech synthesis
    • Kang, Y., Tao, J., Xu, B., Applying pitch target model to convert f0 contour for expressive mandarin speech synthesis. Proceedings of the ICASSP, 2006.
    • (2006) Proceedings of the ICASSP
    • Kang, Y.1    Tao, J.2    Xu, B.3
  • 106
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time–frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds
    • Kawahara, H., Masuda-Katsuse, I., De Cheveigné, A., Restructuring speech representations using a pitch-adaptive time–frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds. Speech Commun. 27:3 (1999), 187–207.
    • (1999) Speech Commun. , vol.27 , Issue.3 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    De Cheveigné, A.3
  • 107
    • 51449108867 scopus 로고    scopus 로고
    • TANDEM-STRAIGHT: a temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, f0, and aperiodicity estimation
    • Kawahara, H., Morise, M., Takahashi, T., Nisimura, R., Irino, T., Banno, H., TANDEM-STRAIGHT: a temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, f0, and aperiodicity estimation. Proceedings of the ICASSP, 2008.
    • (2008) Proceedings of the ICASSP
    • Kawahara, H.1    Morise, M.2    Takahashi, T.3    Nisimura, R.4    Irino, T.5    Banno, H.6
  • 109
    • 85135141647 scopus 로고    scopus 로고
    • Hidden markov model based voice conversion using dynamic characteristics of speaker.
    • Kim, E.-K., Lee, S., Oh, Y.-H., Hidden markov model based voice conversion using dynamic characteristics of speaker. Proceedings of the EUROSPEECH, 1997.
    • (1997) Proceedings of the EUROSPEECH
    • Kim, E.-K.1    Lee, S.2    Oh, Y.-H.3
  • 113
    • 84905248157 scopus 로고    scopus 로고
    • Simple and artefact-free spectral modifications for enhancing the intelligibility of casual speech
    • Koutsogiannaki, M., Stylianou, Y., Simple and artefact-free spectral modifications for enhancing the intelligibility of casual speech. Proceedings of the ICASSP, 2014.
    • (2014) Proceedings of the ICASSP
    • Koutsogiannaki, M.1    Stylianou, Y.2
  • 114
    • 84908466787 scopus 로고    scopus 로고
    • Using phone and diphone based acoustic models for voice conversion: a step towards creating voice fonts
    • Kumar, A., Verma, A., Using phone and diphone based acoustic models for voice conversion: a step towards creating voice fonts. Proceedings of the ICME, 2003.
    • (2003) Proceedings of the ICME
    • Kumar, A.1    Verma, A.2
  • 115
    • 0029256373 scopus 로고
    • Acoustic characteristics of speaker individuality: control and conversion
    • Kuwabara, H., Sagisak, Y., Acoustic characteristics of speaker individuality: control and conversion. Speech Commun. 16:2 (1995), 165–173.
    • (1995) Speech Commun. , vol.16 , Issue.2 , pp. 165-173
    • Kuwabara, H.1    Sagisak, Y.2
  • 116
    • 84964555662 scopus 로고    scopus 로고
    • Speaker intonation adaptation for transforming text-to-speech synthesis speaker identity
    • Langarani, M.S.E., van Santen, J., Speaker intonation adaptation for transforming text-to-speech synthesis speaker identity. Proceedings of the ASRU, 2015.
    • (2015) Proceedings of the ASRU
    • Langarani, M.S.E.1    van Santen, J.2
  • 118
    • 84890501677 scopus 로고    scopus 로고
    • Voice conversion by mapping the spectral and prosodic features using support vector machine
    • Springer
    • Laskar, R.H., Talukdar, F.A., Bhattacharjee, R., Das, S., Voice conversion by mapping the spectral and prosodic features using support vector machine. Applications of Soft Computing, 2009, Springer, 519–528.
    • (2009) Applications of Soft Computing , pp. 519-528
    • Laskar, R.H.1    Talukdar, F.A.2    Bhattacharjee, R.3    Das, S.4
  • 120
    • 44949210554 scopus 로고    scopus 로고
    • Map-based adaptation for speech conversion using adaptation data selection and non-parallel training.
    • Lee, C.-H., Wu, C.-H., Map-based adaptation for speech conversion using adaptation data selection and non-parallel training. Proceedings of the INTERSPEECH, 2006.
    • (2006) Proceedings of the INTERSPEECH
    • Lee, C.-H.1    Wu, C.-H.2
  • 121
    • 38149065136 scopus 로고    scopus 로고
    • Statistical approach for voice personality transformation
    • Lee, K.-S., Statistical approach for voice personality transformation. IEEE Trans. Audio Speech Lang. Process. 15:2 (2007), 641–651.
    • (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , Issue.2 , pp. 641-651
    • Lee, K.-S.1
  • 122
    • 84896464538 scopus 로고    scopus 로고
    • A unit selection approach for voice transformation
    • Lee, K.-S., A unit selection approach for voice transformation. Speech Commun. 60 (2014), 30–43.
    • (2014) Speech Commun. , vol.60 , pp. 30-43
    • Lee, K.-S.1
  • 123
    • 84876489382 scopus 로고    scopus 로고
    • Emotional speech conversion based on spectrum-prosody dual transformation
    • Li, B., Xiao, Z., Shen, Y., Zhou, Q., Tao, Z., Emotional speech conversion based on spectrum-prosody dual transformation. Proceedings of the ICSP, 2012.
    • (2012) Proceedings of the ICSP
    • Li, B.1    Xiao, Z.2    Shen, Y.3    Zhou, Q.4    Tao, Z.5
  • 124
    • 85032750981 scopus 로고    scopus 로고
    • Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends
    • Ling, Z.-H., Kang, S.-Y., Zen, H., Senior, A., Schuster, M., Qian, X.-J., Meng, H.M., Deng, L., Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends. Signal Process. Mag. IEEE 32:3 (2015), 35–52.
    • (2015) Signal Process. Mag. IEEE , vol.32 , Issue.3 , pp. 35-52
    • Ling, Z.-H.1    Kang, S.-Y.2    Zen, H.3    Senior, A.4    Schuster, M.5    Qian, X.-J.6    Meng, H.M.7    Deng, L.8
  • 125
    • 84905223323 scopus 로고    scopus 로고
    • Using bidirectional associative memories for joint spectral envelope modeling in voice conversion
    • Liu, L.-J., Chen, L.-H., Ling, Z.-H., Dai, L.-R., Using bidirectional associative memories for joint spectral envelope modeling in voice conversion. Proceedings of the ICASSP, 2014.
    • (2014) Proceedings of the ICASSP
    • Liu, L.-J.1    Chen, L.-H.2    Ling, Z.-H.3    Dai, L.-R.4
  • 126
    • 84946076200 scopus 로고    scopus 로고
    • Spectral conversion using deep neural networks trained with multi-source speakers
    • Liu, L.-J., Chen, L.-H., Ling, Z.-H., Dai, L.-R., Spectral conversion using deep neural networks trained with multi-source speakers. Proceedings of the ICASSP, 2015.
    • (2015) Proceedings of the ICASSP
    • Liu, L.-J.1    Chen, L.-H.2    Ling, Z.-H.3    Dai, L.-R.4
  • 131
    • 84905269973 scopus 로고    scopus 로고
    • Multimodal voice conversion using non-negative matrix factorization in noisy environments
    • Masaka, K., Aihara, R., Takiguchi, T., Ariki, Y., Multimodal voice conversion using non-negative matrix factorization in noisy environments. Proceedings of the ICASSP, 2014.
    • (2014) Proceedings of the ICASSP
    • Masaka, K.1    Aihara, R.2    Takiguchi, T.3    Ariki, Y.4
  • 132
    • 34547534995 scopus 로고    scopus 로고
    • Cost reduction of training mapping function based on multistep voice conversion
    • Masuda, T., Shozakai, M., Cost reduction of training mapping function based on multistep voice conversion. Proceedings of the ICASSP, 2007.
    • (2007) Proceedings of the ICASSP
    • Masuda, T.1    Shozakai, M.2
  • 133
    • 0015677419 scopus 로고
    • Multidimensional representation of personal quality of vowels and its acoustical correlates
    • Matsumoto, H., Hiki, S., Sone, T., Nimura, T., Multidimensional representation of personal quality of vowels and its acoustical correlates. IEEE Trans. Audio Electroacoust. 21:5 (1973), 428–436.
    • (1973) IEEE Trans. Audio Electroacoust. , vol.21 , Issue.5 , pp. 428-436
    • Matsumoto, H.1    Hiki, S.2    Sone, T.3    Nimura, T.4
  • 134
    • 85007685968 scopus 로고
    • Unsupervised speaker adaptation from short utterances based on a minimized fuzzy objective function.
    • Matsumoto, H., Yamashita, Y., Unsupervised speaker adaptation from short utterances based on a minimized fuzzy objective function. J. Acoust. Soc. Japan (E) 14:5 (1993), 353–361.
    • (1993) J. Acoust. Soc. Japan (E) , vol.14 , Issue.5 , pp. 353-361
    • Matsumoto, H.1    Yamashita, Y.2
  • 137
    • 84994251909 scopus 로고    scopus 로고
    • Deep bidirectional lstm modeling of timbre and prosody for emotional voice conversion
    • Ming, H., Huang, D., Xie, L., Wu, J., Li, M.D.H., Deep bidirectional lstm modeling of timbre and prosody for emotional voice conversion. Proceedings of the INTERSPEECH, 2016.
    • (2016) Proceedings of the INTERSPEECH
    • Ming, H.1    Huang, D.2    Xie, L.3    Wu, J.4    Li, M.D.H.5
  • 138
    • 0029256372 scopus 로고
    • Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt
    • Mizuno, H., Abe, M., Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt. Speech Commun. 16:2 (1995), 153–164.
    • (1995) Speech Commun. , vol.16 , Issue.2 , pp. 153-164
    • Mizuno, H.1    Abe, M.2
  • 140
    • 84946685887 scopus 로고    scopus 로고
    • Voice conversion using deep neural networks with speaker-independent pre-training
    • Mohammadi, S.H., Kain, A., Voice conversion using deep neural networks with speaker-independent pre-training. Proceedings of the SLT, 2014.
    • (2014) Proceedings of the SLT
    • Mohammadi, S.H.1    Kain, A.2
  • 141
    • 84959173289 scopus 로고    scopus 로고
    • Semi-supervised training of a voice conversion mapping function using a joint-autoencoder
    • Mohammadi, S.H., Kain, A., Semi-supervised training of a voice conversion mapping function using a joint-autoencoder. Proceedings of the INTERSPEECH, 2015.
    • (2015) Proceedings of the INTERSPEECH
    • Mohammadi, S.H.1    Kain, A.2
  • 142
    • 84994219829 scopus 로고    scopus 로고
    • A voice conversion mapping function based on a stacked joint-autoencoder
    • Mohammadi, S.H., Kain, A., A voice conversion mapping function based on a stacked joint-autoencoder. Proceedings of the INTERSPEECH, 2016.
    • (2016) Proceedings of the INTERSPEECH
    • Mohammadi, S.H.1    Kain, A.2
  • 144
    • 84908519225 scopus 로고    scopus 로고
    • Cheaptrick, a spectral envelope estimator for high-quality speech synthesis
    • Morise, M., Cheaptrick, a spectral envelope estimator for high-quality speech synthesis. Speech Commun. 67 (2015), 1–7.
    • (2015) Speech Commun. , vol.67 , pp. 1-7
    • Morise, M.1
  • 145
    • 84976902575 scopus 로고    scopus 로고
    • World: a vocoder-based high-quality speech synthesis system for real-time applications
    • Morise, M., Yokomori, F., Ozawa, K., World: a vocoder-based high-quality speech synthesis system for real-time applications. IEICE Trans. Inf. Syst., 2016.
    • (2016) IEICE Trans. Inf. Syst.
    • Morise, M.1    Yokomori, F.2    Ozawa, K.3
  • 147
    • 0036753077 scopus 로고    scopus 로고
    • Reconstruction of speech from whispers
    • Morris, R.W., Clements, M.A., Reconstruction of speech from whispers. Med. Eng. Phys. 24:7 (2002), 515–520.
    • (2002) Med. Eng. Phys. , vol.24 , Issue.7 , pp. 515-520
    • Morris, R.W.1    Clements, M.A.2
  • 149
    • 4544297119 scopus 로고    scopus 로고
    • Non-parallel training for voice conversion by maximum likelihood constrained adaptation
    • Mouchtaris, A., Van der Spiegel, J., Mueller, P., Non-parallel training for voice conversion by maximum likelihood constrained adaptation. Proceedings of the ICASSP, 2004.
    • (2004) Proceedings of the ICASSP
    • Mouchtaris, A.1    Van der Spiegel, J.2    Mueller, P.3
  • 150
    • 11244303645 scopus 로고    scopus 로고
    • A spectral conversion approach to the iterative wiener filter for speech enhancement
    • Mouchtaris, A., Van der Spiegel, J., Mueller, P., A spectral conversion approach to the iterative wiener filter for speech enhancement. Proceedings of the ICME, 2004.
    • (2004) Proceedings of the ICME
    • Mouchtaris, A.1    Van der Spiegel, J.2    Mueller, P.3
  • 151
    • 34047245444 scopus 로고    scopus 로고
    • Nonparallel training for voice conversion based on a parameter adaptation approach
    • Mouchtaris, A., Van der Spiegel, J., Mueller, P., Nonparallel training for voice conversion based on a parameter adaptation approach. IEEE Trans. Audio Speech Lang. Process. 14:3 (2006), 952–963.
    • (2006) IEEE Trans. Audio Speech Lang. Process. , vol.14 , Issue.3 , pp. 952-963
    • Mouchtaris, A.1    Van der Spiegel, J.2    Mueller, P.3
  • 152
    • 0025543906 scopus 로고
    • Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones
    • Moulines, E., Charpentier, F., Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. Speech Commun. 9:5 (1990), 453–467.
    • (1990) Speech Commun. , vol.9 , Issue.5 , pp. 453-467
    • Moulines, E.1    Charpentier, F.2
  • 155
    • 85010461545 scopus 로고    scopus 로고
    • A speech communication aid system for total laryngectomies using voice conversion of body transmitted artificial speech
    • Nakamura, K., Toda, T., Saruwatari, H., Shikano, K., A speech communication aid system for total laryngectomies using voice conversion of body transmitted artificial speech. J. Acoust. Soc. Am., 120(5), 2006, 3351.
    • (2006) J. Acoust. Soc. Am. , vol.120 , Issue.5 , pp. 3351
    • Nakamura, K.1    Toda, T.2    Saruwatari, H.3    Shikano, K.4
  • 156
    • 80052698826 scopus 로고    scopus 로고
    • Speaking-aid systems using GMM -based voice conversion for electrolaryngeal speech
    • Nakamura, K., Toda, T., Saruwatari, H., Shikano, K., Speaking-aid systems using GMM -based voice conversion for electrolaryngeal speech. Speech Commun. 54:1 (2012), 134–146.
    • (2012) Speech Commun. , vol.54 , Issue.1 , pp. 134-146
    • Nakamura, K.1    Toda, T.2    Saruwatari, H.3    Shikano, K.4
  • 158
    • 84910087396 scopus 로고    scopus 로고
    • High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion
    • Nakashika, T., Takiguchi, T., Ariki, Y., High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion. Proceedings of the INTERSPEECH, 2014.
    • (2014) Proceedings of the INTERSPEECH
    • Nakashika, T.1    Takiguchi, T.2    Ariki, Y.3
  • 160
    • 84923867813 scopus 로고    scopus 로고
    • Voice conversion using RNN pre-trained by recurrent temporal restricted Boltzmann machines
    • Nakashika, T., Takiguchi, T., Ariki, Y., Voice conversion using RNN pre-trained by recurrent temporal restricted Boltzmann machines. IEEE/ACM Trans. Audio Speech Lang. Process. 23:3 (2015), 580–587, 10.1109/TASLP.2014.2379589.
    • (2015) IEEE/ACM Trans. Audio Speech Lang. Process. , vol.23 , Issue.3 , pp. 580-587
    • Nakashika, T.1    Takiguchi, T.2    Ariki, Y.3
  • 161
    • 84924309945 scopus 로고    scopus 로고
    • Voice conversion using speaker-dependent conditional restricted Boltzmann machine
    • Nakashika, T., Takiguchi, T., Ariki, Y., Voice conversion using speaker-dependent conditional restricted Boltzmann machine. EURASIP J. Audio Speech Music Process. 2015:1 (2015), 1–12.
    • (2015) EURASIP J. Audio Speech Music Process. , vol.2015 , Issue.1 , pp. 1-12
    • Nakashika, T.1    Takiguchi, T.2    Ariki, Y.3
  • 162
    • 84984920236 scopus 로고    scopus 로고
    • Non-parallel training in voice conversion using an adaptive restricted boltzmann machine
    • Nakashika, T., Takiguchi, T., Minami, Y., Non-parallel training in voice conversion using an adaptive restricted boltzmann machine. IEEE/ACM Trans. Audio Speech Lang. Process. 24:11 (2016), 2032–2045.
    • (2016) IEEE/ACM Trans. Audio Speech Lang. Process. , vol.24 , Issue.11 , pp. 2032-2045
    • Nakashika, T.1    Takiguchi, T.2    Minami, Y.3
  • 163
    • 84901766069 scopus 로고    scopus 로고
    • Voice conversion based on speaker-dependent restricted boltzmann machines
    • Nakashika, T., Toru, Takiguchi, T., Tetsuya, Ariki, Y., Yasuo, Voice conversion based on speaker-dependent restricted boltzmann machines. IEICE Trans. Inf. Syst. 97:6 (2014), 1403–1410.
    • (2014) IEICE Trans. Inf. Syst. , vol.97 , Issue.6 , pp. 1403-1410
    • Nakashika, T.1    Toru2    Takiguchi, T.3    Tetsuya4    Ariki, Y.5    Yasuo6
  • 164
    • 78149241363 scopus 로고    scopus 로고
    • Spectral conversion based on statistical models including time-sequence matching
    • Nankaku, Y., Nakamura, K., Toda, T., Tokuda, K., Spectral conversion based on statistical models including time-sequence matching. Proceedings of the SSW, 2007.
    • (2007) Proceedings of the SSW
    • Nankaku, Y.1    Nakamura, K.2    Toda, T.3    Tokuda, K.4
  • 165
    • 0029254176 scopus 로고
    • Transformation of formants for voice conversion using artificial neural networks
    • Narendranath, M., Murthy, H.A., Rajendran, S., Yegnanarayana, B., Transformation of formants for voice conversion using artificial neural networks. Speech Commun. 16:2 (1995), 207–216.
    • (1995) Speech Commun. , vol.16 , Issue.2 , pp. 207-216
    • Narendranath, M.1    Murthy, H.A.2    Rajendran, S.3    Yegnanarayana, B.4
  • 167
    • 67649297853 scopus 로고    scopus 로고
    • Spectral modification for voice gender conversion using temporal decomposition
    • Nguyen, B.P., Akagi, M., Spectral modification for voice gender conversion using temporal decomposition. J.Signal Process, 2007.
    • (2007) J.Signal Process
    • Nguyen, B.P.1    Akagi, M.2
  • 168
    • 85010381832 scopus 로고    scopus 로고
    • Phoneme-based spectral voice conversion using temporal decomposition and gaussian mixture model
    • Nguyen, B.P., Akagi, M., Phoneme-based spectral voice conversion using temporal decomposition and gaussian mixture model. Proceedings of the ICCE, 2008.
    • (2008) Proceedings of the ICCE
    • Nguyen, B.P.1    Akagi, M.2
  • 169
    • 84867055711 scopus 로고    scopus 로고
    • Voice transformation using radial basis function
    • Springer
    • Nirmal, J., Patnaik, S., Zaveri, M.A., Voice transformation using radial basis function. Proceedings of the TITC, 2013, Springer, 345–351.
    • (2013) Proceedings of the TITC , pp. 345-351
    • Nirmal, J.1    Patnaik, S.2    Zaveri, M.A.3
  • 170
    • 84905573362 scopus 로고    scopus 로고
    • Voice conversion using general regression neural network
    • Nirmal, J., Zaveri, M., Patnaik, S., Kachare, P., Voice conversion using general regression neural network. Appl. Soft Comput. 24 (2014), 1–12.
    • (2014) Appl. Soft Comput. , vol.24 , pp. 1-12
    • Nirmal, J.1    Zaveri, M.2    Patnaik, S.3    Kachare, P.4
  • 177
    • 2142655909 scopus 로고
    • Interpolation properties of linear prediction parametric representations.
    • Paliwal, K.K., Interpolation properties of linear prediction parametric representations. Proceedings of the EUROSPEECH, 1995.
    • (1995) Proceedings of the EUROSPEECH
    • Paliwal, K.K.1
  • 178
    • 0033692729 scopus 로고    scopus 로고
    • Narrowband to wideband conversion of speech using gmm based transformation
    • Park, K.-Y., Kim, H.S., Narrowband to wideband conversion of speech using gmm based transformation. Proceedings of the ICASSP, 2000.
    • (2000) Proceedings of the ICASSP
    • Park, K.-Y.1    Kim, H.S.2
  • 180
    • 0032664931 scopus 로고    scopus 로고
    • An experimental study of speaker verification sensitivity to computer voice-altered imposters
    • Pellom, B.L., Hansen, J.H., An experimental study of speaker verification sensitivity to computer voice-altered imposters. Proceedings of the ICASSP, 1999.
    • (1999) Proceedings of the ICASSP
    • Pellom, B.L.1    Hansen, J.H.2
  • 183
    • 27644522706 scopus 로고    scopus 로고
    • Vocal tract normalization equals linear transformation in cepstral space
    • Pitz, M., Ney, H., Vocal tract normalization equals linear transformation in cepstral space. Speech Audio Process. IEEE Trans. 13:5 (2005), 930–944.
    • (2005) Speech Audio Process. IEEE Trans. , vol.13 , Issue.5 , pp. 930-944
    • Pitz, M.1    Ney, H.2
  • 185
    • 70450171770 scopus 로고    scopus 로고
    • A novel technique for voice conversion based on style and content decomposition with bilinear models.
    • Popa, V., Nurminen, J., Gabbouj, M., A novel technique for voice conversion based on style and content decomposition with bilinear models. Proceedings of the INTERSPEECH, 2009.
    • (2009) Proceedings of the INTERSPEECH
    • Popa, V.1    Nurminen, J.2    Gabbouj, M.3
  • 186
    • 84971616451 scopus 로고    scopus 로고
    • A study of bilinear models in voice conversion
    • Popa, V., Nurminen, J., Gabbouj, M., et al. A study of bilinear models in voice conversion. J. Signal Inf. Process., 2(02), 2011, 125.
    • (2011) J. Signal Inf. Process. , vol.2 , Issue.2 , pp. 125
    • Popa, V.1    Nurminen, J.2    Gabbouj, M.3
  • 189
    • 33751438738 scopus 로고    scopus 로고
    • Non-linear frequency scale mapping for voice conversion in text-to-speech system with cepstral description
    • Přibilová, A., Přibil, J., Non-linear frequency scale mapping for voice conversion in text-to-speech system with cepstral description. Speech Commun. 48:12 (2006), 1691–1703.
    • (2006) Speech Commun. , vol.48 , Issue.12 , pp. 1691-1703
    • Přibilová, A.1    Přibil, J.2
  • 190
    • 84865763441 scopus 로고    scopus 로고
    • A study on bag of gaussian model with application to voice conversion.
    • Qiao, Y., Tong, T., Minematsu, N., A study on bag of gaussian model with application to voice conversion. Proceedings of the INTERSPEECH, 2011, 657–660.
    • (2011) Proceedings of the INTERSPEECH , pp. 657-660
    • Qiao, Y.1    Tong, T.2    Minematsu, N.3
  • 193
    • 85036464413 scopus 로고    scopus 로고
    • Novel pre-processing using outlier removal in voice conversion
    • Rao, S.V., Shah, N.J., Patil, H.A., Novel pre-processing using outlier removal in voice conversion. Proceedings of the SSW, 2016.
    • (2016) Proceedings of the SSW
    • Rao, S.V.1    Shah, N.J.2    Patil, H.A.3
  • 195
    • 0030359624 scopus 로고    scopus 로고
    • Voice conversion based on topological feature maps and time-variant filtering
    • Rinscheid, A., Voice conversion based on topological feature maps and time-variant filtering. Proceedings of the ICSLP, 1996.
    • (1996) Proceedings of the ICSLP
    • Rinscheid, A.1
  • 196
    • 0034847662 scopus 로고    scopus 로고
    • Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs
    • Rix, A.W., Beerends, J.G., Hollier, M.P., Hekstra, A.P., Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. Proceedings of the ICASSP, 2001.
    • (2001) Proceedings of the ICASSP
    • Rix, A.W.1    Beerends, J.G.2    Hollier, M.P.3    Hekstra, A.P.4
  • 199
    • 33748468528 scopus 로고    scopus 로고
    • Dynamic programming approach to voice transformation
    • Salor, Ö., Demirekler, M., Dynamic programming approach to voice transformation. Speech communication 48:10 (2006), 1262–1272.
    • (2006) Speech communication , vol.48 , Issue.10 , pp. 1262-1272
    • Salor, Ö.1    Demirekler, M.2
  • 202
    • 70450149422 scopus 로고    scopus 로고
    • Voice conversion based on mapping formants
    • Shuang, Z., Bakis, R., Qin, Y., Voice conversion based on mapping formants. TC-STAR WSST, 2006, 219–223.
    • (2006) TC-STAR WSST , pp. 219-223
    • Shuang, Z.1    Bakis, R.2    Qin, Y.3
  • 203
    • 51449112440 scopus 로고    scopus 로고
    • Voice conversion by combining frequency warping with unit selection
    • Shuang, Z., Meng, F., Qin, Y., Voice conversion by combining frequency warping with unit selection. Proceedings of the ICASSP, 2008.
    • (2008) Proceedings of the ICASSP
    • Shuang, Z.1    Meng, F.2    Qin, Y.3
  • 204
  • 205
    • 80053068819 scopus 로고    scopus 로고
    • Voice conversion using support vector regression
    • Song, P., Bao, Y., Zhao, L., Zou, C., Voice conversion using support vector regression. Electron. Lett. 47:18 (2011), 1045–1046.
    • (2011) Electron. Lett. , vol.47 , Issue.18 , pp. 1045-1046
    • Song, P.1    Bao, Y.2    Zhao, L.3    Zou, C.4
  • 211
    • 84946027999 scopus 로고    scopus 로고
    • Voice conversion using deep bidirectional long short-term memory based recurrent neural networks
    • Sun, L., Kang, S., Li, K., Meng, H., Voice conversion using deep bidirectional long short-term memory based recurrent neural networks. Proceedings of the ICASSP, 2015.
    • (2015) Proceedings of the ICASSP
    • Sun, L.1    Kang, S.2    Li, K.3    Meng, H.4
  • 212
    • 78650542860 scopus 로고    scopus 로고
    • Voice conversion: State-of-the-art and future work
    • Sündermann, D., Voice conversion: State-of-the-art and future work. Fortschritte der Akustik, 31(2), 2005, 735.
    • (2005) Fortschritte der Akustik , vol.31 , Issue.2 , pp. 735
    • Sündermann, D.1
  • 213
    • 84888623995 scopus 로고    scopus 로고
    • Universitätsbibliothek der Universität der Bundeswehr München Ph.D. thesis.
    • Sündermann, D., Text-independent voice conversion, 2008, Universitätsbibliothek der Universität der Bundeswehr München Ph.D. thesis.
    • (2008) Text-independent voice conversion
    • Sündermann, D.1
  • 220
    • 84946794248 scopus 로고    scopus 로고
    • An automatic segmentation and mapping approach for voice conversion parameter training
    • Sündermann, D., Ney, H., An automatic segmentation and mapping approach for voice conversion parameter training. Proceedings of the AST, 2003.
    • (2003) Proceedings of the AST
    • Sündermann, D.1    Ney, H.2
  • 224
    • 84959166270 scopus 로고    scopus 로고
    • Modulation spectrum-constrained trajectory training algorithm for gmm-based voice conversion
    • Takamichi, S., Toda, T., Black, A.W., Nakamura, S., Modulation spectrum-constrained trajectory training algorithm for gmm-based voice conversion. Proceedings of the ICASSP, 2015.
    • (2015) Proceedings of the ICASSP
    • Takamichi, S.1    Toda, T.2    Black, A.W.3    Nakamura, S.4
  • 227
    • 80051619373 scopus 로고    scopus 로고
    • One sentence voice adaptation using GMm-based frequency-warping and shift with a sub-band basis spectrum model
    • Tamura, M., Morita, M., Kagoshima, T., Akamine, M., One sentence voice adaptation using GMm-based frequency-warping and shift with a sub-band basis spectrum model. Proceedings of the ICASSP, 2011.
    • (2011) Proceedings of the ICASSP
    • Tamura, M.1    Morita, M.2    Kagoshima, T.3    Akamine, M.4
  • 228
    • 84905244240 scopus 로고    scopus 로고
    • A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion.
    • Tanaka, K., Toda, T., Neubig, G., Sakti, S., Nakamura, S., A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion. Proceedings of the INTERSPEECH, 2013.
    • (2013) Proceedings of the INTERSPEECH
    • Tanaka, K.1    Toda, T.2    Neubig, G.3    Sakti, S.4    Nakamura, S.5
  • 230
    • 34047263010 scopus 로고    scopus 로고
    • Prosody conversion from neutral speech to emotional speech
    • Tao, J., Kang, Y., Li, A., Prosody conversion from neutral speech to emotional speech. IEEE Trans. Audio Speech Lang. Process. 14:4 (2006), 1145–1154.
    • (2006) IEEE Trans. Audio Speech Lang. Process. , vol.14 , Issue.4 , pp. 1145-1154
    • Tao, J.1    Kang, Y.2    Li, A.3
  • 233
    • 84912079352 scopus 로고    scopus 로고
    • Correlation-based frequency warping for voice conversion
    • IEEE
    • Tian, X., Wu, Z., Lee, S., Chng, E.S., Correlation-based frequency warping for voice conversion. Proceedings of the ISCSLP, 2014, IEEE, 211–215.
    • (2014) Proceedings of the ISCSLP , pp. 211-215
    • Tian, X.1    Wu, Z.2    Lee, S.3    Chng, E.S.4
  • 236
    • 0003747605 scopus 로고
    • Statistical Analysis of Finite Mixture Distributions
    • Wiley New York
    • Titterington, D.M., Smith, A.F., Makov, U.E., et al. Statistical Analysis of Finite Mixture Distributions. Vol. 7, 1985, Wiley New York.
    • (1985) , vol.7
    • Titterington, D.M.1    Smith, A.F.2    Makov, U.E.3
  • 237
  • 238
    • 33646779506 scopus 로고    scopus 로고
    • Spectral conversion based on maximum likelihood estimation considering global variance of converted parameter.
    • Toda, T., Black, A.W., Tokuda, K., Spectral conversion based on maximum likelihood estimation considering global variance of converted parameter. Proceedings of the ICASSP, 2005.
    • (2005) Proceedings of the ICASSP
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 239
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
    • Toda, T., Black, A.W., Tokuda, K., Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory. IEEE Trans. Audio Speech Lang. Process. 15:8 (2007), 2222–2235.
    • (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 240
    • 38649140222 scopus 로고    scopus 로고
    • Statistical mapping between articulatory movements and acoustic spectrum using a gaussian mixture model
    • Toda, T., Black, A.W., Tokuda, K., Statistical mapping between articulatory movements and acoustic spectrum using a gaussian mixture model. Speech Commun. 50:3 (2008), 215–227.
    • (2008) Speech Commun. , vol.50 , Issue.3 , pp. 215-227
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 241
  • 242
    • 84865698185 scopus 로고    scopus 로고
    • Statistical voice conversion techniques for body-conducted unvoiced speech enhancement
    • Toda, T., Nakagiri, M., Shikano, K., Statistical voice conversion techniques for body-conducted unvoiced speech enhancement. IEEE Trans. Audio Speech Lang. Process. 20:9 (2012), 2505–2517.
    • (2012) IEEE Trans. Audio Speech Lang. Process. , vol.20 , Issue.9 , pp. 2505-2517
    • Toda, T.1    Nakagiri, M.2    Shikano, K.3
  • 245
    • 34547496175 scopus 로고    scopus 로고
    • One-to-many and many-to-one voice conversion based on eigenvoices
    • Toda, T., Ohtani, Y., Shikano, K., One-to-many and many-to-one voice conversion based on eigenvoices. Proceedings of the ICASSP, 2007.
    • (2007) Proceedings of the ICASSP
    • Toda, T.1    Ohtani, Y.2    Shikano, K.3
  • 247
    • 0034842552 scopus 로고    scopus 로고
    • Voice conversion algorithm based on gaussian mixture model with dynamic frequency warping of straight spectrum
    • Toda, T., Saruwatari, H., Shikano, K., Voice conversion algorithm based on gaussian mixture model with dynamic frequency warping of straight spectrum. Proceedings of the ICASSP, 2001.
    • (2001) Proceedings of the ICASSP
    • Toda, T.1    Saruwatari, H.2    Shikano, K.3
  • 250
    • 84946077883 scopus 로고    scopus 로고
    • Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis
    • Tokuda, K., Zen, H., Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis. Proceedings of the ICASSP, 2015.
    • (2015) Proceedings of the ICASSP
    • Tokuda, K.1    Zen, H.2
  • 251
    • 76849105528 scopus 로고    scopus 로고
    • Improvement to a nam-captured whisper-to-speech system
    • Tran, V.-A., Bailly, G., Lœvenbruck, H., Toda, T., Improvement to a nam-captured whisper-to-speech system. Speech Commun. 52:4 (2010), 314–326.
    • (2010) Speech Commun. , vol.52 , Issue.4 , pp. 314-326
    • Tran, V.-A.1    Bailly, G.2    Lœvenbruck, H.3    Toda, T.4
  • 252
  • 254
    • 85009179173 scopus 로고    scopus 로고
    • Voice conversion methods for vocal tract and pitch contour modification.
    • Türk, O., Arslan, L.M., Voice conversion methods for vocal tract and pitch contour modification. Proceedings of the INTERSPEECH, 2003.
    • (2003) Proceedings of the INTERSPEECH
    • Türk, O.1    Arslan, L.M.2
  • 256
    • 33746653351 scopus 로고    scopus 로고
    • Robust processing techniques for voice conversion
    • Turk, O., Arslan, L.M., Robust processing techniques for voice conversion. Comput. Speech Lang. 20:4 (2006), 441–467.
    • (2006) Comput. Speech Lang. , vol.20 , Issue.4 , pp. 441-467
    • Turk, O.1    Arslan, L.M.2
  • 258
    • 84867219635 scopus 로고    scopus 로고
    • A comparison of voice conversion methods for transforming voice quality in emotional speech synthesis.
    • Türk, O., Schröder, M., A comparison of voice conversion methods for transforming voice quality in emotional speech synthesis. Proceedings of the INTERSPEECH, 2008.
    • (2008) Proceedings of the INTERSPEECH
    • Türk, O.1    Schröder, M.2
  • 259
    • 77953699443 scopus 로고    scopus 로고
    • Evaluation of expressive speech synthesis with voice conversion and copy resynthesis techniques
    • Turk, O., Schroder, M., Evaluation of expressive speech synthesis with voice conversion and copy resynthesis techniques. IEEE Trans. Audio Speech Lang. Process. 18:5 (2010), 965–973.
    • (2010) IEEE Trans. Audio Speech Lang. Process. , vol.18 , Issue.5 , pp. 965-973
    • Turk, O.1    Schroder, M.2
  • 260
    • 34547806096 scopus 로고    scopus 로고
    • A self-organizing map with twin units capable of describing a nonlinear input–output relation applied to speech code vector mapping
    • Uchino, E., Yano, K., Azetsu, T., A self-organizing map with twin units capable of describing a nonlinear input–output relation applied to speech code vector mapping. Inf. Sci. 177:21 (2007), 4634–4644.
    • (2007) Inf. Sci. , vol.177 , Issue.21 , pp. 4634-4644
    • Uchino, E.1    Yano, K.2    Azetsu, T.3
  • 262
    • 85010316399 scopus 로고    scopus 로고
    • Voice Conversion Using Frame Selection
    • Reporte Interno Laboratorio de Comunicaciones-UNMdP
    • Uriz, A., Agüero, P.D., Erro, D., Bonafonte, A., Voice Conversion Using Frame Selection. 2008 Reporte Interno Laboratorio de Comunicaciones-UNMdP.
    • (2008)
    • Uriz, A.1    Agüero, P.D.2    Erro, D.3    Bonafonte, A.4
  • 266
    • 0026880275 scopus 로고
    • Voice transformation using PSOLA technique
    • Valbret, H., Moulines, E., Tubach, J.P., Voice transformation using PSOLA technique. Speech Commun. 11:2 (1992), 175–187.
    • (1992) Speech Commun. , vol.11 , Issue.2 , pp. 175-187
    • Valbret, H.1    Moulines, E.2    Tubach, J.P.3
  • 268
  • 269
    • 77956828655 scopus 로고    scopus 로고
    • Voice fonts for individuality representation and transformation
    • Verma, A., Kumar, A., Voice fonts for individuality representation and transformation. ACM Trans. Speech Lang. Process. (TSLP), 2(1), 2005, 4.
    • (2005) ACM Trans. Speech Lang. Process. (TSLP) , vol.2 , Issue.1 , pp. 4
    • Verma, A.1    Kumar, A.2
  • 271
    • 84960864752 scopus 로고    scopus 로고
    • Observation-model error compensation for enhanced spectral envelope transformation in voice conversion
    • Villavicencio, F., Bonada, J., Hisaminato, Y., Observation-model error compensation for enhanced spectral envelope transformation in voice conversion. Proceedings of the MLSP, 2015.
    • (2015) Proceedings of the MLSP
    • Villavicencio, F.1    Bonada, J.2    Hisaminato, Y.3
  • 272
    • 34547541173 scopus 로고    scopus 로고
    • A new method for speech synthesis and transformation based on an arx-lf source-filter decomposition and HNM modeling
    • Vincent, D., Rosec, O., Chonavel, T., A new method for speech synthesis and transformation based on an arx-lf source-filter decomposition and HNM modeling. Proceedings of the ICASSP, 2007.
    • (2007) Proceedings of the ICASSP
    • Vincent, D.1    Rosec, O.2    Chonavel, T.3
  • 273
    • 0003557444 scopus 로고    scopus 로고
    • Verbmobil: Foundations of Speech-to-Speech Translation
    • Springer Science & Business Media
    • Wahlster, W., Verbmobil: Foundations of Speech-to-Speech Translation. 2000, Springer Science & Business Media.
    • (2000)
    • Wahlster, W.1
  • 274
    • 84902959938 scopus 로고    scopus 로고
    • Emotional voice conversion for mandarin using tone nucleus model–small corpus and high efficiency
    • Wang, M., Wen, M., Hirose, K., Minematsu, N., Emotional voice conversion for mandarin using tone nucleus model–small corpus and high efficiency. Proceedings of the Speech Prosody, 2012.
    • (2012) Proceedings of the Speech Prosody
    • Wang, M.1    Wen, M.2    Hirose, K.3    Minematsu, N.4
  • 275
    • 84988295463 scopus 로고    scopus 로고
    • Multi-level prosody and spectrum conversion for emotional speech synthesis
    • Wang, Z., Yu, Y., Multi-level prosody and spectrum conversion for emotional speech synthesis. Proceedings of the ICSP, 2014.
    • (2014) Proceedings of the ICSP
    • Wang, Z.1    Yu, Y.2
  • 276
    • 85009266993 scopus 로고    scopus 로고
    • Transformation of spectral envelope for voice conversion based on radial basis function networks
    • Watanabe, T., Murakami, T., Namba, M., Hoya, T., Ishida, Y., Transformation of spectral envelope for voice conversion based on radial basis function networks. Proceedings of the ICSLP, 2002.
    • (2002) Proceedings of the ICSLP
    • Watanabe, T.1    Murakami, T.2    Namba, M.3    Hoya, T.4    Ishida, Y.5
  • 278
    • 85133164470 scopus 로고    scopus 로고
    • Multidimensional scaling of systems in the voice conversion challenge 2016
    • Wester, M., Wu, Z., Yamagishi, J., Multidimensional scaling of systems in the voice conversion challenge 2016. Proceedings of the SSW, 2016.
    • (2016) Proceedings of the SSW
    • Wester, M.1    Wu, Z.2    Yamagishi, J.3
  • 279
    • 33646815712 scopus 로고    scopus 로고
    • The MOCHA-TIMIT articulatory database
    • Queen Margaret University College
    • Wrench, A., The MOCHA-TIMIT articulatory database. 1999, Queen Margaret University College.
    • (1999)
    • Wrench, A.1
  • 280
    • 34047247202 scopus 로고    scopus 로고
    • Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis
    • Wu, C.-H., Hsia, C.-C., Liu, T.-H., Wang, J.-F., Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis. IEEE Trans. Audio Speech Lang. Process. 14:4 (2006), 1109–1116.
    • (2006) IEEE Trans. Audio Speech Lang. Process. , vol.14 , Issue.4 , pp. 1109-1116
    • Wu, C.-H.1    Hsia, C.-C.2    Liu, T.-H.3    Wang, J.-F.4
  • 282
    • 84889579519 scopus 로고    scopus 로고
    • Conditional restricted boltzmann machine for voice conversion
    • Wu, Z., Chng, E.S., Li, H., Conditional restricted boltzmann machine for voice conversion. Proceedings of the ChinaSIP, 2013.
    • (2013) Proceedings of the ChinaSIP
    • Wu, Z.1    Chng, E.S.2    Li, H.3
  • 283
    • 84910071877 scopus 로고    scopus 로고
    • Joint nonnegative matrix factorization for exemplar-based voice conversion
    • Wu, Z., Chng, E.S., Li, H., Joint nonnegative matrix factorization for exemplar-based voice conversion. Proceedings of the INTERSPEECH, 2014.
    • (2014) Proceedings of the INTERSPEECH
    • Wu, Z.1    Chng, E.S.2    Li, H.3
  • 284
    • 79959842826 scopus 로고    scopus 로고
    • Text-independent F0 transformation with non-parallel data for voice conversion.
    • Wu, Z., Kinnunen, T., Chng, E., Li, H., Text-independent F0 transformation with non-parallel data for voice conversion. Proceedings of the INTERSPEECH, 2010.
    • (2010) Proceedings of the INTERSPEECH
    • Wu, Z.1    Kinnunen, T.2    Chng, E.3    Li, H.4
  • 285
    • 84869384026 scopus 로고    scopus 로고
    • Mixture of factor analyzers using priors from non-parallel speech for voice conversion
    • Wu, Z., Kinnunen, T., Chng, E.S., Li, H., Mixture of factor analyzers using priors from non-parallel speech for voice conversion. IEEE Signal Process. Lett. 19:12 (2012), 914–917.
    • (2012) IEEE Signal Process. Lett. , vol.19 , Issue.12 , pp. 914-917
    • Wu, Z.1    Kinnunen, T.2    Chng, E.S.3    Li, H.4
  • 286
    • 84906275384 scopus 로고    scopus 로고
    • Vulnerability evaluation of speaker verification under voice conversion spoofing: the effect of text constraints.
    • Wu, Z., Larcher, A., Lee, K.-A., Chng, E., Kinnunen, T., Li, H., Vulnerability evaluation of speaker verification under voice conversion spoofing: the effect of text constraints. Proceedings of the INTERSPEECH, 2013.
    • (2013) Proceedings of the INTERSPEECH
    • Wu, Z.1    Larcher, A.2    Lee, K.-A.3    Chng, E.4    Kinnunen, T.5    Li, H.6
  • 287
    • 84956723787 scopus 로고    scopus 로고
    • Voice conversion versus speaker verification: an overview
    • Wu, Z., Li, H., Voice conversion versus speaker verification: an overview. APSIPA Trans. Signal Inf. Process., 3, 2014, e17.
    • (2014) APSIPA Trans. Signal Inf. Process. , vol.3 , pp. e17
    • Wu, Z.1    Li, H.2
  • 288
    • 84911369131 scopus 로고    scopus 로고
    • Exemplar-based sparse representation with residual compensation for voice conversion
    • Wu, Z., Virtanen, T., Chng, E.S., Li, H., Exemplar-based sparse representation with residual compensation for voice conversion. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 22:10 (2014), 1506–1521.
    • (2014) IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) , vol.22 , Issue.10 , pp. 1506-1521
    • Wu, Z.1    Virtanen, T.2    Chng, E.S.3    Li, H.4
  • 293
    • 84890539284 scopus 로고    scopus 로고
    • Voice conversion based on gaussian processes by coherent and asymmetric training with limited training data
    • Xu, N., Tang, Y., Bao, J., Jiang, A., Liu, X., Yang, Z., Voice conversion based on gaussian processes by coherent and asymmetric training with limited training data. Speech Commun. 58 (2014), 124–138.
    • (2014) Speech Commun. , vol.58 , pp. 124-138
    • Xu, N.1    Tang, Y.2    Bao, J.3    Jiang, A.4    Liu, X.5    Yang, Z.6
  • 294
    • 84855906479 scopus 로고    scopus 로고
    • Speech synthesis technologies for individuals with vocal disabilities: voice banking and reconstruction
    • Yamagishi, J., Veaux, C., King, S., Renals, S., Speech synthesis technologies for individuals with vocal disabilities: voice banking and reconstruction. Acoust. Sci. Technol. 33:1 (2012), 1–5.
    • (2012) Acoust. Sci. Technol. , vol.33 , Issue.1 , pp. 1-5
    • Yamagishi, J.1    Veaux, C.2    King, S.3    Renals, S.4
  • 295
    • 85009224898 scopus 로고    scopus 로고
    • Perceptually weighted linear transformations for voice conversion.
    • Ye, H., Young, S., Perceptually weighted linear transformations for voice conversion. Proceedings of the INTERSPEECH, 2003.
    • (2003) Proceedings of the INTERSPEECH
    • Ye, H.1    Young, S.2
  • 297
    • 34047254509 scopus 로고    scopus 로고
    • Quality-enhanced voice morphing using maximum likelihood transformations
    • Ye, H., Young, S., Quality-enhanced voice morphing using maximum likelihood transformations. IEEE Trans. Audio Speech Lang. Process. 14:4 (2006), 1301–1312.
    • (2006) IEEE Trans. Audio Speech Lang. Process. , vol.14 , Issue.4 , pp. 1301-1312
    • Ye, H.1    Young, S.2
  • 300
    • 78149260085 scopus 로고    scopus 로고
    • Continuous stochastic feature mapping based on trajectory hmms
    • Zen, H., Nankaku, Y., Tokuda, K., Continuous stochastic feature mapping based on trajectory hmms. IEEE Trans. Audio Speech Lang. Process. 19:2 (2011), 417–430.
    • (2011) IEEE Trans. Audio Speech Lang. Process. , vol.19 , Issue.2 , pp. 417-430
    • Zen, H.1    Nankaku, Y.2    Tokuda, K.3
  • 301
    • 33646812247 scopus 로고    scopus 로고
    • Voice conversion based on weighted least squares estimation criterion and residual prediction from pitch contour
    • Springer
    • Zhang, J., Sun, J., Dai, B., Voice conversion based on weighted least squares estimation criterion and residual prediction from pitch contour. Affective Computing and Intelligent Interaction, 2005, Springer, 326–333.
    • (2005) Affective Computing and Intelligent Interaction , pp. 326-333
    • Zhang, J.1    Sun, J.2    Dai, B.3
  • 303
  • 305
    • 84871520443 scopus 로고    scopus 로고
    • Improving the quality of standard GMM-based voice conversion systems by considering physically motivated linear transformations
    • Springer
    • Zorilă, T.-C., Erro, D., Hernaez, I., Improving the quality of standard GMM-based voice conversion systems by considering physically motivated linear transformations. Advances in Speech and Language Technologies for Iberian Languages, 2012, Springer, 30–39.
    • (2012) Advances in Speech and Language Technologies for Iberian Languages , pp. 30-39
    • Zorilă, T.-C.1    Erro, D.2    Hernaez, I.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.