SCOPUS 정보 검색 플랫폼

Speech Communication

Volumn 88, Issue , 2017, Pages 65-82

An overview of voice conversion systems

(2) Mohammadi, Seyed Hamidreza a Kain, Alexander a

a OREGON HEALTH AND SCIENCE UNIVERSITY (United States)

Author keywords

Overview; Survey; Voice conversion

Indexed keywords

COMPUTER SIMULATION; SURVEYING;

EXISTING SYSTEMS; LINGUISTIC INFORMATION; OVERVIEW; REAL-WORLD; SPEECH QUALITY; SPEECH SIGNALS; TARGET SPEAKER; VOICE CONVERSION;

COMPUTER APPLICATIONS;

EID: 85010399617 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2017.01.008 Document Type: Review

Times cited : (305)

References (305)

1
- 0023739214
- Voice conversion through vector quantization
- Abe, M., Nakamura, S., Shikano, K., Kuwabara, H., Voice conversion through vector quantization. Proceedings of the ICASSP, 1988.
- (1988) Proceedings of the ICASSP
- Abe, M.¹ Nakamura, S.² Shikano, K.³ Kuwabara, H.⁴

2
- 84930664922
- VOCAINE the vocoder and applications in speech synthesis
- Agiomyrgiannakis, Y., VOCAINE the vocoder and applications in speech synthesis. Proceedings of the ICASSP, 2015.
- (2015) Proceedings of the ICASSP
- Agiomyrgiannakis, Y.¹

3
- 70349208681
- ARX-LF-based source-filter methods for voice modification and transformation
- Agiomyrgiannakis, Y., Rosec, O., ARX-LF-based source-filter methods for voice modification and transformation. Proceedings of the ICASSP, 2009.
- (2009) Proceedings of the ICASSP
- Agiomyrgiannakis, Y.¹ Rosec, O.²

4
- 84905227265
- Voice conversion based on non-negative matrix factorization using phoneme-categorized dictionary
- Aihara, R., Nakashika, T., Takiguchi, T., Ariki, Y., Voice conversion based on non-negative matrix factorization using phoneme-categorized dictionary. Proceedings of the ICASSP, 2014.
- (2014) Proceedings of the ICASSP
- Aihara, R.¹ Nakashika, T.² Takiguchi, T.³ Ariki, Y.⁴

5
- 84890519936
- Individuality-preserving voice conversion for articulation disorders based on non-negative matrix factorization
- Aihara, R., Takashima, R., Takiguchi, T., Ariki, Y., Individuality-preserving voice conversion for articulation disorders based on non-negative matrix factorization. Proceedings of the ICASSP, 2013.
- (2013) Proceedings of the ICASSP
- Aihara, R.¹ Takashima, R.² Takiguchi, T.³ Ariki, Y.⁴

6
- 84946095434
- Activity-mapping non-negative matrix factorization for exemplar-based voice conversion
- AIHARA, R., TAKIGUCHI, T., ARIKI, Y., Activity-mapping non-negative matrix factorization for exemplar-based voice conversion. Proceedings of the ICASSP, 2015.
- (2015) Proceedings of the ICASSP
- AIHARA, R.¹ TAKIGUCHI, T.² ARIKI, Y.³

7
- 84959090646
- Many-to-many voice conversion based on multiple non-negative matrix factorization
- Aihara, R., Takiguchi, T., Ariki, Y., Many-to-many voice conversion based on multiple non-negative matrix factorization. Proceedings of the INTERSPEECH, 2015.
- (2015) Proceedings of the INTERSPEECH
- Aihara, R.¹ Takiguchi, T.² Ariki, Y.³

8
- 84949924136
- Exemplar-based emotional voice conversion using non-negative matrix factorization
- Aihara, R., Ueda, R., Takiguchi, T., Ariki, Y., Exemplar-based emotional voice conversion using non-negative matrix factorization. Proceedings of the APSIPA, 2014, 10.1109/APSIPA.2014.7041640.
- (2014) Proceedings of the APSIPA
- Aihara, R.¹ Ueda, R.² Takiguchi, T.³ Ariki, Y.⁴

9
- 84890542394
- Spoofing countermeasures to protect automatic speaker verification from voice conversion
- Alegre, F., Amehraye, A., Evans, N., Spoofing countermeasures to protect automatic speaker verification from voice conversion. Proceedings of the ICASSP, 2013.
- (2013) Proceedings of the ICASSP
- Alegre, F.¹ Amehraye, A.² Evans, N.³

10
- 84878399314
- Festvox: Tools for creation and analyses of large speech corpora
- Anumanchipalli, G.K., Prahallad, K., Black, A.W., Festvox: Tools for creation and analyses of large speech corpora. Workshop on Very Large Scale Phonetics Research, UPenn, Philadelphia, 2011.
- (2011) Workshop on Very Large Scale Phonetics Research, UPenn, Philadelphia
- Anumanchipalli, G.K.¹ Prahallad, K.² Black, A.W.³

11
- 0033154052
- Speaker transformation algorithm using segmental codebooks (STASC)
- Arslan, L.M., Speaker transformation algorithm using segmental codebooks (STASC). Speech Commun. 28:3 (1999), 211–226.
- (1999) Speech Commun. , vol.28 , Issue.3 , pp. 211-226
- Arslan, L.M.¹

12
- 84863268465
- Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum
- Arslan, L.M., Talkin, D., Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum. Proceedings of the EUROSPEECH, 1997.
- (1997) Proceedings of the EUROSPEECH
- Arslan, L.M.¹ Talkin, D.²

13
- 0031643805
- Speaker transformation using sentence HMM based alignments and detailed prosody modification
- Arslan, L.M., Talkin, D., Speaker transformation using sentence HMM based alignments and detailed prosody modification. Proceedings of the ICASSP, 1998.
- (1998) Proceedings of the ICASSP
- Arslan, L.M.¹ Talkin, D.²

14
- 84905277636
- Foreign accent conversion through voice morphing.
- Aryal, S., Felps, D., Gutierrez-Osuna, R., Foreign accent conversion through voice morphing. Proceedings of the INTERSPEECH, 2013.
- (2013) Proceedings of the INTERSPEECH
- Aryal, S.¹ Felps, D.² Gutierrez-Osuna, R.³

15
- 84906281619
- Real-time voice conversion using artificial neural networks with rectified linear units
- Azarov, E., Vashkevich, M., Likhachov, D., Petrovsky, A., Real-time voice conversion using artificial neural networks with rectified linear units. Proceedings of the INTERSPEECH, 2013.
- (2013) Proceedings of the INTERSPEECH
- Azarov, E.¹ Vashkevich, M.² Likhachov, D.³ Petrovsky, A.⁴

16
- 56149092168
- On the limitations of voice conversion techniques in emotion identification tasks
- Barra, R., Montero, J.M., Macias-Guarasa, J., Gutiérrez-Arriola, J., Ferreiros, J., Pardo, J.M., On the limitations of voice conversion techniques in emotion identification tasks. Proceedings of the INTERSPEECH, 2007.
- (2007) Proceedings of the INTERSPEECH
- Barra, R.¹ Montero, J.M.² Macias-Guarasa, J.³ Gutiérrez-Arriola, J.⁴ Ferreiros, J.⁵ Pardo, J.M.⁶

17
- 84865754815
- Voice conversion using gmm with enhanced global variance.
- Benisty, H., Malah, D., Voice conversion using gmm with enhanced global variance. Proceedings of the INTERSPEECH, 2011.
- (2011) Proceedings of the INTERSPEECH
- Benisty, H.¹ Malah, D.²

18
- 84941241899
- Sequential voice conversion using grid-based approximation
- Benisty, H., Malah, D., Crammer, K., Sequential voice conversion using grid-based approximation. Proceedings of the IEEEI, 2014.
- (2014) Proceedings of the IEEEI
- Benisty, H.¹ Malah, D.² Crammer, K.³

19
- 0031104132
- Application of speech conversion to alaryngeal speech enhancement
- Bi, N., Qi, Y., Application of speech conversion to alaryngeal speech enhancement. IEEE Trans. Speech Audio Process. 5:2 (1997), 97–105.
- (1997) IEEE Trans. Speech Audio Process. , vol.5 , Issue.2 , pp. 97-105
- Bi, N.¹ Qi, Y.²

20
- 79961212205
- TC-STAR: Specifications of language resources and evaluation for speech synthesis
- Bonafonte, A., Höge, H., Kiss, I., Moreno, A., Ziegenhain, U., van den Heuvel, H., Hain, H.-U., Wang, X.S., Garcia, M.-N., TC-STAR: Specifications of language resources and evaluation for speech synthesis. Proceedings of the LREC, 2006.
- (2006) Proceedings of the LREC
- Bonafonte, A.¹ Höge, H.² Kiss, I.³ Moreno, A.⁴ Ziegenhain, U.⁵ van den Heuvel, H.⁶ Hain, H.-U.⁷ Wang, X.S.⁸ Garcia, M.-N.⁹

21
- 85009161180
- Voice morphing system for impersonating in karaoke applications
- Cano, P., Loscos, A., Bonada, J., De Boer, M., Serra, X., Voice morphing system for impersonating in karaoke applications. Proceedings of the ICMC, 2000.
- (2000) Proceedings of the ICMC
- Cano, P.¹ Loscos, A.² Bonada, J.³ De Boer, M.⁴ Serra, X.⁵

22
- 5444259197
- On the construction of a pitch conversion system
- Ceyssens, T., Verhelst, W., Wambacq, P., On the construction of a pitch conversion system. Proceedings of the EUSIPCO, 2002.
- (2002) Proceedings of the EUSIPCO
- Ceyssens, T.¹ Verhelst, W.² Wambacq, P.³

23
- 5444243681
- Speaker-specific pitch contour modeling and modification
- Chappell, D.T., Hansen, J.H., Speaker-specific pitch contour modeling and modification. Proceedings of the ICASSP, 1998.
- (1998) Proceedings of the ICASSP
- Chappell, D.T.¹ Hansen, J.H.²

24
- 84910104946
- Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes
- Chen, L.-H., Ling, Z.-H., Dai, L.-R., Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes. Proceedings of the INTERSPEECH, 2014.
- (2014) Proceedings of the INTERSPEECH
- Chen, L.-H.¹ Ling, Z.-H.² Dai, L.-R.³

25
- 84921735339
- Voice conversion using deep neural networks with layer-wise generative training
- Chen, L.-H., Ling, Z.-H., Liu, L.-J., Dai, L.-R., Voice conversion using deep neural networks with layer-wise generative training. IEEE/ACM Trans. Audio Speech Language Process. (TASLP) 22:12 (2014), 1859–1872.
- (2014) IEEE/ACM Trans. Audio Speech Language Process. (TASLP) , vol.22 , Issue.12 , pp. 1859-1872
- Chen, L.-H.¹ Ling, Z.-H.² Liu, L.-J.³ Dai, L.-R.⁴

26
- 84906225084
- Joint spectral distribution modeling using restricted boltzmann machines for voice conversion
- Chen, L.-H., Ling, Z.-H., Song, Y., Dai, L.-R., Joint spectral distribution modeling using restricted boltzmann machines for voice conversion. Proceedings of the INTERSPEECH, 2013.
- (2013) Proceedings of the INTERSPEECH
- Chen, L.-H.¹ Ling, Z.-H.² Song, Y.³ Dai, L.-R.⁴

27
- 84994337398
- The USTC system for voice conversion challenge 2016: neural network based approaches for spectrum, aperiodicity and F0 conversion
- Chen, L.-H., Liu, L.-J., Ling, Z.-H., Jiang, Y., Dai, L.-R., The USTC system for voice conversion challenge 2016: neural network based approaches for spectrum, aperiodicity and F0 conversion. Proceedings of the INTERSPEECH, 2016.
- (2016) Proceedings of the INTERSPEECH
- Chen, L.-H.¹ Liu, L.-J.² Ling, Z.-H.³ Jiang, Y.⁴ Dai, L.-R.⁵

28
- 84905560807
- Voice conversion with smoothed GMM and MAP adaptation
- Chen, Y., Chu, M., Chang, E., Liu, J., Liu, R., Voice conversion with smoothed GMM and MAP adaptation. Proceedings of the EUROSPEECH, 2003.
- (2003) Proceedings of the EUROSPEECH
- Chen, Y.¹ Chu, M.² Chang, E.³ Liu, J.⁴ Liu, R.⁵

29
- 0022203520
- Voice conversion: Factors responsible for quality
- Childers, D., Yegnanarayana, B., Wu, K., Voice conversion: Factors responsible for quality. Proceedings of the ICASSP, 1985.
- (1985) Proceedings of the ICASSP
- Childers, D.¹ Yegnanarayana, B.² Wu, K.³

30
- 0029253818
- Glottal source modeling for voice conversion
- Childers, D.G., Glottal source modeling for voice conversion. Speech Commun. 16:2 (1995), 127–138.
- (1995) Speech Commun. , vol.16 , Issue.2 , pp. 127-138
- Childers, D.G.¹

31
- 0024680919
- Voice conversion
- Childers, D.G., Wu, K., Hicks, D., Yegnanarayana, B., Voice conversion. Speech Commun. 8:2 (1989), 147–158.
- (1989) Speech Commun. , vol.8 , Issue.2 , pp. 147-158
- Childers, D.G.¹ Wu, K.² Hicks, D.³ Yegnanarayana, B.⁴

32
- 85010346137
- Instituto Superior Técnico Master's Thesis
- Correia, M.J.R.F., Anti-Spoofing: Speaker Verification vs. Voice Conversion, 2014, Instituto Superior Técnico Master's Thesis.
- (2014) Anti-Spoofing: Speaker Verification vs. Voice Conversion
- Correia, M.J.R.F.¹

33
- 84867216755
- The linear transformation of lf glottal waveforms for voice conversion.
- Del Pozo, A., Young, S., The linear transformation of lf glottal waveforms for voice conversion. Proceedings of the INTERSPEECH, 2008.
- (2008) Proceedings of the INTERSPEECH
- Del Pozo, A.¹ Young, S.²

34
- 77953707533
- Spectral mapping using artificial neural networks for voice conversion
- Desai, S., Black, A.W., Yegnanarayana, B., Prahallad, K., Spectral mapping using artificial neural networks for voice conversion. IEEE Trans. Audio Speech Lang. Process. 18:5 (2010), 954–964.
- (2010) IEEE Trans. Audio Speech Lang. Process. , vol.18 , Issue.5 , pp. 954-964
- Desai, S.¹ Black, A.W.² Yegnanarayana, B.³ Prahallad, K.⁴

35
- 84874403435
- Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system
- Doi, H., Toda, T., Nakano, T., Goto, M., Nakamura, S., Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system. Proceedings of the APSIPA, 2012.
- (2012) Proceedings of the APSIPA
- Doi, H.¹ Toda, T.² Nakano, T.³ Goto, M.⁴ Nakamura, S.⁵

36
- 34547496196
- Towards a voice conversion system based on frame selection
- Dutoit, T., Holzapfel, A., Jottrand, M., Moinet, A., Perez, J., Stylianou, Y., Towards a voice conversion system based on frame selection. Proceedings of the ICASSP, 2007.
- (2007) Proceedings of the ICASSP
- Dutoit, T.¹ Holzapfel, A.² Jottrand, M.³ Moinet, A.⁴ Perez, J.⁵ Stylianou, Y.⁶

37
- 56149088450
- Universitat Politecnica de Catalunya, Barcelona, Spain Ph.D. thesis.
- Duxans, H., Voice Conversion applied to Text-to-Speech systems, 2006, Universitat Politecnica de Catalunya, Barcelona, Spain Ph.D. thesis.
- (2006) Voice Conversion applied to Text-to-Speech systems
- Duxans, H.¹

38
- 33947629275
- Residual conversion versus prediction on voice morphing systems
- Duxans, H., Bonafonte, A., Residual conversion versus prediction on voice morphing systems. Proceedings of the ICASSP, 2006.
- (2006) Proceedings of the ICASSP
- Duxans, H.¹ Bonafonte, A.²

39
- 84994241109
- Including dynamic and phonetic information in voice conversion systems
- Duxans, H., Bonafonte, A., Kain, A., Van Santen, J., Including dynamic and phonetic information in voice conversion systems. Proceedings of the ICSLP, 2004.
- (2004) Proceedings of the ICSLP
- Duxans, H.¹ Bonafonte, A.² Kain, A.³ Van Santen, J.⁴

40
- 79951758789
- Voice conversion of non-aligned data using unit selection
- Duxans, H., Erro, D., Pérez, J., Diego, F., Bonafonte, A., Moreno, A., Voice conversion of non-aligned data using unit selection. TC-STAR WSST, 2006.
- (2006) TC-STAR WSST
- Duxans, H.¹ Erro, D.² Pérez, J.³ Diego, F.⁴ Bonafonte, A.⁵ Moreno, A.⁶

41
- 84946210905
- A new method for pitch prediction from spectral envelope and its application in voice conversion
- En-Najjary, T., Rosec, O., Chonavel, T., A new method for pitch prediction from spectral envelope and its application in voice conversion. Proceedings of the INTERSPEECH, 2003.
- (2003) Proceedings of the INTERSPEECH
- En-Najjary, T.¹ Rosec, O.² Chonavel, T.³

42
- 85010449478
- A voice conversion method based on joint pitch and spectral envelope transformation.
- En-Najjary, T., Rosec, O., Chonavel, T., A voice conversion method based on joint pitch and spectral envelope transformation. Proceedings of the INTERSPEECH, 2004.
- (2004) Proceedings of the INTERSPEECH
- En-Najjary, T.¹ Rosec, O.² Chonavel, T.³

43
- 77949522811
- Why does unsupervised pre-training help deep learning?
- Erhan, D., Bengio, Y., Courville, A., Manzagol, P.A., Vincent, P., Bengio, S., Why does unsupervised pre-training help deep learning?. J. Mach. Learn. Res. 11 (2010), 625–660.
- (2010) J. Mach. Learn. Res. , vol.11 , pp. 625-660
- Erhan, D.¹ Bengio, Y.² Courville, A.³ Manzagol, P.A.⁴ Vincent, P.⁵ Bengio, S.⁶

44
- 84888241651
- Towards physically interpretable parametric voice conversion functions
- Springer
- Erro, D., Alonso, A., Serrano, L., Navas, E., Hernáez, I., Towards physically interpretable parametric voice conversion functions. Advances in Nonlinear Speech Processing, 2013, Springer, 75–82.
- (2013) Advances in Nonlinear Speech Processing , pp. 75-82
- Erro, D.¹ Alonso, A.² Serrano, L.³ Navas, E.⁴ Hernáez, I.⁵

45
- 84913585254
- Interpretable parametric voice conversion functions based on gaussian mixture models and constrained transformations
- Erro, D., Alonso, A., Serrano, L., Navas, E., Hernaez, I., Interpretable parametric voice conversion functions based on gaussian mixture models and constrained transformations. Comput. Speech Lang. 30:1 (2015), 3–15.
- (2015) Comput. Speech Lang. , vol.30 , Issue.1 , pp. 3-15
- Erro, D.¹ Alonso, A.² Serrano, L.³ Navas, E.⁴ Hernaez, I.⁵

46
- 84994385904
- Ml parameter generation with a reformulated mge training criterion—participation in the voice conversion challenge 2016
- Erro, D., Alonso, A., Serrano, L., Tavarez, D., Odriozola, I., Sarasola, X., Del Blanco, E., Sanchez, J., Saratxaga, I., Navas, E., et al. Ml parameter generation with a reformulated mge training criterion—participation in the voice conversion challenge 2016. Proceedings of the INTERSPEECH, 2016.
- (2016) Proceedings of the INTERSPEECH
- Erro, D.¹ Alonso, A.² Serrano, L.³ Tavarez, D.⁴ Odriozola, I.⁵ Sarasola, X.⁶ Del Blanco, E.⁷ Sanchez, J.⁸ Saratxaga, I.⁹ Navas, E.¹⁰

47
- 56149106209
- Frame alignment method for cross-lingual voice conversion
- Erro, D., Moreno, A., Frame alignment method for cross-lingual voice conversion. Proceedings of the INTERSPEECH, 2007.
- (2007) Proceedings of the INTERSPEECH
- Erro, D.¹ Moreno, A.²

48
- 56149096085
- Weighted frequency warping for voice conversion.
- Erro, D., Moreno, A., Weighted frequency warping for voice conversion. Proceedings of the INTERSPEECH, 2007.
- (2007) Proceedings of the INTERSPEECH
- Erro, D.¹ Moreno, A.²

49
- 77953725318
- INCA algorithm for training voice conversion systems from nonparallel corpora
- Erro, D., Moreno, A., Bonafonte, A., INCA algorithm for training voice conversion systems from nonparallel corpora. IEEE Trans. Audio Speech Lang. Process. 18:5 (2010), 944–953.
- (2010) IEEE Trans. Audio Speech Lang. Process. , vol.18 , Issue.5 , pp. 944-953
- Erro, D.¹ Moreno, A.² Bonafonte, A.³

50
- 77953727123
- Voice conversion based on weighted frequency warping
- Erro, D., Moreno, A., Bonafonte, A., Voice conversion based on weighted frequency warping. IEEE Trans. Audio Speech Lang. Process. 18:5 (2010), 922–931.
- (2010) IEEE Trans. Audio Speech Lang. Process. , vol.18 , Issue.5 , pp. 922-931
- Erro, D.¹ Moreno, A.² Bonafonte, A.³

51
- 84878409257
- Iterative MMSE estimation of vocal tract length normalization factors for voice transformation.
- Erro, D., Navas, E., Hernáez, I., Iterative MMSE estimation of vocal tract length normalization factors for voice transformation. Proceedings of the INTERSPEECH, 2012.
- (2012) Proceedings of the INTERSPEECH
- Erro, D.¹ Navas, E.² Hernáez, I.³

52
- 51449121679
- On combining statistical methods and frequency warping for high-quality voice conversion
- Erro, D., Polyakova, T., Moreno, A., On combining statistical methods and frequency warping for high-quality voice conversion. Proceedings of the ICASSP, 2008.
- (2008) Proceedings of the ICASSP
- Erro, D.¹ Polyakova, T.² Moreno, A.³

53
- 84865795787
- Improved HNM-based vocoder for statistical synthesizers.
- Erro, D., Sainz, I., Navas, E., Hernáez, I., Improved HNM-based vocoder for statistical synthesizers. Proceedings of the INTERSPEECH, 2011.
- (2011) Proceedings of the INTERSPEECH
- Erro, D.¹ Sainz, I.² Navas, E.³ Hernáez, I.⁴

54
- 84865743085
- Quality improvement of voice conversion systems based on trellis structured vector quantization
- Eslami, M., Sheikhzadeh, H., Sayadiyan, A., Quality improvement of voice conversion systems based on trellis structured vector quantization. Twelfth Annual Conference of the International Speech Communication Association, 2011.
- (2011) Twelfth Annual Conference of the International Speech Communication Association
- Eslami, M.¹ Sheikhzadeh, H.² Sayadiyan, A.³

55
- 84986212974
- A waveform representation framework for high-quality statistical parametric speech synthesis
- arXiv preprint arXiv:1510.01443
- Fan, B., Lee, S.W., Tian, X., Xie, L., Dong, M., A waveform representation framework for high-quality statistical parametric speech synthesis. Proceedings of the APSIPA, 2015 arXiv preprint arXiv:1510.01443.
- (2015) Proceedings of the APSIPA
- Fan, B.¹ Lee, S.W.² Tian, X.³ Xie, L.⁴ Dong, M.⁵

56
- 84905179995
- Highindividuality voice conversion based on concatenative speech synthesis
- Fujii, K., Okawa, J., Suigetsu, K., Highindividuality voice conversion based on concatenative speech synthesis. World Academy of Science, Engineering and Technology, 2, 2007, 1.
- (2007) World Academy of Science, Engineering and Technology , vol.2 , pp. 1
- Fujii, K.¹ Okawa, J.² Suigetsu, K.³

57
- 0022667694
- Speaker-independent isolated word recognition using dynamic features of speech spectrum
- Furui, S., Speaker-independent isolated word recognition using dynamic features of speech spectrum. IEEE Transactions on Acoustics, Speech and Signal Processing 34:1 (1986), 52–59.
- (1986) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.34 , Issue.1 , pp. 52-59
- Furui, S.¹

58
- 6344222337
- DARPA TIMIT Acoustic-Phonetic Continous Speech Corpus CD-ROM. Nist Speech Disc 1-1.1
- NASA STI, Recon Technical Report N
- Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., DARPA TIMIT Acoustic-Phonetic Continous Speech Corpus CD-ROM. Nist Speech Disc 1-1.1. 93, 1993, NASA STI, Recon Technical Report N, 27403.
- (1993) , vol.93 , pp. 27403
- Garofolo, J.S.¹ Lamel, L.F.² Fisher, W.M.³ Fiscus, J.G.⁴ Pallett, D.S.⁵

59
- 84919915933
- Voice conversion based on feature combination with limited training data
- Ghorbandoost, M., Sayadiyan, A., Ahangar, M., Sheikhzadeh, H., Shahrebabaki, A.S., Amini, J., Voice conversion based on feature combination with limited training data. Speech Commun. 67 (2015), 113–128.
- (2015) Speech Commun. , vol.67 , pp. 113-128
- Ghorbandoost, M.¹ Sayadiyan, A.² Ahangar, M.³ Sheikhzadeh, H.⁴ Shahrebabaki, A.S.⁵ Amini, J.⁶

60
- 85009212516
- Transforming f0 contours
- Gillett, B., King, S., Transforming f0 contours. Proceedings of the EUROSPEECH, 2003.
- (2003) Proceedings of the EUROSPEECH
- Gillett, B.¹ King, S.²

61
- 84862277874
- Understanding the difficulty of training deep feedforward neural networks
- Glorot, X., Bengio, Y., Understanding the difficulty of training deep feedforward neural networks. International Conference on Artificial Intelligence and Statistics, 2010, 249–256.
- (2010) International Conference on Artificial Intelligence and Statistics , pp. 249-256
- Glorot, X.¹ Bengio, Y.²

62
- 84872578524
- Deep sparse rectifier neural networks.
- Glorot, X., Bordes, A., Bengio, Y., Deep sparse rectifier neural networks. Aistats, 2011.
- (2011) Aistats
- Glorot, X.¹ Bordes, A.² Bengio, Y.³

63
- 84906241950
- Assessing the intelligibility impact of vowel space expansion via clear speech-inspired frequency warping.
- Godoy, E., Koutsogiannaki, M., Stylianou, Y., Assessing the intelligibility impact of vowel space expansion via clear speech-inspired frequency warping. Proceedings of the INTERSPEECH, 2013.
- (2013) Proceedings of the INTERSPEECH
- Godoy, E.¹ Koutsogiannaki, M.² Stylianou, Y.³

64
- 84890562746
- Approaching speech intelligibility enhancement with inspiration from lombard and clear speaking styles
- Godoy, E., Koutsogiannaki, M., Stylianou, Y., Approaching speech intelligibility enhancement with inspiration from lombard and clear speaking styles. Comput. Speech. Lang. 28:2 (2014), 629–647.
- (2014) Comput. Speech. Lang. , vol.28 , Issue.2 , pp. 629-647
- Godoy, E.¹ Koutsogiannaki, M.² Stylianou, Y.³

65
- 70450186582
- Alleviating the one-to-many mapping problem in voice conversion with context-dependent modelling
- Godoy, E., Rosec, O., Chonavel, T., Alleviating the one-to-many mapping problem in voice conversion with context-dependent modelling. Proceedings of the INTERSPEECH, 2009.
- (2009) Proceedings of the INTERSPEECH
- Godoy, E.¹ Rosec, O.² Chonavel, T.³

66
- 85010285285
- On transforming spectral peaks in voice conversion
- Godoy, E., Rosec, O., Chonavel, T., On transforming spectral peaks in voice conversion. Proceedings of the SSW, 2010.
- (2010) Proceedings of the SSW
- Godoy, E.¹ Rosec, O.² Chonavel, T.³

67
- 78650273608
- Speech spectral envelope estimation through explicit control of peak evolution in time
- Godoy, E., Rosec, O., Chonavel, T., Speech spectral envelope estimation through explicit control of peak evolution in time. Proceedings of the ISSPA, 2010.
- (2010) Proceedings of the ISSPA
- Godoy, E.¹ Rosec, O.² Chonavel, T.³

68
- 84865717274
- Spectral envelope transformation using DFW and amplitude scaling for voice conversion with parallel or nonparallel corpora
- Godoy, E., Rosec, O., Chonavel, T., Spectral envelope transformation using DFW and amplitude scaling for voice conversion with parallel or nonparallel corpora. Proceeding of the INTERSPEECH, 2011.
- (2011) Proceeding of the INTERSPEECH
- Godoy, E.¹ Rosec, O.² Chonavel, T.³

69
- 84857498745
- Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora
- Godoy, E., Rosec, O., Chonavel, T., Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora. IEEE Trans. Audio Speech Lang. Process. 20:4 (2012), 1313–1323.
- (2012) IEEE Trans. Audio Speech Lang. Process. , vol.20 , Issue.4 , pp. 1313-1323
- Godoy, E.¹ Rosec, O.² Chonavel, T.³

70
- 84912138720
- Improving segmental GMM based voice conversion method with target frame selection
- Gu, H.-Y., Tsai, S.-F., Improving segmental GMM based voice conversion method with target frame selection. Proceedings of the ISCSLP, 2014.
- (2014) Proceedings of the ISCSLP
- Gu, H.-Y.¹ Tsai, S.-F.²

71
- 56149097756
- F0 transformation within the voice conversion framework
- Hanzlíček, Z., Matoušek, J., F0 transformation within the voice conversion framework. Proceedings of the INTERSPEECH, 2007.
- (2007) Proceedings of the INTERSPEECH
- Hanzlíček, Z.¹ Matoušek, J.²

72
- 85010456787
- Spectral mapping method for voice conversion using speaker selection and vector field smoothing
- Hashimoto, M., Higuchi, N., Spectral mapping method for voice conversion using speaker selection and vector field smoothing. Proceedings of the EUROSPEECH, 1995.
- (1995) Proceedings of the EUROSPEECH
- Hashimoto, M.¹ Higuchi, N.²

73
- 0030351582
- Training data selection for voice conversion using speaker selection and vector field smoothing
- Hashimoto, M., Higuchi, N., Training data selection for voice conversion using speaker selection and vector field smoothing. Proceedings of the ICSLP, 1996.
- (1996) Proceedings of the ICSLP
- Hashimoto, M.¹ Higuchi, N.²

74
- 85010438368
- Analysis of lsf frame selection in voice conversion
- Helander, E., Nurminen, J., Gabbouj, M., Analysis of lsf frame selection in voice conversion. Proceedings of the SPECOM, 2007.
- (2007) Proceedings of the SPECOM
- Helander, E.¹ Nurminen, J.² Gabbouj, M.³

75
- 51449107658
- Lsf mapping for voice conversion with very small training sets
- Helander, E., Nurminen, J., Gabbouj, M., Lsf mapping for voice conversion with very small training sets. Proceedings of the ICASSP, 2008.
- (2008) Proceedings of the ICASSP
- Helander, E.¹ Nurminen, J.² Gabbouj, M.³

76
- 84867198185
- On the impact of alignment on voice conversion performance
- Helander, E., Schwarz, J., Nurminen, J., Silen, H., Gabbouj, M., On the impact of alignment on voice conversion performance. Proceedings of the INTERSPEECH, 2008.
- (2008) Proceedings of the INTERSPEECH
- Helander, E.¹ Schwarz, J.² Nurminen, J.³ Silen, H.⁴ Gabbouj, M.⁵

77
- 79959836789
- Maximum a posteriori voice conversion using sequential monte carlo methods.
- Helander, E., Silén, H., Míguez, J., Gabbouj, M., Maximum a posteriori voice conversion using sequential monte carlo methods. Proceedings of the INTERSPEECH, 2010.
- (2010) Proceedings of the INTERSPEECH
- Helander, E.¹ Silén, H.² Míguez, J.³ Gabbouj, M.⁴

78
- 84856141218
- Voice conversion using dynamic kernel partial least squares regression
- Helander, E., Silén, H., Virtanen, T., Gabbouj, M., Voice conversion using dynamic kernel partial least squares regression. IEEE Trans. Audio Speech Lang. Process. 20:3 (2012), 806–817.
- (2012) IEEE Trans. Audio Speech Lang. Process. , vol.20 , Issue.3 , pp. 806-817
- Helander, E.¹ Silén, H.² Virtanen, T.³ Gabbouj, M.⁴

79
- 77953712499
- Voice conversion using partial least squares regression
- Helander, E., Virtanen, T., Nurminen, J., Gabbouj, M., Voice conversion using partial least squares regression. IEEE Trans. Audio Speech Lang. Process. 18:5 (2010), 912–921.
- (2010) IEEE Trans. Audio Speech Lang. Process. , vol.18 , Issue.5 , pp. 912-921
- Helander, E.¹ Virtanen, T.² Nurminen, J.³ Gabbouj, M.⁴

80
- 34547520011
- A novel method for prosody prediction in voice conversion
- Helander, E.E., Nurminen, J., A novel method for prosody prediction in voice conversion. Proceedings of the ICASSP, 2007.
- (2007) Proceedings of the ICASSP
- Helander, E.E.¹ Nurminen, J.²

81
- 56149114123
- On the importance of pure prosody in the perception of speaker identity
- Helander, E.E., Nurminen, J., On the importance of pure prosody in the perception of speaker identity. Proceedings of the INTERSPEECH, 2007.
- (2007) Proceedings of the INTERSPEECH
- Helander, E.E.¹ Nurminen, J.²

82
- 77956795483
- Esophageal speech enhancement based on statistical voice conversion with gaussian mixture models
- Hironori, D., Nakamura, K., Tomoki, T., Saruwatari, H., Shikano, K., Esophageal speech enhancement based on statistical voice conversion with gaussian mixture models. IEICE Trans. Inf. Syst. 93:9 (2010), 2472–2482.
- (2010) IEICE Trans. Inf. Syst. , vol.93 , Issue.9 , pp. 2472-2482
- Hironori, D.¹ Nakamura, K.² Tomoki, T.³ Saruwatari, H.⁴ Shikano, K.⁵

83
- 0024880831
- Multilayer feedforward networks are universal approximators
- Hornik, K., Stinchcombe, M., White, H., Multilayer feedforward networks are universal approximators. Neural Netw. 2:5 (1989), 359–366.
- (1989) Neural Netw. , vol.2 , Issue.5 , pp. 359-366
- Hornik, K.¹ Stinchcombe, M.² White, H.³

84
- 85010449824
- Duration-embedded bi-HMM for expressive voice conversion.
- Hsia, C.-C., Wu, C.-H., Liu, T.-H., Duration-embedded bi-HMM for expressive voice conversion. Proceedings of the INTERSPEECH, 2005.
- (2005) Proceedings of the INTERSPEECH
- Hsia, C.-C.¹ Wu, C.-H.² Liu, T.-H.³

85
- 34548216761
- Conversion function clustering and selection using linguistic and spectral information for emotional voice conversion
- Hsia, C.-C., Wu, C.-H., Wu, J.-Q., Conversion function clustering and selection using linguistic and spectral information for emotional voice conversion. IEEE Trans. Comput. 56:9 (2007), 1245–1254.
- (2007) IEEE Trans. Comput. , vol.56 , Issue.9 , pp. 1245-1254
- Hsia, C.-C.¹ Wu, C.-H.² Wu, J.-Q.³

86
- 85075288991
- An automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity
- Huang, D.-Y., Xie, L., Siu, Y., Lee, W., Wu, J., Ming, H., Tian, X., Zhang, S., Ding, C., Li, M., Nguyen, Q.H., Dong, M., Li, H., An automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity. Proceedings of the SSW, 2016.
- (2016) Proceedings of the SSW
- Huang, D.-Y.¹ Xie, L.² Siu, Y.³ Lee, W.⁴ Wu, J.⁵ Ming, H.⁶ Tian, X.⁷ Zhang, S.⁸ Ding, C.⁹ Li, M.¹⁰ Nguyen, Q.H.¹¹ Dong, M.¹² Li, H.¹³

87
- 84893234191
- Incorporating global variance in the training phase of GMM-based voice conversion
- Hwang, H.-T., Tsao, Y., Wang, H.-M., Wang, Y.-R., Chen, S.-H., Incorporating global variance in the training phase of GMM-based voice conversion. Proceedings of the APSIPA, 2013.
- (2013) Proceedings of the APSIPA
- Hwang, H.-T.¹ Tsao, Y.² Wang, H.-M.³ Wang, Y.-R.⁴ Chen, S.-H.⁵

88
- 84878415076
- A study of mutual information for GMM-based spectral conversion.
- Hwang, H.-T., Tsao, Y., Wang, H.-M., Wang, Y.-R., Chen, S.-H., et al. A study of mutual information for GMM-based spectral conversion. Proceedings of the INTERSPEECH, 2012.
- (2012) Proceedings of the INTERSPEECH
- Hwang, H.-T.¹ Tsao, Y.² Wang, H.-M.³ Wang, Y.-R.⁴ Chen, S.-H.⁵

89
- 0020596154
- Cepstral analysis synthesis on the mel frequency scale
- Imai, S., Cepstral analysis synthesis on the mel frequency scale. Proceedings of the ICASSP, 1983.
- (1983) Proceedings of the ICASSP
- Imai, S.¹

90
- 84863739383
- Speech signal processing toolkit (SPTK), version 3.3
- Imai, S., Kobayashi, T., Tokuda, K., Masuko, T., Koishida, K., Sako, S., Zen, H., Speech signal processing toolkit (SPTK), version 3.3. 2009.
- (2009)
- Imai, S.¹ Kobayashi, T.² Tokuda, K.³ Masuko, T.⁴ Koishida, K.⁵ Sako, S.⁶ Zen, H.⁷

91
- 0020703324
- Mel log spectrum approximation (MLSA) filter for speech synthesis
- Imai, S., Sumita, K., Furuichi, C., Mel log spectrum approximation (MLSA) filter for speech synthesis. Electron. Commun. Japan 66:2 (1983), 10–18.
- (1983) Electron. Commun. Japan , vol.66 , Issue.2 , pp. 10-18
- Imai, S.¹ Sumita, K.² Furuichi, C.³

92
- 33947693233
- St. Edmunds College, University of Cambridge Master's Thesis
- Inanoglu, Z., Transforming Pitch in a Voice Conversion Framework, 2003, St. Edmunds College, University of Cambridge Master's Thesis.
- (2003) Transforming Pitch in a Voice Conversion Framework
- Inanoglu, Z.¹

93
- 84938935270
- A system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality.
- Inanoglu, Z., Young, S., A system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality. Proceedings of the INTERSPEECH, 2007, 490–493.
- (2007) Proceedings of the INTERSPEECH , pp. 490-493
- Inanoglu, Z.¹ Young, S.²

94
- 58149203393
- Data-driven emotion conversion in spoken english
- Inanoglu, Z., Young, S., Data-driven emotion conversion in spoken english. Speech Commun. 51:3 (2009), 268–283.
- (2009) Speech Commun. , vol.51 , Issue.3 , pp. 268-283
- Inanoglu, Z.¹ Young, S.²

95
- 85064715894
- Speech spectrum transformation by speaker interpolation
- IEEE
- Iwahashi, N., Sagisaka, Y., Speech spectrum transformation by speaker interpolation. Proceedings of the ICASSP. Vol. 1, 1994, IEEE, I–461.
- (1994) Proceedings of the ICASSP. Vol. 1 , pp. I-461
- Iwahashi, N.¹ Sagisaka, Y.²

96
- 0029251946
- Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks
- Iwahashi, N., Sagisaka, Y., Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks. Speech Commun. 16:2 (1995), 139–151.
- (1995) Speech Commun. , vol.16 , Issue.2 , pp. 139-151
- Iwahashi, N.¹ Sagisaka, Y.²

97
- 0031623661
- Spectral voice conversion for text-to-speech synthesis
- Kain, A., Macon, M.W., Spectral voice conversion for text-to-speech synthesis. Proceedings of the ICASSP, 1998.
- (1998) Proceedings of the ICASSP
- Kain, A.¹ Macon, M.W.²

98
- 84984905455
- Text-to-speech voice adaptation from sparse training data.
- Kain, A., Macon, M.W., Text-to-speech voice adaptation from sparse training data. Proceedings of the ICSLP, 1998.
- (1998) Proceedings of the ICSLP
- Kain, A.¹ Macon, M.W.²

99
- 0034841948
- Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction
- Kain, A., Macon, M.W., Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction. Proceedings of the ICASSP, 2001.
- (2001) Proceedings of the ICASSP
- Kain, A.¹ Macon, M.W.²

100
- 77953816641
- Unit-selection text-to-speech synthesis using an asynchronous interpolation model.
- Kain, A., van Santen, J.P., Unit-selection text-to-speech synthesis using an asynchronous interpolation model. Proceedings of the SSW, 2007.
- (2007) Proceedings of the SSW
- Kain, A.¹ van Santen, J.P.²

101
- 70349210296
- Using speech transformation to increase speech intelligibility for the hearing-and speaking-impaired
- Kain, A., Van Santen, J., Using speech transformation to increase speech intelligibility for the hearing-and speaking-impaired. Proceedings of the ICASSP, 2009.
- (2009) Proceedings of the ICASSP
- Kain, A.¹ Van Santen, J.²

102
- 4444285698
- Oregon Health & Science University Ph.D. thesis
- Kain, A.B., High Resolution Voice Transformation, 2001, Oregon Health & Science University Ph.D. thesis.
- (2001) High Resolution Voice Transformation
- Kain, A.B.¹

103
- 34447635527
- Improving the intelligibility of dysarthric speech
- Kain, A.B., Hosom, J.-P., Niu, X., van Santen, J.P., Fried-Oken, M., Staehely, J., Improving the intelligibility of dysarthric speech. Speech Commun. 49:9 (2007), 743–759.
- (2007) Speech Commun. , vol.49 , Issue.9 , pp. 743-759
- Kain, A.B.¹ Hosom, J.-P.² Niu, X.³ van Santen, J.P.⁴ Fried-Oken, M.⁵ Staehely, J.⁶

104
- 33646785078
- A hybrid gmm and codebook mapping method for spectral conversion
- Springer
- Kang, Y., Shuang, Z., Tao, J., Zhang, W., Xu, B., A hybrid gmm and codebook mapping method for spectral conversion. Affective Computing and Intelligent Interaction, 2005, Springer, 303–310.
- (2005) Affective Computing and Intelligent Interaction , pp. 303-310
- Kang, Y.¹ Shuang, Z.² Tao, J.³ Zhang, W.⁴ Xu, B.⁵

105
- 33947698917
- Applying pitch target model to convert f0 contour for expressive mandarin speech synthesis
- Kang, Y., Tao, J., Xu, B., Applying pitch target model to convert f0 contour for expressive mandarin speech synthesis. Proceedings of the ICASSP, 2006.
- (2006) Proceedings of the ICASSP
- Kang, Y.¹ Tao, J.² Xu, B.³

106
- 0032673049
- Restructuring speech representations using a pitch-adaptive time–frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds
- Kawahara, H., Masuda-Katsuse, I., De Cheveigné, A., Restructuring speech representations using a pitch-adaptive time–frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds. Speech Commun. 27:3 (1999), 187–207.
- (1999) Speech Commun. , vol.27 , Issue.3 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² De Cheveigné, A.³

107
- 51449108867
- TANDEM-STRAIGHT: a temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, f0, and aperiodicity estimation
- Kawahara, H., Morise, M., Takahashi, T., Nisimura, R., Irino, T., Banno, H., TANDEM-STRAIGHT: a temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, f0, and aperiodicity estimation. Proceedings of the ICASSP, 2008.
- (2008) Proceedings of the ICASSP
- Kawahara, H.¹ Morise, M.² Takahashi, T.³ Nisimura, R.⁴ Irino, T.⁵ Banno, H.⁶

108
- 84876497245
- GMM-based voice conversion applied to emotional speech synthesis
- Kawanami, H., Iwami, Y., Toda, T., Saruwatari, H., Shikano, K., GMM-based voice conversion applied to emotional speech synthesis. Proceedings of the EUROSPEECH, 2003.
- (2003) Proceedings of the EUROSPEECH
- Kawanami, H.¹ Iwami, Y.² Toda, T.³ Saruwatari, H.⁴ Shikano, K.⁵

109
- 85135141647
- Hidden markov model based voice conversion using dynamic characteristics of speaker.
- Kim, E.-K., Lee, S., Oh, Y.-H., Hidden markov model based voice conversion using dynamic characteristics of speaker. Proceedings of the EUROSPEECH, 1997.
- (1997) Proceedings of the EUROSPEECH
- Kim, E.-K.¹ Lee, S.² Oh, Y.-H.³

110
- 84905262778
- An investigation of acoustic features for singing voice conversion based on perceptual age.
- Kobayashi, K., Doi, H., Toda, T., Nakano, T., Goto, M., Neubig, G., Sakti, S., Nakamura, S., An investigation of acoustic features for singing voice conversion based on perceptual age. Proceedings of the INTERSPEECH, 2013.
- (2013) Proceedings of the INTERSPEECH
- Kobayashi, K.¹ Doi, H.² Toda, T.³ Nakano, T.⁴ Goto, M.⁵ Neubig, G.⁶ Sakti, S.⁷ Nakamura, S.⁸

111
- 84959111811
- Statistical singing voice conversion based on direct waveform modification with global variance
- Kobayashi, K., Toda, T., Neubig, G., Sakti, S., Nakamura, S., Statistical singing voice conversion based on direct waveform modification with global variance. Proceedings of the INTERSPEECH, 2015.
- (2015) Proceedings of the INTERSPEECH
- Kobayashi, K.¹ Toda, T.² Neubig, G.³ Sakti, S.⁴ Nakamura, S.⁵

112
- 85090475413
- The CMU arctic speech databases
- Kominek, J., Black, A.W., The CMU arctic speech databases. Proceedings of the SSW, 2004.
- (2004) Proceedings of the SSW
- Kominek, J.¹ Black, A.W.²

113
- 84905248157
- Simple and artefact-free spectral modifications for enhancing the intelligibility of casual speech
- Koutsogiannaki, M., Stylianou, Y., Simple and artefact-free spectral modifications for enhancing the intelligibility of casual speech. Proceedings of the ICASSP, 2014.
- (2014) Proceedings of the ICASSP
- Koutsogiannaki, M.¹ Stylianou, Y.²

114
- 84908466787
- Using phone and diphone based acoustic models for voice conversion: a step towards creating voice fonts
- Kumar, A., Verma, A., Using phone and diphone based acoustic models for voice conversion: a step towards creating voice fonts. Proceedings of the ICME, 2003.
- (2003) Proceedings of the ICME
- Kumar, A.¹ Verma, A.²

115
- 0029256373
- Acoustic characteristics of speaker individuality: control and conversion
- Kuwabara, H., Sagisak, Y., Acoustic characteristics of speaker individuality: control and conversion. Speech Commun. 16:2 (1995), 165–173.
- (1995) Speech Commun. , vol.16 , Issue.2 , pp. 165-173
- Kuwabara, H.¹ Sagisak, Y.²

116
- 84964555662
- Speaker intonation adaptation for transforming text-to-speech synthesis speaker identity
- Langarani, M.S.E., van Santen, J., Speaker intonation adaptation for transforming text-to-speech synthesis speaker identity. Proceedings of the ASRU, 2015.
- (2015) Proceedings of the ASRU
- Langarani, M.S.E.¹ van Santen, J.²

117
- 84865847955
- Comparing ANN and GMM in a voice conversion framework
- Laskar, R., Chakrabarty, D., Talukdar, F., Rao, K.S., Banerjee, K., Comparing ANN and GMM in a voice conversion framework. Appl. Soft Comput. 12:11 (2012), 3332–3342.
- (2012) Appl. Soft Comput. , vol.12 , Issue.11 , pp. 3332-3342
- Laskar, R.¹ Chakrabarty, D.² Talukdar, F.³ Rao, K.S.⁴ Banerjee, K.⁵

118
- 84890501677
- Voice conversion by mapping the spectral and prosodic features using support vector machine
- Springer
- Laskar, R.H., Talukdar, F.A., Bhattacharjee, R., Das, S., Voice conversion by mapping the spectral and prosodic features using support vector machine. Applications of Soft Computing, 2009, Springer, 519–528.
- (2009) Applications of Soft Computing , pp. 519-528
- Laskar, R.H.¹ Talukdar, F.A.² Bhattacharjee, R.³ Das, S.⁴

119
- 84910030281
- Voice expression conversion with factorised HMM-TTS models
- Latorre, J., Wan, V., Yanagisawa, K., Voice expression conversion with factorised HMM-TTS models. Proceedings of the INTERSPEECH, 2014.
- (2014) Proceedings of the INTERSPEECH
- Latorre, J.¹ Wan, V.² Yanagisawa, K.³

120
- 44949210554
- Map-based adaptation for speech conversion using adaptation data selection and non-parallel training.
- Lee, C.-H., Wu, C.-H., Map-based adaptation for speech conversion using adaptation data selection and non-parallel training. Proceedings of the INTERSPEECH, 2006.
- (2006) Proceedings of the INTERSPEECH
- Lee, C.-H.¹ Wu, C.-H.²

121
- 38149065136
- Statistical approach for voice personality transformation
- Lee, K.-S., Statistical approach for voice personality transformation. IEEE Trans. Audio Speech Lang. Process. 15:2 (2007), 641–651.
- (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , Issue.2 , pp. 641-651
- Lee, K.-S.¹

122
- 84896464538
- A unit selection approach for voice transformation
- Lee, K.-S., A unit selection approach for voice transformation. Speech Commun. 60 (2014), 30–43.
- (2014) Speech Commun. , vol.60 , pp. 30-43
- Lee, K.-S.¹

123
- 84876489382
- Emotional speech conversion based on spectrum-prosody dual transformation
- Li, B., Xiao, Z., Shen, Y., Zhou, Q., Tao, Z., Emotional speech conversion based on spectrum-prosody dual transformation. Proceedings of the ICSP, 2012.
- (2012) Proceedings of the ICSP
- Li, B.¹ Xiao, Z.² Shen, Y.³ Zhou, Q.⁴ Tao, Z.⁵

124
- 85032750981
- Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends
- Ling, Z.-H., Kang, S.-Y., Zen, H., Senior, A., Schuster, M., Qian, X.-J., Meng, H.M., Deng, L., Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends. Signal Process. Mag. IEEE 32:3 (2015), 35–52.
- (2015) Signal Process. Mag. IEEE , vol.32 , Issue.3 , pp. 35-52
- Ling, Z.-H.¹ Kang, S.-Y.² Zen, H.³ Senior, A.⁴ Schuster, M.⁵ Qian, X.-J.⁶ Meng, H.M.⁷ Deng, L.⁸

125
- 84905223323
- Using bidirectional associative memories for joint spectral envelope modeling in voice conversion
- Liu, L.-J., Chen, L.-H., Ling, Z.-H., Dai, L.-R., Using bidirectional associative memories for joint spectral envelope modeling in voice conversion. Proceedings of the ICASSP, 2014.
- (2014) Proceedings of the ICASSP
- Liu, L.-J.¹ Chen, L.-H.² Ling, Z.-H.³ Dai, L.-R.⁴

126
- 84946076200
- Spectral conversion using deep neural networks trained with multi-source speakers
- Liu, L.-J., Chen, L.-H., Ling, Z.-H., Dai, L.-R., Spectral conversion using deep neural networks trained with multi-source speakers. Proceedings of the ICASSP, 2015.
- (2015) Proceedings of the ICASSP
- Liu, L.-J.¹ Chen, L.-H.² Ling, Z.-H.³ Dai, L.-R.⁴

127
- 77953726259
- Pitch and duration transformation with non-parallel data
- Lolive, D., Barbot, N., Boeffard, O., Pitch and duration transformation with non-parallel data. Proceedings of the Speech Prosody, 2008.
- (2008) Proceedings of the Speech Prosody
- Lolive, D.¹ Barbot, N.² Boeffard, O.³

128
- 84905188122
- Voice conversion: a critical survey
- Machado, A.F., Queiroz, M., Voice conversion: a critical survey. Proceedings of the SMC, 2010.
- (2010) Proceedings of the SMC
- Machado, A.F.¹ Queiroz, M.²

129
- 85059803513
- Speaker conversion through non-linear frequency warping of straight spectrum.
- Maeda, N., Banno, H., Kajita, S., Takeda, K., Itakura, F., Speaker conversion through non-linear frequency warping of straight spectrum. Proceedings of the EUROSPEECH, 1999.
- (1999) Proceedings of the EUROSPEECH
- Maeda, N.¹ Banno, H.² Kajita, S.³ Takeda, K.⁴ Itakura, F.⁵

130
- 34548785064
- Voice conversion using nonlinear principal component analysis
- Makki, B., Seyedsalehi, S., Sadati, N., Hosseini, M.N., Voice conversion using nonlinear principal component analysis. Proceedings of the CIISP, 2007.
- (2007) Proceedings of the CIISP
- Makki, B.¹ Seyedsalehi, S.² Sadati, N.³ Hosseini, M.N.⁴

131
- 84905269973
- Multimodal voice conversion using non-negative matrix factorization in noisy environments
- Masaka, K., Aihara, R., Takiguchi, T., Ariki, Y., Multimodal voice conversion using non-negative matrix factorization in noisy environments. Proceedings of the ICASSP, 2014.
- (2014) Proceedings of the ICASSP
- Masaka, K.¹ Aihara, R.² Takiguchi, T.³ Ariki, Y.⁴

132
- 34547534995
- Cost reduction of training mapping function based on multistep voice conversion
- Masuda, T., Shozakai, M., Cost reduction of training mapping function based on multistep voice conversion. Proceedings of the ICASSP, 2007.
- (2007) Proceedings of the ICASSP
- Masuda, T.¹ Shozakai, M.²

133
- 0015677419
- Multidimensional representation of personal quality of vowels and its acoustical correlates
- Matsumoto, H., Hiki, S., Sone, T., Nimura, T., Multidimensional representation of personal quality of vowels and its acoustical correlates. IEEE Trans. Audio Electroacoust. 21:5 (1973), 428–436.
- (1973) IEEE Trans. Audio Electroacoust. , vol.21 , Issue.5 , pp. 428-436
- Matsumoto, H.¹ Hiki, S.² Sone, T.³ Nimura, T.⁴

134
- 85007685968
- Unsupervised speaker adaptation from short utterances based on a minimized fuzzy objective function.
- Matsumoto, H., Yamashita, Y., Unsupervised speaker adaptation from short utterances based on a minimized fuzzy objective function. J. Acoust. Soc. Japan (E) 14:5 (1993), 353–361.
- (1993) J. Acoust. Soc. Japan (E) , vol.14 , Issue.5 , pp. 353-361
- Matsumoto, H.¹ Yamashita, Y.²

135
- 56149098813
- Comparing GMM-based speech transformation systems
- Mesbahi, L., Barreaud, V., Boeffard, O., Comparing GMM-based speech transformation systems. Proceedings of the INTERSPEECH, 2007.
- (2007) Proceedings of the INTERSPEECH
- Mesbahi, L.¹ Barreaud, V.² Boeffard, O.³

136
- 85047459969
- Gmm-based speech transformation systems under data reduction
- Mesbahi, L., Barreaud, V., Boeffard, O., Gmm-based speech transformation systems under data reduction. Proceedings of the SSW, 2007.
- (2007) Proceedings of the SSW
- Mesbahi, L.¹ Barreaud, V.² Boeffard, O.³

137
- 84994251909
- Deep bidirectional lstm modeling of timbre and prosody for emotional voice conversion
- Ming, H., Huang, D., Xie, L., Wu, J., Li, M.D.H., Deep bidirectional lstm modeling of timbre and prosody for emotional voice conversion. Proceedings of the INTERSPEECH, 2016.
- (2016) Proceedings of the INTERSPEECH
- Ming, H.¹ Huang, D.² Xie, L.³ Wu, J.⁴ Li, M.D.H.⁵

138
- 0029256372
- Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt
- Mizuno, H., Abe, M., Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt. Speech Commun. 16:2 (1995), 153–164.
- (1995) Speech Commun. , vol.16 , Issue.2 , pp. 153-164
- Mizuno, H.¹ Abe, M.²

139
- 84890475857
- Transmutative voice conversion
- Mohammadi, S.H., Kain, A., Transmutative voice conversion. Proceedings of the ICASSP, 2013.
- (2013) Proceedings of the ICASSP
- Mohammadi, S.H.¹ Kain, A.²

140
- 84946685887
- Voice conversion using deep neural networks with speaker-independent pre-training
- Mohammadi, S.H., Kain, A., Voice conversion using deep neural networks with speaker-independent pre-training. Proceedings of the SLT, 2014.
- (2014) Proceedings of the SLT
- Mohammadi, S.H.¹ Kain, A.²

141
- 84959173289
- Semi-supervised training of a voice conversion mapping function using a joint-autoencoder
- Mohammadi, S.H., Kain, A., Semi-supervised training of a voice conversion mapping function using a joint-autoencoder. Proceedings of the INTERSPEECH, 2015.
- (2015) Proceedings of the INTERSPEECH
- Mohammadi, S.H.¹ Kain, A.²

142
- 84994219829
- A voice conversion mapping function based on a stacked joint-autoencoder
- Mohammadi, S.H., Kain, A., A voice conversion mapping function based on a stacked joint-autoencoder. Proceedings of the INTERSPEECH, 2016.
- (2016) Proceedings of the INTERSPEECH
- Mohammadi, S.H.¹ Kain, A.²

143
- 84878384703
- Making conversational vowels more clear.
- Mohammadi, S.H., Kain, A., van Santen, J.P., Making conversational vowels more clear. Proceedings of the INTERSPEECH, 2012.
- (2012) Proceedings of the INTERSPEECH
- Mohammadi, S.H.¹ Kain, A.² van Santen, J.P.³

144
- 84908519225
- Cheaptrick, a spectral envelope estimator for high-quality speech synthesis
- Morise, M., Cheaptrick, a spectral envelope estimator for high-quality speech synthesis. Speech Commun. 67 (2015), 1–7.
- (2015) Speech Commun. , vol.67 , pp. 1-7
- Morise, M.¹

145
- 84976902575
- World: a vocoder-based high-quality speech synthesis system for real-time applications
- Morise, M., Yokomori, F., Ozawa, K., World: a vocoder-based high-quality speech synthesis system for real-time applications. IEICE Trans. Inf. Syst., 2016.
- (2016) IEICE Trans. Inf. Syst.
- Morise, M.¹ Yokomori, F.² Ozawa, K.³

146
- 84878384415
- Synthetic f0 can effectively convey speaker id in delexicalized speech.
- Morley, E., Klabbers, E., van Santen, J.P., Kain, A., Mohammadi, S.H., Synthetic f0 can effectively convey speaker id in delexicalized speech. Proceedings of the INTERSPEECH, 2012.
- (2012) Proceedings of the INTERSPEECH
- Morley, E.¹ Klabbers, E.² van Santen, J.P.³ Kain, A.⁴ Mohammadi, S.H.⁵

147
- 0036753077
- Reconstruction of speech from whispers
- Morris, R.W., Clements, M.A., Reconstruction of speech from whispers. Med. Eng. Phys. 24:7 (2002), 515–520.
- (2002) Med. Eng. Phys. , vol.24 , Issue.7 , pp. 515-520
- Morris, R.W.¹ Clements, M.A.²

148
- 34547552192
- Conditional vector quantization for voice conversion
- Mouchtaris, A., Agiomyrgiannakis, Y., Stylianou, Y., Conditional vector quantization for voice conversion. Proceedings of the ICASSP, 2007.
- (2007) Proceedings of the ICASSP
- Mouchtaris, A.¹ Agiomyrgiannakis, Y.² Stylianou, Y.³

149
- 4544297119
- Non-parallel training for voice conversion by maximum likelihood constrained adaptation
- Mouchtaris, A., Van der Spiegel, J., Mueller, P., Non-parallel training for voice conversion by maximum likelihood constrained adaptation. Proceedings of the ICASSP, 2004.
- (2004) Proceedings of the ICASSP
- Mouchtaris, A.¹ Van der Spiegel, J.² Mueller, P.³

150
- 11244303645
- A spectral conversion approach to the iterative wiener filter for speech enhancement
- Mouchtaris, A., Van der Spiegel, J., Mueller, P., A spectral conversion approach to the iterative wiener filter for speech enhancement. Proceedings of the ICME, 2004.
- (2004) Proceedings of the ICME
- Mouchtaris, A.¹ Van der Spiegel, J.² Mueller, P.³

151
- 34047245444
- Nonparallel training for voice conversion based on a parameter adaptation approach
- Mouchtaris, A., Van der Spiegel, J., Mueller, P., Nonparallel training for voice conversion based on a parameter adaptation approach. IEEE Trans. Audio Speech Lang. Process. 14:3 (2006), 952–963.
- (2006) IEEE Trans. Audio Speech Lang. Process. , vol.14 , Issue.3 , pp. 952-963
- Mouchtaris, A.¹ Van der Spiegel, J.² Mueller, P.³

152
- 0025543906
- Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones
- Moulines, E., Charpentier, F., Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. Speech Commun. 9:5 (1990), 453–467.
- (1990) Speech Commun. , vol.9 , Issue.5 , pp. 453-467
- Moulines, E.¹ Charpentier, F.²

153
- 84867211725
- Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory
- Muramatsu, T., Ohtani, Y., Toda, T., Saruwatari, H., Shikano, K., Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory. Proceedings of the INTERSPEECH, 2008.
- (2008) Proceedings of the INTERSPEECH
- Muramatsu, T.¹ Ohtani, Y.² Toda, T.³ Saruwatari, H.⁴ Shikano, K.⁵

154
- 44949187612
- Improving body transmitted unvoiced speech with statistical voice conversion
- Nakagiri, M., Toda, T., Kashioka, H., Shikano, K., Improving body transmitted unvoiced speech with statistical voice conversion. Proceedings of the INTERSPEECH, 2006.
- (2006) Proceedings of the INTERSPEECH
- Nakagiri, M.¹ Toda, T.² Kashioka, H.³ Shikano, K.⁴

155
- 85010461545
- A speech communication aid system for total laryngectomies using voice conversion of body transmitted artificial speech
- Nakamura, K., Toda, T., Saruwatari, H., Shikano, K., A speech communication aid system for total laryngectomies using voice conversion of body transmitted artificial speech. J. Acoust. Soc. Am., 120(5), 2006, 3351.
- (2006) J. Acoust. Soc. Am. , vol.120 , Issue.5 , pp. 3351
- Nakamura, K.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

156
- 80052698826
- Speaking-aid systems using GMM -based voice conversion for electrolaryngeal speech
- Nakamura, K., Toda, T., Saruwatari, H., Shikano, K., Speaking-aid systems using GMM -based voice conversion for electrolaryngeal speech. Speech Commun. 54:1 (2012), 134–146.
- (2012) Speech Commun. , vol.54 , Issue.1 , pp. 134-146
- Nakamura, K.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

157
- 84906280857
- Voice conversion in high-order eigen space using deep belief nets
- Nakashika, T., Takashima, R., Takiguchi, T., Ariki, Y., Voice conversion in high-order eigen space using deep belief nets. Proceedings of the INTERSPEECH, 2013.
- (2013) Proceedings of the INTERSPEECH
- Nakashika, T.¹ Takashima, R.² Takiguchi, T.³ Ariki, Y.⁴

158
- 84910087396
- High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion
- Nakashika, T., Takiguchi, T., Ariki, Y., High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion. Proceedings of the INTERSPEECH, 2014.
- (2014) Proceedings of the INTERSPEECH
- Nakashika, T.¹ Takiguchi, T.² Ariki, Y.³

159
- 84946019814
- Sparse nonlinear representation for voice conversion
- Nakashika, T., Takiguchi, T., Ariki, Y., Sparse nonlinear representation for voice conversion. Proceedings of the ICME, 2015.
- (2015) Proceedings of the ICME
- Nakashika, T.¹ Takiguchi, T.² Ariki, Y.³

160
- 84923867813
- Voice conversion using RNN pre-trained by recurrent temporal restricted Boltzmann machines
- Nakashika, T., Takiguchi, T., Ariki, Y., Voice conversion using RNN pre-trained by recurrent temporal restricted Boltzmann machines. IEEE/ACM Trans. Audio Speech Lang. Process. 23:3 (2015), 580–587, 10.1109/TASLP.2014.2379589.
- (2015) IEEE/ACM Trans. Audio Speech Lang. Process. , vol.23 , Issue.3 , pp. 580-587
- Nakashika, T.¹ Takiguchi, T.² Ariki, Y.³

161
- 84924309945
- Voice conversion using speaker-dependent conditional restricted Boltzmann machine
- Nakashika, T., Takiguchi, T., Ariki, Y., Voice conversion using speaker-dependent conditional restricted Boltzmann machine. EURASIP J. Audio Speech Music Process. 2015:1 (2015), 1–12.
- (2015) EURASIP J. Audio Speech Music Process. , vol.2015 , Issue.1 , pp. 1-12
- Nakashika, T.¹ Takiguchi, T.² Ariki, Y.³

162
- 84984920236
- Non-parallel training in voice conversion using an adaptive restricted boltzmann machine
- Nakashika, T., Takiguchi, T., Minami, Y., Non-parallel training in voice conversion using an adaptive restricted boltzmann machine. IEEE/ACM Trans. Audio Speech Lang. Process. 24:11 (2016), 2032–2045.
- (2016) IEEE/ACM Trans. Audio Speech Lang. Process. , vol.24 , Issue.11 , pp. 2032-2045
- Nakashika, T.¹ Takiguchi, T.² Minami, Y.³

163
- 84901766069
- Voice conversion based on speaker-dependent restricted boltzmann machines
- Nakashika, T., Toru, Takiguchi, T., Tetsuya, Ariki, Y., Yasuo, Voice conversion based on speaker-dependent restricted boltzmann machines. IEICE Trans. Inf. Syst. 97:6 (2014), 1403–1410.
- (2014) IEICE Trans. Inf. Syst. , vol.97 , Issue.6 , pp. 1403-1410
- Nakashika, T.¹ Toru² Takiguchi, T.³ Tetsuya⁴ Ariki, Y.⁵ Yasuo⁶

164
- 78149241363
- Spectral conversion based on statistical models including time-sequence matching
- Nankaku, Y., Nakamura, K., Toda, T., Tokuda, K., Spectral conversion based on statistical models including time-sequence matching. Proceedings of the SSW, 2007.
- (2007) Proceedings of the SSW
- Nankaku, Y.¹ Nakamura, K.² Toda, T.³ Tokuda, K.⁴

165
- 0029254176
- Transformation of formants for voice conversion using artificial neural networks
- Narendranath, M., Murthy, H.A., Rajendran, S., Yegnanarayana, B., Transformation of formants for voice conversion using artificial neural networks. Speech Commun. 16:2 (1995), 207–216.
- (1995) Speech Commun. , vol.16 , Issue.2 , pp. 207-216
- Narendranath, M.¹ Murthy, H.A.² Rajendran, S.³ Yegnanarayana, B.⁴

166
- 78650302965
- Japan Advanced Institute of Science and Technology Ph.D. thesis
- Nguyen, B.P., Studies on Spectral Modification in Voice Transformation, 2009, Japan Advanced Institute of Science and Technology Ph.D. thesis.
- (2009) Studies on Spectral Modification in Voice Transformation
- Nguyen, B.P.¹

167
- 67649297853
- Spectral modification for voice gender conversion using temporal decomposition
- Nguyen, B.P., Akagi, M., Spectral modification for voice gender conversion using temporal decomposition. J.Signal Process, 2007.
- (2007) J.Signal Process
- Nguyen, B.P.¹ Akagi, M.²

168
- 85010381832
- Phoneme-based spectral voice conversion using temporal decomposition and gaussian mixture model
- Nguyen, B.P., Akagi, M., Phoneme-based spectral voice conversion using temporal decomposition and gaussian mixture model. Proceedings of the ICCE, 2008.
- (2008) Proceedings of the ICCE
- Nguyen, B.P.¹ Akagi, M.²

169
- 84867055711
- Voice transformation using radial basis function
- Springer
- Nirmal, J., Patnaik, S., Zaveri, M.A., Voice transformation using radial basis function. Proceedings of the TITC, 2013, Springer, 345–351.
- (2013) Proceedings of the TITC , pp. 345-351
- Nirmal, J.¹ Patnaik, S.² Zaveri, M.A.³

170
- 84905573362
- Voice conversion using general regression neural network
- Nirmal, J., Zaveri, M., Patnaik, S., Kachare, P., Voice conversion using general regression neural network. Appl. Soft Comput. 24 (2014), 1–12.
- (2014) Appl. Soft Comput. , vol.24 , pp. 1-12
- Nirmal, J.¹ Zaveri, M.² Patnaik, S.³ Kachare, P.⁴

171
- 34547527563
- A parametric approach for voice conversion
- Nurminen, J., Popa, V., Tian, J., Tang, Y., Kiss, I., A parametric approach for voice conversion. TCSTAR WSST, 2006, 225–229.
- (2006) TCSTAR WSST , pp. 225-229
- Nurminen, J.¹ Popa, V.² Tian, J.³ Tang, Y.⁴ Kiss, I.⁵

172
- 56149116066
- Voicing level control with application in voice conversion
- Nurminen, J., Tian, J., Popa, V., Voicing level control with application in voice conversion. Proceedings of the INTERSPEECH, 2007.
- (2007) Proceedings of the INTERSPEECH
- Nurminen, J.¹ Tian, J.² Popa, V.³

173
- 78649963192
- Nara Institute of Science and Technology
- Ohtani, Y., Techniques for Improving Voice Conversion Based on Eigenvoices, 2010, Nara Institute of Science and Technology.
- (2010) Techniques for Improving Voice Conversion Based on Eigenvoices
- Ohtani, Y.¹

174
- 44949143155
- Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation
- Ohtani, Y., Toda, T., Saruwatari, H., Shikano, K., Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation. Proceedings of the INTERSPEECH, 2006.
- (2006) Proceedings of the INTERSPEECH
- Ohtani, Y.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

175
- 70450194389
- Many-to-many eigenvoice conversion with reference voice
- Ohtani, Y., Toda, T., Saruwatari, H., Shikano, K., Many-to-many eigenvoice conversion with reference voice. Proceedings of the INTERSPEECH, 2009.
- (2009) Proceedings of the INTERSPEECH
- Ohtani, Y.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

176
- 78049398713
- Non-parallel training for many-to-many eigenvoice conversion
- Ohtani, Y., Toda, T., Saruwatari, H., Shikano, K., Non-parallel training for many-to-many eigenvoice conversion. Proceedings of the ICASSP, 2010.
- (2010) Proceedings of the ICASSP
- Ohtani, Y.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

177
- 2142655909
- Interpolation properties of linear prediction parametric representations.
- Paliwal, K.K., Interpolation properties of linear prediction parametric representations. Proceedings of the EUROSPEECH, 1995.
- (1995) Proceedings of the EUROSPEECH
- Paliwal, K.K.¹

178
- 0033692729
- Narrowband to wideband conversion of speech using gmm based transformation
- Park, K.-Y., Kim, H.S., Narrowband to wideband conversion of speech using gmm based transformation. Proceedings of the ICASSP, 2000.
- (2000) Proceedings of the ICASSP
- Park, K.-Y.¹ Kim, H.S.²

179
- 33044494135
- Edinburgh University Ph.D. thesis.
- Patterson, D.J., linguistic Approach to Pitch Range Modelling, 2000, Edinburgh University Ph.D. thesis.
- (2000) linguistic Approach to Pitch Range Modelling
- Patterson, D.J.¹

180
- 0032664931
- An experimental study of speaker verification sensitivity to computer voice-altered imposters
- Pellom, B.L., Hansen, J.H., An experimental study of speaker verification sensitivity to computer voice-altered imposters. Proceedings of the ICASSP, 1999.
- (1999) Proceedings of the ICASSP
- Pellom, B.L.¹ Hansen, J.H.²

181
- 51449094434
- Voice conversion with linear prediction residual estimaton
- Percybrooks, W.S., Moore, E., Voice conversion with linear prediction residual estimaton. Proceedings of the ICASSP, 2008.
- (2008) Proceedings of the ICASSP
- Percybrooks, W.S.¹ Moore, E.²

182
- 84865737668
- Gaussian process experts for voice conversion
- Pilkington, N.C., Zen, H., Gales, M.J., et al. Gaussian process experts for voice conversion. Proceedings of the INTERSPEECH, 2011.
- (2011) Proceedings of the INTERSPEECH
- Pilkington, N.C.¹ Zen, H.² Gales, M.J.³

183
- 27644522706
- Vocal tract normalization equals linear transformation in cepstral space
- Pitz, M., Ney, H., Vocal tract normalization equals linear transformation in cepstral space. Speech Audio Process. IEEE Trans. 13:5 (2005), 930–944.
- (2005) Speech Audio Process. IEEE Trans. , vol.13 , Issue.5 , pp. 930-944
- Pitz, M.¹ Ney, H.²

184
- 85010372628
- The University of Tokyo Master's thesis.
- Pongkittiphan, T., Eigenvoice-Based Character Conversion and its Evaluations, 2012, The University of Tokyo Master's thesis.
- (2012) Eigenvoice-Based Character Conversion and its Evaluations
- Pongkittiphan, T.¹

185
- 70450171770
- A novel technique for voice conversion based on style and content decomposition with bilinear models.
- Popa, V., Nurminen, J., Gabbouj, M., A novel technique for voice conversion based on style and content decomposition with bilinear models. Proceedings of the INTERSPEECH, 2009.
- (2009) Proceedings of the INTERSPEECH
- Popa, V.¹ Nurminen, J.² Gabbouj, M.³

186
- 84971616451
- A study of bilinear models in voice conversion
- Popa, V., Nurminen, J., Gabbouj, M., et al. A study of bilinear models in voice conversion. J. Signal Inf. Process., 2(02), 2011, 125.
- (2011) J. Signal Inf. Process. , vol.2 , Issue.2 , pp. 125
- Popa, V.¹ Nurminen, J.² Gabbouj, M.³

187
- 84867594339
- Local linear transformation for voice conversion
- Popa, V., Silen, H., Nurminen, J., Gabbouj, M., Local linear transformation for voice conversion. Proceedings of the ICASSP, 2012.
- (2012) Proceedings of the ICASSP
- Popa, V.¹ Silen, H.² Nurminen, J.³ Gabbouj, M.⁴

188
- 85010372637
- University of Cambridge Ph.D. thesis.
- Pozo, A., Voice Source and Duration Modelling for Voice Conversion and Speech Repair, 2008, University of Cambridge Ph.D. thesis.
- (2008) Voice Source and Duration Modelling for Voice Conversion and Speech Repair
- Pozo, A.¹

189
- 33751438738
- Non-linear frequency scale mapping for voice conversion in text-to-speech system with cepstral description
- Přibilová, A., Přibil, J., Non-linear frequency scale mapping for voice conversion in text-to-speech system with cepstral description. Speech Commun. 48:12 (2006), 1691–1703.
- (2006) Speech Commun. , vol.48 , Issue.12 , pp. 1691-1703
- Přibilová, A.¹ Přibil, J.²

190
- 84865763441
- A study on bag of gaussian model with application to voice conversion.
- Qiao, Y., Tong, T., Minematsu, N., A study on bag of gaussian model with application to voice conversion. Proceedings of the INTERSPEECH, 2011, 657–660.
- (2011) Proceedings of the INTERSPEECH , pp. 657-660
- Qiao, Y.¹ Tong, T.² Minematsu, N.³

191
- 85010391480
- Tecnico Lisboa Master's thesis.
- Ramos, M.V., Voice Conversion with Deep Learning, 2016, Tecnico Lisboa Master's thesis.
- (2016) Voice Conversion with Deep Learning
- Ramos, M.V.¹

192
- 38149073264
- Voice transformation by mapping the features at syllable level
- Springer
- Rao, K.S., Laskar, R., Koolagudi, S.G., Voice transformation by mapping the features at syllable level. Pattern Recognition and Machine Intelligence, 2007, Springer, 479–486.
- (2007) Pattern Recognition and Machine Intelligence , pp. 479-486
- Rao, K.S.¹ Laskar, R.² Koolagudi, S.G.³

193
- 85036464413
- Novel pre-processing using outlier removal in voice conversion
- Rao, S.V., Shah, N.J., Patil, H.A., Novel pre-processing using outlier removal in voice conversion. Proceedings of the SSW, 2016.
- (2016) Proceedings of the SSW
- Rao, S.V.¹ Shah, N.J.² Patil, H.A.³

194
- 85009195247
- Probability models of formant parameters for voice conversion
- Rentzos, D., Qin, S.V., Ho, C.-H., Turajlic, E., Probability models of formant parameters for voice conversion. Proceedings of the EUROSPEECH, 2003.
- (2003) Proceedings of the EUROSPEECH
- Rentzos, D.¹ Qin, S.V.² Ho, C.-H.³ Turajlic, E.⁴

195
- 0030359624
- Voice conversion based on topological feature maps and time-variant filtering
- Rinscheid, A., Voice conversion based on topological feature maps and time-variant filtering. Proceedings of the ICSLP, 1996.
- (1996) Proceedings of the ICSLP
- Rinscheid, A.¹

196
- 0034847662
- Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs
- Rix, A.W., Beerends, J.G., Hollier, M.P., Hekstra, A.P., Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. Proceedings of the ICASSP, 2001.
- (2001) Proceedings of the ICASSP
- Rix, A.W.¹ Beerends, J.G.² Hollier, M.P.³ Hekstra, A.P.⁴

197
- 84859768504
- Statistical voice conversion based on noisy channel model
- Saito, D., Watanabe, S., Nakamura, A., Minematsu, N., Statistical voice conversion based on noisy channel model. IEEE Trans. Audio Speech Lang. Process. 20:6 (2012), 1784–1794.
- (2012) IEEE Trans. Audio Speech Lang. Process. , vol.20 , Issue.6 , pp. 1784-1794
- Saito, D.¹ Watanabe, S.² Nakamura, A.³ Minematsu, N.⁴

198
- 84865798483
- One-to-many voice conversion based on tensor representation of speaker space
- Saito, D., Yamamoto, K., Minematsu, N., Hirose, K., One-to-many voice conversion based on tensor representation of speaker space. Proceedings of the INTERSPEECH, 2011.
- (2011) Proceedings of the INTERSPEECH
- Saito, D.¹ Yamamoto, K.² Minematsu, N.³ Hirose, K.⁴

199
- 33748468528
- Dynamic programming approach to voice transformation
- Salor, Ö., Demirekler, M., Dynamic programming approach to voice transformation. Speech communication 48:10 (2006), 1262–1272.
- (2006) Speech communication , vol.48 , Issue.10 , pp. 1262-1272
- Salor, Ö.¹ Demirekler, M.²

200
- 84910089725
- Hierarchical modeling of f0 contours for voice conversion
- Sanchez, G., Silen, H., Nurminen, J., Gabbouj, M., Hierarchical modeling of f0 contours for voice conversion. Proceedings of the INTERSPEECH, 2014.
- (2014) Proceedings of the INTERSPEECH
- Sanchez, G.¹ Silen, H.² Nurminen, J.³ Gabbouj, M.⁴

201
- 0026394044
- Speaker adaptation and voice conversion by codebook mapping
- Shikano, K., Nakamura, S., Abe, M., Speaker adaptation and voice conversion by codebook mapping. IEEE International Sympoisum on Circuits and Systems, 1991, 594–597.
- (1991) IEEE International Sympoisum on Circuits and Systems , pp. 594-597
- Shikano, K.¹ Nakamura, S.² Abe, M.³

202
- 70450149422
- Voice conversion based on mapping formants
- Shuang, Z., Bakis, R., Qin, Y., Voice conversion based on mapping formants. TC-STAR WSST, 2006, 219–223.
- (2006) TC-STAR WSST , pp. 219-223
- Shuang, Z.¹ Bakis, R.² Qin, Y.³

203
- 51449112440
- Voice conversion by combining frequency warping with unit selection
- Shuang, Z., Meng, F., Qin, Y., Voice conversion by combining frequency warping with unit selection. Proceedings of the ICASSP, 2008.
- (2008) Proceedings of the ICASSP
- Shuang, Z.¹ Meng, F.² Qin, Y.³

204
- 85009076640
- A novel voice conversion system based on codebook mapping with phoneme-tied weighting
- Shuang, Z.-W., Wang, Z.-X., Ling, Z.-H., Wang, R.-H., A novel voice conversion system based on codebook mapping with phoneme-tied weighting. Proceedings of the ICSLP, 2004.
- (2004) Proceedings of the ICSLP
- Shuang, Z.-W.¹ Wang, Z.-X.² Ling, Z.-H.³ Wang, R.-H.⁴

205
- 80053068819
- Voice conversion using support vector regression
- Song, P., Bao, Y., Zhao, L., Zou, C., Voice conversion using support vector regression. Electron. Lett. 47:18 (2011), 1045–1046.
- (2011) Electron. Lett. , vol.47 , Issue.18 , pp. 1045-1046
- Song, P.¹ Bao, Y.² Zhao, L.³ Zou, C.⁴

206
- 84865718211
- Uniform speech parameterization for multi-form segment synthesis.
- Sorin, A., Shechtman, S., Pollet, V., Uniform speech parameterization for multi-form segment synthesis. Proceedings of the INTERSPEECH, 2011.
- (2011) Proceedings of the INTERSPEECH
- Sorin, A.¹ Shechtman, S.² Pollet, V.³

207
- 84904163933
- Dropout: a simple way to prevent neural networks from overfitting
- Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R., Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15:1 (2014), 1929–1958.
- (2014) J. Mach. Learn. Res. , vol.15 , Issue.1 , pp. 1929-1958
- Srivastava, N.¹ Hinton, G.² Krizhevsky, A.³ Sutskever, I.⁴ Salakhutdinov, R.⁵

208
- 0003447548
- Ecole Nationale Supérieure des Télécommunications Ph.D. thesis
- Stylianou, I., Harmonic Plus Noise Models for Speech, Combined with Statistical Methods, for Speech and Speaker Modification, 1996, Ecole Nationale Supérieure des Télécommunications Ph.D. thesis.
- (1996) Harmonic Plus Noise Models for Speech, Combined with Statistical Methods, for Speech and Speaker Modification
- Stylianou, I.¹

209
- 70349197715
- Voice transformation: a survey
- Stylianou, Y., Voice transformation: a survey. Proceedings of the ICASSP, 2009.
- (2009) Proceedings of the ICASSP
- Stylianou, Y.¹

210
- 0032026483
- Continuous probabilistic transform for voice conversion
- Stylianou, Y., Cappé, O., Moulines, E., Continuous probabilistic transform for voice conversion. IEEE Trans. Speech Audio Process. 6:2 (1998), 131–142.
- (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappé, O.² Moulines, E.³

211
- 84946027999
- Voice conversion using deep bidirectional long short-term memory based recurrent neural networks
- Sun, L., Kang, S., Li, K., Meng, H., Voice conversion using deep bidirectional long short-term memory based recurrent neural networks. Proceedings of the ICASSP, 2015.
- (2015) Proceedings of the ICASSP
- Sun, L.¹ Kang, S.² Li, K.³ Meng, H.⁴

212
- 78650542860
- Voice conversion: State-of-the-art and future work
- Sündermann, D., Voice conversion: State-of-the-art and future work. Fortschritte der Akustik, 31(2), 2005, 735.
- (2005) Fortschritte der Akustik , vol.31 , Issue.2 , pp. 735
- Sündermann, D.¹

213
- 84888623995
- Universitätsbibliothek der Universität der Bundeswehr München Ph.D. thesis.
- Sündermann, D., Text-independent voice conversion, 2008, Universitätsbibliothek der Universität der Bundeswehr München Ph.D. thesis.
- (2008) Text-independent voice conversion
- Sündermann, D.¹

214
- 85010452452
- Voice conversion using exclusively unaligned training data
- Sündermann, D., Bonafonte, A., Höge, H., Ney, H., Voice conversion using exclusively unaligned training data. Proceedings of the ACL/SEPLN, 2004.
- (2004) Proceedings of the ACL/SEPLN
- Sündermann, D.¹ Bonafonte, A.² Höge, H.³ Ney, H.⁴

215
- 85009084358
- A first step towards text-independent voice conversion
- Sündermann, D., Bonafonte, A., Ney, H., Höge, H., A first step towards text-independent voice conversion. Proceedings of the ICSLP, 2004.
- (2004) Proceedings of the ICSLP
- Sündermann, D.¹ Bonafonte, A.² Ney, H.³ Höge, H.⁴

216
- 33646767751
- A study on residual prediction techniques for voice conversion.
- Sündermann, D., Bonafonte, A., Ney, H., Höge, H., A study on residual prediction techniques for voice conversion. Proceedings of the ICASSP, 2005.
- (2005) Proceedings of the ICASSP
- Sündermann, D.¹ Bonafonte, A.² Ney, H.³ Höge, H.⁴

217
- 33947623206
- Text-independent voice conversion based on unit selection
- Sündermann, D., Hoge, H., Bonafonte, A., Ney, H., Black, A., Narayanan, S., Text-independent voice conversion based on unit selection. Proceedings of the ICASSP, 2006.
- (2006) Proceedings of the ICASSP
- Sündermann, D.¹ Hoge, H.² Bonafonte, A.³ Ney, H.⁴ Black, A.⁵ Narayanan, S.⁶

218
- 84905187320
- TC-Star: cross-language voice conversion revisited
- TC-Star Workshop
- Sündermann, D., Höge, H., Bonafonte, A., Ney, H., Hirschberg, J., TC-Star: cross-language voice conversion revisited. TC-Star Workshop, 2006, TC-Star Workshop.
- (2006) TC-Star Workshop
- Sündermann, D.¹ Höge, H.² Bonafonte, A.³ Ney, H.⁴ Hirschberg, J.⁵

219
- 44949241666
- Text-independent cross-language voice conversion.
- Sündermann, D., Höge, H., Bonafonte, A., Ney, H., Hirschberg, J., Text-independent cross-language voice conversion. Proceedings of the INTERSPEECH, 2006.
- (2006) Proceedings of the INTERSPEECH
- Sündermann, D.¹ Höge, H.² Bonafonte, A.³ Ney, H.⁴ Hirschberg, J.⁵

220
- 84946794248
- An automatic segmentation and mapping approach for voice conversion parameter training
- Sündermann, D., Ney, H., An automatic segmentation and mapping approach for voice conversion parameter training. Proceedings of the AST, 2003.
- (2003) Proceedings of the AST
- Sündermann, D.¹ Ney, H.²

221
- 84946753271
- VTLN-based cross-language voice conversion
- Sündermann, D., Ney, H., Hoge, H., VTLN-based cross-language voice conversion. Proceedings of the ASRU, 2003.
- (2003) Proceedings of the ASRU
- Sündermann, D.¹ Ney, H.² Hoge, H.³

222
- 84946045633
- Wavelets for intonation modeling in hmm speech synthesis
- Suni, A.S., Aalto, D., Raitio, T., Alku, P., Vainio, M., et al. Wavelets for intonation modeling in hmm speech synthesis. Proceedings of the SSW, 2013.
- (2013) Proceedings of the SSW
- Suni, A.S.¹ Aalto, D.² Raitio, T.³ Alku, P.⁴ Vainio, M.⁵

223
- 84949926049
- Modulation spectrum-based post-filter for GMM-based voice conversion
- Takamichi, S., Toda, T., Black, A.W., Nakamura, S., Modulation spectrum-based post-filter for GMM-based voice conversion. Proceedings of the APSIPA, 2014.
- (2014) Proceedings of the APSIPA
- Takamichi, S.¹ Toda, T.² Black, A.W.³ Nakamura, S.⁴

224
- 84959166270
- Modulation spectrum-constrained trajectory training algorithm for gmm-based voice conversion
- Takamichi, S., Toda, T., Black, A.W., Nakamura, S., Modulation spectrum-constrained trajectory training algorithm for gmm-based voice conversion. Proceedings of the ICASSP, 2015.
- (2015) Proceedings of the ICASSP
- Takamichi, S.¹ Toda, T.² Black, A.W.³ Nakamura, S.⁴

225
- 84905271796
- Noise-robust voice conversion based on spectral mapping on sparse space
- Takashima, R., Aihara, R., Takiguchi, T., Ariki, Y., Noise-robust voice conversion based on spectral mapping on sparse space. Proceedings of the SSW, 2013.
- (2013) Proceedings of the SSW
- Takashima, R.¹ Aihara, R.² Takiguchi, T.³ Ariki, Y.⁴

226
- 84874248255
- Exemplar-based voice conversion in noisy environment
- Takashima, R., Takiguchi, T., Ariki, Y., Exemplar-based voice conversion in noisy environment. Proceedings of the SLT, 2012.
- (2012) Proceedings of the SLT
- Takashima, R.¹ Takiguchi, T.² Ariki, Y.³

227
- 80051619373
- One sentence voice adaptation using GMm-based frequency-warping and shift with a sub-band basis spectrum model
- Tamura, M., Morita, M., Kagoshima, T., Akamine, M., One sentence voice adaptation using GMm-based frequency-warping and shift with a sub-band basis spectrum model. Proceedings of the ICASSP, 2011.
- (2011) Proceedings of the ICASSP
- Tamura, M.¹ Morita, M.² Kagoshima, T.³ Akamine, M.⁴

228
- 84905244240
- A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion.
- Tanaka, K., Toda, T., Neubig, G., Sakti, S., Nakamura, S., A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion. Proceedings of the INTERSPEECH, 2013.
- (2013) Proceedings of the INTERSPEECH
- Tanaka, K.¹ Toda, T.² Neubig, G.³ Sakti, S.⁴ Nakamura, S.⁵

229
- 84867203066
- Maximum a posteriori adaptation for many-to-one eigenvoice conversion
- Tani, D., Toda, T., Ohtani, Y., Saruwatari, H., Shikano, K., Maximum a posteriori adaptation for many-to-one eigenvoice conversion. Proceedings of the INTERSPEECH, 2008.
- (2008) Proceedings of the INTERSPEECH
- Tani, D.¹ Toda, T.² Ohtani, Y.³ Saruwatari, H.⁴ Shikano, K.⁵

230
- 34047263010
- Prosody conversion from neutral speech to emotional speech
- Tao, J., Kang, Y., Li, A., Prosody conversion from neutral speech to emotional speech. IEEE Trans. Audio Speech Lang. Process. 14:4 (2006), 1145–1154.
- (2006) IEEE Trans. Audio Speech Lang. Process. , vol.14 , Issue.4 , pp. 1145-1154
- Tao, J.¹ Kang, Y.² Li, A.³

231
- 77953724495
- Supervisory data alignment for text-independent voice conversion
- Tao, J., Zhang, M., Nurminen, J., Tian, J., Wang, X., Supervisory data alignment for text-independent voice conversion. IEEE Trans. Audio Speech Lang. Process. 18:5 (2010), 932–943.
- (2010) IEEE Trans. Audio Speech Lang. Process. , vol.18 , Issue.5 , pp. 932-943
- Tao, J.¹ Zhang, M.² Nurminen, J.³ Tian, J.⁴ Wang, X.⁵

232
- 85061833826
- Two vocoder techniques for neutral to emotional timbre conversion.
- Tesser, F., Zovato, E., Nicolao, M., Cosi, P., Two vocoder techniques for neutral to emotional timbre conversion. Proceedings of the SSW, 2010.
- (2010) Proceedings of the SSW
- Tesser, F.¹ Zovato, E.² Nicolao, M.³ Cosi, P.⁴

233
- 84912079352
- Correlation-based frequency warping for voice conversion
- IEEE
- Tian, X., Wu, Z., Lee, S., Chng, E.S., Correlation-based frequency warping for voice conversion. Proceedings of the ISCSLP, 2014, IEEE, 211–215.
- (2014) Proceedings of the ISCSLP , pp. 211-215
- Tian, X.¹ Wu, Z.² Lee, S.³ Chng, E.S.⁴

234
- 84946020861
- Sparse representation for frequency warping based voice conversion
- Tian, X., Wu, Z., Lee, S.W., Hy, N.Q., Chng, E.S., Dong, M., Sparse representation for frequency warping based voice conversion. Proceedings of the ICASSP, 2015.
- (2015) Proceedings of the ICASSP
- Tian, X.¹ Wu, Z.² Lee, S.W.³ Hy, N.Q.⁴ Chng, E.S.⁵ Dong, M.⁶

235
- 84959163883
- System fusion for high-performance voice conversion
- Tian, X., Wu, Z., Lee, S.W., Hy, N.Q., Dong, M., Chng, E.S., System fusion for high-performance voice conversion. Proceedings of the INTERSPEECH, 2015.
- (2015) Proceedings of the INTERSPEECH
- Tian, X.¹ Wu, Z.² Lee, S.W.³ Hy, N.Q.⁴ Dong, M.⁵ Chng, E.S.⁶

236
- 0003747605
- Statistical Analysis of Finite Mixture Distributions
- Wiley New York
- Titterington, D.M., Smith, A.F., Makov, U.E., et al. Statistical Analysis of Finite Mixture Distributions. Vol. 7, 1985, Wiley New York.
- (1985) , vol.7
- Titterington, D.M.¹ Smith, A.F.² Makov, U.E.³

237
- 84937840625
- Acoustic-to-articulatory inversion mapping with gaussian mixture model.
- Toda, T., Black, A.W., Tokuda, K., Acoustic-to-articulatory inversion mapping with gaussian mixture model. Proceedings of the INTERSPEECH, 2004.
- (2004) Proceedings of the INTERSPEECH
- Toda, T.¹ Black, A.W.² Tokuda, K.³

238
- 33646779506
- Spectral conversion based on maximum likelihood estimation considering global variance of converted parameter.
- Toda, T., Black, A.W., Tokuda, K., Spectral conversion based on maximum likelihood estimation considering global variance of converted parameter. Proceedings of the ICASSP, 2005.
- (2005) Proceedings of the ICASSP
- Toda, T.¹ Black, A.W.² Tokuda, K.³

239
- 57749193836
- Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
- Toda, T., Black, A.W., Tokuda, K., Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory. IEEE Trans. Audio Speech Lang. Process. 15:8 (2007), 2222–2235.
- (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.W.² Tokuda, K.³

240
- 38649140222
- Statistical mapping between articulatory movements and acoustic spectrum using a gaussian mixture model
- Toda, T., Black, A.W., Tokuda, K., Statistical mapping between articulatory movements and acoustic spectrum using a gaussian mixture model. Speech Commun. 50:3 (2008), 215–227.
- (2008) Speech Commun. , vol.50 , Issue.3 , pp. 215-227
- Toda, T.¹ Black, A.W.² Tokuda, K.³

241
- 84878390910
- Implementation of computationally efficient real-time voice conversion.
- Toda, T., Muramatsu, T., Banno, H., Implementation of computationally efficient real-time voice conversion. Proceedings of the INTERSPEECH, 2012.
- (2012) Proceedings of the INTERSPEECH
- Toda, T.¹ Muramatsu, T.² Banno, H.³

242
- 84865698185
- Statistical voice conversion techniques for body-conducted unvoiced speech enhancement
- Toda, T., Nakagiri, M., Shikano, K., Statistical voice conversion techniques for body-conducted unvoiced speech enhancement. IEEE Trans. Audio Speech Lang. Process. 20:9 (2012), 2505–2517.
- (2012) IEEE Trans. Audio Speech Lang. Process. , vol.20 , Issue.9 , pp. 2505-2517
- Toda, T.¹ Nakagiri, M.² Shikano, K.³

243
- 84897939966
- Alaryngeal speech enhancement based on one-to-many eigenvoice conversion
- Toda, T., Nakamura, K., Saruwatari, H., Shikano, K., et al. Alaryngeal speech enhancement based on one-to-many eigenvoice conversion. IEEE/ACM IEEE Trans. Audio Speech Lang. Process. 22:1 (2014), 172–183.
- (2014) IEEE/ACM IEEE Trans. Audio Speech Lang. Process. , vol.22 , Issue.1 , pp. 172-183
- Toda, T.¹ Nakamura, K.² Saruwatari, H.³ Shikano, K.⁴

244
- 34547512822
- Eigenvoice conversion based on gaussian mixture model
- Toda, T., Ohtani, Y., Shikano, K., Eigenvoice conversion based on gaussian mixture model. Proceedings of the INTERSPEECH, 2006.
- (2006) Proceedings of the INTERSPEECH
- Toda, T.¹ Ohtani, Y.² Shikano, K.³

245
- 34547496175
- One-to-many and many-to-one voice conversion based on eigenvoices
- Toda, T., Ohtani, Y., Shikano, K., One-to-many and many-to-one voice conversion based on eigenvoices. Proceedings of the ICASSP, 2007.
- (2007) Proceedings of the ICASSP
- Toda, T.¹ Ohtani, Y.² Shikano, K.³

246
- 84994361374
- The voice conversion challenge 2016
- Toda, T., Saito, D., Villavicencio, F., Yamagishi, J., Wester, M., Wu, Z., Chen, L.-H., et al. The voice conversion challenge 2016. Proceedings of the INTERSPEECH, 2016.
- (2016) Proceedings of the INTERSPEECH
- Toda, T.¹ Saito, D.² Villavicencio, F.³ Yamagishi, J.⁴ Wester, M.⁵ Wu, Z.⁶ Chen, L.-H.⁷

247
- 0034842552
- Voice conversion algorithm based on gaussian mixture model with dynamic frequency warping of straight spectrum
- Toda, T., Saruwatari, H., Shikano, K., Voice conversion algorithm based on gaussian mixture model with dynamic frequency warping of straight spectrum. Proceedings of the ICASSP, 2001.
- (2001) Proceedings of the ICASSP
- Toda, T.¹ Saruwatari, H.² Shikano, K.³

248
- 84995686462
- NAM-to-speech conversion with gaussian mixture models
- Toda, T., Shikano, K., NAM-to-speech conversion with gaussian mixture models. Proceedings of the INTERSPEECH, 2005.
- (2005) Proceedings of the INTERSPEECH
- Toda, T.¹ Shikano, K.²

249
- 0028996993
- Speech parameter generation from HMM using dynamic features
- Tokuda, K., Kobayashi, T., Imai, S., Speech parameter generation from HMM using dynamic features. Proceedings of the ICASSP, 1995.
- (1995) Proceedings of the ICASSP
- Tokuda, K.¹ Kobayashi, T.² Imai, S.³

250
- 84946077883
- Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis
- Tokuda, K., Zen, H., Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis. Proceedings of the ICASSP, 2015.
- (2015) Proceedings of the ICASSP
- Tokuda, K.¹ Zen, H.²

251
- 76849105528
- Improvement to a nam-captured whisper-to-speech system
- Tran, V.-A., Bailly, G., Lœvenbruck, H., Toda, T., Improvement to a nam-captured whisper-to-speech system. Speech Commun. 52:4 (2010), 314–326.
- (2010) Speech Commun. , vol.52 , Issue.4 , pp. 314-326
- Tran, V.-A.¹ Bailly, G.² Lœvenbruck, H.³ Toda, T.⁴

252
- 0141479037
- Evaluation of methods for parameteric formant transformation in voice conversion
- Turajlic, E., Rentzos, D., Vaseghi, S., Ho, C.-H., Evaluation of methods for parameteric formant transformation in voice conversion. Proceeding of the ICASSP, 2003.
- (2003) Proceeding of the ICASSP
- Turajlic, E.¹ Rentzos, D.² Vaseghi, S.³ Ho, C.-H.⁴

253
- 77950029784
- Bogaziçi University Ph.D. thesis.
- Türk, O., Cross-Lingual Voice Conversion, 2007, Bogaziçi University Ph.D. thesis.
- (2007) Cross-Lingual Voice Conversion
- Türk, O.¹

254
- 85009179173
- Voice conversion methods for vocal tract and pitch contour modification.
- Türk, O., Arslan, L.M., Voice conversion methods for vocal tract and pitch contour modification. Proceedings of the INTERSPEECH, 2003.
- (2003) Proceedings of the INTERSPEECH
- Türk, O.¹ Arslan, L.M.²

255
- 84863647359
- Donor selection for voice conversion
- Turk, O., Arslan, L.M., Donor selection for voice conversion. Proceedings of the EUSIPCO, 2005.
- (2005) Proceedings of the EUSIPCO
- Turk, O.¹ Arslan, L.M.²

256
- 33746653351
- Robust processing techniques for voice conversion
- Turk, O., Arslan, L.M., Robust processing techniques for voice conversion. Comput. Speech Lang. 20:4 (2006), 441–467.
- (2006) Comput. Speech Lang. , vol.20 , Issue.4 , pp. 441-467
- Turk, O.¹ Arslan, L.M.²

257
- 70349207267
- Application of voice conversion for cross-language rap singing transformation
- Turk, O., Buyuk, O., Haznedaroglu, A., Arslan, L.M., Application of voice conversion for cross-language rap singing transformation. Proceedings of the ICASSP, 2009.
- (2009) Proceedings of the ICASSP
- Turk, O.¹ Buyuk, O.² Haznedaroglu, A.³ Arslan, L.M.⁴

258
- 84867219635
- A comparison of voice conversion methods for transforming voice quality in emotional speech synthesis.
- Türk, O., Schröder, M., A comparison of voice conversion methods for transforming voice quality in emotional speech synthesis. Proceedings of the INTERSPEECH, 2008.
- (2008) Proceedings of the INTERSPEECH
- Türk, O.¹ Schröder, M.²

259
- 77953699443
- Evaluation of expressive speech synthesis with voice conversion and copy resynthesis techniques
- Turk, O., Schroder, M., Evaluation of expressive speech synthesis with voice conversion and copy resynthesis techniques. IEEE Trans. Audio Speech Lang. Process. 18:5 (2010), 965–973.
- (2010) IEEE Trans. Audio Speech Lang. Process. , vol.18 , Issue.5 , pp. 965-973
- Turk, O.¹ Schroder, M.²

260
- 34547806096
- A self-organizing map with twin units capable of describing a nonlinear input–output relation applied to speech code vector mapping
- Uchino, E., Yano, K., Azetsu, T., A self-organizing map with twin units capable of describing a nonlinear input–output relation applied to speech code vector mapping. Inf. Sci. 177:21 (2007), 4634–4644.
- (2007) Inf. Sci. , vol.177 , Issue.21 , pp. 4634-4644
- Uchino, E.¹ Yano, K.² Azetsu, T.³

261
- 85010284658
- Voice conversion using frame selection and warping functions
- Uriz, A., Aguero, P., Tulli, J., Gonzalez, E., Bonafonte, A., Voice conversion using frame selection and warping functions. Proceedings of the RPIC, 2009.
- (2009) Proceedings of the RPIC
- Uriz, A.¹ Aguero, P.² Tulli, J.³ Gonzalez, E.⁴ Bonafonte, A.⁵

262
- 85010316399
- Voice Conversion Using Frame Selection
- Reporte Interno Laboratorio de Comunicaciones-UNMdP
- Uriz, A., Agüero, P.D., Erro, D., Bonafonte, A., Voice Conversion Using Frame Selection. 2008 Reporte Interno Laboratorio de Comunicaciones-UNMdP.
- (2008)
- Uriz, A.¹ Agüero, P.D.² Erro, D.³ Bonafonte, A.⁴

263
- 70450204589
- Voice conversion using k-histograms and frame selection.
- Uriz, A.J., Agüero, P.D., Bonafonte, A., Tulli, J.C., Voice conversion using k-histograms and frame selection. Proceedings of the INTERSPEECH, 2009.
- (2009) Proceedings of the INTERSPEECH
- Uriz, A.J.¹ Agüero, P.D.² Bonafonte, A.³ Tulli, J.C.⁴

264
- 44949104276
- Voice conversion based on mixtures of factor analyzers
- Uto, Y., Nankaku, Y., Toda, T., Lee, A., Tokuda, K., Voice conversion based on mixtures of factor analyzers. Proceeding of the ICSLP, 2006.
- (2006) Proceeding of the ICSLP
- Uto, Y.¹ Nankaku, Y.² Toda, T.³ Lee, A.⁴ Tokuda, K.⁵

265
- 85010815133
- Voice transformation using PSOLA technique
- Valbret, H., Moulines, E., Tubach, J.-P., Voice transformation using PSOLA technique. Proceedings of the ICASSP, 1992.
- (1992) Proceedings of the ICASSP
- Valbret, H.¹ Moulines, E.² Tubach, J.-P.³

266
- 0026880275
- Voice transformation using PSOLA technique
- Valbret, H., Moulines, E., Tubach, J.P., Voice transformation using PSOLA technique. Speech Commun. 11:2 (1992), 175–187.
- (1992) Speech Commun. , vol.11 , Issue.2 , pp. 175-187
- Valbret, H.¹ Moulines, E.² Tubach, J.P.³

267
- 84959108894
- Towards minimum perceptual error training for DNN-based speech synthesis
- Valentini-Botinhao, C., Wu, Z., King, S., Towards minimum perceptual error training for DNN-based speech synthesis. Proceedings of the INTERSPEECH, 2015.
- (2015) Proceedings of the INTERSPEECH
- Valentini-Botinhao, C.¹ Wu, Z.² King, S.³

268
- 84865747520
- Intonation conversion from neutral to expressive speech.
- Veaux, C., Rodet, X., Intonation conversion from neutral to expressive speech. Proceedings of the INTERSPEECH, 2011.
- (2011) Proceedings of the INTERSPEECH
- Veaux, C.¹ Rodet, X.²

269
- 77956828655
- Voice fonts for individuality representation and transformation
- Verma, A., Kumar, A., Voice fonts for individuality representation and transformation. ACM Trans. Speech Lang. Process. (TSLP), 2(1), 2005, 4.
- (2005) ACM Trans. Speech Lang. Process. (TSLP) , vol.2 , Issue.1 , pp. 4
- Verma, A.¹ Kumar, A.²

270
- 79959827418
- Applying voice conversion to concatenative singing-voice synthesis.
- Villavicencio, F., Bonada, J., Applying voice conversion to concatenative singing-voice synthesis. Proceedings of the INTERSPEECH, 2010.
- (2010) Proceedings of the INTERSPEECH
- Villavicencio, F.¹ Bonada, J.²

271
- 84960864752
- Observation-model error compensation for enhanced spectral envelope transformation in voice conversion
- Villavicencio, F., Bonada, J., Hisaminato, Y., Observation-model error compensation for enhanced spectral envelope transformation in voice conversion. Proceedings of the MLSP, 2015.
- (2015) Proceedings of the MLSP
- Villavicencio, F.¹ Bonada, J.² Hisaminato, Y.³

272
- 34547541173
- A new method for speech synthesis and transformation based on an arx-lf source-filter decomposition and HNM modeling
- Vincent, D., Rosec, O., Chonavel, T., A new method for speech synthesis and transformation based on an arx-lf source-filter decomposition and HNM modeling. Proceedings of the ICASSP, 2007.
- (2007) Proceedings of the ICASSP
- Vincent, D.¹ Rosec, O.² Chonavel, T.³

273
- 0003557444
- Verbmobil: Foundations of Speech-to-Speech Translation
- Springer Science & Business Media
- Wahlster, W., Verbmobil: Foundations of Speech-to-Speech Translation. 2000, Springer Science & Business Media.
- (2000)
- Wahlster, W.¹

274
- 84902959938
- Emotional voice conversion for mandarin using tone nucleus model–small corpus and high efficiency
- Wang, M., Wen, M., Hirose, K., Minematsu, N., Emotional voice conversion for mandarin using tone nucleus model–small corpus and high efficiency. Proceedings of the Speech Prosody, 2012.
- (2012) Proceedings of the Speech Prosody
- Wang, M.¹ Wen, M.² Hirose, K.³ Minematsu, N.⁴

275
- 84988295463
- Multi-level prosody and spectrum conversion for emotional speech synthesis
- Wang, Z., Yu, Y., Multi-level prosody and spectrum conversion for emotional speech synthesis. Proceedings of the ICSP, 2014.
- (2014) Proceedings of the ICSP
- Wang, Z.¹ Yu, Y.²

276
- 85009266993
- Transformation of spectral envelope for voice conversion based on radial basis function networks
- Watanabe, T., Murakami, T., Namba, M., Hoya, T., Ishida, Y., Transformation of spectral envelope for voice conversion based on radial basis function networks. Proceedings of the ICSLP, 2002.
- (2002) Proceedings of the ICSLP
- Watanabe, T.¹ Murakami, T.² Namba, M.³ Hoya, T.⁴ Ishida, Y.⁵

277
- 84994351528
- Analysis of the voice conversion challenge 2016 evaluation results
- Wester, M., Wu, Z., Yamagishi, J., Analysis of the voice conversion challenge 2016 evaluation results. Proceedings of the INTERSPEECH, 2016.
- (2016) Proceedings of the INTERSPEECH
- Wester, M.¹ Wu, Z.² Yamagishi, J.³

278
- 85133164470
- Multidimensional scaling of systems in the voice conversion challenge 2016
- Wester, M., Wu, Z., Yamagishi, J., Multidimensional scaling of systems in the voice conversion challenge 2016. Proceedings of the SSW, 2016.
- (2016) Proceedings of the SSW
- Wester, M.¹ Wu, Z.² Yamagishi, J.³

279
- 33646815712
- The MOCHA-TIMIT articulatory database
- Queen Margaret University College
- Wrench, A., The MOCHA-TIMIT articulatory database. 1999, Queen Margaret University College.
- (1999)
- Wrench, A.¹

280
- 34047247202
- Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis
- Wu, C.-H., Hsia, C.-C., Liu, T.-H., Wang, J.-F., Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis. IEEE Trans. Audio Speech Lang. Process. 14:4 (2006), 1109–1116.
- (2006) IEEE Trans. Audio Speech Lang. Process. , vol.14 , Issue.4 , pp. 1109-1116
- Wu, C.-H.¹ Hsia, C.-C.² Liu, T.-H.³ Wang, J.-F.⁴

281
- 84994247053
- Locally linear embedding for exemplar-based spectral conversion
- Wu, Y.-C., Hwang, H.-T., Hsu, C.-C., Tsao, Y., Wang, H.-M., Locally linear embedding for exemplar-based spectral conversion. Proceedings of the INTERSPEECH, 2016.
- (2016) Proceedings of the INTERSPEECH
- Wu, Y.-C.¹ Hwang, H.-T.² Hsu, C.-C.³ Tsao, Y.⁴ Wang, H.-M.⁵

282
- 84889579519
- Conditional restricted boltzmann machine for voice conversion
- Wu, Z., Chng, E.S., Li, H., Conditional restricted boltzmann machine for voice conversion. Proceedings of the ChinaSIP, 2013.
- (2013) Proceedings of the ChinaSIP
- Wu, Z.¹ Chng, E.S.² Li, H.³

283
- 84910071877
- Joint nonnegative matrix factorization for exemplar-based voice conversion
- Wu, Z., Chng, E.S., Li, H., Joint nonnegative matrix factorization for exemplar-based voice conversion. Proceedings of the INTERSPEECH, 2014.
- (2014) Proceedings of the INTERSPEECH
- Wu, Z.¹ Chng, E.S.² Li, H.³

284
- 79959842826
- Text-independent F0 transformation with non-parallel data for voice conversion.
- Wu, Z., Kinnunen, T., Chng, E., Li, H., Text-independent F0 transformation with non-parallel data for voice conversion. Proceedings of the INTERSPEECH, 2010.
- (2010) Proceedings of the INTERSPEECH
- Wu, Z.¹ Kinnunen, T.² Chng, E.³ Li, H.⁴

285
- 84869384026
- Mixture of factor analyzers using priors from non-parallel speech for voice conversion
- Wu, Z., Kinnunen, T., Chng, E.S., Li, H., Mixture of factor analyzers using priors from non-parallel speech for voice conversion. IEEE Signal Process. Lett. 19:12 (2012), 914–917.
- (2012) IEEE Signal Process. Lett. , vol.19 , Issue.12 , pp. 914-917
- Wu, Z.¹ Kinnunen, T.² Chng, E.S.³ Li, H.⁴

286
- 84906275384
- Vulnerability evaluation of speaker verification under voice conversion spoofing: the effect of text constraints.
- Wu, Z., Larcher, A., Lee, K.-A., Chng, E., Kinnunen, T., Li, H., Vulnerability evaluation of speaker verification under voice conversion spoofing: the effect of text constraints. Proceedings of the INTERSPEECH, 2013.
- (2013) Proceedings of the INTERSPEECH
- Wu, Z.¹ Larcher, A.² Lee, K.-A.³ Chng, E.⁴ Kinnunen, T.⁵ Li, H.⁶

287
- 84956723787
- Voice conversion versus speaker verification: an overview
- Wu, Z., Li, H., Voice conversion versus speaker verification: an overview. APSIPA Trans. Signal Inf. Process., 3, 2014, e17.
- (2014) APSIPA Trans. Signal Inf. Process. , vol.3 , pp. e17
- Wu, Z.¹ Li, H.²

288
- 84911369131
- Exemplar-based sparse representation with residual compensation for voice conversion
- Wu, Z., Virtanen, T., Chng, E.S., Li, H., Exemplar-based sparse representation with residual compensation for voice conversion. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 22:10 (2014), 1506–1521.
- (2014) IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) , vol.22 , Issue.10 , pp. 1506-1521
- Wu, Z.¹ Virtanen, T.² Chng, E.S.³ Li, H.⁴

289
- 84906276055
- Exemplar-based unit selection for voice conversion utilizing temporal information
- Wu, Z., Virtanen, T., Kinnunen, T., Chng, E., Li, H., Exemplar-based unit selection for voice conversion utilizing temporal information. Proceedings of the INTERSPEECH, 2013.
- (2013) Proceedings of the INTERSPEECH
- Wu, Z.¹ Virtanen, T.² Kinnunen, T.³ Chng, E.⁴ Li, H.⁵

290
- 84901803470
- Exemplar-based voice conversion using non-negative spectrogram deconvolution
- Wu, Z., Virtanen, T., Kinnunen, T., Chng, E.S., Li, H., Exemplar-based voice conversion using non-negative spectrogram deconvolution. Proceedings of the SSW, 2013.
- (2013) Proceedings of the SSW
- Wu, Z.¹ Virtanen, T.² Kinnunen, T.³ Chng, E.S.⁴ Li, H.⁵

291
- 84910087395
- Sequence error (se) minimization training of neural network for voice conversion
- Xie, F.-L., Qian, Y., Fan, Y., Soong, F.K., Li, H., Sequence error (se) minimization training of neural network for voice conversion. Proceedings of the INTERSPEECH, 2014.
- (2014) Proceedings of the INTERSPEECH
- Xie, F.-L.¹ Qian, Y.² Fan, Y.³ Soong, F.K.⁴ Li, H.⁵

292
- 84912078522
- Pitch transformation in neural network based voice conversion
- Xie, F.-L., Qian, Y., Soong, F.K., Li, H., Pitch transformation in neural network based voice conversion. Proceedings of the ISCSLP, 2014.
- (2014) Proceedings of the ISCSLP
- Xie, F.-L.¹ Qian, Y.² Soong, F.K.³ Li, H.⁴

293
- 84890539284
- Voice conversion based on gaussian processes by coherent and asymmetric training with limited training data
- Xu, N., Tang, Y., Bao, J., Jiang, A., Liu, X., Yang, Z., Voice conversion based on gaussian processes by coherent and asymmetric training with limited training data. Speech Commun. 58 (2014), 124–138.
- (2014) Speech Commun. , vol.58 , pp. 124-138
- Xu, N.¹ Tang, Y.² Bao, J.³ Jiang, A.⁴ Liu, X.⁵ Yang, Z.⁶

294
- 84855906479
- Speech synthesis technologies for individuals with vocal disabilities: voice banking and reconstruction
- Yamagishi, J., Veaux, C., King, S., Renals, S., Speech synthesis technologies for individuals with vocal disabilities: voice banking and reconstruction. Acoust. Sci. Technol. 33:1 (2012), 1–5.
- (2012) Acoust. Sci. Technol. , vol.33 , Issue.1 , pp. 1-5
- Yamagishi, J.¹ Veaux, C.² King, S.³ Renals, S.⁴

295
- 85009224898
- Perceptually weighted linear transformations for voice conversion.
- Ye, H., Young, S., Perceptually weighted linear transformations for voice conversion. Proceedings of the INTERSPEECH, 2003.
- (2003) Proceedings of the INTERSPEECH
- Ye, H.¹ Young, S.²

296
- 85010372765
- Voice conversion for unknown speakers.
- Ye, H., Young, S., Voice conversion for unknown speakers. Proceedings of the INTERSPEECH, 2004.
- (2004) Proceedings of the INTERSPEECH
- Ye, H.¹ Young, S.²

297
- 34047254509
- Quality-enhanced voice morphing using maximum likelihood transformations
- Ye, H., Young, S., Quality-enhanced voice morphing using maximum likelihood transformations. IEEE Trans. Audio Speech Lang. Process. 14:4 (2006), 1301–1312.
- (2006) IEEE Trans. Audio Speech Lang. Process. , vol.14 , Issue.4 , pp. 1301-1312
- Ye, H.¹ Young, S.²

298
- 51849135434
- Voice conversion using HMM combined with GMM
- Yue, Z., Zou, X., Jia, Y., Wang, H., Voice conversion using HMM combined with GMM. Proceedings of the CISP, 2008.
- (2008) Proceedings of the CISP
- Yue, Z.¹ Zou, X.² Jia, Y.³ Wang, H.⁴

299
- 70349218136
- Voice conversion based on simultaneous modelling of spectrum and f0
- Yutani, K., Uto, Y., Nankaku, Y., Lee, A., Tokuda, K., Voice conversion based on simultaneous modelling of spectrum and f0. Proceedings of the ICASSP, 2009.
- (2009) Proceedings of the ICASSP
- Yutani, K.¹ Uto, Y.² Nankaku, Y.³ Lee, A.⁴ Tokuda, K.⁵

300
- 78149260085
- Continuous stochastic feature mapping based on trajectory hmms
- Zen, H., Nankaku, Y., Tokuda, K., Continuous stochastic feature mapping based on trajectory hmms. IEEE Trans. Audio Speech Lang. Process. 19:2 (2011), 417–430.
- (2011) IEEE Trans. Audio Speech Lang. Process. , vol.19 , Issue.2 , pp. 417-430
- Zen, H.¹ Nankaku, Y.² Tokuda, K.³

301
- 33646812247
- Voice conversion based on weighted least squares estimation criterion and residual prediction from pitch contour
- Springer
- Zhang, J., Sun, J., Dai, B., Voice conversion based on weighted least squares estimation criterion and residual prediction from pitch contour. Affective Computing and Intelligent Interaction, 2005, Springer, 326–333.
- (2005) Affective Computing and Intelligent Interaction , pp. 326-333
- Zhang, J.¹ Sun, J.² Dai, B.³

302
- 70349215699
- Phoneme cluster based state mapping for text-independent voice conversion
- Zhang, M., Tao, J., Nurminen, J., Tian, J., Wang, X., Phoneme cluster based state mapping for text-independent voice conversion. Proceedings of the ICASSP, 2009.
- (2009) Proceedings of the ICASSP
- Zhang, M.¹ Tao, J.² Nurminen, J.³ Tian, J.⁴ Wang, X.⁵

303
- 51449121435
- Text-independent voice conversion based on state mapped codebook
- Zhang, M., Tao, J., Tian, J., Wang, X., Text-independent voice conversion based on state mapped codebook. Proceedings of the ICASSP, 2008.
- (2008) Proceedings of the ICASSP
- Zhang, M.¹ Tao, J.² Tian, J.³ Wang, X.⁴

304
- 0030681728
- A formant vocoder based on mixtures of gaussians
- Zolfaghari, P., Robinson, T., A formant vocoder based on mixtures of gaussians. Proceedings of the ICASSP, 1997.
- (1997) Proceedings of the ICASSP
- Zolfaghari, P.¹ Robinson, T.²

305
- 84871520443
- Improving the quality of standard GMM-based voice conversion systems by considering physically motivated linear transformations
- Springer
- Zorilă, T.-C., Erro, D., Hernaez, I., Improving the quality of standard GMM-based voice conversion systems by considering physically motivated linear transformations. Advances in Speech and Language Technologies for Iberian Languages, 2012, Springer, 30–39.
- (2012) Advances in Speech and Language Technologies for Iberian Languages , pp. 30-39
- Zorilă, T.-C.¹ Erro, D.² Hernaez, I.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.