-
1
-
-
0023739214
-
Voice conversion through vector quantization
-
Abe, M., Nakamura, S., Shikano, K., Kuwabara, H., Voice conversion through vector quantization. Proceedings of the ICASSP, 1988.
-
(1988)
Proceedings of the ICASSP
-
-
Abe, M.1
Nakamura, S.2
Shikano, K.3
Kuwabara, H.4
-
2
-
-
84930664922
-
VOCAINE the vocoder and applications in speech synthesis
-
Agiomyrgiannakis, Y., VOCAINE the vocoder and applications in speech synthesis. Proceedings of the ICASSP, 2015.
-
(2015)
Proceedings of the ICASSP
-
-
Agiomyrgiannakis, Y.1
-
3
-
-
70349208681
-
ARX-LF-based source-filter methods for voice modification and transformation
-
Agiomyrgiannakis, Y., Rosec, O., ARX-LF-based source-filter methods for voice modification and transformation. Proceedings of the ICASSP, 2009.
-
(2009)
Proceedings of the ICASSP
-
-
Agiomyrgiannakis, Y.1
Rosec, O.2
-
4
-
-
84905227265
-
Voice conversion based on non-negative matrix factorization using phoneme-categorized dictionary
-
Aihara, R., Nakashika, T., Takiguchi, T., Ariki, Y., Voice conversion based on non-negative matrix factorization using phoneme-categorized dictionary. Proceedings of the ICASSP, 2014.
-
(2014)
Proceedings of the ICASSP
-
-
Aihara, R.1
Nakashika, T.2
Takiguchi, T.3
Ariki, Y.4
-
5
-
-
84890519936
-
Individuality-preserving voice conversion for articulation disorders based on non-negative matrix factorization
-
Aihara, R., Takashima, R., Takiguchi, T., Ariki, Y., Individuality-preserving voice conversion for articulation disorders based on non-negative matrix factorization. Proceedings of the ICASSP, 2013.
-
(2013)
Proceedings of the ICASSP
-
-
Aihara, R.1
Takashima, R.2
Takiguchi, T.3
Ariki, Y.4
-
6
-
-
84946095434
-
Activity-mapping non-negative matrix factorization for exemplar-based voice conversion
-
AIHARA, R., TAKIGUCHI, T., ARIKI, Y., Activity-mapping non-negative matrix factorization for exemplar-based voice conversion. Proceedings of the ICASSP, 2015.
-
(2015)
Proceedings of the ICASSP
-
-
AIHARA, R.1
TAKIGUCHI, T.2
ARIKI, Y.3
-
7
-
-
84959090646
-
Many-to-many voice conversion based on multiple non-negative matrix factorization
-
Aihara, R., Takiguchi, T., Ariki, Y., Many-to-many voice conversion based on multiple non-negative matrix factorization. Proceedings of the INTERSPEECH, 2015.
-
(2015)
Proceedings of the INTERSPEECH
-
-
Aihara, R.1
Takiguchi, T.2
Ariki, Y.3
-
8
-
-
84949924136
-
Exemplar-based emotional voice conversion using non-negative matrix factorization
-
Aihara, R., Ueda, R., Takiguchi, T., Ariki, Y., Exemplar-based emotional voice conversion using non-negative matrix factorization. Proceedings of the APSIPA, 2014, 10.1109/APSIPA.2014.7041640.
-
(2014)
Proceedings of the APSIPA
-
-
Aihara, R.1
Ueda, R.2
Takiguchi, T.3
Ariki, Y.4
-
9
-
-
84890542394
-
Spoofing countermeasures to protect automatic speaker verification from voice conversion
-
Alegre, F., Amehraye, A., Evans, N., Spoofing countermeasures to protect automatic speaker verification from voice conversion. Proceedings of the ICASSP, 2013.
-
(2013)
Proceedings of the ICASSP
-
-
Alegre, F.1
Amehraye, A.2
Evans, N.3
-
10
-
-
84878399314
-
Festvox: Tools for creation and analyses of large speech corpora
-
Anumanchipalli, G.K., Prahallad, K., Black, A.W., Festvox: Tools for creation and analyses of large speech corpora. Workshop on Very Large Scale Phonetics Research, UPenn, Philadelphia, 2011.
-
(2011)
Workshop on Very Large Scale Phonetics Research, UPenn, Philadelphia
-
-
Anumanchipalli, G.K.1
Prahallad, K.2
Black, A.W.3
-
11
-
-
0033154052
-
Speaker transformation algorithm using segmental codebooks (STASC)
-
Arslan, L.M., Speaker transformation algorithm using segmental codebooks (STASC). Speech Commun. 28:3 (1999), 211–226.
-
(1999)
Speech Commun.
, vol.28
, Issue.3
, pp. 211-226
-
-
Arslan, L.M.1
-
12
-
-
84863268465
-
Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum
-
Arslan, L.M., Talkin, D., Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum. Proceedings of the EUROSPEECH, 1997.
-
(1997)
Proceedings of the EUROSPEECH
-
-
Arslan, L.M.1
Talkin, D.2
-
13
-
-
0031643805
-
Speaker transformation using sentence HMM based alignments and detailed prosody modification
-
Arslan, L.M., Talkin, D., Speaker transformation using sentence HMM based alignments and detailed prosody modification. Proceedings of the ICASSP, 1998.
-
(1998)
Proceedings of the ICASSP
-
-
Arslan, L.M.1
Talkin, D.2
-
14
-
-
84905277636
-
Foreign accent conversion through voice morphing.
-
Aryal, S., Felps, D., Gutierrez-Osuna, R., Foreign accent conversion through voice morphing. Proceedings of the INTERSPEECH, 2013.
-
(2013)
Proceedings of the INTERSPEECH
-
-
Aryal, S.1
Felps, D.2
Gutierrez-Osuna, R.3
-
15
-
-
84906281619
-
Real-time voice conversion using artificial neural networks with rectified linear units
-
Azarov, E., Vashkevich, M., Likhachov, D., Petrovsky, A., Real-time voice conversion using artificial neural networks with rectified linear units. Proceedings of the INTERSPEECH, 2013.
-
(2013)
Proceedings of the INTERSPEECH
-
-
Azarov, E.1
Vashkevich, M.2
Likhachov, D.3
Petrovsky, A.4
-
16
-
-
56149092168
-
On the limitations of voice conversion techniques in emotion identification tasks
-
Barra, R., Montero, J.M., Macias-Guarasa, J., Gutiérrez-Arriola, J., Ferreiros, J., Pardo, J.M., On the limitations of voice conversion techniques in emotion identification tasks. Proceedings of the INTERSPEECH, 2007.
-
(2007)
Proceedings of the INTERSPEECH
-
-
Barra, R.1
Montero, J.M.2
Macias-Guarasa, J.3
Gutiérrez-Arriola, J.4
Ferreiros, J.5
Pardo, J.M.6
-
18
-
-
84941241899
-
Sequential voice conversion using grid-based approximation
-
Benisty, H., Malah, D., Crammer, K., Sequential voice conversion using grid-based approximation. Proceedings of the IEEEI, 2014.
-
(2014)
Proceedings of the IEEEI
-
-
Benisty, H.1
Malah, D.2
Crammer, K.3
-
19
-
-
0031104132
-
Application of speech conversion to alaryngeal speech enhancement
-
Bi, N., Qi, Y., Application of speech conversion to alaryngeal speech enhancement. IEEE Trans. Speech Audio Process. 5:2 (1997), 97–105.
-
(1997)
IEEE Trans. Speech Audio Process.
, vol.5
, Issue.2
, pp. 97-105
-
-
Bi, N.1
Qi, Y.2
-
20
-
-
79961212205
-
TC-STAR: Specifications of language resources and evaluation for speech synthesis
-
Bonafonte, A., Höge, H., Kiss, I., Moreno, A., Ziegenhain, U., van den Heuvel, H., Hain, H.-U., Wang, X.S., Garcia, M.-N., TC-STAR: Specifications of language resources and evaluation for speech synthesis. Proceedings of the LREC, 2006.
-
(2006)
Proceedings of the LREC
-
-
Bonafonte, A.1
Höge, H.2
Kiss, I.3
Moreno, A.4
Ziegenhain, U.5
van den Heuvel, H.6
Hain, H.-U.7
Wang, X.S.8
Garcia, M.-N.9
-
21
-
-
85009161180
-
Voice morphing system for impersonating in karaoke applications
-
Cano, P., Loscos, A., Bonada, J., De Boer, M., Serra, X., Voice morphing system for impersonating in karaoke applications. Proceedings of the ICMC, 2000.
-
(2000)
Proceedings of the ICMC
-
-
Cano, P.1
Loscos, A.2
Bonada, J.3
De Boer, M.4
Serra, X.5
-
22
-
-
5444259197
-
On the construction of a pitch conversion system
-
Ceyssens, T., Verhelst, W., Wambacq, P., On the construction of a pitch conversion system. Proceedings of the EUSIPCO, 2002.
-
(2002)
Proceedings of the EUSIPCO
-
-
Ceyssens, T.1
Verhelst, W.2
Wambacq, P.3
-
24
-
-
84910104946
-
Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes
-
Chen, L.-H., Ling, Z.-H., Dai, L.-R., Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes. Proceedings of the INTERSPEECH, 2014.
-
(2014)
Proceedings of the INTERSPEECH
-
-
Chen, L.-H.1
Ling, Z.-H.2
Dai, L.-R.3
-
25
-
-
84921735339
-
Voice conversion using deep neural networks with layer-wise generative training
-
Chen, L.-H., Ling, Z.-H., Liu, L.-J., Dai, L.-R., Voice conversion using deep neural networks with layer-wise generative training. IEEE/ACM Trans. Audio Speech Language Process. (TASLP) 22:12 (2014), 1859–1872.
-
(2014)
IEEE/ACM Trans. Audio Speech Language Process. (TASLP)
, vol.22
, Issue.12
, pp. 1859-1872
-
-
Chen, L.-H.1
Ling, Z.-H.2
Liu, L.-J.3
Dai, L.-R.4
-
26
-
-
84906225084
-
Joint spectral distribution modeling using restricted boltzmann machines for voice conversion
-
Chen, L.-H., Ling, Z.-H., Song, Y., Dai, L.-R., Joint spectral distribution modeling using restricted boltzmann machines for voice conversion. Proceedings of the INTERSPEECH, 2013.
-
(2013)
Proceedings of the INTERSPEECH
-
-
Chen, L.-H.1
Ling, Z.-H.2
Song, Y.3
Dai, L.-R.4
-
27
-
-
84994337398
-
The USTC system for voice conversion challenge 2016: neural network based approaches for spectrum, aperiodicity and F0 conversion
-
Chen, L.-H., Liu, L.-J., Ling, Z.-H., Jiang, Y., Dai, L.-R., The USTC system for voice conversion challenge 2016: neural network based approaches for spectrum, aperiodicity and F0 conversion. Proceedings of the INTERSPEECH, 2016.
-
(2016)
Proceedings of the INTERSPEECH
-
-
Chen, L.-H.1
Liu, L.-J.2
Ling, Z.-H.3
Jiang, Y.4
Dai, L.-R.5
-
28
-
-
84905560807
-
Voice conversion with smoothed GMM and MAP adaptation
-
Chen, Y., Chu, M., Chang, E., Liu, J., Liu, R., Voice conversion with smoothed GMM and MAP adaptation. Proceedings of the EUROSPEECH, 2003.
-
(2003)
Proceedings of the EUROSPEECH
-
-
Chen, Y.1
Chu, M.2
Chang, E.3
Liu, J.4
Liu, R.5
-
29
-
-
0022203520
-
Voice conversion: Factors responsible for quality
-
Childers, D., Yegnanarayana, B., Wu, K., Voice conversion: Factors responsible for quality. Proceedings of the ICASSP, 1985.
-
(1985)
Proceedings of the ICASSP
-
-
Childers, D.1
Yegnanarayana, B.2
Wu, K.3
-
30
-
-
0029253818
-
Glottal source modeling for voice conversion
-
Childers, D.G., Glottal source modeling for voice conversion. Speech Commun. 16:2 (1995), 127–138.
-
(1995)
Speech Commun.
, vol.16
, Issue.2
, pp. 127-138
-
-
Childers, D.G.1
-
31
-
-
0024680919
-
Voice conversion
-
Childers, D.G., Wu, K., Hicks, D., Yegnanarayana, B., Voice conversion. Speech Commun. 8:2 (1989), 147–158.
-
(1989)
Speech Commun.
, vol.8
, Issue.2
, pp. 147-158
-
-
Childers, D.G.1
Wu, K.2
Hicks, D.3
Yegnanarayana, B.4
-
33
-
-
84867216755
-
The linear transformation of lf glottal waveforms for voice conversion.
-
Del Pozo, A., Young, S., The linear transformation of lf glottal waveforms for voice conversion. Proceedings of the INTERSPEECH, 2008.
-
(2008)
Proceedings of the INTERSPEECH
-
-
Del Pozo, A.1
Young, S.2
-
34
-
-
77953707533
-
Spectral mapping using artificial neural networks for voice conversion
-
Desai, S., Black, A.W., Yegnanarayana, B., Prahallad, K., Spectral mapping using artificial neural networks for voice conversion. IEEE Trans. Audio Speech Lang. Process. 18:5 (2010), 954–964.
-
(2010)
IEEE Trans. Audio Speech Lang. Process.
, vol.18
, Issue.5
, pp. 954-964
-
-
Desai, S.1
Black, A.W.2
Yegnanarayana, B.3
Prahallad, K.4
-
35
-
-
84874403435
-
Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system
-
Doi, H., Toda, T., Nakano, T., Goto, M., Nakamura, S., Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system. Proceedings of the APSIPA, 2012.
-
(2012)
Proceedings of the APSIPA
-
-
Doi, H.1
Toda, T.2
Nakano, T.3
Goto, M.4
Nakamura, S.5
-
36
-
-
34547496196
-
Towards a voice conversion system based on frame selection
-
Dutoit, T., Holzapfel, A., Jottrand, M., Moinet, A., Perez, J., Stylianou, Y., Towards a voice conversion system based on frame selection. Proceedings of the ICASSP, 2007.
-
(2007)
Proceedings of the ICASSP
-
-
Dutoit, T.1
Holzapfel, A.2
Jottrand, M.3
Moinet, A.4
Perez, J.5
Stylianou, Y.6
-
38
-
-
33947629275
-
Residual conversion versus prediction on voice morphing systems
-
Duxans, H., Bonafonte, A., Residual conversion versus prediction on voice morphing systems. Proceedings of the ICASSP, 2006.
-
(2006)
Proceedings of the ICASSP
-
-
Duxans, H.1
Bonafonte, A.2
-
39
-
-
84994241109
-
Including dynamic and phonetic information in voice conversion systems
-
Duxans, H., Bonafonte, A., Kain, A., Van Santen, J., Including dynamic and phonetic information in voice conversion systems. Proceedings of the ICSLP, 2004.
-
(2004)
Proceedings of the ICSLP
-
-
Duxans, H.1
Bonafonte, A.2
Kain, A.3
Van Santen, J.4
-
40
-
-
79951758789
-
Voice conversion of non-aligned data using unit selection
-
Duxans, H., Erro, D., Pérez, J., Diego, F., Bonafonte, A., Moreno, A., Voice conversion of non-aligned data using unit selection. TC-STAR WSST, 2006.
-
(2006)
TC-STAR WSST
-
-
Duxans, H.1
Erro, D.2
Pérez, J.3
Diego, F.4
Bonafonte, A.5
Moreno, A.6
-
41
-
-
84946210905
-
A new method for pitch prediction from spectral envelope and its application in voice conversion
-
En-Najjary, T., Rosec, O., Chonavel, T., A new method for pitch prediction from spectral envelope and its application in voice conversion. Proceedings of the INTERSPEECH, 2003.
-
(2003)
Proceedings of the INTERSPEECH
-
-
En-Najjary, T.1
Rosec, O.2
Chonavel, T.3
-
42
-
-
85010449478
-
A voice conversion method based on joint pitch and spectral envelope transformation.
-
En-Najjary, T., Rosec, O., Chonavel, T., A voice conversion method based on joint pitch and spectral envelope transformation. Proceedings of the INTERSPEECH, 2004.
-
(2004)
Proceedings of the INTERSPEECH
-
-
En-Najjary, T.1
Rosec, O.2
Chonavel, T.3
-
43
-
-
77949522811
-
Why does unsupervised pre-training help deep learning?
-
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.A., Vincent, P., Bengio, S., Why does unsupervised pre-training help deep learning?. J. Mach. Learn. Res. 11 (2010), 625–660.
-
(2010)
J. Mach. Learn. Res.
, vol.11
, pp. 625-660
-
-
Erhan, D.1
Bengio, Y.2
Courville, A.3
Manzagol, P.A.4
Vincent, P.5
Bengio, S.6
-
44
-
-
84888241651
-
Towards physically interpretable parametric voice conversion functions
-
Springer
-
Erro, D., Alonso, A., Serrano, L., Navas, E., Hernáez, I., Towards physically interpretable parametric voice conversion functions. Advances in Nonlinear Speech Processing, 2013, Springer, 75–82.
-
(2013)
Advances in Nonlinear Speech Processing
, pp. 75-82
-
-
Erro, D.1
Alonso, A.2
Serrano, L.3
Navas, E.4
Hernáez, I.5
-
45
-
-
84913585254
-
Interpretable parametric voice conversion functions based on gaussian mixture models and constrained transformations
-
Erro, D., Alonso, A., Serrano, L., Navas, E., Hernaez, I., Interpretable parametric voice conversion functions based on gaussian mixture models and constrained transformations. Comput. Speech Lang. 30:1 (2015), 3–15.
-
(2015)
Comput. Speech Lang.
, vol.30
, Issue.1
, pp. 3-15
-
-
Erro, D.1
Alonso, A.2
Serrano, L.3
Navas, E.4
Hernaez, I.5
-
46
-
-
84994385904
-
Ml parameter generation with a reformulated mge training criterion—participation in the voice conversion challenge 2016
-
Erro, D., Alonso, A., Serrano, L., Tavarez, D., Odriozola, I., Sarasola, X., Del Blanco, E., Sanchez, J., Saratxaga, I., Navas, E., et al. Ml parameter generation with a reformulated mge training criterion—participation in the voice conversion challenge 2016. Proceedings of the INTERSPEECH, 2016.
-
(2016)
Proceedings of the INTERSPEECH
-
-
Erro, D.1
Alonso, A.2
Serrano, L.3
Tavarez, D.4
Odriozola, I.5
Sarasola, X.6
Del Blanco, E.7
Sanchez, J.8
Saratxaga, I.9
Navas, E.10
-
49
-
-
77953725318
-
INCA algorithm for training voice conversion systems from nonparallel corpora
-
Erro, D., Moreno, A., Bonafonte, A., INCA algorithm for training voice conversion systems from nonparallel corpora. IEEE Trans. Audio Speech Lang. Process. 18:5 (2010), 944–953.
-
(2010)
IEEE Trans. Audio Speech Lang. Process.
, vol.18
, Issue.5
, pp. 944-953
-
-
Erro, D.1
Moreno, A.2
Bonafonte, A.3
-
50
-
-
77953727123
-
Voice conversion based on weighted frequency warping
-
Erro, D., Moreno, A., Bonafonte, A., Voice conversion based on weighted frequency warping. IEEE Trans. Audio Speech Lang. Process. 18:5 (2010), 922–931.
-
(2010)
IEEE Trans. Audio Speech Lang. Process.
, vol.18
, Issue.5
, pp. 922-931
-
-
Erro, D.1
Moreno, A.2
Bonafonte, A.3
-
51
-
-
84878409257
-
Iterative MMSE estimation of vocal tract length normalization factors for voice transformation.
-
Erro, D., Navas, E., Hernáez, I., Iterative MMSE estimation of vocal tract length normalization factors for voice transformation. Proceedings of the INTERSPEECH, 2012.
-
(2012)
Proceedings of the INTERSPEECH
-
-
Erro, D.1
Navas, E.2
Hernáez, I.3
-
52
-
-
51449121679
-
On combining statistical methods and frequency warping for high-quality voice conversion
-
Erro, D., Polyakova, T., Moreno, A., On combining statistical methods and frequency warping for high-quality voice conversion. Proceedings of the ICASSP, 2008.
-
(2008)
Proceedings of the ICASSP
-
-
Erro, D.1
Polyakova, T.2
Moreno, A.3
-
53
-
-
84865795787
-
Improved HNM-based vocoder for statistical synthesizers.
-
Erro, D., Sainz, I., Navas, E., Hernáez, I., Improved HNM-based vocoder for statistical synthesizers. Proceedings of the INTERSPEECH, 2011.
-
(2011)
Proceedings of the INTERSPEECH
-
-
Erro, D.1
Sainz, I.2
Navas, E.3
Hernáez, I.4
-
54
-
-
84865743085
-
Quality improvement of voice conversion systems based on trellis structured vector quantization
-
Eslami, M., Sheikhzadeh, H., Sayadiyan, A., Quality improvement of voice conversion systems based on trellis structured vector quantization. Twelfth Annual Conference of the International Speech Communication Association, 2011.
-
(2011)
Twelfth Annual Conference of the International Speech Communication Association
-
-
Eslami, M.1
Sheikhzadeh, H.2
Sayadiyan, A.3
-
55
-
-
84986212974
-
A waveform representation framework for high-quality statistical parametric speech synthesis
-
arXiv preprint arXiv:1510.01443
-
Fan, B., Lee, S.W., Tian, X., Xie, L., Dong, M., A waveform representation framework for high-quality statistical parametric speech synthesis. Proceedings of the APSIPA, 2015 arXiv preprint arXiv:1510.01443.
-
(2015)
Proceedings of the APSIPA
-
-
Fan, B.1
Lee, S.W.2
Tian, X.3
Xie, L.4
Dong, M.5
-
56
-
-
84905179995
-
Highindividuality voice conversion based on concatenative speech synthesis
-
Fujii, K., Okawa, J., Suigetsu, K., Highindividuality voice conversion based on concatenative speech synthesis. World Academy of Science, Engineering and Technology, 2, 2007, 1.
-
(2007)
World Academy of Science, Engineering and Technology
, vol.2
, pp. 1
-
-
Fujii, K.1
Okawa, J.2
Suigetsu, K.3
-
57
-
-
0022667694
-
Speaker-independent isolated word recognition using dynamic features of speech spectrum
-
Furui, S., Speaker-independent isolated word recognition using dynamic features of speech spectrum. IEEE Transactions on Acoustics, Speech and Signal Processing 34:1 (1986), 52–59.
-
(1986)
IEEE Transactions on Acoustics, Speech and Signal Processing
, vol.34
, Issue.1
, pp. 52-59
-
-
Furui, S.1
-
58
-
-
6344222337
-
DARPA TIMIT Acoustic-Phonetic Continous Speech Corpus CD-ROM. Nist Speech Disc 1-1.1
-
NASA STI, Recon Technical Report N
-
Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., DARPA TIMIT Acoustic-Phonetic Continous Speech Corpus CD-ROM. Nist Speech Disc 1-1.1. 93, 1993, NASA STI, Recon Technical Report N, 27403.
-
(1993)
, vol.93
, pp. 27403
-
-
Garofolo, J.S.1
Lamel, L.F.2
Fisher, W.M.3
Fiscus, J.G.4
Pallett, D.S.5
-
59
-
-
84919915933
-
Voice conversion based on feature combination with limited training data
-
Ghorbandoost, M., Sayadiyan, A., Ahangar, M., Sheikhzadeh, H., Shahrebabaki, A.S., Amini, J., Voice conversion based on feature combination with limited training data. Speech Commun. 67 (2015), 113–128.
-
(2015)
Speech Commun.
, vol.67
, pp. 113-128
-
-
Ghorbandoost, M.1
Sayadiyan, A.2
Ahangar, M.3
Sheikhzadeh, H.4
Shahrebabaki, A.S.5
Amini, J.6
-
62
-
-
84872578524
-
Deep sparse rectifier neural networks.
-
Glorot, X., Bordes, A., Bengio, Y., Deep sparse rectifier neural networks. Aistats, 2011.
-
(2011)
Aistats
-
-
Glorot, X.1
Bordes, A.2
Bengio, Y.3
-
63
-
-
84906241950
-
Assessing the intelligibility impact of vowel space expansion via clear speech-inspired frequency warping.
-
Godoy, E., Koutsogiannaki, M., Stylianou, Y., Assessing the intelligibility impact of vowel space expansion via clear speech-inspired frequency warping. Proceedings of the INTERSPEECH, 2013.
-
(2013)
Proceedings of the INTERSPEECH
-
-
Godoy, E.1
Koutsogiannaki, M.2
Stylianou, Y.3
-
64
-
-
84890562746
-
Approaching speech intelligibility enhancement with inspiration from lombard and clear speaking styles
-
Godoy, E., Koutsogiannaki, M., Stylianou, Y., Approaching speech intelligibility enhancement with inspiration from lombard and clear speaking styles. Comput. Speech. Lang. 28:2 (2014), 629–647.
-
(2014)
Comput. Speech. Lang.
, vol.28
, Issue.2
, pp. 629-647
-
-
Godoy, E.1
Koutsogiannaki, M.2
Stylianou, Y.3
-
65
-
-
70450186582
-
Alleviating the one-to-many mapping problem in voice conversion with context-dependent modelling
-
Godoy, E., Rosec, O., Chonavel, T., Alleviating the one-to-many mapping problem in voice conversion with context-dependent modelling. Proceedings of the INTERSPEECH, 2009.
-
(2009)
Proceedings of the INTERSPEECH
-
-
Godoy, E.1
Rosec, O.2
Chonavel, T.3
-
66
-
-
85010285285
-
On transforming spectral peaks in voice conversion
-
Godoy, E., Rosec, O., Chonavel, T., On transforming spectral peaks in voice conversion. Proceedings of the SSW, 2010.
-
(2010)
Proceedings of the SSW
-
-
Godoy, E.1
Rosec, O.2
Chonavel, T.3
-
67
-
-
78650273608
-
Speech spectral envelope estimation through explicit control of peak evolution in time
-
Godoy, E., Rosec, O., Chonavel, T., Speech spectral envelope estimation through explicit control of peak evolution in time. Proceedings of the ISSPA, 2010.
-
(2010)
Proceedings of the ISSPA
-
-
Godoy, E.1
Rosec, O.2
Chonavel, T.3
-
68
-
-
84865717274
-
Spectral envelope transformation using DFW and amplitude scaling for voice conversion with parallel or nonparallel corpora
-
Godoy, E., Rosec, O., Chonavel, T., Spectral envelope transformation using DFW and amplitude scaling for voice conversion with parallel or nonparallel corpora. Proceeding of the INTERSPEECH, 2011.
-
(2011)
Proceeding of the INTERSPEECH
-
-
Godoy, E.1
Rosec, O.2
Chonavel, T.3
-
69
-
-
84857498745
-
Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora
-
Godoy, E., Rosec, O., Chonavel, T., Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora. IEEE Trans. Audio Speech Lang. Process. 20:4 (2012), 1313–1323.
-
(2012)
IEEE Trans. Audio Speech Lang. Process.
, vol.20
, Issue.4
, pp. 1313-1323
-
-
Godoy, E.1
Rosec, O.2
Chonavel, T.3
-
70
-
-
84912138720
-
Improving segmental GMM based voice conversion method with target frame selection
-
Gu, H.-Y., Tsai, S.-F., Improving segmental GMM based voice conversion method with target frame selection. Proceedings of the ISCSLP, 2014.
-
(2014)
Proceedings of the ISCSLP
-
-
Gu, H.-Y.1
Tsai, S.-F.2
-
72
-
-
85010456787
-
Spectral mapping method for voice conversion using speaker selection and vector field smoothing
-
Hashimoto, M., Higuchi, N., Spectral mapping method for voice conversion using speaker selection and vector field smoothing. Proceedings of the EUROSPEECH, 1995.
-
(1995)
Proceedings of the EUROSPEECH
-
-
Hashimoto, M.1
Higuchi, N.2
-
73
-
-
0030351582
-
Training data selection for voice conversion using speaker selection and vector field smoothing
-
Hashimoto, M., Higuchi, N., Training data selection for voice conversion using speaker selection and vector field smoothing. Proceedings of the ICSLP, 1996.
-
(1996)
Proceedings of the ICSLP
-
-
Hashimoto, M.1
Higuchi, N.2
-
74
-
-
85010438368
-
Analysis of lsf frame selection in voice conversion
-
Helander, E., Nurminen, J., Gabbouj, M., Analysis of lsf frame selection in voice conversion. Proceedings of the SPECOM, 2007.
-
(2007)
Proceedings of the SPECOM
-
-
Helander, E.1
Nurminen, J.2
Gabbouj, M.3
-
75
-
-
51449107658
-
Lsf mapping for voice conversion with very small training sets
-
Helander, E., Nurminen, J., Gabbouj, M., Lsf mapping for voice conversion with very small training sets. Proceedings of the ICASSP, 2008.
-
(2008)
Proceedings of the ICASSP
-
-
Helander, E.1
Nurminen, J.2
Gabbouj, M.3
-
76
-
-
84867198185
-
On the impact of alignment on voice conversion performance
-
Helander, E., Schwarz, J., Nurminen, J., Silen, H., Gabbouj, M., On the impact of alignment on voice conversion performance. Proceedings of the INTERSPEECH, 2008.
-
(2008)
Proceedings of the INTERSPEECH
-
-
Helander, E.1
Schwarz, J.2
Nurminen, J.3
Silen, H.4
Gabbouj, M.5
-
77
-
-
79959836789
-
Maximum a posteriori voice conversion using sequential monte carlo methods.
-
Helander, E., Silén, H., Míguez, J., Gabbouj, M., Maximum a posteriori voice conversion using sequential monte carlo methods. Proceedings of the INTERSPEECH, 2010.
-
(2010)
Proceedings of the INTERSPEECH
-
-
Helander, E.1
Silén, H.2
Míguez, J.3
Gabbouj, M.4
-
78
-
-
84856141218
-
Voice conversion using dynamic kernel partial least squares regression
-
Helander, E., Silén, H., Virtanen, T., Gabbouj, M., Voice conversion using dynamic kernel partial least squares regression. IEEE Trans. Audio Speech Lang. Process. 20:3 (2012), 806–817.
-
(2012)
IEEE Trans. Audio Speech Lang. Process.
, vol.20
, Issue.3
, pp. 806-817
-
-
Helander, E.1
Silén, H.2
Virtanen, T.3
Gabbouj, M.4
-
79
-
-
77953712499
-
Voice conversion using partial least squares regression
-
Helander, E., Virtanen, T., Nurminen, J., Gabbouj, M., Voice conversion using partial least squares regression. IEEE Trans. Audio Speech Lang. Process. 18:5 (2010), 912–921.
-
(2010)
IEEE Trans. Audio Speech Lang. Process.
, vol.18
, Issue.5
, pp. 912-921
-
-
Helander, E.1
Virtanen, T.2
Nurminen, J.3
Gabbouj, M.4
-
81
-
-
56149114123
-
On the importance of pure prosody in the perception of speaker identity
-
Helander, E.E., Nurminen, J., On the importance of pure prosody in the perception of speaker identity. Proceedings of the INTERSPEECH, 2007.
-
(2007)
Proceedings of the INTERSPEECH
-
-
Helander, E.E.1
Nurminen, J.2
-
82
-
-
77956795483
-
Esophageal speech enhancement based on statistical voice conversion with gaussian mixture models
-
Hironori, D., Nakamura, K., Tomoki, T., Saruwatari, H., Shikano, K., Esophageal speech enhancement based on statistical voice conversion with gaussian mixture models. IEICE Trans. Inf. Syst. 93:9 (2010), 2472–2482.
-
(2010)
IEICE Trans. Inf. Syst.
, vol.93
, Issue.9
, pp. 2472-2482
-
-
Hironori, D.1
Nakamura, K.2
Tomoki, T.3
Saruwatari, H.4
Shikano, K.5
-
83
-
-
0024880831
-
Multilayer feedforward networks are universal approximators
-
Hornik, K., Stinchcombe, M., White, H., Multilayer feedforward networks are universal approximators. Neural Netw. 2:5 (1989), 359–366.
-
(1989)
Neural Netw.
, vol.2
, Issue.5
, pp. 359-366
-
-
Hornik, K.1
Stinchcombe, M.2
White, H.3
-
84
-
-
85010449824
-
Duration-embedded bi-HMM for expressive voice conversion.
-
Hsia, C.-C., Wu, C.-H., Liu, T.-H., Duration-embedded bi-HMM for expressive voice conversion. Proceedings of the INTERSPEECH, 2005.
-
(2005)
Proceedings of the INTERSPEECH
-
-
Hsia, C.-C.1
Wu, C.-H.2
Liu, T.-H.3
-
85
-
-
34548216761
-
Conversion function clustering and selection using linguistic and spectral information for emotional voice conversion
-
Hsia, C.-C., Wu, C.-H., Wu, J.-Q., Conversion function clustering and selection using linguistic and spectral information for emotional voice conversion. IEEE Trans. Comput. 56:9 (2007), 1245–1254.
-
(2007)
IEEE Trans. Comput.
, vol.56
, Issue.9
, pp. 1245-1254
-
-
Hsia, C.-C.1
Wu, C.-H.2
Wu, J.-Q.3
-
86
-
-
85075288991
-
An automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity
-
Huang, D.-Y., Xie, L., Siu, Y., Lee, W., Wu, J., Ming, H., Tian, X., Zhang, S., Ding, C., Li, M., Nguyen, Q.H., Dong, M., Li, H., An automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity. Proceedings of the SSW, 2016.
-
(2016)
Proceedings of the SSW
-
-
Huang, D.-Y.1
Xie, L.2
Siu, Y.3
Lee, W.4
Wu, J.5
Ming, H.6
Tian, X.7
Zhang, S.8
Ding, C.9
Li, M.10
Nguyen, Q.H.11
Dong, M.12
Li, H.13
-
87
-
-
84893234191
-
Incorporating global variance in the training phase of GMM-based voice conversion
-
Hwang, H.-T., Tsao, Y., Wang, H.-M., Wang, Y.-R., Chen, S.-H., Incorporating global variance in the training phase of GMM-based voice conversion. Proceedings of the APSIPA, 2013.
-
(2013)
Proceedings of the APSIPA
-
-
Hwang, H.-T.1
Tsao, Y.2
Wang, H.-M.3
Wang, Y.-R.4
Chen, S.-H.5
-
88
-
-
84878415076
-
A study of mutual information for GMM-based spectral conversion.
-
Hwang, H.-T., Tsao, Y., Wang, H.-M., Wang, Y.-R., Chen, S.-H., et al. A study of mutual information for GMM-based spectral conversion. Proceedings of the INTERSPEECH, 2012.
-
(2012)
Proceedings of the INTERSPEECH
-
-
Hwang, H.-T.1
Tsao, Y.2
Wang, H.-M.3
Wang, Y.-R.4
Chen, S.-H.5
-
89
-
-
0020596154
-
Cepstral analysis synthesis on the mel frequency scale
-
Imai, S., Cepstral analysis synthesis on the mel frequency scale. Proceedings of the ICASSP, 1983.
-
(1983)
Proceedings of the ICASSP
-
-
Imai, S.1
-
90
-
-
84863739383
-
Speech signal processing toolkit (SPTK), version 3.3
-
Imai, S., Kobayashi, T., Tokuda, K., Masuko, T., Koishida, K., Sako, S., Zen, H., Speech signal processing toolkit (SPTK), version 3.3. 2009.
-
(2009)
-
-
Imai, S.1
Kobayashi, T.2
Tokuda, K.3
Masuko, T.4
Koishida, K.5
Sako, S.6
Zen, H.7
-
91
-
-
0020703324
-
Mel log spectrum approximation (MLSA) filter for speech synthesis
-
Imai, S., Sumita, K., Furuichi, C., Mel log spectrum approximation (MLSA) filter for speech synthesis. Electron. Commun. Japan 66:2 (1983), 10–18.
-
(1983)
Electron. Commun. Japan
, vol.66
, Issue.2
, pp. 10-18
-
-
Imai, S.1
Sumita, K.2
Furuichi, C.3
-
93
-
-
84938935270
-
A system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality.
-
Inanoglu, Z., Young, S., A system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality. Proceedings of the INTERSPEECH, 2007, 490–493.
-
(2007)
Proceedings of the INTERSPEECH
, pp. 490-493
-
-
Inanoglu, Z.1
Young, S.2
-
94
-
-
58149203393
-
Data-driven emotion conversion in spoken english
-
Inanoglu, Z., Young, S., Data-driven emotion conversion in spoken english. Speech Commun. 51:3 (2009), 268–283.
-
(2009)
Speech Commun.
, vol.51
, Issue.3
, pp. 268-283
-
-
Inanoglu, Z.1
Young, S.2
-
95
-
-
85064715894
-
Speech spectrum transformation by speaker interpolation
-
IEEE
-
Iwahashi, N., Sagisaka, Y., Speech spectrum transformation by speaker interpolation. Proceedings of the ICASSP. Vol. 1, 1994, IEEE, I–461.
-
(1994)
Proceedings of the ICASSP. Vol. 1
, pp. I-461
-
-
Iwahashi, N.1
Sagisaka, Y.2
-
96
-
-
0029251946
-
Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks
-
Iwahashi, N., Sagisaka, Y., Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks. Speech Commun. 16:2 (1995), 139–151.
-
(1995)
Speech Commun.
, vol.16
, Issue.2
, pp. 139-151
-
-
Iwahashi, N.1
Sagisaka, Y.2
-
97
-
-
0031623661
-
Spectral voice conversion for text-to-speech synthesis
-
Kain, A., Macon, M.W., Spectral voice conversion for text-to-speech synthesis. Proceedings of the ICASSP, 1998.
-
(1998)
Proceedings of the ICASSP
-
-
Kain, A.1
Macon, M.W.2
-
98
-
-
84984905455
-
Text-to-speech voice adaptation from sparse training data.
-
Kain, A., Macon, M.W., Text-to-speech voice adaptation from sparse training data. Proceedings of the ICSLP, 1998.
-
(1998)
Proceedings of the ICSLP
-
-
Kain, A.1
Macon, M.W.2
-
99
-
-
0034841948
-
Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction
-
Kain, A., Macon, M.W., Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction. Proceedings of the ICASSP, 2001.
-
(2001)
Proceedings of the ICASSP
-
-
Kain, A.1
Macon, M.W.2
-
100
-
-
77953816641
-
Unit-selection text-to-speech synthesis using an asynchronous interpolation model.
-
Kain, A., van Santen, J.P., Unit-selection text-to-speech synthesis using an asynchronous interpolation model. Proceedings of the SSW, 2007.
-
(2007)
Proceedings of the SSW
-
-
Kain, A.1
van Santen, J.P.2
-
101
-
-
70349210296
-
Using speech transformation to increase speech intelligibility for the hearing-and speaking-impaired
-
Kain, A., Van Santen, J., Using speech transformation to increase speech intelligibility for the hearing-and speaking-impaired. Proceedings of the ICASSP, 2009.
-
(2009)
Proceedings of the ICASSP
-
-
Kain, A.1
Van Santen, J.2
-
103
-
-
34447635527
-
Improving the intelligibility of dysarthric speech
-
Kain, A.B., Hosom, J.-P., Niu, X., van Santen, J.P., Fried-Oken, M., Staehely, J., Improving the intelligibility of dysarthric speech. Speech Commun. 49:9 (2007), 743–759.
-
(2007)
Speech Commun.
, vol.49
, Issue.9
, pp. 743-759
-
-
Kain, A.B.1
Hosom, J.-P.2
Niu, X.3
van Santen, J.P.4
Fried-Oken, M.5
Staehely, J.6
-
104
-
-
33646785078
-
A hybrid gmm and codebook mapping method for spectral conversion
-
Springer
-
Kang, Y., Shuang, Z., Tao, J., Zhang, W., Xu, B., A hybrid gmm and codebook mapping method for spectral conversion. Affective Computing and Intelligent Interaction, 2005, Springer, 303–310.
-
(2005)
Affective Computing and Intelligent Interaction
, pp. 303-310
-
-
Kang, Y.1
Shuang, Z.2
Tao, J.3
Zhang, W.4
Xu, B.5
-
105
-
-
33947698917
-
Applying pitch target model to convert f0 contour for expressive mandarin speech synthesis
-
Kang, Y., Tao, J., Xu, B., Applying pitch target model to convert f0 contour for expressive mandarin speech synthesis. Proceedings of the ICASSP, 2006.
-
(2006)
Proceedings of the ICASSP
-
-
Kang, Y.1
Tao, J.2
Xu, B.3
-
106
-
-
0032673049
-
Restructuring speech representations using a pitch-adaptive time–frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds
-
Kawahara, H., Masuda-Katsuse, I., De Cheveigné, A., Restructuring speech representations using a pitch-adaptive time–frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds. Speech Commun. 27:3 (1999), 187–207.
-
(1999)
Speech Commun.
, vol.27
, Issue.3
, pp. 187-207
-
-
Kawahara, H.1
Masuda-Katsuse, I.2
De Cheveigné, A.3
-
107
-
-
51449108867
-
TANDEM-STRAIGHT: a temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, f0, and aperiodicity estimation
-
Kawahara, H., Morise, M., Takahashi, T., Nisimura, R., Irino, T., Banno, H., TANDEM-STRAIGHT: a temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, f0, and aperiodicity estimation. Proceedings of the ICASSP, 2008.
-
(2008)
Proceedings of the ICASSP
-
-
Kawahara, H.1
Morise, M.2
Takahashi, T.3
Nisimura, R.4
Irino, T.5
Banno, H.6
-
108
-
-
84876497245
-
GMM-based voice conversion applied to emotional speech synthesis
-
Kawanami, H., Iwami, Y., Toda, T., Saruwatari, H., Shikano, K., GMM-based voice conversion applied to emotional speech synthesis. Proceedings of the EUROSPEECH, 2003.
-
(2003)
Proceedings of the EUROSPEECH
-
-
Kawanami, H.1
Iwami, Y.2
Toda, T.3
Saruwatari, H.4
Shikano, K.5
-
109
-
-
85135141647
-
Hidden markov model based voice conversion using dynamic characteristics of speaker.
-
Kim, E.-K., Lee, S., Oh, Y.-H., Hidden markov model based voice conversion using dynamic characteristics of speaker. Proceedings of the EUROSPEECH, 1997.
-
(1997)
Proceedings of the EUROSPEECH
-
-
Kim, E.-K.1
Lee, S.2
Oh, Y.-H.3
-
110
-
-
84905262778
-
An investigation of acoustic features for singing voice conversion based on perceptual age.
-
Kobayashi, K., Doi, H., Toda, T., Nakano, T., Goto, M., Neubig, G., Sakti, S., Nakamura, S., An investigation of acoustic features for singing voice conversion based on perceptual age. Proceedings of the INTERSPEECH, 2013.
-
(2013)
Proceedings of the INTERSPEECH
-
-
Kobayashi, K.1
Doi, H.2
Toda, T.3
Nakano, T.4
Goto, M.5
Neubig, G.6
Sakti, S.7
Nakamura, S.8
-
111
-
-
84959111811
-
Statistical singing voice conversion based on direct waveform modification with global variance
-
Kobayashi, K., Toda, T., Neubig, G., Sakti, S., Nakamura, S., Statistical singing voice conversion based on direct waveform modification with global variance. Proceedings of the INTERSPEECH, 2015.
-
(2015)
Proceedings of the INTERSPEECH
-
-
Kobayashi, K.1
Toda, T.2
Neubig, G.3
Sakti, S.4
Nakamura, S.5
-
113
-
-
84905248157
-
Simple and artefact-free spectral modifications for enhancing the intelligibility of casual speech
-
Koutsogiannaki, M., Stylianou, Y., Simple and artefact-free spectral modifications for enhancing the intelligibility of casual speech. Proceedings of the ICASSP, 2014.
-
(2014)
Proceedings of the ICASSP
-
-
Koutsogiannaki, M.1
Stylianou, Y.2
-
114
-
-
84908466787
-
Using phone and diphone based acoustic models for voice conversion: a step towards creating voice fonts
-
Kumar, A., Verma, A., Using phone and diphone based acoustic models for voice conversion: a step towards creating voice fonts. Proceedings of the ICME, 2003.
-
(2003)
Proceedings of the ICME
-
-
Kumar, A.1
Verma, A.2
-
115
-
-
0029256373
-
Acoustic characteristics of speaker individuality: control and conversion
-
Kuwabara, H., Sagisak, Y., Acoustic characteristics of speaker individuality: control and conversion. Speech Commun. 16:2 (1995), 165–173.
-
(1995)
Speech Commun.
, vol.16
, Issue.2
, pp. 165-173
-
-
Kuwabara, H.1
Sagisak, Y.2
-
116
-
-
84964555662
-
Speaker intonation adaptation for transforming text-to-speech synthesis speaker identity
-
Langarani, M.S.E., van Santen, J., Speaker intonation adaptation for transforming text-to-speech synthesis speaker identity. Proceedings of the ASRU, 2015.
-
(2015)
Proceedings of the ASRU
-
-
Langarani, M.S.E.1
van Santen, J.2
-
117
-
-
84865847955
-
Comparing ANN and GMM in a voice conversion framework
-
Laskar, R., Chakrabarty, D., Talukdar, F., Rao, K.S., Banerjee, K., Comparing ANN and GMM in a voice conversion framework. Appl. Soft Comput. 12:11 (2012), 3332–3342.
-
(2012)
Appl. Soft Comput.
, vol.12
, Issue.11
, pp. 3332-3342
-
-
Laskar, R.1
Chakrabarty, D.2
Talukdar, F.3
Rao, K.S.4
Banerjee, K.5
-
118
-
-
84890501677
-
Voice conversion by mapping the spectral and prosodic features using support vector machine
-
Springer
-
Laskar, R.H., Talukdar, F.A., Bhattacharjee, R., Das, S., Voice conversion by mapping the spectral and prosodic features using support vector machine. Applications of Soft Computing, 2009, Springer, 519–528.
-
(2009)
Applications of Soft Computing
, pp. 519-528
-
-
Laskar, R.H.1
Talukdar, F.A.2
Bhattacharjee, R.3
Das, S.4
-
119
-
-
84910030281
-
Voice expression conversion with factorised HMM-TTS models
-
Latorre, J., Wan, V., Yanagisawa, K., Voice expression conversion with factorised HMM-TTS models. Proceedings of the INTERSPEECH, 2014.
-
(2014)
Proceedings of the INTERSPEECH
-
-
Latorre, J.1
Wan, V.2
Yanagisawa, K.3
-
120
-
-
44949210554
-
Map-based adaptation for speech conversion using adaptation data selection and non-parallel training.
-
Lee, C.-H., Wu, C.-H., Map-based adaptation for speech conversion using adaptation data selection and non-parallel training. Proceedings of the INTERSPEECH, 2006.
-
(2006)
Proceedings of the INTERSPEECH
-
-
Lee, C.-H.1
Wu, C.-H.2
-
121
-
-
38149065136
-
Statistical approach for voice personality transformation
-
Lee, K.-S., Statistical approach for voice personality transformation. IEEE Trans. Audio Speech Lang. Process. 15:2 (2007), 641–651.
-
(2007)
IEEE Trans. Audio Speech Lang. Process.
, vol.15
, Issue.2
, pp. 641-651
-
-
Lee, K.-S.1
-
122
-
-
84896464538
-
A unit selection approach for voice transformation
-
Lee, K.-S., A unit selection approach for voice transformation. Speech Commun. 60 (2014), 30–43.
-
(2014)
Speech Commun.
, vol.60
, pp. 30-43
-
-
Lee, K.-S.1
-
123
-
-
84876489382
-
Emotional speech conversion based on spectrum-prosody dual transformation
-
Li, B., Xiao, Z., Shen, Y., Zhou, Q., Tao, Z., Emotional speech conversion based on spectrum-prosody dual transformation. Proceedings of the ICSP, 2012.
-
(2012)
Proceedings of the ICSP
-
-
Li, B.1
Xiao, Z.2
Shen, Y.3
Zhou, Q.4
Tao, Z.5
-
124
-
-
85032750981
-
Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends
-
Ling, Z.-H., Kang, S.-Y., Zen, H., Senior, A., Schuster, M., Qian, X.-J., Meng, H.M., Deng, L., Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends. Signal Process. Mag. IEEE 32:3 (2015), 35–52.
-
(2015)
Signal Process. Mag. IEEE
, vol.32
, Issue.3
, pp. 35-52
-
-
Ling, Z.-H.1
Kang, S.-Y.2
Zen, H.3
Senior, A.4
Schuster, M.5
Qian, X.-J.6
Meng, H.M.7
Deng, L.8
-
125
-
-
84905223323
-
Using bidirectional associative memories for joint spectral envelope modeling in voice conversion
-
Liu, L.-J., Chen, L.-H., Ling, Z.-H., Dai, L.-R., Using bidirectional associative memories for joint spectral envelope modeling in voice conversion. Proceedings of the ICASSP, 2014.
-
(2014)
Proceedings of the ICASSP
-
-
Liu, L.-J.1
Chen, L.-H.2
Ling, Z.-H.3
Dai, L.-R.4
-
126
-
-
84946076200
-
Spectral conversion using deep neural networks trained with multi-source speakers
-
Liu, L.-J., Chen, L.-H., Ling, Z.-H., Dai, L.-R., Spectral conversion using deep neural networks trained with multi-source speakers. Proceedings of the ICASSP, 2015.
-
(2015)
Proceedings of the ICASSP
-
-
Liu, L.-J.1
Chen, L.-H.2
Ling, Z.-H.3
Dai, L.-R.4
-
127
-
-
77953726259
-
Pitch and duration transformation with non-parallel data
-
Lolive, D., Barbot, N., Boeffard, O., Pitch and duration transformation with non-parallel data. Proceedings of the Speech Prosody, 2008.
-
(2008)
Proceedings of the Speech Prosody
-
-
Lolive, D.1
Barbot, N.2
Boeffard, O.3
-
129
-
-
85059803513
-
Speaker conversion through non-linear frequency warping of straight spectrum.
-
Maeda, N., Banno, H., Kajita, S., Takeda, K., Itakura, F., Speaker conversion through non-linear frequency warping of straight spectrum. Proceedings of the EUROSPEECH, 1999.
-
(1999)
Proceedings of the EUROSPEECH
-
-
Maeda, N.1
Banno, H.2
Kajita, S.3
Takeda, K.4
Itakura, F.5
-
130
-
-
34548785064
-
Voice conversion using nonlinear principal component analysis
-
Makki, B., Seyedsalehi, S., Sadati, N., Hosseini, M.N., Voice conversion using nonlinear principal component analysis. Proceedings of the CIISP, 2007.
-
(2007)
Proceedings of the CIISP
-
-
Makki, B.1
Seyedsalehi, S.2
Sadati, N.3
Hosseini, M.N.4
-
131
-
-
84905269973
-
Multimodal voice conversion using non-negative matrix factorization in noisy environments
-
Masaka, K., Aihara, R., Takiguchi, T., Ariki, Y., Multimodal voice conversion using non-negative matrix factorization in noisy environments. Proceedings of the ICASSP, 2014.
-
(2014)
Proceedings of the ICASSP
-
-
Masaka, K.1
Aihara, R.2
Takiguchi, T.3
Ariki, Y.4
-
132
-
-
34547534995
-
Cost reduction of training mapping function based on multistep voice conversion
-
Masuda, T., Shozakai, M., Cost reduction of training mapping function based on multistep voice conversion. Proceedings of the ICASSP, 2007.
-
(2007)
Proceedings of the ICASSP
-
-
Masuda, T.1
Shozakai, M.2
-
133
-
-
0015677419
-
Multidimensional representation of personal quality of vowels and its acoustical correlates
-
Matsumoto, H., Hiki, S., Sone, T., Nimura, T., Multidimensional representation of personal quality of vowels and its acoustical correlates. IEEE Trans. Audio Electroacoust. 21:5 (1973), 428–436.
-
(1973)
IEEE Trans. Audio Electroacoust.
, vol.21
, Issue.5
, pp. 428-436
-
-
Matsumoto, H.1
Hiki, S.2
Sone, T.3
Nimura, T.4
-
134
-
-
85007685968
-
Unsupervised speaker adaptation from short utterances based on a minimized fuzzy objective function.
-
Matsumoto, H., Yamashita, Y., Unsupervised speaker adaptation from short utterances based on a minimized fuzzy objective function. J. Acoust. Soc. Japan (E) 14:5 (1993), 353–361.
-
(1993)
J. Acoust. Soc. Japan (E)
, vol.14
, Issue.5
, pp. 353-361
-
-
Matsumoto, H.1
Yamashita, Y.2
-
135
-
-
56149098813
-
Comparing GMM-based speech transformation systems
-
Mesbahi, L., Barreaud, V., Boeffard, O., Comparing GMM-based speech transformation systems. Proceedings of the INTERSPEECH, 2007.
-
(2007)
Proceedings of the INTERSPEECH
-
-
Mesbahi, L.1
Barreaud, V.2
Boeffard, O.3
-
136
-
-
85047459969
-
Gmm-based speech transformation systems under data reduction
-
Mesbahi, L., Barreaud, V., Boeffard, O., Gmm-based speech transformation systems under data reduction. Proceedings of the SSW, 2007.
-
(2007)
Proceedings of the SSW
-
-
Mesbahi, L.1
Barreaud, V.2
Boeffard, O.3
-
137
-
-
84994251909
-
Deep bidirectional lstm modeling of timbre and prosody for emotional voice conversion
-
Ming, H., Huang, D., Xie, L., Wu, J., Li, M.D.H., Deep bidirectional lstm modeling of timbre and prosody for emotional voice conversion. Proceedings of the INTERSPEECH, 2016.
-
(2016)
Proceedings of the INTERSPEECH
-
-
Ming, H.1
Huang, D.2
Xie, L.3
Wu, J.4
Li, M.D.H.5
-
138
-
-
0029256372
-
Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt
-
Mizuno, H., Abe, M., Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt. Speech Commun. 16:2 (1995), 153–164.
-
(1995)
Speech Commun.
, vol.16
, Issue.2
, pp. 153-164
-
-
Mizuno, H.1
Abe, M.2
-
140
-
-
84946685887
-
Voice conversion using deep neural networks with speaker-independent pre-training
-
Mohammadi, S.H., Kain, A., Voice conversion using deep neural networks with speaker-independent pre-training. Proceedings of the SLT, 2014.
-
(2014)
Proceedings of the SLT
-
-
Mohammadi, S.H.1
Kain, A.2
-
141
-
-
84959173289
-
Semi-supervised training of a voice conversion mapping function using a joint-autoencoder
-
Mohammadi, S.H., Kain, A., Semi-supervised training of a voice conversion mapping function using a joint-autoencoder. Proceedings of the INTERSPEECH, 2015.
-
(2015)
Proceedings of the INTERSPEECH
-
-
Mohammadi, S.H.1
Kain, A.2
-
142
-
-
84994219829
-
A voice conversion mapping function based on a stacked joint-autoencoder
-
Mohammadi, S.H., Kain, A., A voice conversion mapping function based on a stacked joint-autoencoder. Proceedings of the INTERSPEECH, 2016.
-
(2016)
Proceedings of the INTERSPEECH
-
-
Mohammadi, S.H.1
Kain, A.2
-
143
-
-
84878384703
-
Making conversational vowels more clear.
-
Mohammadi, S.H., Kain, A., van Santen, J.P., Making conversational vowels more clear. Proceedings of the INTERSPEECH, 2012.
-
(2012)
Proceedings of the INTERSPEECH
-
-
Mohammadi, S.H.1
Kain, A.2
van Santen, J.P.3
-
144
-
-
84908519225
-
Cheaptrick, a spectral envelope estimator for high-quality speech synthesis
-
Morise, M., Cheaptrick, a spectral envelope estimator for high-quality speech synthesis. Speech Commun. 67 (2015), 1–7.
-
(2015)
Speech Commun.
, vol.67
, pp. 1-7
-
-
Morise, M.1
-
145
-
-
84976902575
-
World: a vocoder-based high-quality speech synthesis system for real-time applications
-
Morise, M., Yokomori, F., Ozawa, K., World: a vocoder-based high-quality speech synthesis system for real-time applications. IEICE Trans. Inf. Syst., 2016.
-
(2016)
IEICE Trans. Inf. Syst.
-
-
Morise, M.1
Yokomori, F.2
Ozawa, K.3
-
146
-
-
84878384415
-
Synthetic f0 can effectively convey speaker id in delexicalized speech.
-
Morley, E., Klabbers, E., van Santen, J.P., Kain, A., Mohammadi, S.H., Synthetic f0 can effectively convey speaker id in delexicalized speech. Proceedings of the INTERSPEECH, 2012.
-
(2012)
Proceedings of the INTERSPEECH
-
-
Morley, E.1
Klabbers, E.2
van Santen, J.P.3
Kain, A.4
Mohammadi, S.H.5
-
147
-
-
0036753077
-
Reconstruction of speech from whispers
-
Morris, R.W., Clements, M.A., Reconstruction of speech from whispers. Med. Eng. Phys. 24:7 (2002), 515–520.
-
(2002)
Med. Eng. Phys.
, vol.24
, Issue.7
, pp. 515-520
-
-
Morris, R.W.1
Clements, M.A.2
-
148
-
-
34547552192
-
Conditional vector quantization for voice conversion
-
Mouchtaris, A., Agiomyrgiannakis, Y., Stylianou, Y., Conditional vector quantization for voice conversion. Proceedings of the ICASSP, 2007.
-
(2007)
Proceedings of the ICASSP
-
-
Mouchtaris, A.1
Agiomyrgiannakis, Y.2
Stylianou, Y.3
-
149
-
-
4544297119
-
Non-parallel training for voice conversion by maximum likelihood constrained adaptation
-
Mouchtaris, A., Van der Spiegel, J., Mueller, P., Non-parallel training for voice conversion by maximum likelihood constrained adaptation. Proceedings of the ICASSP, 2004.
-
(2004)
Proceedings of the ICASSP
-
-
Mouchtaris, A.1
Van der Spiegel, J.2
Mueller, P.3
-
150
-
-
11244303645
-
A spectral conversion approach to the iterative wiener filter for speech enhancement
-
Mouchtaris, A., Van der Spiegel, J., Mueller, P., A spectral conversion approach to the iterative wiener filter for speech enhancement. Proceedings of the ICME, 2004.
-
(2004)
Proceedings of the ICME
-
-
Mouchtaris, A.1
Van der Spiegel, J.2
Mueller, P.3
-
151
-
-
34047245444
-
Nonparallel training for voice conversion based on a parameter adaptation approach
-
Mouchtaris, A., Van der Spiegel, J., Mueller, P., Nonparallel training for voice conversion based on a parameter adaptation approach. IEEE Trans. Audio Speech Lang. Process. 14:3 (2006), 952–963.
-
(2006)
IEEE Trans. Audio Speech Lang. Process.
, vol.14
, Issue.3
, pp. 952-963
-
-
Mouchtaris, A.1
Van der Spiegel, J.2
Mueller, P.3
-
152
-
-
0025543906
-
Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones
-
Moulines, E., Charpentier, F., Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. Speech Commun. 9:5 (1990), 453–467.
-
(1990)
Speech Commun.
, vol.9
, Issue.5
, pp. 453-467
-
-
Moulines, E.1
Charpentier, F.2
-
153
-
-
84867211725
-
Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory
-
Muramatsu, T., Ohtani, Y., Toda, T., Saruwatari, H., Shikano, K., Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory. Proceedings of the INTERSPEECH, 2008.
-
(2008)
Proceedings of the INTERSPEECH
-
-
Muramatsu, T.1
Ohtani, Y.2
Toda, T.3
Saruwatari, H.4
Shikano, K.5
-
154
-
-
44949187612
-
Improving body transmitted unvoiced speech with statistical voice conversion
-
Nakagiri, M., Toda, T., Kashioka, H., Shikano, K., Improving body transmitted unvoiced speech with statistical voice conversion. Proceedings of the INTERSPEECH, 2006.
-
(2006)
Proceedings of the INTERSPEECH
-
-
Nakagiri, M.1
Toda, T.2
Kashioka, H.3
Shikano, K.4
-
155
-
-
85010461545
-
A speech communication aid system for total laryngectomies using voice conversion of body transmitted artificial speech
-
Nakamura, K., Toda, T., Saruwatari, H., Shikano, K., A speech communication aid system for total laryngectomies using voice conversion of body transmitted artificial speech. J. Acoust. Soc. Am., 120(5), 2006, 3351.
-
(2006)
J. Acoust. Soc. Am.
, vol.120
, Issue.5
, pp. 3351
-
-
Nakamura, K.1
Toda, T.2
Saruwatari, H.3
Shikano, K.4
-
156
-
-
80052698826
-
Speaking-aid systems using GMM -based voice conversion for electrolaryngeal speech
-
Nakamura, K., Toda, T., Saruwatari, H., Shikano, K., Speaking-aid systems using GMM -based voice conversion for electrolaryngeal speech. Speech Commun. 54:1 (2012), 134–146.
-
(2012)
Speech Commun.
, vol.54
, Issue.1
, pp. 134-146
-
-
Nakamura, K.1
Toda, T.2
Saruwatari, H.3
Shikano, K.4
-
157
-
-
84906280857
-
Voice conversion in high-order eigen space using deep belief nets
-
Nakashika, T., Takashima, R., Takiguchi, T., Ariki, Y., Voice conversion in high-order eigen space using deep belief nets. Proceedings of the INTERSPEECH, 2013.
-
(2013)
Proceedings of the INTERSPEECH
-
-
Nakashika, T.1
Takashima, R.2
Takiguchi, T.3
Ariki, Y.4
-
158
-
-
84910087396
-
High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion
-
Nakashika, T., Takiguchi, T., Ariki, Y., High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion. Proceedings of the INTERSPEECH, 2014.
-
(2014)
Proceedings of the INTERSPEECH
-
-
Nakashika, T.1
Takiguchi, T.2
Ariki, Y.3
-
159
-
-
84946019814
-
Sparse nonlinear representation for voice conversion
-
Nakashika, T., Takiguchi, T., Ariki, Y., Sparse nonlinear representation for voice conversion. Proceedings of the ICME, 2015.
-
(2015)
Proceedings of the ICME
-
-
Nakashika, T.1
Takiguchi, T.2
Ariki, Y.3
-
160
-
-
84923867813
-
Voice conversion using RNN pre-trained by recurrent temporal restricted Boltzmann machines
-
Nakashika, T., Takiguchi, T., Ariki, Y., Voice conversion using RNN pre-trained by recurrent temporal restricted Boltzmann machines. IEEE/ACM Trans. Audio Speech Lang. Process. 23:3 (2015), 580–587, 10.1109/TASLP.2014.2379589.
-
(2015)
IEEE/ACM Trans. Audio Speech Lang. Process.
, vol.23
, Issue.3
, pp. 580-587
-
-
Nakashika, T.1
Takiguchi, T.2
Ariki, Y.3
-
161
-
-
84924309945
-
Voice conversion using speaker-dependent conditional restricted Boltzmann machine
-
Nakashika, T., Takiguchi, T., Ariki, Y., Voice conversion using speaker-dependent conditional restricted Boltzmann machine. EURASIP J. Audio Speech Music Process. 2015:1 (2015), 1–12.
-
(2015)
EURASIP J. Audio Speech Music Process.
, vol.2015
, Issue.1
, pp. 1-12
-
-
Nakashika, T.1
Takiguchi, T.2
Ariki, Y.3
-
162
-
-
84984920236
-
Non-parallel training in voice conversion using an adaptive restricted boltzmann machine
-
Nakashika, T., Takiguchi, T., Minami, Y., Non-parallel training in voice conversion using an adaptive restricted boltzmann machine. IEEE/ACM Trans. Audio Speech Lang. Process. 24:11 (2016), 2032–2045.
-
(2016)
IEEE/ACM Trans. Audio Speech Lang. Process.
, vol.24
, Issue.11
, pp. 2032-2045
-
-
Nakashika, T.1
Takiguchi, T.2
Minami, Y.3
-
163
-
-
84901766069
-
Voice conversion based on speaker-dependent restricted boltzmann machines
-
Nakashika, T., Toru, Takiguchi, T., Tetsuya, Ariki, Y., Yasuo, Voice conversion based on speaker-dependent restricted boltzmann machines. IEICE Trans. Inf. Syst. 97:6 (2014), 1403–1410.
-
(2014)
IEICE Trans. Inf. Syst.
, vol.97
, Issue.6
, pp. 1403-1410
-
-
Nakashika, T.1
Toru2
Takiguchi, T.3
Tetsuya4
Ariki, Y.5
Yasuo6
-
164
-
-
78149241363
-
Spectral conversion based on statistical models including time-sequence matching
-
Nankaku, Y., Nakamura, K., Toda, T., Tokuda, K., Spectral conversion based on statistical models including time-sequence matching. Proceedings of the SSW, 2007.
-
(2007)
Proceedings of the SSW
-
-
Nankaku, Y.1
Nakamura, K.2
Toda, T.3
Tokuda, K.4
-
165
-
-
0029254176
-
Transformation of formants for voice conversion using artificial neural networks
-
Narendranath, M., Murthy, H.A., Rajendran, S., Yegnanarayana, B., Transformation of formants for voice conversion using artificial neural networks. Speech Commun. 16:2 (1995), 207–216.
-
(1995)
Speech Commun.
, vol.16
, Issue.2
, pp. 207-216
-
-
Narendranath, M.1
Murthy, H.A.2
Rajendran, S.3
Yegnanarayana, B.4
-
167
-
-
67649297853
-
Spectral modification for voice gender conversion using temporal decomposition
-
Nguyen, B.P., Akagi, M., Spectral modification for voice gender conversion using temporal decomposition. J.Signal Process, 2007.
-
(2007)
J.Signal Process
-
-
Nguyen, B.P.1
Akagi, M.2
-
168
-
-
85010381832
-
Phoneme-based spectral voice conversion using temporal decomposition and gaussian mixture model
-
Nguyen, B.P., Akagi, M., Phoneme-based spectral voice conversion using temporal decomposition and gaussian mixture model. Proceedings of the ICCE, 2008.
-
(2008)
Proceedings of the ICCE
-
-
Nguyen, B.P.1
Akagi, M.2
-
169
-
-
84867055711
-
Voice transformation using radial basis function
-
Springer
-
Nirmal, J., Patnaik, S., Zaveri, M.A., Voice transformation using radial basis function. Proceedings of the TITC, 2013, Springer, 345–351.
-
(2013)
Proceedings of the TITC
, pp. 345-351
-
-
Nirmal, J.1
Patnaik, S.2
Zaveri, M.A.3
-
170
-
-
84905573362
-
Voice conversion using general regression neural network
-
Nirmal, J., Zaveri, M., Patnaik, S., Kachare, P., Voice conversion using general regression neural network. Appl. Soft Comput. 24 (2014), 1–12.
-
(2014)
Appl. Soft Comput.
, vol.24
, pp. 1-12
-
-
Nirmal, J.1
Zaveri, M.2
Patnaik, S.3
Kachare, P.4
-
171
-
-
34547527563
-
A parametric approach for voice conversion
-
Nurminen, J., Popa, V., Tian, J., Tang, Y., Kiss, I., A parametric approach for voice conversion. TCSTAR WSST, 2006, 225–229.
-
(2006)
TCSTAR WSST
, pp. 225-229
-
-
Nurminen, J.1
Popa, V.2
Tian, J.3
Tang, Y.4
Kiss, I.5
-
172
-
-
56149116066
-
Voicing level control with application in voice conversion
-
Nurminen, J., Tian, J., Popa, V., Voicing level control with application in voice conversion. Proceedings of the INTERSPEECH, 2007.
-
(2007)
Proceedings of the INTERSPEECH
-
-
Nurminen, J.1
Tian, J.2
Popa, V.3
-
174
-
-
44949143155
-
Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation
-
Ohtani, Y., Toda, T., Saruwatari, H., Shikano, K., Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation. Proceedings of the INTERSPEECH, 2006.
-
(2006)
Proceedings of the INTERSPEECH
-
-
Ohtani, Y.1
Toda, T.2
Saruwatari, H.3
Shikano, K.4
-
175
-
-
70450194389
-
Many-to-many eigenvoice conversion with reference voice
-
Ohtani, Y., Toda, T., Saruwatari, H., Shikano, K., Many-to-many eigenvoice conversion with reference voice. Proceedings of the INTERSPEECH, 2009.
-
(2009)
Proceedings of the INTERSPEECH
-
-
Ohtani, Y.1
Toda, T.2
Saruwatari, H.3
Shikano, K.4
-
176
-
-
78049398713
-
Non-parallel training for many-to-many eigenvoice conversion
-
Ohtani, Y., Toda, T., Saruwatari, H., Shikano, K., Non-parallel training for many-to-many eigenvoice conversion. Proceedings of the ICASSP, 2010.
-
(2010)
Proceedings of the ICASSP
-
-
Ohtani, Y.1
Toda, T.2
Saruwatari, H.3
Shikano, K.4
-
177
-
-
2142655909
-
Interpolation properties of linear prediction parametric representations.
-
Paliwal, K.K., Interpolation properties of linear prediction parametric representations. Proceedings of the EUROSPEECH, 1995.
-
(1995)
Proceedings of the EUROSPEECH
-
-
Paliwal, K.K.1
-
178
-
-
0033692729
-
Narrowband to wideband conversion of speech using gmm based transformation
-
Park, K.-Y., Kim, H.S., Narrowband to wideband conversion of speech using gmm based transformation. Proceedings of the ICASSP, 2000.
-
(2000)
Proceedings of the ICASSP
-
-
Park, K.-Y.1
Kim, H.S.2
-
180
-
-
0032664931
-
An experimental study of speaker verification sensitivity to computer voice-altered imposters
-
Pellom, B.L., Hansen, J.H., An experimental study of speaker verification sensitivity to computer voice-altered imposters. Proceedings of the ICASSP, 1999.
-
(1999)
Proceedings of the ICASSP
-
-
Pellom, B.L.1
Hansen, J.H.2
-
182
-
-
84865737668
-
Gaussian process experts for voice conversion
-
Pilkington, N.C., Zen, H., Gales, M.J., et al. Gaussian process experts for voice conversion. Proceedings of the INTERSPEECH, 2011.
-
(2011)
Proceedings of the INTERSPEECH
-
-
Pilkington, N.C.1
Zen, H.2
Gales, M.J.3
-
183
-
-
27644522706
-
Vocal tract normalization equals linear transformation in cepstral space
-
Pitz, M., Ney, H., Vocal tract normalization equals linear transformation in cepstral space. Speech Audio Process. IEEE Trans. 13:5 (2005), 930–944.
-
(2005)
Speech Audio Process. IEEE Trans.
, vol.13
, Issue.5
, pp. 930-944
-
-
Pitz, M.1
Ney, H.2
-
185
-
-
70450171770
-
A novel technique for voice conversion based on style and content decomposition with bilinear models.
-
Popa, V., Nurminen, J., Gabbouj, M., A novel technique for voice conversion based on style and content decomposition with bilinear models. Proceedings of the INTERSPEECH, 2009.
-
(2009)
Proceedings of the INTERSPEECH
-
-
Popa, V.1
Nurminen, J.2
Gabbouj, M.3
-
186
-
-
84971616451
-
A study of bilinear models in voice conversion
-
Popa, V., Nurminen, J., Gabbouj, M., et al. A study of bilinear models in voice conversion. J. Signal Inf. Process., 2(02), 2011, 125.
-
(2011)
J. Signal Inf. Process.
, vol.2
, Issue.2
, pp. 125
-
-
Popa, V.1
Nurminen, J.2
Gabbouj, M.3
-
187
-
-
84867594339
-
Local linear transformation for voice conversion
-
Popa, V., Silen, H., Nurminen, J., Gabbouj, M., Local linear transformation for voice conversion. Proceedings of the ICASSP, 2012.
-
(2012)
Proceedings of the ICASSP
-
-
Popa, V.1
Silen, H.2
Nurminen, J.3
Gabbouj, M.4
-
189
-
-
33751438738
-
Non-linear frequency scale mapping for voice conversion in text-to-speech system with cepstral description
-
Přibilová, A., Přibil, J., Non-linear frequency scale mapping for voice conversion in text-to-speech system with cepstral description. Speech Commun. 48:12 (2006), 1691–1703.
-
(2006)
Speech Commun.
, vol.48
, Issue.12
, pp. 1691-1703
-
-
Přibilová, A.1
Přibil, J.2
-
190
-
-
84865763441
-
A study on bag of gaussian model with application to voice conversion.
-
Qiao, Y., Tong, T., Minematsu, N., A study on bag of gaussian model with application to voice conversion. Proceedings of the INTERSPEECH, 2011, 657–660.
-
(2011)
Proceedings of the INTERSPEECH
, pp. 657-660
-
-
Qiao, Y.1
Tong, T.2
Minematsu, N.3
-
192
-
-
38149073264
-
Voice transformation by mapping the features at syllable level
-
Springer
-
Rao, K.S., Laskar, R., Koolagudi, S.G., Voice transformation by mapping the features at syllable level. Pattern Recognition and Machine Intelligence, 2007, Springer, 479–486.
-
(2007)
Pattern Recognition and Machine Intelligence
, pp. 479-486
-
-
Rao, K.S.1
Laskar, R.2
Koolagudi, S.G.3
-
193
-
-
85036464413
-
Novel pre-processing using outlier removal in voice conversion
-
Rao, S.V., Shah, N.J., Patil, H.A., Novel pre-processing using outlier removal in voice conversion. Proceedings of the SSW, 2016.
-
(2016)
Proceedings of the SSW
-
-
Rao, S.V.1
Shah, N.J.2
Patil, H.A.3
-
194
-
-
85009195247
-
Probability models of formant parameters for voice conversion
-
Rentzos, D., Qin, S.V., Ho, C.-H., Turajlic, E., Probability models of formant parameters for voice conversion. Proceedings of the EUROSPEECH, 2003.
-
(2003)
Proceedings of the EUROSPEECH
-
-
Rentzos, D.1
Qin, S.V.2
Ho, C.-H.3
Turajlic, E.4
-
195
-
-
0030359624
-
Voice conversion based on topological feature maps and time-variant filtering
-
Rinscheid, A., Voice conversion based on topological feature maps and time-variant filtering. Proceedings of the ICSLP, 1996.
-
(1996)
Proceedings of the ICSLP
-
-
Rinscheid, A.1
-
196
-
-
0034847662
-
Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs
-
Rix, A.W., Beerends, J.G., Hollier, M.P., Hekstra, A.P., Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. Proceedings of the ICASSP, 2001.
-
(2001)
Proceedings of the ICASSP
-
-
Rix, A.W.1
Beerends, J.G.2
Hollier, M.P.3
Hekstra, A.P.4
-
197
-
-
84859768504
-
Statistical voice conversion based on noisy channel model
-
Saito, D., Watanabe, S., Nakamura, A., Minematsu, N., Statistical voice conversion based on noisy channel model. IEEE Trans. Audio Speech Lang. Process. 20:6 (2012), 1784–1794.
-
(2012)
IEEE Trans. Audio Speech Lang. Process.
, vol.20
, Issue.6
, pp. 1784-1794
-
-
Saito, D.1
Watanabe, S.2
Nakamura, A.3
Minematsu, N.4
-
198
-
-
84865798483
-
One-to-many voice conversion based on tensor representation of speaker space
-
Saito, D., Yamamoto, K., Minematsu, N., Hirose, K., One-to-many voice conversion based on tensor representation of speaker space. Proceedings of the INTERSPEECH, 2011.
-
(2011)
Proceedings of the INTERSPEECH
-
-
Saito, D.1
Yamamoto, K.2
Minematsu, N.3
Hirose, K.4
-
199
-
-
33748468528
-
Dynamic programming approach to voice transformation
-
Salor, Ö., Demirekler, M., Dynamic programming approach to voice transformation. Speech communication 48:10 (2006), 1262–1272.
-
(2006)
Speech communication
, vol.48
, Issue.10
, pp. 1262-1272
-
-
Salor, Ö.1
Demirekler, M.2
-
200
-
-
84910089725
-
Hierarchical modeling of f0 contours for voice conversion
-
Sanchez, G., Silen, H., Nurminen, J., Gabbouj, M., Hierarchical modeling of f0 contours for voice conversion. Proceedings of the INTERSPEECH, 2014.
-
(2014)
Proceedings of the INTERSPEECH
-
-
Sanchez, G.1
Silen, H.2
Nurminen, J.3
Gabbouj, M.4
-
201
-
-
0026394044
-
Speaker adaptation and voice conversion by codebook mapping
-
Shikano, K., Nakamura, S., Abe, M., Speaker adaptation and voice conversion by codebook mapping. IEEE International Sympoisum on Circuits and Systems, 1991, 594–597.
-
(1991)
IEEE International Sympoisum on Circuits and Systems
, pp. 594-597
-
-
Shikano, K.1
Nakamura, S.2
Abe, M.3
-
202
-
-
70450149422
-
Voice conversion based on mapping formants
-
Shuang, Z., Bakis, R., Qin, Y., Voice conversion based on mapping formants. TC-STAR WSST, 2006, 219–223.
-
(2006)
TC-STAR WSST
, pp. 219-223
-
-
Shuang, Z.1
Bakis, R.2
Qin, Y.3
-
203
-
-
51449112440
-
Voice conversion by combining frequency warping with unit selection
-
Shuang, Z., Meng, F., Qin, Y., Voice conversion by combining frequency warping with unit selection. Proceedings of the ICASSP, 2008.
-
(2008)
Proceedings of the ICASSP
-
-
Shuang, Z.1
Meng, F.2
Qin, Y.3
-
204
-
-
85009076640
-
A novel voice conversion system based on codebook mapping with phoneme-tied weighting
-
Shuang, Z.-W., Wang, Z.-X., Ling, Z.-H., Wang, R.-H., A novel voice conversion system based on codebook mapping with phoneme-tied weighting. Proceedings of the ICSLP, 2004.
-
(2004)
Proceedings of the ICSLP
-
-
Shuang, Z.-W.1
Wang, Z.-X.2
Ling, Z.-H.3
Wang, R.-H.4
-
205
-
-
80053068819
-
Voice conversion using support vector regression
-
Song, P., Bao, Y., Zhao, L., Zou, C., Voice conversion using support vector regression. Electron. Lett. 47:18 (2011), 1045–1046.
-
(2011)
Electron. Lett.
, vol.47
, Issue.18
, pp. 1045-1046
-
-
Song, P.1
Bao, Y.2
Zhao, L.3
Zou, C.4
-
206
-
-
84865718211
-
Uniform speech parameterization for multi-form segment synthesis.
-
Sorin, A., Shechtman, S., Pollet, V., Uniform speech parameterization for multi-form segment synthesis. Proceedings of the INTERSPEECH, 2011.
-
(2011)
Proceedings of the INTERSPEECH
-
-
Sorin, A.1
Shechtman, S.2
Pollet, V.3
-
207
-
-
84904163933
-
Dropout: a simple way to prevent neural networks from overfitting
-
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R., Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15:1 (2014), 1929–1958.
-
(2014)
J. Mach. Learn. Res.
, vol.15
, Issue.1
, pp. 1929-1958
-
-
Srivastava, N.1
Hinton, G.2
Krizhevsky, A.3
Sutskever, I.4
Salakhutdinov, R.5
-
210
-
-
0032026483
-
Continuous probabilistic transform for voice conversion
-
Stylianou, Y., Cappé, O., Moulines, E., Continuous probabilistic transform for voice conversion. IEEE Trans. Speech Audio Process. 6:2 (1998), 131–142.
-
(1998)
IEEE Trans. Speech Audio Process.
, vol.6
, Issue.2
, pp. 131-142
-
-
Stylianou, Y.1
Cappé, O.2
Moulines, E.3
-
211
-
-
84946027999
-
Voice conversion using deep bidirectional long short-term memory based recurrent neural networks
-
Sun, L., Kang, S., Li, K., Meng, H., Voice conversion using deep bidirectional long short-term memory based recurrent neural networks. Proceedings of the ICASSP, 2015.
-
(2015)
Proceedings of the ICASSP
-
-
Sun, L.1
Kang, S.2
Li, K.3
Meng, H.4
-
212
-
-
78650542860
-
Voice conversion: State-of-the-art and future work
-
Sündermann, D., Voice conversion: State-of-the-art and future work. Fortschritte der Akustik, 31(2), 2005, 735.
-
(2005)
Fortschritte der Akustik
, vol.31
, Issue.2
, pp. 735
-
-
Sündermann, D.1
-
213
-
-
84888623995
-
-
Universitätsbibliothek der Universität der Bundeswehr München Ph.D. thesis.
-
Sündermann, D., Text-independent voice conversion, 2008, Universitätsbibliothek der Universität der Bundeswehr München Ph.D. thesis.
-
(2008)
Text-independent voice conversion
-
-
Sündermann, D.1
-
214
-
-
85010452452
-
Voice conversion using exclusively unaligned training data
-
Sündermann, D., Bonafonte, A., Höge, H., Ney, H., Voice conversion using exclusively unaligned training data. Proceedings of the ACL/SEPLN, 2004.
-
(2004)
Proceedings of the ACL/SEPLN
-
-
Sündermann, D.1
Bonafonte, A.2
Höge, H.3
Ney, H.4
-
215
-
-
85009084358
-
A first step towards text-independent voice conversion
-
Sündermann, D., Bonafonte, A., Ney, H., Höge, H., A first step towards text-independent voice conversion. Proceedings of the ICSLP, 2004.
-
(2004)
Proceedings of the ICSLP
-
-
Sündermann, D.1
Bonafonte, A.2
Ney, H.3
Höge, H.4
-
216
-
-
33646767751
-
A study on residual prediction techniques for voice conversion.
-
Sündermann, D., Bonafonte, A., Ney, H., Höge, H., A study on residual prediction techniques for voice conversion. Proceedings of the ICASSP, 2005.
-
(2005)
Proceedings of the ICASSP
-
-
Sündermann, D.1
Bonafonte, A.2
Ney, H.3
Höge, H.4
-
217
-
-
33947623206
-
Text-independent voice conversion based on unit selection
-
Sündermann, D., Hoge, H., Bonafonte, A., Ney, H., Black, A., Narayanan, S., Text-independent voice conversion based on unit selection. Proceedings of the ICASSP, 2006.
-
(2006)
Proceedings of the ICASSP
-
-
Sündermann, D.1
Hoge, H.2
Bonafonte, A.3
Ney, H.4
Black, A.5
Narayanan, S.6
-
218
-
-
84905187320
-
TC-Star: cross-language voice conversion revisited
-
TC-Star Workshop
-
Sündermann, D., Höge, H., Bonafonte, A., Ney, H., Hirschberg, J., TC-Star: cross-language voice conversion revisited. TC-Star Workshop, 2006, TC-Star Workshop.
-
(2006)
TC-Star Workshop
-
-
Sündermann, D.1
Höge, H.2
Bonafonte, A.3
Ney, H.4
Hirschberg, J.5
-
219
-
-
44949241666
-
Text-independent cross-language voice conversion.
-
Sündermann, D., Höge, H., Bonafonte, A., Ney, H., Hirschberg, J., Text-independent cross-language voice conversion. Proceedings of the INTERSPEECH, 2006.
-
(2006)
Proceedings of the INTERSPEECH
-
-
Sündermann, D.1
Höge, H.2
Bonafonte, A.3
Ney, H.4
Hirschberg, J.5
-
220
-
-
84946794248
-
An automatic segmentation and mapping approach for voice conversion parameter training
-
Sündermann, D., Ney, H., An automatic segmentation and mapping approach for voice conversion parameter training. Proceedings of the AST, 2003.
-
(2003)
Proceedings of the AST
-
-
Sündermann, D.1
Ney, H.2
-
221
-
-
84946753271
-
VTLN-based cross-language voice conversion
-
Sündermann, D., Ney, H., Hoge, H., VTLN-based cross-language voice conversion. Proceedings of the ASRU, 2003.
-
(2003)
Proceedings of the ASRU
-
-
Sündermann, D.1
Ney, H.2
Hoge, H.3
-
222
-
-
84946045633
-
Wavelets for intonation modeling in hmm speech synthesis
-
Suni, A.S., Aalto, D., Raitio, T., Alku, P., Vainio, M., et al. Wavelets for intonation modeling in hmm speech synthesis. Proceedings of the SSW, 2013.
-
(2013)
Proceedings of the SSW
-
-
Suni, A.S.1
Aalto, D.2
Raitio, T.3
Alku, P.4
Vainio, M.5
-
223
-
-
84949926049
-
Modulation spectrum-based post-filter for GMM-based voice conversion
-
Takamichi, S., Toda, T., Black, A.W., Nakamura, S., Modulation spectrum-based post-filter for GMM-based voice conversion. Proceedings of the APSIPA, 2014.
-
(2014)
Proceedings of the APSIPA
-
-
Takamichi, S.1
Toda, T.2
Black, A.W.3
Nakamura, S.4
-
224
-
-
84959166270
-
Modulation spectrum-constrained trajectory training algorithm for gmm-based voice conversion
-
Takamichi, S., Toda, T., Black, A.W., Nakamura, S., Modulation spectrum-constrained trajectory training algorithm for gmm-based voice conversion. Proceedings of the ICASSP, 2015.
-
(2015)
Proceedings of the ICASSP
-
-
Takamichi, S.1
Toda, T.2
Black, A.W.3
Nakamura, S.4
-
225
-
-
84905271796
-
Noise-robust voice conversion based on spectral mapping on sparse space
-
Takashima, R., Aihara, R., Takiguchi, T., Ariki, Y., Noise-robust voice conversion based on spectral mapping on sparse space. Proceedings of the SSW, 2013.
-
(2013)
Proceedings of the SSW
-
-
Takashima, R.1
Aihara, R.2
Takiguchi, T.3
Ariki, Y.4
-
226
-
-
84874248255
-
Exemplar-based voice conversion in noisy environment
-
Takashima, R., Takiguchi, T., Ariki, Y., Exemplar-based voice conversion in noisy environment. Proceedings of the SLT, 2012.
-
(2012)
Proceedings of the SLT
-
-
Takashima, R.1
Takiguchi, T.2
Ariki, Y.3
-
227
-
-
80051619373
-
One sentence voice adaptation using GMm-based frequency-warping and shift with a sub-band basis spectrum model
-
Tamura, M., Morita, M., Kagoshima, T., Akamine, M., One sentence voice adaptation using GMm-based frequency-warping and shift with a sub-band basis spectrum model. Proceedings of the ICASSP, 2011.
-
(2011)
Proceedings of the ICASSP
-
-
Tamura, M.1
Morita, M.2
Kagoshima, T.3
Akamine, M.4
-
228
-
-
84905244240
-
A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion.
-
Tanaka, K., Toda, T., Neubig, G., Sakti, S., Nakamura, S., A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion. Proceedings of the INTERSPEECH, 2013.
-
(2013)
Proceedings of the INTERSPEECH
-
-
Tanaka, K.1
Toda, T.2
Neubig, G.3
Sakti, S.4
Nakamura, S.5
-
229
-
-
84867203066
-
Maximum a posteriori adaptation for many-to-one eigenvoice conversion
-
Tani, D., Toda, T., Ohtani, Y., Saruwatari, H., Shikano, K., Maximum a posteriori adaptation for many-to-one eigenvoice conversion. Proceedings of the INTERSPEECH, 2008.
-
(2008)
Proceedings of the INTERSPEECH
-
-
Tani, D.1
Toda, T.2
Ohtani, Y.3
Saruwatari, H.4
Shikano, K.5
-
230
-
-
34047263010
-
Prosody conversion from neutral speech to emotional speech
-
Tao, J., Kang, Y., Li, A., Prosody conversion from neutral speech to emotional speech. IEEE Trans. Audio Speech Lang. Process. 14:4 (2006), 1145–1154.
-
(2006)
IEEE Trans. Audio Speech Lang. Process.
, vol.14
, Issue.4
, pp. 1145-1154
-
-
Tao, J.1
Kang, Y.2
Li, A.3
-
231
-
-
77953724495
-
Supervisory data alignment for text-independent voice conversion
-
Tao, J., Zhang, M., Nurminen, J., Tian, J., Wang, X., Supervisory data alignment for text-independent voice conversion. IEEE Trans. Audio Speech Lang. Process. 18:5 (2010), 932–943.
-
(2010)
IEEE Trans. Audio Speech Lang. Process.
, vol.18
, Issue.5
, pp. 932-943
-
-
Tao, J.1
Zhang, M.2
Nurminen, J.3
Tian, J.4
Wang, X.5
-
232
-
-
85061833826
-
Two vocoder techniques for neutral to emotional timbre conversion.
-
Tesser, F., Zovato, E., Nicolao, M., Cosi, P., Two vocoder techniques for neutral to emotional timbre conversion. Proceedings of the SSW, 2010.
-
(2010)
Proceedings of the SSW
-
-
Tesser, F.1
Zovato, E.2
Nicolao, M.3
Cosi, P.4
-
233
-
-
84912079352
-
Correlation-based frequency warping for voice conversion
-
IEEE
-
Tian, X., Wu, Z., Lee, S., Chng, E.S., Correlation-based frequency warping for voice conversion. Proceedings of the ISCSLP, 2014, IEEE, 211–215.
-
(2014)
Proceedings of the ISCSLP
, pp. 211-215
-
-
Tian, X.1
Wu, Z.2
Lee, S.3
Chng, E.S.4
-
234
-
-
84946020861
-
Sparse representation for frequency warping based voice conversion
-
Tian, X., Wu, Z., Lee, S.W., Hy, N.Q., Chng, E.S., Dong, M., Sparse representation for frequency warping based voice conversion. Proceedings of the ICASSP, 2015.
-
(2015)
Proceedings of the ICASSP
-
-
Tian, X.1
Wu, Z.2
Lee, S.W.3
Hy, N.Q.4
Chng, E.S.5
Dong, M.6
-
235
-
-
84959163883
-
System fusion for high-performance voice conversion
-
Tian, X., Wu, Z., Lee, S.W., Hy, N.Q., Dong, M., Chng, E.S., System fusion for high-performance voice conversion. Proceedings of the INTERSPEECH, 2015.
-
(2015)
Proceedings of the INTERSPEECH
-
-
Tian, X.1
Wu, Z.2
Lee, S.W.3
Hy, N.Q.4
Dong, M.5
Chng, E.S.6
-
236
-
-
0003747605
-
Statistical Analysis of Finite Mixture Distributions
-
Wiley New York
-
Titterington, D.M., Smith, A.F., Makov, U.E., et al. Statistical Analysis of Finite Mixture Distributions. Vol. 7, 1985, Wiley New York.
-
(1985)
, vol.7
-
-
Titterington, D.M.1
Smith, A.F.2
Makov, U.E.3
-
237
-
-
84937840625
-
Acoustic-to-articulatory inversion mapping with gaussian mixture model.
-
Toda, T., Black, A.W., Tokuda, K., Acoustic-to-articulatory inversion mapping with gaussian mixture model. Proceedings of the INTERSPEECH, 2004.
-
(2004)
Proceedings of the INTERSPEECH
-
-
Toda, T.1
Black, A.W.2
Tokuda, K.3
-
238
-
-
33646779506
-
Spectral conversion based on maximum likelihood estimation considering global variance of converted parameter.
-
Toda, T., Black, A.W., Tokuda, K., Spectral conversion based on maximum likelihood estimation considering global variance of converted parameter. Proceedings of the ICASSP, 2005.
-
(2005)
Proceedings of the ICASSP
-
-
Toda, T.1
Black, A.W.2
Tokuda, K.3
-
239
-
-
57749193836
-
Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
-
Toda, T., Black, A.W., Tokuda, K., Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory. IEEE Trans. Audio Speech Lang. Process. 15:8 (2007), 2222–2235.
-
(2007)
IEEE Trans. Audio Speech Lang. Process.
, vol.15
, Issue.8
, pp. 2222-2235
-
-
Toda, T.1
Black, A.W.2
Tokuda, K.3
-
240
-
-
38649140222
-
Statistical mapping between articulatory movements and acoustic spectrum using a gaussian mixture model
-
Toda, T., Black, A.W., Tokuda, K., Statistical mapping between articulatory movements and acoustic spectrum using a gaussian mixture model. Speech Commun. 50:3 (2008), 215–227.
-
(2008)
Speech Commun.
, vol.50
, Issue.3
, pp. 215-227
-
-
Toda, T.1
Black, A.W.2
Tokuda, K.3
-
241
-
-
84878390910
-
Implementation of computationally efficient real-time voice conversion.
-
Toda, T., Muramatsu, T., Banno, H., Implementation of computationally efficient real-time voice conversion. Proceedings of the INTERSPEECH, 2012.
-
(2012)
Proceedings of the INTERSPEECH
-
-
Toda, T.1
Muramatsu, T.2
Banno, H.3
-
242
-
-
84865698185
-
Statistical voice conversion techniques for body-conducted unvoiced speech enhancement
-
Toda, T., Nakagiri, M., Shikano, K., Statistical voice conversion techniques for body-conducted unvoiced speech enhancement. IEEE Trans. Audio Speech Lang. Process. 20:9 (2012), 2505–2517.
-
(2012)
IEEE Trans. Audio Speech Lang. Process.
, vol.20
, Issue.9
, pp. 2505-2517
-
-
Toda, T.1
Nakagiri, M.2
Shikano, K.3
-
243
-
-
84897939966
-
Alaryngeal speech enhancement based on one-to-many eigenvoice conversion
-
Toda, T., Nakamura, K., Saruwatari, H., Shikano, K., et al. Alaryngeal speech enhancement based on one-to-many eigenvoice conversion. IEEE/ACM IEEE Trans. Audio Speech Lang. Process. 22:1 (2014), 172–183.
-
(2014)
IEEE/ACM IEEE Trans. Audio Speech Lang. Process.
, vol.22
, Issue.1
, pp. 172-183
-
-
Toda, T.1
Nakamura, K.2
Saruwatari, H.3
Shikano, K.4
-
244
-
-
34547512822
-
Eigenvoice conversion based on gaussian mixture model
-
Toda, T., Ohtani, Y., Shikano, K., Eigenvoice conversion based on gaussian mixture model. Proceedings of the INTERSPEECH, 2006.
-
(2006)
Proceedings of the INTERSPEECH
-
-
Toda, T.1
Ohtani, Y.2
Shikano, K.3
-
245
-
-
34547496175
-
One-to-many and many-to-one voice conversion based on eigenvoices
-
Toda, T., Ohtani, Y., Shikano, K., One-to-many and many-to-one voice conversion based on eigenvoices. Proceedings of the ICASSP, 2007.
-
(2007)
Proceedings of the ICASSP
-
-
Toda, T.1
Ohtani, Y.2
Shikano, K.3
-
246
-
-
84994361374
-
The voice conversion challenge 2016
-
Toda, T., Saito, D., Villavicencio, F., Yamagishi, J., Wester, M., Wu, Z., Chen, L.-H., et al. The voice conversion challenge 2016. Proceedings of the INTERSPEECH, 2016.
-
(2016)
Proceedings of the INTERSPEECH
-
-
Toda, T.1
Saito, D.2
Villavicencio, F.3
Yamagishi, J.4
Wester, M.5
Wu, Z.6
Chen, L.-H.7
-
247
-
-
0034842552
-
Voice conversion algorithm based on gaussian mixture model with dynamic frequency warping of straight spectrum
-
Toda, T., Saruwatari, H., Shikano, K., Voice conversion algorithm based on gaussian mixture model with dynamic frequency warping of straight spectrum. Proceedings of the ICASSP, 2001.
-
(2001)
Proceedings of the ICASSP
-
-
Toda, T.1
Saruwatari, H.2
Shikano, K.3
-
249
-
-
0028996993
-
Speech parameter generation from HMM using dynamic features
-
Tokuda, K., Kobayashi, T., Imai, S., Speech parameter generation from HMM using dynamic features. Proceedings of the ICASSP, 1995.
-
(1995)
Proceedings of the ICASSP
-
-
Tokuda, K.1
Kobayashi, T.2
Imai, S.3
-
250
-
-
84946077883
-
Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis
-
Tokuda, K., Zen, H., Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis. Proceedings of the ICASSP, 2015.
-
(2015)
Proceedings of the ICASSP
-
-
Tokuda, K.1
Zen, H.2
-
251
-
-
76849105528
-
Improvement to a nam-captured whisper-to-speech system
-
Tran, V.-A., Bailly, G., Lœvenbruck, H., Toda, T., Improvement to a nam-captured whisper-to-speech system. Speech Commun. 52:4 (2010), 314–326.
-
(2010)
Speech Commun.
, vol.52
, Issue.4
, pp. 314-326
-
-
Tran, V.-A.1
Bailly, G.2
Lœvenbruck, H.3
Toda, T.4
-
252
-
-
0141479037
-
Evaluation of methods for parameteric formant transformation in voice conversion
-
Turajlic, E., Rentzos, D., Vaseghi, S., Ho, C.-H., Evaluation of methods for parameteric formant transformation in voice conversion. Proceeding of the ICASSP, 2003.
-
(2003)
Proceeding of the ICASSP
-
-
Turajlic, E.1
Rentzos, D.2
Vaseghi, S.3
Ho, C.-H.4
-
254
-
-
85009179173
-
Voice conversion methods for vocal tract and pitch contour modification.
-
Türk, O., Arslan, L.M., Voice conversion methods for vocal tract and pitch contour modification. Proceedings of the INTERSPEECH, 2003.
-
(2003)
Proceedings of the INTERSPEECH
-
-
Türk, O.1
Arslan, L.M.2
-
256
-
-
33746653351
-
Robust processing techniques for voice conversion
-
Turk, O., Arslan, L.M., Robust processing techniques for voice conversion. Comput. Speech Lang. 20:4 (2006), 441–467.
-
(2006)
Comput. Speech Lang.
, vol.20
, Issue.4
, pp. 441-467
-
-
Turk, O.1
Arslan, L.M.2
-
257
-
-
70349207267
-
Application of voice conversion for cross-language rap singing transformation
-
Turk, O., Buyuk, O., Haznedaroglu, A., Arslan, L.M., Application of voice conversion for cross-language rap singing transformation. Proceedings of the ICASSP, 2009.
-
(2009)
Proceedings of the ICASSP
-
-
Turk, O.1
Buyuk, O.2
Haznedaroglu, A.3
Arslan, L.M.4
-
258
-
-
84867219635
-
A comparison of voice conversion methods for transforming voice quality in emotional speech synthesis.
-
Türk, O., Schröder, M., A comparison of voice conversion methods for transforming voice quality in emotional speech synthesis. Proceedings of the INTERSPEECH, 2008.
-
(2008)
Proceedings of the INTERSPEECH
-
-
Türk, O.1
Schröder, M.2
-
259
-
-
77953699443
-
Evaluation of expressive speech synthesis with voice conversion and copy resynthesis techniques
-
Turk, O., Schroder, M., Evaluation of expressive speech synthesis with voice conversion and copy resynthesis techniques. IEEE Trans. Audio Speech Lang. Process. 18:5 (2010), 965–973.
-
(2010)
IEEE Trans. Audio Speech Lang. Process.
, vol.18
, Issue.5
, pp. 965-973
-
-
Turk, O.1
Schroder, M.2
-
260
-
-
34547806096
-
A self-organizing map with twin units capable of describing a nonlinear input–output relation applied to speech code vector mapping
-
Uchino, E., Yano, K., Azetsu, T., A self-organizing map with twin units capable of describing a nonlinear input–output relation applied to speech code vector mapping. Inf. Sci. 177:21 (2007), 4634–4644.
-
(2007)
Inf. Sci.
, vol.177
, Issue.21
, pp. 4634-4644
-
-
Uchino, E.1
Yano, K.2
Azetsu, T.3
-
261
-
-
85010284658
-
Voice conversion using frame selection and warping functions
-
Uriz, A., Aguero, P., Tulli, J., Gonzalez, E., Bonafonte, A., Voice conversion using frame selection and warping functions. Proceedings of the RPIC, 2009.
-
(2009)
Proceedings of the RPIC
-
-
Uriz, A.1
Aguero, P.2
Tulli, J.3
Gonzalez, E.4
Bonafonte, A.5
-
262
-
-
85010316399
-
Voice Conversion Using Frame Selection
-
Reporte Interno Laboratorio de Comunicaciones-UNMdP
-
Uriz, A., Agüero, P.D., Erro, D., Bonafonte, A., Voice Conversion Using Frame Selection. 2008 Reporte Interno Laboratorio de Comunicaciones-UNMdP.
-
(2008)
-
-
Uriz, A.1
Agüero, P.D.2
Erro, D.3
Bonafonte, A.4
-
263
-
-
70450204589
-
Voice conversion using k-histograms and frame selection.
-
Uriz, A.J., Agüero, P.D., Bonafonte, A., Tulli, J.C., Voice conversion using k-histograms and frame selection. Proceedings of the INTERSPEECH, 2009.
-
(2009)
Proceedings of the INTERSPEECH
-
-
Uriz, A.J.1
Agüero, P.D.2
Bonafonte, A.3
Tulli, J.C.4
-
264
-
-
44949104276
-
Voice conversion based on mixtures of factor analyzers
-
Uto, Y., Nankaku, Y., Toda, T., Lee, A., Tokuda, K., Voice conversion based on mixtures of factor analyzers. Proceeding of the ICSLP, 2006.
-
(2006)
Proceeding of the ICSLP
-
-
Uto, Y.1
Nankaku, Y.2
Toda, T.3
Lee, A.4
Tokuda, K.5
-
265
-
-
85010815133
-
Voice transformation using PSOLA technique
-
Valbret, H., Moulines, E., Tubach, J.-P., Voice transformation using PSOLA technique. Proceedings of the ICASSP, 1992.
-
(1992)
Proceedings of the ICASSP
-
-
Valbret, H.1
Moulines, E.2
Tubach, J.-P.3
-
266
-
-
0026880275
-
Voice transformation using PSOLA technique
-
Valbret, H., Moulines, E., Tubach, J.P., Voice transformation using PSOLA technique. Speech Commun. 11:2 (1992), 175–187.
-
(1992)
Speech Commun.
, vol.11
, Issue.2
, pp. 175-187
-
-
Valbret, H.1
Moulines, E.2
Tubach, J.P.3
-
267
-
-
84959108894
-
Towards minimum perceptual error training for DNN-based speech synthesis
-
Valentini-Botinhao, C., Wu, Z., King, S., Towards minimum perceptual error training for DNN-based speech synthesis. Proceedings of the INTERSPEECH, 2015.
-
(2015)
Proceedings of the INTERSPEECH
-
-
Valentini-Botinhao, C.1
Wu, Z.2
King, S.3
-
269
-
-
77956828655
-
Voice fonts for individuality representation and transformation
-
Verma, A., Kumar, A., Voice fonts for individuality representation and transformation. ACM Trans. Speech Lang. Process. (TSLP), 2(1), 2005, 4.
-
(2005)
ACM Trans. Speech Lang. Process. (TSLP)
, vol.2
, Issue.1
, pp. 4
-
-
Verma, A.1
Kumar, A.2
-
271
-
-
84960864752
-
Observation-model error compensation for enhanced spectral envelope transformation in voice conversion
-
Villavicencio, F., Bonada, J., Hisaminato, Y., Observation-model error compensation for enhanced spectral envelope transformation in voice conversion. Proceedings of the MLSP, 2015.
-
(2015)
Proceedings of the MLSP
-
-
Villavicencio, F.1
Bonada, J.2
Hisaminato, Y.3
-
272
-
-
34547541173
-
A new method for speech synthesis and transformation based on an arx-lf source-filter decomposition and HNM modeling
-
Vincent, D., Rosec, O., Chonavel, T., A new method for speech synthesis and transformation based on an arx-lf source-filter decomposition and HNM modeling. Proceedings of the ICASSP, 2007.
-
(2007)
Proceedings of the ICASSP
-
-
Vincent, D.1
Rosec, O.2
Chonavel, T.3
-
273
-
-
0003557444
-
Verbmobil: Foundations of Speech-to-Speech Translation
-
Springer Science & Business Media
-
Wahlster, W., Verbmobil: Foundations of Speech-to-Speech Translation. 2000, Springer Science & Business Media.
-
(2000)
-
-
Wahlster, W.1
-
274
-
-
84902959938
-
Emotional voice conversion for mandarin using tone nucleus model–small corpus and high efficiency
-
Wang, M., Wen, M., Hirose, K., Minematsu, N., Emotional voice conversion for mandarin using tone nucleus model–small corpus and high efficiency. Proceedings of the Speech Prosody, 2012.
-
(2012)
Proceedings of the Speech Prosody
-
-
Wang, M.1
Wen, M.2
Hirose, K.3
Minematsu, N.4
-
275
-
-
84988295463
-
Multi-level prosody and spectrum conversion for emotional speech synthesis
-
Wang, Z., Yu, Y., Multi-level prosody and spectrum conversion for emotional speech synthesis. Proceedings of the ICSP, 2014.
-
(2014)
Proceedings of the ICSP
-
-
Wang, Z.1
Yu, Y.2
-
276
-
-
85009266993
-
Transformation of spectral envelope for voice conversion based on radial basis function networks
-
Watanabe, T., Murakami, T., Namba, M., Hoya, T., Ishida, Y., Transformation of spectral envelope for voice conversion based on radial basis function networks. Proceedings of the ICSLP, 2002.
-
(2002)
Proceedings of the ICSLP
-
-
Watanabe, T.1
Murakami, T.2
Namba, M.3
Hoya, T.4
Ishida, Y.5
-
277
-
-
84994351528
-
Analysis of the voice conversion challenge 2016 evaluation results
-
Wester, M., Wu, Z., Yamagishi, J., Analysis of the voice conversion challenge 2016 evaluation results. Proceedings of the INTERSPEECH, 2016.
-
(2016)
Proceedings of the INTERSPEECH
-
-
Wester, M.1
Wu, Z.2
Yamagishi, J.3
-
278
-
-
85133164470
-
Multidimensional scaling of systems in the voice conversion challenge 2016
-
Wester, M., Wu, Z., Yamagishi, J., Multidimensional scaling of systems in the voice conversion challenge 2016. Proceedings of the SSW, 2016.
-
(2016)
Proceedings of the SSW
-
-
Wester, M.1
Wu, Z.2
Yamagishi, J.3
-
279
-
-
33646815712
-
The MOCHA-TIMIT articulatory database
-
Queen Margaret University College
-
Wrench, A., The MOCHA-TIMIT articulatory database. 1999, Queen Margaret University College.
-
(1999)
-
-
Wrench, A.1
-
280
-
-
34047247202
-
Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis
-
Wu, C.-H., Hsia, C.-C., Liu, T.-H., Wang, J.-F., Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis. IEEE Trans. Audio Speech Lang. Process. 14:4 (2006), 1109–1116.
-
(2006)
IEEE Trans. Audio Speech Lang. Process.
, vol.14
, Issue.4
, pp. 1109-1116
-
-
Wu, C.-H.1
Hsia, C.-C.2
Liu, T.-H.3
Wang, J.-F.4
-
281
-
-
84994247053
-
Locally linear embedding for exemplar-based spectral conversion
-
Wu, Y.-C., Hwang, H.-T., Hsu, C.-C., Tsao, Y., Wang, H.-M., Locally linear embedding for exemplar-based spectral conversion. Proceedings of the INTERSPEECH, 2016.
-
(2016)
Proceedings of the INTERSPEECH
-
-
Wu, Y.-C.1
Hwang, H.-T.2
Hsu, C.-C.3
Tsao, Y.4
Wang, H.-M.5
-
282
-
-
84889579519
-
Conditional restricted boltzmann machine for voice conversion
-
Wu, Z., Chng, E.S., Li, H., Conditional restricted boltzmann machine for voice conversion. Proceedings of the ChinaSIP, 2013.
-
(2013)
Proceedings of the ChinaSIP
-
-
Wu, Z.1
Chng, E.S.2
Li, H.3
-
283
-
-
84910071877
-
Joint nonnegative matrix factorization for exemplar-based voice conversion
-
Wu, Z., Chng, E.S., Li, H., Joint nonnegative matrix factorization for exemplar-based voice conversion. Proceedings of the INTERSPEECH, 2014.
-
(2014)
Proceedings of the INTERSPEECH
-
-
Wu, Z.1
Chng, E.S.2
Li, H.3
-
284
-
-
79959842826
-
Text-independent F0 transformation with non-parallel data for voice conversion.
-
Wu, Z., Kinnunen, T., Chng, E., Li, H., Text-independent F0 transformation with non-parallel data for voice conversion. Proceedings of the INTERSPEECH, 2010.
-
(2010)
Proceedings of the INTERSPEECH
-
-
Wu, Z.1
Kinnunen, T.2
Chng, E.3
Li, H.4
-
285
-
-
84869384026
-
Mixture of factor analyzers using priors from non-parallel speech for voice conversion
-
Wu, Z., Kinnunen, T., Chng, E.S., Li, H., Mixture of factor analyzers using priors from non-parallel speech for voice conversion. IEEE Signal Process. Lett. 19:12 (2012), 914–917.
-
(2012)
IEEE Signal Process. Lett.
, vol.19
, Issue.12
, pp. 914-917
-
-
Wu, Z.1
Kinnunen, T.2
Chng, E.S.3
Li, H.4
-
286
-
-
84906275384
-
Vulnerability evaluation of speaker verification under voice conversion spoofing: the effect of text constraints.
-
Wu, Z., Larcher, A., Lee, K.-A., Chng, E., Kinnunen, T., Li, H., Vulnerability evaluation of speaker verification under voice conversion spoofing: the effect of text constraints. Proceedings of the INTERSPEECH, 2013.
-
(2013)
Proceedings of the INTERSPEECH
-
-
Wu, Z.1
Larcher, A.2
Lee, K.-A.3
Chng, E.4
Kinnunen, T.5
Li, H.6
-
287
-
-
84956723787
-
Voice conversion versus speaker verification: an overview
-
Wu, Z., Li, H., Voice conversion versus speaker verification: an overview. APSIPA Trans. Signal Inf. Process., 3, 2014, e17.
-
(2014)
APSIPA Trans. Signal Inf. Process.
, vol.3
, pp. e17
-
-
Wu, Z.1
Li, H.2
-
288
-
-
84911369131
-
Exemplar-based sparse representation with residual compensation for voice conversion
-
Wu, Z., Virtanen, T., Chng, E.S., Li, H., Exemplar-based sparse representation with residual compensation for voice conversion. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 22:10 (2014), 1506–1521.
-
(2014)
IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP)
, vol.22
, Issue.10
, pp. 1506-1521
-
-
Wu, Z.1
Virtanen, T.2
Chng, E.S.3
Li, H.4
-
289
-
-
84906276055
-
Exemplar-based unit selection for voice conversion utilizing temporal information
-
Wu, Z., Virtanen, T., Kinnunen, T., Chng, E., Li, H., Exemplar-based unit selection for voice conversion utilizing temporal information. Proceedings of the INTERSPEECH, 2013.
-
(2013)
Proceedings of the INTERSPEECH
-
-
Wu, Z.1
Virtanen, T.2
Kinnunen, T.3
Chng, E.4
Li, H.5
-
290
-
-
84901803470
-
Exemplar-based voice conversion using non-negative spectrogram deconvolution
-
Wu, Z., Virtanen, T., Kinnunen, T., Chng, E.S., Li, H., Exemplar-based voice conversion using non-negative spectrogram deconvolution. Proceedings of the SSW, 2013.
-
(2013)
Proceedings of the SSW
-
-
Wu, Z.1
Virtanen, T.2
Kinnunen, T.3
Chng, E.S.4
Li, H.5
-
291
-
-
84910087395
-
Sequence error (se) minimization training of neural network for voice conversion
-
Xie, F.-L., Qian, Y., Fan, Y., Soong, F.K., Li, H., Sequence error (se) minimization training of neural network for voice conversion. Proceedings of the INTERSPEECH, 2014.
-
(2014)
Proceedings of the INTERSPEECH
-
-
Xie, F.-L.1
Qian, Y.2
Fan, Y.3
Soong, F.K.4
Li, H.5
-
292
-
-
84912078522
-
Pitch transformation in neural network based voice conversion
-
Xie, F.-L., Qian, Y., Soong, F.K., Li, H., Pitch transformation in neural network based voice conversion. Proceedings of the ISCSLP, 2014.
-
(2014)
Proceedings of the ISCSLP
-
-
Xie, F.-L.1
Qian, Y.2
Soong, F.K.3
Li, H.4
-
293
-
-
84890539284
-
Voice conversion based on gaussian processes by coherent and asymmetric training with limited training data
-
Xu, N., Tang, Y., Bao, J., Jiang, A., Liu, X., Yang, Z., Voice conversion based on gaussian processes by coherent and asymmetric training with limited training data. Speech Commun. 58 (2014), 124–138.
-
(2014)
Speech Commun.
, vol.58
, pp. 124-138
-
-
Xu, N.1
Tang, Y.2
Bao, J.3
Jiang, A.4
Liu, X.5
Yang, Z.6
-
294
-
-
84855906479
-
Speech synthesis technologies for individuals with vocal disabilities: voice banking and reconstruction
-
Yamagishi, J., Veaux, C., King, S., Renals, S., Speech synthesis technologies for individuals with vocal disabilities: voice banking and reconstruction. Acoust. Sci. Technol. 33:1 (2012), 1–5.
-
(2012)
Acoust. Sci. Technol.
, vol.33
, Issue.1
, pp. 1-5
-
-
Yamagishi, J.1
Veaux, C.2
King, S.3
Renals, S.4
-
295
-
-
85009224898
-
Perceptually weighted linear transformations for voice conversion.
-
Ye, H., Young, S., Perceptually weighted linear transformations for voice conversion. Proceedings of the INTERSPEECH, 2003.
-
(2003)
Proceedings of the INTERSPEECH
-
-
Ye, H.1
Young, S.2
-
297
-
-
34047254509
-
Quality-enhanced voice morphing using maximum likelihood transformations
-
Ye, H., Young, S., Quality-enhanced voice morphing using maximum likelihood transformations. IEEE Trans. Audio Speech Lang. Process. 14:4 (2006), 1301–1312.
-
(2006)
IEEE Trans. Audio Speech Lang. Process.
, vol.14
, Issue.4
, pp. 1301-1312
-
-
Ye, H.1
Young, S.2
-
298
-
-
51849135434
-
Voice conversion using HMM combined with GMM
-
Yue, Z., Zou, X., Jia, Y., Wang, H., Voice conversion using HMM combined with GMM. Proceedings of the CISP, 2008.
-
(2008)
Proceedings of the CISP
-
-
Yue, Z.1
Zou, X.2
Jia, Y.3
Wang, H.4
-
299
-
-
70349218136
-
Voice conversion based on simultaneous modelling of spectrum and f0
-
Yutani, K., Uto, Y., Nankaku, Y., Lee, A., Tokuda, K., Voice conversion based on simultaneous modelling of spectrum and f0. Proceedings of the ICASSP, 2009.
-
(2009)
Proceedings of the ICASSP
-
-
Yutani, K.1
Uto, Y.2
Nankaku, Y.3
Lee, A.4
Tokuda, K.5
-
300
-
-
78149260085
-
Continuous stochastic feature mapping based on trajectory hmms
-
Zen, H., Nankaku, Y., Tokuda, K., Continuous stochastic feature mapping based on trajectory hmms. IEEE Trans. Audio Speech Lang. Process. 19:2 (2011), 417–430.
-
(2011)
IEEE Trans. Audio Speech Lang. Process.
, vol.19
, Issue.2
, pp. 417-430
-
-
Zen, H.1
Nankaku, Y.2
Tokuda, K.3
-
301
-
-
33646812247
-
Voice conversion based on weighted least squares estimation criterion and residual prediction from pitch contour
-
Springer
-
Zhang, J., Sun, J., Dai, B., Voice conversion based on weighted least squares estimation criterion and residual prediction from pitch contour. Affective Computing and Intelligent Interaction, 2005, Springer, 326–333.
-
(2005)
Affective Computing and Intelligent Interaction
, pp. 326-333
-
-
Zhang, J.1
Sun, J.2
Dai, B.3
-
302
-
-
70349215699
-
Phoneme cluster based state mapping for text-independent voice conversion
-
Zhang, M., Tao, J., Nurminen, J., Tian, J., Wang, X., Phoneme cluster based state mapping for text-independent voice conversion. Proceedings of the ICASSP, 2009.
-
(2009)
Proceedings of the ICASSP
-
-
Zhang, M.1
Tao, J.2
Nurminen, J.3
Tian, J.4
Wang, X.5
-
303
-
-
51449121435
-
Text-independent voice conversion based on state mapped codebook
-
Zhang, M., Tao, J., Tian, J., Wang, X., Text-independent voice conversion based on state mapped codebook. Proceedings of the ICASSP, 2008.
-
(2008)
Proceedings of the ICASSP
-
-
Zhang, M.1
Tao, J.2
Tian, J.3
Wang, X.4
-
305
-
-
84871520443
-
Improving the quality of standard GMM-based voice conversion systems by considering physically motivated linear transformations
-
Springer
-
Zorilă, T.-C., Erro, D., Hernaez, I., Improving the quality of standard GMM-based voice conversion systems by considering physically motivated linear transformations. Advances in Speech and Language Technologies for Iberian Languages, 2012, Springer, 30–39.
-
(2012)
Advances in Speech and Language Technologies for Iberian Languages
, pp. 30-39
-
-
Zorilă, T.-C.1
Erro, D.2
Hernaez, I.3
|