SCOPUS 정보 검색 플랫폼

Computer Speech and Language

Volumn 30, Issue 1, 2015, Pages 3-15

Interpretable parametric voice conversion functions based on Gaussian mixture models and constrained transformations

(5) Erro, Daniel a,b Alonso, Agustin a Serrano, Luis a Navas, Eva a Hernaez, Inma a

a UNIVERSITY OF THE BASQUE COUNTRY UPV EHU (Spain)

b BASQUE FOUNDATION FOR SCIENCE (Spain)

Author keywords

Amplitude scaling; Frequency warping; Gaussian mixture models; Spectral tilt; Voice conversion

Indexed keywords

GAUSSIAN DISTRIBUTION;

AMPLITUDE SCALING; FREQUENCY WARPING; GAUSSIAN MIXTURE MODEL; SPECTRAL TILT; VOICE CONVERSION;

SPEECH PROCESSING;

EID: 84913585254 PISSN: 08852308 EISSN: 10958363 Source Type: Journal
DOI: 10.1016/j.csl.2014.03.001 Document Type: Article

Times cited : (18)

References (31)

1
- 0023739214
- Voice conversion through vector quantization
- M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara Voice conversion through vector quantization Proc. ICASSP 1988 655 658
- (1988) Proc. ICASSP , pp. 655-658
- Abe, M.¹ Nakamura, S.² Shikano, K.³ Kuwabara, H.⁴

2
- 0033154052
- Speaker transformation algorithm using segmental codebooks (STASC)
- L.M. Arslan Speaker transformation algorithm using segmental codebooks (STASC) Speech Communication 28 1999 211 226
- (1999) Speech Communication , vol.28 , pp. 211-226
- Arslan, L.M.¹

3
- 84865754815
- Voice conversion using GMM with enhanced global variance
- H. Benisty, and D. Malah Voice conversion using GMM with enhanced global variance Proc. Interspeech 2011 669 672
- (2011) Proc. Interspeech , pp. 669-672
- Benisty, H.¹ Malah, D.²

4
- 84875231469
- Evaluating the intelligibility benefit of speech modifications in known noise conditions
- M. Cooke, C. Mayo, C. Valentini-Botinhao, Y. Stylianou, B. Sauert, and Y. Tang Evaluating the intelligibility benefit of speech modifications in known noise conditions Speech Communication 55 2013 572 585
- (2013) Speech Communication , vol.55 , pp. 572-585
- Cooke, M.¹ Mayo, C.² Valentini-Botinhao, C.³ Stylianou, Y.⁴ Sauert, B.⁵ Tang, Y.⁶

5
- 77953707533
- Spectral mapping using artificial neural networks for voice conversion
- S. Desai, A.W. Black, B. Yegnanarayana, and K. Prahallad Spectral mapping using artificial neural networks for voice conversion IEEE Transactions on Audio, Speech and Language Processing 18 2010 954 964
- (2010) IEEE Transactions on Audio, Speech and Language Processing , vol.18 , pp. 954-964
- Desai, S.¹ Black, A.W.² Yegnanarayana, B.³ Prahallad, K.⁴

6
- 84994241109
- Including dynamic and phonetic information in voice conversion systems
- H. Duxans, A. Bonafonte, A. Kain, and J. Van Santen Including dynamic and phonetic information in voice conversion systems Proc. ICSLP 2004 1193 1196
- (2004) Proc. ICSLP , pp. 1193-1196
- Duxans, H.¹ Bonafonte, A.² Kain, A.³ Van Santen, J.⁴

7
- 77953727123
- Voice conversion based on weighted frequency warping
- D. Erro, A. Moreno, and A. Bonafonte Voice conversion based on weighted frequency warping IEEE Transactions on Audio, Speech and Language Processing 18 2010 922 931
- (2010) IEEE Transactions on Audio, Speech and Language Processing , vol.18 , pp. 922-931
- Erro, D.¹ Moreno, A.² Bonafonte, A.³

8
- 80051629671
- HNM-based MFCC + F0 extractor applied to statistical speech synthesis
- D. Erro, I. Sainz, E. Navas, and I. Hernaez HNM-based MFCC + F0 extractor applied to statistical speech synthesis Proc. ICASSP 2011 4728 4731
- (2011) Proc. ICASSP , pp. 4728-4731
- Erro, D.¹ Sainz, I.² Navas, E.³ Hernaez, I.⁴

9
- 84878409257
- Iterative MMSE estimation of vocal tract length normalization factors for voice transformation
- D. Erro, E. Navas, and I. Hernaez Iterative MMSE estimation of vocal tract length normalization factors for voice transformation Proc. Interspeech 2012 86 89
- (2012) Proc. Interspeech , pp. 86-89
- Erro, D.¹ Navas, E.² Hernaez, I.³

10
- 84888241651
- Towards physically interpretable parametric voice conversion functions
- D. Erro, A. Alonso, L. Serrano, E. Navas, and I. Hernaez Towards physically interpretable parametric voice conversion functions Lecture Notes in Artificial Intelligence 7911 2013 75 82
- (2013) Lecture Notes in Artificial Intelligence , vol.7911 , pp. 75-82
- Erro, D.¹ Alonso, A.² Serrano, L.³ Navas, E.⁴ Hernaez, I.⁵

11
- 84872177757
- Parametric voice conversion based on bilinear frequency warping plus amplitude scaling
- D. Erro, E. Navas, and I. Hernaez Parametric voice conversion based on bilinear frequency warping plus amplitude scaling IEEE Transactions on Audio, Speech and Language Processing 21 2013 556 566
- (2013) IEEE Transactions on Audio, Speech and Language Processing , vol.21 , pp. 556-566
- Erro, D.¹ Navas, E.² Hernaez, I.³

12
- 84857498745
- Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora
- E. Godoy, O. Rosec, and T. Chonavel Voice conversion using dynamic frequency warping with amplitude scaling, for parallel or nonparallel corpora IEEE Transactions on Audio, Speech and Language Processing 20 2012 1313 1323
- (2012) IEEE Transactions on Audio, Speech and Language Processing , vol.20 , pp. 1313-1323
- Godoy, E.¹ Rosec, O.² Chonavel, T.³

13
- 77953712499
- Voice conversion using partial least squares regression
- E. Helander, T. Virtanen, J. Nurminen, and M. Gabbouj Voice conversion using partial least squares regression IEEE Transactions on Audio, Speech and Language Processing 18 2010 912 921
- (2010) IEEE Transactions on Audio, Speech and Language Processing , vol.18 , pp. 912-921
- Helander, E.¹ Virtanen, T.² Nurminen, J.³ Gabbouj, M.⁴

14
- 4444285698
- Oregon Health and Science University Portland, Oregon, USA
- A. Kain High Resolution Voice Transformation 2001 Oregon Health and Science University Portland, Oregon, USA
- (2001) High Resolution Voice Transformation
- Kain, A.¹

15
- 33646773080
- J. Kominek, and A.W. Black CMU Arctic Databases for Speech Synthesis 2003
- (2003) CMU Arctic Databases for Speech Synthesis
- Kominek, J.¹ Black, A.W.²

16
- 78049373493
- Pronunciation variation generation for spontaneous speech synthesis using state-based voice transformation
- C.H. Lee, C.H. Wu, and J.C. Guo Pronunciation variation generation for spontaneous speech synthesis using state-based voice transformation Proc. ICASSP 2010 4826 4829
- (2010) Proc. ICASSP , pp. 4826-4829
- Lee, C.H.¹ Wu, C.H.² Guo, J.C.³

17
- 0032657747
- Speaker adaptation with all-pass transforms
- J. McDonough, and W. Byrne Speaker adaptation with all-pass transforms Proc. ICASSP 1999 757 760
- (1999) Proc. ICASSP , pp. 757-760
- McDonough, J.¹ Byrne, W.²

18
- 58149209073
- Voice conversion: State of the art and perspectives
- E. Moulines, and Y. Sagisaka Voice conversion: state of the art and perspectives Speech Communication 16 1995 125 126
- (1995) Speech Communication , vol.16 , pp. 125-126
- Moulines, E.¹ Sagisaka, Y.²

19
- 0029254176
- Transformation of formants for voice conversion using artificial neural networks
- M. Narendranath, H.A. Murthy, S. Rajendran, and B. Yegnanarayana Transformation of formants for voice conversion using artificial neural networks Speech Communication 16 1995 207 216
- (1995) Speech Communication , vol.16 , pp. 207-216
- Narendranath, M.¹ Murthy, H.A.² Rajendran, S.³ Yegnanarayana, B.⁴

20
- 27644522706
- Vocal tract normalization equals linear transformation in cepstral space
- M. Pitz, and H. Ney Vocal tract normalization equals linear transformation in cepstral space IEEE Transactions on Speech and Audio Processing 13 2005 930 944
- (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , pp. 930-944
- Pitz, M.¹ Ney, H.²

21
- 4544361661
- Voice conversion through transformation of spectral and intonation features
- D. Rentzos, S. Vaseghi, Q. Yan, and C.H. Ho Voice conversion through transformation of spectral and intonation features Proc. ICASSP 2004 21 24
- (2004) Proc. ICASSP , pp. 21-24
- Rentzos, D.¹ Vaseghi, S.² Yan, Q.³ Ho, C.H.⁴

22
- 84921668468
- Versatile speech databases for high quality synthesis for basque
- I. Sainz, D. Erro, E. Navas, I. Hernaez, J. Sanchez, I. Saratxaga, and I. Odriozola Versatile speech databases for high quality synthesis for basque Proc. LREC 2012 3308 3312
- (2012) Proc. LREC , pp. 3308-3312
- Sainz, I.¹ Erro, D.² Navas, E.³ Hernaez, I.⁴ Sanchez, J.⁵ Saratxaga, I.⁶ Odriozola, I.⁷

23
- 34547507542
- Frequency warping based on mapping formant parameters
- Z. Shuang, R. Bakis, S. Shechtman, D. Chazan, and Y. Qin Frequency warping based on mapping formant parameters Proc. Interspeech 2006 2290 2293
- (2006) Proc. Interspeech , pp. 2290-2293
- Shuang, Z.¹ Bakis, R.² Shechtman, S.³ Chazan, D.⁴ Qin, Y.⁵

24
- 0032026483
- Continuous probabilistic transform for voice conversion
- Y. Stylianou, O. Cappe, and E. Moulines Continuous probabilistic transform for voice conversion IEEE Transactions on Speech and Audio Processing 6 1998 131 142
- (1998) IEEE Transactions on Speech and Audio Processing , vol.6 , pp. 131-142
- Stylianou, Y.¹ Cappe, O.² Moulines, E.³

25
- 84948175540
- VTLN-based voice conversion
- D. Suendermann, and H. Ney VTLN-based voice conversion Proc. ISSPIT 2003 556 559
- (2003) Proc. ISSPIT , pp. 556-559
- Suendermann, D.¹ Ney, H.²

26
- 80051619373
- One sentence voice adaptation using GMM-based frequency-warping and shift with a sub-band basis spectrum model
- M. Tamura, M. Morita, T. Kagoshima, and M. Akamine One sentence voice adaptation using GMM-based frequency-warping and shift with a sub-band basis spectrum model Proc. ICASSP 2011 5124 5127
- (2011) Proc. ICASSP , pp. 5124-5127
- Tamura, M.¹ Morita, M.² Kagoshima, T.³ Akamine, M.⁴

27
- 84946236688
- High quality voice conversion based on Gaussian mixture model with dynamic frequency warping
- T. Toda, H. Saruwatari, and K. Shikano High quality voice conversion based on Gaussian mixture model with dynamic frequency warping Proc. Interspeech 2001 349 352
- (2001) Proc. Interspeech , pp. 349-352
- Toda, T.¹ Saruwatari, H.² Shikano, K.³

28
- 57749193836
- Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
- T. Toda, A.W. Black, and K. Tokuda Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory IEEE Transactions on Audio, Speech and Language Processing 15 2007 2222 2235
- (2007) IEEE Transactions on Audio, Speech and Language Processing , vol.15 , pp. 2222-2235
- Toda, T.¹ Black, A.W.² Tokuda, K.³

29
- 0026880275
- Voice transformation using PSOLA technique
- H. Valbret, E. Moulines, and J.P. Tubach Voice transformation using PSOLA technique Speech Communication 11 1992 175 187
- (1992) Speech Communication , vol.11 , pp. 175-187
- Valbret, H.¹ Moulines, E.² Tubach, J.P.³

30
- 78149260085
- Continuous stochastic feature mapping based on trajectory HMMs
- H. Zen, Y. Nankaku, and K. Tokuda Continuous stochastic feature mapping based on trajectory HMMs IEEE Transactions on Audio, Speech and Language Processing 19 2011 417 430
- (2011) IEEE Transactions on Audio, Speech and Language Processing , vol.19 , pp. 417-430
- Zen, H.¹ Nankaku, Y.² Tokuda, K.³

31
- 84871520443
- Improving the quality of standard GMM-based voice conversion systems by considering physically motivated linear transformations
- T.C. Zorila, D. Erro, and I. Hernaez Improving the quality of standard GMM-based voice conversion systems by considering physically motivated linear transformations Communications in Computer and Information Science 328 2012 30 39
- (2012) Communications in Computer and Information Science , vol.328 , pp. 30-39
- Zorila, T.C.¹ Erro, D.² Hernaez, I.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.