SCOPUS 정보 검색 플랫폼

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, ISCSLP 2014

Volumn , Issue , 2014, Pages 197-200

Pitch transformation in neural network based voice conversion

(4) Xie, Feng Long a,b Qian, Yao b Soong, Frank K b Li, Haifeng a

a HARBIN INSTITUTE OF TECHNOLOGY (China)

b MICROSOFT RESEARCH ASIA (China)

Author keywords

neural network; pitch; voice conversion

Indexed keywords

WAVELET DECOMPOSITION;

CONVERSION SYSTEMS; DISCONTINUITY PROPERTY; F0 CONTOURS; NORMALIZED TRANSFORMATIONS; PITCH; RESEARCH TOPICS; SPECTRAL FEATURE; VOICE CONVERSION;

NEURAL NETWORKS;

EID: 84912078522 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ISCSLP.2014.6936599 Document Type: Conference Paper

Times cited : (9)

References (24)

1
- 0032026483
- Continuous probabilistic transform for voice conversion
- Y. Stylianou, O. Cappe, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. on Audio Speech and Language Processing, vol. 6, no. 2, pp. 131-142, 1998.
- (1998) IEEE Trans. on Audio Speech and Language Processing , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappe, O.² Moulines, E.³

2
- 57749193836
- Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
- T. Toda, A. Black, and K. Tokuda,"Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory," IEEE Trans. on Audio Speech and Language Processing, vol. 15, no. 8, pp. 2222-2235, 2007.
- (2007) IEEE Trans. on Audio Speech and Language Processing , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.² Tokuda, K.³

3
- 0029254176
- Transformation of formants for voice conversion using artificial neural networks
- M. Narendranath, H. A. Murthy, S. Rajendran, and B. Yegnanarayana, "Transformation of formants for voice conversion using artificial neural networks," Speech Commun., vol. 16, no. 2, pp. 207-216, 1995.
- (1995) Speech Commun. , vol.16 , Issue.2 , pp. 207-216
- Narendranath, M.¹ Murthy, H.A.² Rajendran, S.³ Yegnanarayana, B.⁴

4
- 77953707533
- Spectral mapping using artifical neural networks for voice conversion
- S. Desai, A. Black, B. Yegnanarayana, K. Prahallad, "Spectral Mapping Using Artifical Neural Networks for Voice Conversion," IEEE Trans. on Audio Speech and Language Processing, vol. 18, no. 5, pp. 954-964, 2010.
- (2010) IEEE Trans. on Audio Speech and Language Processing , vol.18 , Issue.5 , pp. 954-964
- Desai, S.¹ Black, A.² Yegnanarayana, B.³ Prahallad, K.⁴

5
- 84910068272
- Continuous wavelet transform for analysis of speech prosody
- M. Vainio, A. Suni, and D. Aalto, "Continuous wavelet transform for analysis of speech prosody," TRASP, pp, 78-81, 2013.
- (2013) TRASP , pp. 78-81
- Vainio, M.¹ Suni, A.² Aalto, D.³

6
- 85089106384
- Estimating phrase curves in the general superpositional intonation model
- Pittsburgh
- J. P. H. van Santen, T. Mishra, E. Klabbers, "Estimating phrase curves in the general superpositional intonation model," Proc. 5th ISCA speech synthesis workshop, Pittsburgh, 2004.
- (2004) Proc. 5th ISCA Speech Synthesis Workshop
- Van Santen, J.P.H.¹ Mishra, T.² Klabbers, E.³

7
- 84867198266
- Incorporating durational modification in voice transformation
- A. R. Toth and A. W. Black, "Incorporating durational modification in voice transformation", in Proc. Interspeech, pp. 1088-1091, 2008.
- (2008) Proc. Interspeech , pp. 1088-1091
- Toth, A.R.¹ Black, A.W.²

8
- 85010815133
- Voice transformation using PSOLA technique
- H. Valbret, E. Moulines, and J. P. Tubach, "Voice transformation using PSOLA technique," in Proc. ICASPP, pp. 145-148, 1992.
- (1992) Proc. ICASPP , pp. 145-148
- Valbret, H.¹ Moulines, E.² Tubach, J.P.³

9
- 0023739214
- Voice conversion through vector quantization
- M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, "Voice conversion through vector quantization," in Proc. ICASPP, pp. 655-658, 1988.
- (1988) Proc. ICASPP , pp. 655-658
- Abe, M.¹ Nakamura, S.² Shikano, K.³ Kuwabara, H.⁴

10
- 84863934040
- Duration modelling in voice conversion using artificial neural networks
- R. Srikanth, B. Bajibabu, K. Prahallad, "Duration modelling in voice conversion using artificial neural networks," in Proc. IWSSIP, pp. 556-559, 2012.
- (2012) Proc. IWSSIP , pp. 556-559
- Srikanth, R.¹ Bajibabu, B.² Prahallad, K.³

11
- 56149097756
- F0 transformation within the voice convesrion framework
- Z. Hanzlicek, J. Matousek, "F0 transformation within the voice convesrion framework," in Proc. Interspeech, pp. 1961-1964, 2007.
- (2007) Proc. Interspeech , pp. 1961-1964
- Hanzlicek, Z.¹ Matousek, J.²

12
- 60849084576
- Multi-layer F0 modelling for HMM-based speech synthesis
- C. Wang, Z. Ling, B. Zhang, and L. Dai, "Multi-layer F0 modelling for HMM-based speech synthesis." in Proc. ISCSLP, pp. 129-132, 2008.
- (2008) Proc. ISCSLP , pp. 129-132
- Wang, C.¹ Ling, Z.² Zhang, B.³ Dai, L.⁴

13
- 79959844205
- A hierarchical F0 modelling method for HMM-based speech synthesis
- M. Lei, Y. Wu, F. K. Soong, Z. Ling, and L. Dai, "A hierarchical F0 modelling method for HMM-based speech synthesis,", Proc. of Interspeech, 2010.
- (2010) Proc. of Interspeech
- Lei, M.¹ Wu, Y.² Soong, F.K.³ Ling, Z.⁴ Dai, L.⁵

14
- 44049085520
- High Quality voice convesrion through phoneme based linear mapping functions with STRAIGHT for mandarin
- K. Liu, J. Zhang, and Y. Yan. "High Quality voice convesrion through phoneme based linear mapping functions with STRAIGHT for mandarin", in Proc. 4th Int. Conf. Fuzzy Syst. Knowl. Discovery (FSKD 2007), 2007, vol.4, pp. 410-414.
- (2007) Proc. 4th Int. Conf. Fuzzy Syst. Knowl. Discovery (FSKD 2007) , vol.4 , pp. 410-414
- Liu, K.¹ Zhang, J.² Yan, Y.³

15
- 84906225084
- Joint spectral distribution modeling using restricted boltzmann machines for voice conversion
- L. Chen, Z. Ling, Y. Song, L. Dai, "Joint Spectral Distribution Modeling Using Restricted Boltzmann Machines for Voice Conversion," in Proc. Interspeech, pp. 3053-3056, 2013.
- (2013) Proc. Interspeech , pp. 3053-3056
- Chen, L.¹ Ling, Z.² Song, Y.³ Dai, L.⁴

16
- 0027887751
- Stochastic gradient techniques for the efficient simulation of high-speed networks using importance sampling
- M. Devetsikiotis, W. A. Al-Qaq, J. A. Freebersyser, J. K. Townsend, "Stochastic gradient techniques for the efficient simulation of high-speed networks using importance sampling," in Proc. GLOBECOM, pp. 751-756, 1993.
- (1993) Proc. GLOBECOM , pp. 751-756
- Devetsikiotis, M.¹ Al-Qaq, W.A.² Freebersyser, J.A.³ Townsend, J.K.⁴

17
- 33646773080
- The CMU ARCTIC databases for speech synthesis
- Language Technologies Institute, Carnegie Mellon University
- J. Kominek and A. Black, "The CMU ARCTIC databases for speech synthesis," Tech. Rep. CMU-LTI-03-177, Language Technologies Institute, Carnegie Mellon University, 2003.
- (2003) Tech. Rep. CMU-LTI-03-177
- Kominek, J.¹ Black, A.²

18
- 84865725683
- Growing a spoken language interface on amazon mechanical turk
- I. McGraw, J. Glass and S. Seneff, "Growing a Spoken Language Interface on Amazon Mechanical Turk", in Proc. of Interspeech, 2011.
- (2011) Proc. of Interspeech
- McGraw, I.¹ Glass, J.² Seneff, S.³

19
- 84910090364
- Line spectral pairs based voice conversion using radial basis function
- J. H. Nirmal, S. Patnaik, Mukesh A. Zaveri, "Line Spectral Pairs Based Voice Conversion Using Radial Basis Function ", Int. J. on Signal & Image Processing, Vol. 4, No.2, pp. 26-33, 2013.
- (2013) Int. J. on Signal & Image Processing , vol.4 , Issue.2 , pp. 26-33
- Nirmal, J.H.¹ Patnaik, S.² Zaveri, M.A.³

20
- 84888246669
- Nonlinear pitch modification in voice conversion using artificial neural networks
- Bajibabu Bollepalli, Jonas Beskow, Joakim Gustafson, "Nonlinear Pitch Modification in Voice Conversion Using Artificial Neural Networks", Advances in Nonlinear Speech Processing Lecture Notes in Computer Science, Vol. 7911, pp. 97-103, 2013.
- (2013) Advances in Nonlinear Speech Processing Lecture Notes in Computer Science , vol.7911 , pp. 97-103
- Bollepalli, B.¹ Beskow, J.² Gustafson, J.³

21
- 84867199771
- Simultaneous conversion of duration and spectrum based on statistical models including time-sequence matching
- K. Yutani, Y. Utoi, Y. Nankaku, T. Toda, K. Tokuda, "Simultaneous Conversion of Duration and Spectrum Based on Statistical Models Including Time-Sequence Matching," in Proc. INTERSPEECH, pp. 1072-1075, 2008.
- (2008) Proc. INTERSPEECH , pp. 1072-1075
- Yutani, K.¹ Utoi, Y.² Nankaku, Y.³ Toda, T.⁴ Tokuda, K.⁵

22
- 78149241363
- Spectral conversion based on statistical models including time-sequence matching
- Y. Nankaku, K. Nakamura, T. Toda, K. Tokuda, "Spectral Conversion based on Statistical Models Including Time-Sequence Matching", in Proc. 6th ISCA speech synthesis workshop, pp. 333-338, 2007.
- (2007) Proc. 6th ISCA Speech Synthesis Workshop , pp. 333-338
- Nankaku, Y.¹ Nakamura, K.² Toda, T.³ Tokuda, K.⁴

23
- 80051624021
- Improved F0 modeling and generation in voice conversion
- A. Kunikoshi, Y. Qian, F. K. Soong and N. Minematsu, "Improved F0 modeling and generation in voice conversion," In Proc. ICASSP 2011.
- (2011) Proc. ICASSP
- Kunikoshi, A.¹ Qian, Y.² Soong, F.K.³ Minematsu, N.⁴

24
- 84910087395
- Sequence Error (SE) minimization training of neural network for voice conversion
- Accepted
- F-L. Xie, Y. Qian, F. K. Soong, H. Li, "Sequence Error (SE) Minimization Training of Neural Network for Voice Conversion," in Proc. Interspeech, 2014(Accepted).
- (2014) Proc. Interspeech
- Xie, F.-L.¹ Qian, Y.² Soong, F.K.³ Li, H.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.