SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 2015-January, Issue , 2015, Pages 2749-2753

Many-to-many voice conversion based on multiple non-negative matrix factorization

(3) Aihara, Ryo a Takiguchi, Testuya a Ariki, Yasuo a

a KOBE UNIVERSITY (Japan)

Author keywords

Exemplar based; Many tomany; NMF; Speech synthesis; Voice conversion

Indexed keywords

FACTORIZATION; GAUSSIAN DISTRIBUTION; SPEECH COMMUNICATION; SPEECH PROCESSING; SPEECH SYNTHESIS;

EXEMPLAR-BASED; GAUSSIAN MIXTURE MODEL; MANY-TOMANY; NOISE ROBUSTNESS; NONNEGATIVE MATRIX FACTORIZATION; PARALLEL TRAINING; VOICE CONVERSION; VOICE QUALITY CONTROLS;

MATRIX ALGEBRA;

EID: 84959090646 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (3)

References (27)

1
- 0032026483
- Continuous probabilistictransform for voice conversion
- Y. Stylianou, O. Cappe, and E. Moilines, "Continuous probabilistictransform for voice conversion, " IEEE Trans. Speech and AudioProcessing, vol. 6, no. 2, pp. 131-142, 1998.
- (1998) IEEE Trans. Speech and AudioProcessing , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappe, O.² Moilines, E.³

2
- 80052698826
- Speakingaidsystems using GMM-based voice conversion for electrolaryngealspeech
- K. Nakamura, T. Toda, H. Saruwatari, and K. Shikano, "Speakingaidsystems using GMM-based voice conversion for electrolaryngealspeech, " Speech Communication, vol. 54, no. 1, pp. 134-146, 2012.
- (2012) Speech Communication , vol.54 , Issue.1 , pp. 134-146
- Nakamura, K.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

3
- 0031623661
- Spectral voice conversion for text-tospeechsynthesis
- A. Kain and M. W. Macon, "Spectral voice conversion for text-tospeechsynthesis, " in Proc. ICASSP, vol. 1, pp. 285-288, 1998.
- (1998) Proc. ICASSP , vol.1 , pp. 285-288
- Kain, A.¹ Macon, M.W.²

4
- 84910069658
- A mel-cepstral analysis technique restoring high frequencycomponents from low-sampling-rate speech
- K. Nakamura, K. Hashimoto, K. Oura, Y. Nankaku, and K. Tokuda, "A mel-cepstral analysis technique restoring high frequencycomponents from low-sampling-rate speech, " in Proc. Interspeech, pp. 2494-2498, 2014.
- (2014) Proc. Interspeech , pp. 2494-2498
- Nakamura, K.¹ Hashimoto, K.² Oura, K.³ Nankaku, Y.⁴ Tokuda, K.⁵

5
- 84910024857
- GMM-basedband width extension using sub-band basis spectrum model
- Y. Ohtani, M. Tamura, M. Morita, and M. Akamine, "GMM-basedband width extension using sub-band basis spectrum model, " inProc. Interspeech, pp. 2489-2493, 2014.
- (2014) Proc. Interspeech , pp. 2489-2493
- Ohtani, Y.¹ Tamura, M.² Morita, M.³ Akamine, M.⁴

6
- 0023739214
- Esophageal speech enhancement based on statistical voice conversionwith Gaussian mixture models
- M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, "Esophageal speech enhancement based on statistical voice conversionwith Gaussian mixture models, " in Proc. ICASSP, pp. 655-658, 1988.
- (1988) Proc. ICASSP , pp. 655-658
- Abe, M.¹ Nakamura, S.² Shikano, K.³ Kuwabara, H.⁴

7
- 0026880275
- Voice transformationusing PSOLA technique
- H. Valbret, E. Moulines, and J. P. Tubach, "Voice transformationusing PSOLA technique, " Speech Communication, vol. 11, no. 2-3, pp. 175-187, 1992.
- (1992) Speech Communication , vol.11 , Issue.2-3 , pp. 175-187
- Valbret, H.¹ Moulines, E.² Tubach, J.P.³

8
- 57749193836
- Voice conversion based onmaximum likelihood estimation of spectral parameter trajectory
- T. Toda, A. Black, and K. Tokuda, "Voice conversion based onmaximum likelihood estimation of spectral parameter trajectory, "IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2222-2235, 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.² Tokuda, K.³

9
- 77953712499
- Voiceconversion using partial least squares regression
- E. Heland er, T. Virtanen, J. Nurminen, and M. Gabbouj, "Voiceconversion using partial least squares regression, " IEEE Trans. Audio, Speech, Lang. Process., vol. 18, Issue: 5, pp. 912-921, 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process , vol.18 , Issue.5 , pp. 912-921
- Heland Er, E.¹ Virtanen, T.² Nurminen, J.³ Gabbouj, M.⁴

10
- 84874248255
- Exemplar-based voiceconversion in noisy environment
- R. Takashima, T. Takiguchi, and Y. Ariki, "Exemplar-based voiceconversion in noisy environment, " in Proc. SLT, pp. 313-317, 2012.
- (2012) Proc. SLT , pp. 313-317
- Takashima, R.¹ Takiguchi, T.² Ariki, Y.³

11
- 84911369131
- Exemplar-basedsparse representation with residual compensation for voice conversion
- Z. Wu, T. Virtanen, E. S. Chng, and H. Li, "Exemplar-basedsparse representation with residual compensation for voice conversion, "IEEE Trans. Audio, Speech, Lang. Process., vol. 22, no. 10, pp. 1506-1521, 2014.
- (2014) IEEE Trans. Audio, Speech, Lang. Process , vol.22 , Issue.10 , pp. 1506-1521
- Wu, Z.¹ Virtanen, T.² Chng, E.S.³ Li, H.⁴

12
- 84901806271
- Noiserobustvoice conversion based on sparse spectral mapping usingnon-negative matrix factorization
- R. Aihara, R. Takashima, T. Takiguchi, and Y. Ariki, "Noiserobustvoice conversion based on sparse spectral mapping usingnon-negative matrix factorization, " IEICE Transactions on Informationand Systems, vol. E97-D, no. 6, pp. 1411-1418, 2014.
- (2014) IEICE Transactions on Informationand Systems , vol.E97-D , Issue.6 , pp. 1411-1418
- Aihara, R.¹ Takashima, R.² Takiguchi, T.³ Ariki, Y.⁴

13
- 84898964201
- Algorithms for non-negative matrixfactorization
- D. D. Lee and H. S. Seung, "Algorithms for non-negative matrixfactorization, " Neural Information Processing System, pp. 556-562, 2001.
- (2001) Neural Information Processing System , pp. 556-562
- Lee, D.D.¹ Seung, H.S.²

14
- 44949110218
- Single-channel speech separationusing sparse non-negative matrix factorization
- M. N. Schmidt and R. K. Olsson, "Single-channel speech separationusing sparse non-negative matrix factorization, " in Proc. Interspeech, 2006.
- (2006) Proc. Interspeech
- Schmidt, M.N.¹ Olsson, R.K.²

15
- 50249152311
- Monaural sound source separation by non-negativematrix factorization with temporal continuity and sparseness criteria
- T. Virtanen, "Monaural sound source separation by non-negativematrix factorization with temporal continuity and sparseness criteria, "IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 3, pp. 1066-1074, 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , Issue.3 , pp. 1066-1074
- Virtanen, T.¹

16
- 79960657803
- Exemplarbasedsparse representations for noise robust automatic speechrecognition
- J. F. Gemmeke, T. Viratnen, and A. Hurmalainen, "Exemplarbasedsparse representations for noise robust automatic speechrecognition, " IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 7, pp. 2067-2080, 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process , vol.19 , Issue.7 , pp. 2067-2080
- Gemmeke, J.F.¹ Viratnen, T.² Hurmalainen, A.³

17
- 84905227265
- Voiceconversion based on non-negative matrix factorization usingphoneme-categorized dictionary
- R. Aihara, T. Nakashika, T. Takiguchi, and Y. Ariki, "Voiceconversion based on non-negative matrix factorization usingphoneme-categorized dictionary, " in Proc. ICASSP, pp. 7944-7948, 2014.
- (2014) Proc. ICASSP , pp. 7944-7948
- Aihara, R.¹ Nakashika, T.² Takiguchi, T.³ Ariki, Y.⁴

18
- 84901801701
- A preliminarydemonstration of exemplar-based voice conversion for articulationdisorders using an individuality-preserving dictionary
- R. Aihara, R. Takashima, T. Takiguchi, and Y. Ariki, "A preliminarydemonstration of exemplar-based voice conversion for articulationdisorders using an individuality-preserving dictionary, "EURASIP Journal on Audio, Speech, and Music Processing, vol. 2014: 5, doi: 10. 1186/1687-4722-2014-5, 2014.
- (2014) EURASIP Journal on Audio, Speech, and Music Processing , vol.2014 , pp. 5
- Aihara, R.¹ Takashima, R.² Takiguchi, T.³ Ariki, Y.⁴

19
- 84910091291
- Multimodalexemplar-based voice conversion using lip features in noisy environments
- K. Masaka, R. Aihara, T. Takiguchi, and Y. Ariki, "Multimodalexemplar-based voice conversion using lip features in noisy environments, "in Proc. INTERSPEECH, vol. 1159-1163, 2014.
- (2014) Proc. INTERSPEECH , vol.1159-1163
- Masaka, K.¹ Aihara, R.² Takiguchi, T.³ Ariki, Y.⁴

20
- 44949210554
- MAP-based adaptation for speech conversionusing adaptation data selection and non-parallel training
- C. H. Lee and C. H. Wu, "MAP-based adaptation for speech conversionusing adaptation data selection and non-parallel training, "in Proc. INTERSPEECH, pp. 2254-2257, 2006.
- (2006) Proc. INTERSPEECH , pp. 2254-2257
- Lee, C.H.¹ Wu, C.H.²

21
- 34047245444
- Nonparalleltraining for voice conversion based on a parameter adaptation approach
- A. Mouchtaris, J. V. der Spiegel, and P. Mueller, "Nonparalleltraining for voice conversion based on a parameter adaptation approach, "Audio, Speech, and Language Processing, IEEE Transactionson 14 (3), pp. 952-963, 2006.
- (2006) Audio, Speech, and Language Processing, IEEE Transactionson , vol.14 , Issue.3 , pp. 952-963
- Mouchtaris, A.¹ Der Spiegel, J.V.² Mueller, P.³

22
- 34547512822
- Eigenvoice conversion basedon Gaussian mixture model
- T. Toda, Y. Ohtani, and K. Shikano, "Eigenvoice conversion basedon Gaussian mixture model, " in Proc. Interspeech, pp. 2446-2449, 2006.
- (2006) Proc. Interspeech , pp. 2446-2449
- Toda, T.¹ Ohtani, Y.² Shikano, K.³

23
- 70450194389
- Many-tomanyeigenvoice conversion with reference voice
- Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Many-tomanyeigenvoice conversion with reference voice, " in Proc. Interspeech, pp. 1623-1626, 2009.
- (2009) Proc. Interspeech , pp. 1623-1626
- Ohtani, Y.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

24
- 84865798483
- One-tomanyvoice conversion based on tensor representation of speakerspace
- D. Saito, K. Yamamoto, N. Minematsu, and K. Hirose, "One-tomanyvoice conversion based on tensor representation of speakerspace, " in Proc. INTERSPEECH, pp. 653-656, 2011.
- (2011) Proc. INTERSPEECH , pp. 653-656
- Saito, D.¹ Yamamoto, K.² Minematsu, N.³ Hirose, K.⁴

25
- 0025475528
- ATR Japanese speech database as a tool ofspeech recognition and synthesis
- A. Kurematsu, K. Takeda, Y. Sagisaka, S. Katagiri, H. Kuwabara, and K. Shikano, "ATR Japanese speech database as a tool ofspeech recognition and synthesis, " Speech Communication, vol. 9, pp. 357-363, 1990.
- (1990) Speech Communication , vol.9 , pp. 357-363
- Kurematsu, A.¹ Takeda, K.² Sagisaka, Y.³ Katagiri, S.⁴ Kuwabara, H.⁵ Shikano, K.⁶

26
- 33750915991
- STRAIGHT, exploitation of the other aspectof vocoder: Perceptually isomorphic decomposition of speechsounds
- H. Kawahara, "STRAIGHT, exploitation of the other aspectof vocoder: Perceptually isomorphic decomposition of speechsounds, " Acoustical Science and Technology, pp. 349-353, 2006.
- (2006) Acoustical Science and Technology , pp. 349-353
- Kawahara, H.¹

27
- 84901788410
- INTERNATIONAL TELECOMMUNICATION UNION, ITU-TRecommendation P. 800
- INTERNATIONAL TELECOMMUNICATION UNION, "Methodsfor objective and subjective assessment of quality, " ITU-TRecommendation P. 800, 2003.
- (2003) Methodsfor Objective and Subjective Assessment of Quality

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.