SCOPUS 정보 검색 플랫폼

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014

Volumn , Issue , 2014, Pages

Exemplar-based emotional voice conversion using non-negative matrix factorization

(4) Aihara, Ryo a Ueda, Reina a Takiguchi, Tetsuya a Ariki, Yasuo a

a KOBE UNIVERSITY (Japan)

Author keywords

[No Author keywords available]

Indexed keywords

FACE RECOGNITION; MATRIX ALGEBRA; SPEECH COMMUNICATION; SPEECH PROCESSING;

COMPUTATIONAL TIME; EMOTIONAL VOICES; EXEMPLAR-BASED; NONNEGATIVE MATRIX FACTORIZATION; OBJECTIVE AND SUBJECTIVE EVALUATIONS; SET ALGORITHM; SOURCE SPECTRUM; SPEECH SIGNALS;

FACTORIZATION;

EID: 84949924136 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/APSIPA.2014.7041640 Document Type: Conference Paper

Times cited : (29)

References (31)

1
- 84966398940
- Optimising selection of units from speech database for concatenative synthesis
- A. W. Black and N. Cambpbell, "Optimising selection of units from speech database for concatenative synthesis," in EUROSPEECH, pp. 581-584, 1995.
- (1995) EUROSPEECH , pp. 581-584
- Black, A.W.¹ Cambpbell, N.²

2
- 0033708106
- Speech parameter generation algorithms for HMM based speech synthesis
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM based speech synthesis," in ICASSP, pp. 1315-1318, 2000.
- (2000) ICASSP , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

3
- 85008006694
- A robust speaker adaptive HMM based text to speech synthesis
- J. Yamagishi, T. Nose, H. Zen, Z. H. Ling, T. Toda, K. Tokuda, S. King, and S. Renals, "A robust speaker adaptive HMM based text to speech synthesis," IEEE Trans. Speech Audio Lang. Process., pp. 1208-1230, 2009.
- (2009) IEEE Trans. Speech Audio Lang. Process , pp. 1208-1230
- Yamagishi, J.¹ Nose, T.² Zen, H.³ Ling, Z.H.⁴ Toda, T.⁵ Tokuda, K.⁶ King, S.⁷ Renals, S.⁸

4
- 77949913458
- Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech
- R. Barra Chicote, J. Yamagichi, S. King, J. M. Montero, and J. Macias Guarasa, "Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech," Speech Communication, vol. 52, pp. 394-404, 2010.
- (2010) Speech Communication , vol.52 , pp. 394-404
- Barra Chicote, R.¹ Yamagichi, J.² King, S.³ Montero, J.M.⁴ Macias Guarasa, J.⁵

5
- 0032026483
- Continuous probabilistic transform for voice conversion
- Y. Stylianou, O. Cappe, and E. Moilines, "Continuous probabilistic transform for voice conversion," IEEE Trans. Speech and Audio Processing, vol. 6, no. 2, pp. 131-142, 1998.
- (1998) IEEE Trans. Speech and Audio Processing , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappe, O.² Moilines, E.³

6
- 0031623661
- Spectral voice conversion for text to speech synthesis
- A. Kain and M. W. Macon, "Spectral voice conversion for text to speech synthesis," in ICASSP, vol. 1, pp. 285-288, 1998.
- (1998) ICASSP , vol.1 , pp. 285-288
- Kain, A.¹ Macon, M.W.²

7
- 78649328053
- Survey on speech emotion recognition: Features, classification schemes, and databases
- M. E. Ayadia, M. S. Kamel, and F. Karray, "Survey on speech emotion recognition: Features, classification schemes, and databases," Pattern Recognition, vol. 44, 2011.
- (2011) Pattern Recognition , vol.44
- Ayadia, M.E.¹ Kamel, M.S.² Karray, F.³

8
- 84874248255
- Exemplar based voice conversion in noisy environment
- R. Takashima, T. Takiguchi, and Y. Ariki, "Exemplar based voice conversion in noisy environment," in SLT, pp. 313-317, 2012.
- (2012) SLT , pp. 313-317
- Takashima, R.¹ Takiguchi, T.² Ariki, Y.³

9
- 79960657803
- Exemplar based sparse representations for noise robust automatic speech recognition
- J. F. Gemmeke, T. Viratnen, and A. Hurmalainen, "Exemplar based sparse representations for noise robust automatic speech recognition," IEEE Trans. Audio, Speech and Language Processing, vol. 19, no. 7, pp. 2067-2080, 2011.
- (2011) IEEE Trans. Audio, Speech and Language Processing , vol.19 , Issue.7 , pp. 2067-2080
- Gemmeke, J.F.¹ Viratnen, T.² Hurmalainen, A.³

10
- 84898964201
- Algorithms for non negative matrix factorization
- D. D. Lee and H. S. Seung, "Algorithms for non negative matrix factorization," Neural Information Processing System, pp. 556-562, 2001.
- (2001) Neural Information Processing System , pp. 556-562
- Lee, D.D.¹ Seung, H.S.²

11
- 50249152311
- Monaural sound source separation by non negative matrix factorization with temporal continuity and sparseness criteria
- T. Virtanen, "Monaural sound source separation by non negative matrix factorization with temporal continuity and sparseness criteria," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 3, pp. 1066-1074, 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , Issue.3 , pp. 1066-1074
- Virtanen, T.¹

12
- 44949110218
- Single channel speech separation using sparse non negative matrix factorization
- M. N. Schmidt and R. K. Olsson, "Single channel speech separation using sparse non negative matrix factorization," in Interspeech, 2006.
- (2006) Interspeech
- Schmidt, M.N.¹ Olsson, R.K.²

13
- 84905268745
- Active set newton algorithm for non negative sparse coding of audio
- T. Virtanen, B. Raj, J. F. Gemmeke, and H. Van hamme, "Active set newton algorithm for non negative sparse coding of audio," in ICASSP, pp. 3116-3120, 2014.
- (2014) ICASSP , pp. 3116-3120
- Virtanen, T.¹ Raj, B.² Gemmeke, J.F.³ Van Hamme, H.⁴

14
- 0023739214
- Esophageal speech enhancement based on statistical voice conversion with Gaussian mixture models
- M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, "Esophageal speech enhancement based on statistical voice conversion with Gaussian mixture models," in Proc. ICASSP, pp. 655 658, 1988.
- (1988) Proc. ICASSP , pp. 655-658
- Abe, M.¹ Nakamura, S.² Shikano, K.³ Kuwabara, H.⁴

15
- 0026880275
- Voice transformation using PSOLA technique
- H. Valbret, E. Moulines, and J. P. Tubach, "Voice transformation using PSOLA technique," Speech Communication, vol. 11, no. 2 3, pp. 175 187, 1992.
- (1992) Speech Communication , vol.11 , Issue.2-3 , pp. 175-187
- Valbret, H.¹ Moulines, E.² Tubach, J.P.³

16
- 57749193836
- Voice conversion based on maximum likelihood estimation of spectral parameter trajectory
- T. Toda, A. Black, and K. Tokuda, "Voice conversion based on maximum likelihood estimation of spectral parameter trajectory," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2222-2235, 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.² Tokuda, K.³

17
- 77953712499
- Voice conversion using partial least squares regression
- E. Helander, T. Virtanen, J. Nurminen, and M. Gabbouj, "Voice conversion using partial least squares regression," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, Issue:5, pp. 912-921, 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process , vol.18 , Issue.5 , pp. 912-921
- Helander, E.¹ Virtanen, T.² Nurminen, J.³ Gabbouj, M.⁴

18
- 44949210554
- Map based adaptation for speech conversion using adaptation data selection and non parallel training
- C. H. Lee and C. H. Wu, "Map based adaptation for speech conversion using adaptation data selection and non parallel training," in Interspeech, pp. 2254-2257, 2006.
- (2006) Interspeech , pp. 2254-2257
- Lee, C.H.¹ Wu, C.H.²

19
- 34547512822
- Eigenvoice conversion based on Gaussian mixture model
- T. Toda, Y. Ohtani, and K. Shikano, "Eigenvoice conversion based on Gaussian mixture model," in Interspeech, pp. 2446-2449, 2006.
- (2006) Interspeech , pp. 2446-2449
- Toda, T.¹ Ohtani, Y.² Shikano, K.³

20
- 84865798483
- One to many voice conversion based on tensor representation of speaker space
- D. Saito, K. Yamamoto, N. Minematsu, and K. Hirose, "One to many voice conversion based on tensor representation of speaker space," in Interspeech, pp. 653-656, 2011.
- (2011) Interspeech , pp. 653-656
- Saito, D.¹ Yamamoto, K.² Minematsu, N.³ Hirose, K.⁴

21
- 84905227265
- Voice conversion based on non negative matrix factorization using phoneme categorized dictionary
- R. Aihara, T. Nakashika, T. Takiguchi, and Y. Ariki, "Voice conversion based on non negative matrix factorization using phoneme categorized dictionary," in ICASSP, pp. 7944-7948, 2014.
- (2014) ICASSP , pp. 7944-7948
- Aihara, R.¹ Nakashika, T.² Takiguchi, T.³ Ariki, Y.⁴

22
- 84905269973
- Multimodal voice conversion using non negative matrix factorization in noisy environments
- K. Masaka, R. Aihara, T. Takiguchi, and Y. Ariki, "Multimodal voice conversion using non negative matrix factorization in noisy environments," ICASSP2014, pp. 1561-1565, 2014.
- (2014) ICASSP2014 , pp. 1561-1565
- Masaka, K.¹ Aihara, R.² Takiguchi, T.³ Ariki, Y.⁴

23
- 84890519936
- Individualitypreserving voice conversion for articulation disorders based on nonnegative matrix factorization
- R. Aihara, R. Takashima, T. Takiguchi, and Y. Ariki, "Individualitypreserving voice conversion for articulation disorders based on Nonnegative Matrix Factorization," in ICASSP, pp. 8037-8040, 2013.
- (2013) ICASSP , pp. 8037-8040
- Aihara, R.¹ Takashima, R.² Takiguchi, T.³ Ariki, Y.⁴

24
- 34247615610
- Emotional speech synthesis using subspace constraints in prosody
- S. Mori, T. Moriyama, and S. Ozawa, "Emotional speech synthesis using subspace constraints in prosody," IEEE Conference on Multimedia and Expo, pp. 1093-1096, 2006.
- (2006) IEEE Conference on Multimedia and Expo , pp. 1093-1096
- Mori, S.¹ Moriyama, T.² Ozawa, S.³

25
- 77955722263
- Hierarchical prosody conversion using regression based clustering for emotional synthesis
- C. H. Wu, C. C. Hsia, and C. H. Lee, "Hierarchical prosody conversion using regression based clustering for emotional synthesis," IEEE Trans. Audio, Speech and Lang Proc., 2010.
- (2010) IEEE Trans. Audio, Speech and Lang Proc
- Wu, C.H.¹ Hsia, C.C.² Lee, C.H.³

26
- 56149126461
- GMM based voice conversion applied to emotional speech synthesis
- H. Kawanami, Y. Iwami, T. Toda, H. Saruwatari, and K. Shikano, "GMM based voice conversion applied to emotional speech synthesis," IEEE Trans. Speech and Audio Proc., 1999.
- (1999) IEEE Trans. Speech and Audio Proc
- Kawanami, H.¹ Iwami, Y.² Toda, T.³ Saruwatari, H.⁴ Shikano, K.⁵

27
- 84865747520
- Intonation conversion from neutral to expressive speech
- C. Veaux and X. Robet, "Intonation conversion from neutral to expressive speech," in Interspeech, pp. 2765-2768, 2011.
- (2011) Interspeech , pp. 2765-2768
- Veaux, C.¹ Robet, X.²

28
- 84890451203
- GMM based emotional voice conversion using spectrum and prosody features
- R. Aihara, R. Takashima, T. Takiguchi, and Y. Ariki, "GMM based emotional voice conversion using spectrum and prosody features," American Journal of Signal Processing, vol. 2, no. 5, pp. 134-138, 2012.
- (2012) American Journal of Signal Processing , vol.2 , Issue.5 , pp. 134-138
- Aihara, R.¹ Takashima, R.² Takiguchi, T.³ Ariki, Y.⁴

29
- 33750915991
- STRAIGHT, exploitation of the other aspect of vocoder: Perceptually isomorphic decomposition of speech sounds
- H. Kawahara, "STRAIGHT, exploitation of the other aspect of vocoder: Perceptually isomorphic decomposition of speech sounds," Acoustical Science and Technology, pp. 349-353, 2006.
- (2006) Acoustical Science and Technology , pp. 349-353
- Kawahara, H.¹

30
- 0025475528
- ATR Japanese speech database as a tool of speech recognition and synthesis
- A. Kurematsu, K. Takeda, Y. Sagisaka, S. Katagiri, H. Kuwabara, and K. Shikano, "ATR Japanese speech database as a tool of speech recognition and synthesis," Speech Communication, vol. 9, pp. 357-363, 1990.
- (1990) Speech Communication , vol.9 , pp. 357-363
- Kurematsu, A.¹ Takeda, K.² Sagisaka, Y.³ Katagiri, S.⁴ Kuwabara, H.⁵ Shikano, K.⁶

31
- 84905271796
- Noise robust voice conversion based on spectral mapping on sparse space
- R. Takashima, R. Aihara, T. Takiguchi, and Y. Ariki, "Noise robust voice conversion based on spectral mapping on sparse space," SSW8, 8th ISCA Speech Synthesis Workshop, pp. 71-75, 2013.
- (2013) SSW8, 8th ISCA Speech Synthesis Workshop , pp. 71-75
- Takashima, R.¹ Aihara, R.² Takiguchi, T.³ Ariki, Y.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.