SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2014, Pages 1159-1163

Multimodal exemplar-based voice conversion using lip features in noisy environments

(4) Masaka, Kenta a Aihara, Ryo a Takiguchi, Tetsuya a Ariki, Yasuo a

a KOBE UNIVERSITY (Japan)

Author keywords

Image features; Multimodal; Noise robustness; Non negative matrix factorization; Voice conversion

Indexed keywords

ACOUSTIC NOISE; COST FUNCTIONS; FACE RECOGNITION; FACTORIZATION; SIGNAL PROCESSING; SPEECH COMMUNICATION;

IMAGE FEATURES; MULTI-MODAL; NOISE ROBUSTNESS; NONNEGATIVE MATRIX FACTORIZATION; VOICE CONVERSION;

SPEECH PROCESSING;

EID: 84910091291 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (6)

References (28)

1
- 84898964201
- Algorithms for non-negative matrix factorization
- D. D. Lee and H. S. Seung, "Algorithms for non-negative matrix factorization, " Neural Information Processing System, pp. 556- 562, 2001.
- (2001) Neural Information Processing System , pp. 556-562
- Lee, D.D.¹ Seung, H.S.²

2
- 50249152311
- Monaural sound source separation by non-negative matrix factorization with temporal continuity and sparseness cri- Teria
- T. Virtanen, "Monaural sound source separation by non-negative matrix factorization with temporal continuity and sparseness cri- Teria, " IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 3, pp. 1066-1074, 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.3 , pp. 1066-1074
- Virtanen, T.¹

3
- 44949110218
- Single-channel speech sepa- ration using sparse non-negative matrix factorization
- M. N. Schmidt and R. K. Olsson, "Single-channel speech sepa- ration using sparse non-negative matrix factorization, " in Inter- speech, 2006.
- (2006) Inter- Speech
- Schmidt, M.N.¹ Olsson, R.K.²

4
- 79960657803
- Exemplar- based sparse representations for noise robust automatic speech recognition
- J. F. Gemmeke, T. Viratnen, and A. Hurmalainen, "Exemplar- based sparse representations for noise robust automatic speech recognition, " IEEE Trans. Audio, Speech and Language Processing, vol. 19, no. 7, pp. 2067-2080, 2011.
- (2011) IEEE Trans. Audio, Speech and Language Processing , vol.19 , Issue.7 , pp. 2067-2080
- Gemmeke, J.F.¹ Viratnen, T.² Hurmalainen, A.³

5
- 84874248255
- Exemplar-based voice conversion in noisy environment
- R. Takashima, T. Takiguchi, and Y. Ariki, "Exemplar-based voice conversion in noisy environment, " in SLT, pp. 313-317, 2012.
- (2012) SLT , pp. 313-317
- Takashima, R.¹ Takiguchi, T.² Ariki, Y.³

6
- 0032026483
- Continuous probabilis- Tic transform for voice conversion
- Y. Stylianou, O. Cappe, and E. Moilines, "Continuous probabilis- Tic transform for voice conversion, " IEEE Trans. Speech and Audio Processing, vol. 6, no. 2, pp. 131-142, 1998.
- (1998) IEEE Trans. Speech and Audio Processing , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappe, O.² Moilines, E.³

7
- 0031624666
- Discriminative training of HMM stream exponents for audio-visual speech recognition
- G. Potamianos and H. P. Graf, "Discriminative training of HMM stream exponents for audio-visual speech recognition, " in ICASSP, pp. 3733-3736, 1998.
- (1998) ICASSP , pp. 3733-3736
- Potamianos, G.¹ Graf, H.P.²

8
- 0042954451
- Late inte- gration in audio-visual continuous speech recognition
- A. Verma, T. Faruquie, C. Neti, S. Basu, and A. Senior, "Late inte- gration in audio-visual continuous speech recognition, " in ASRU, 1999.
- (1999) ASRU
- Verma, A.¹ Faruquie, T.² Neti, C.³ Basu, S.⁴ Senior, A.⁵

9
- 0029747053
- Integrat- ing audio and visual information to provide highly robust speech recognition
- M. J. Tomlinson, M. J. Russell, and N. M. Brooke, "Integrat- ing audio and visual information to provide highly robust speech recognition, " in ICASSP, pp. 821-824, 1996.
- (1996) ICASSP , pp. 821-824
- Tomlinson, M.J.¹ Russell, M.J.² Brooke, N.M.³

10
- 84871395683
- Robust aam- based audio-visual speech recognition against face direction changes
- Y. Komai, N. Yang, T. Takiguchi, and Y. Ariki, "Robust aam- based audio-visual speech recognition against face direction changes, " ACM Multimedia, pp. 1161-1164, 2012.
- (2012) ACM Multimedia , pp. 1161-1164
- Komai, Y.¹ Yang, N.² Takiguchi, T.³ Ariki, Y.⁴

11
- 0035363218
- Active appearance models
- T. F. Cootes, G. J. Edwards, and C. J. Taylor, "Active appearance models, " IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 681-685, 2001.
- (2001) IEEE Transactions on Pattern Analysis and Machine Intelligence , pp. 681-685
- Cootes, T.F.¹ Edwards, G.J.² Taylor, C.J.³

12
- 84865747520
- Intonation conversion from neutral to expressive speech
- C. Veaux and X. Robet, "Intonation conversion from neutral to expressive speech, " in Interspeech, pp. 2765-2768, 2011.
- (2011) Interspeech , pp. 2765-2768
- Veaux, C.¹ Robet, X.²

13
- 84890451203
- GMM- based emotional voice conversion using spectrum and prosody features
- R. Aihara, R. Takashima, T. Takiguchi, and Y. Ariki, "GMM- based emotional voice conversion using spectrum and prosody features, " American Journal of Signal Processing, vol. 2, no. 5, 2012.
- (2012) American Journal of Signal Processing , vol.2 , Issue.5
- Aihara, R.¹ Takashima, R.² Takiguchi, T.³ Ariki, Y.⁴

14
- 80052698826
- Speaking- Aid systems using GMM-based voice conversion for electrolaryn- geal speech
- K. Nakamura, T. Toda, H. Saruwatari, and K. Shikano, "Speaking- Aid systems using GMM-based voice conversion for electrolaryn- geal speech, " Speech Communication, vol. 54, no. 1, pp. 134-146, 2012.
- (2012) Speech Communication , vol.54 , Issue.1 , pp. 134-146
- Nakamura, K.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

15
- 84890519936
- Individuality-preserving voice conversion for articulation disor- ders based on non-negative matrix factorization
- R. Aihara, R. Takashima, T. Takiguchi, and Y. Ariki, "Individuality-preserving voice conversion for articulation disor- ders based on Non-negative Matrix Factorization, " in ICASSP, pp. 8037-8040, 2013.
- (2013) ICASSP , pp. 8037-8040
- Aihara, R.¹ Takashima, R.² Takiguchi, T.³ Ariki, Y.⁴

16
- 84905224579
- Speak- ing aid system for total laryngectomees using voice conversion of body transmitted artificial speech
- K. Nakamura, T. Toda, H. Saruwatari, and K. Shikano, "Speak- ing aid system for total laryngectomees using voice conversion of body transmitted artificial speech, " in Interspeech, pp. 148-151, 2006.
- (2006) Interspeech , pp. 148-151
- Nakamura, K.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

17
- 0031623661
- Spectral voice conversion for text-to- speech synthesis
- A. Kain and M.W. Macon, "Spectral voice conversion for text-to- speech synthesis, " in ICASSP, vol. 1, pp. 285-288, 1998.
- (1998) ICASSP , vol.1 , pp. 285-288
- Kain, A.¹ Macon, M.W.²

18
- 0023739214
- Esophageal speech enhancement based on statistical voice con- version with gaussian mixture models
- M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, "Esophageal speech enhancement based on statistical voice con- version with Gaussian mixture models, " in ICASSP, pp. 655-658, 1988.
- (1988) ICASSP , pp. 655-658
- Abe, M.¹ Nakamura, S.² Shikano, K.³ Kuwabara, H.⁴

19
- 0026880275
- Voice transformation using PSOLA technique
- H. Valbret, E. Moulines, and J. P. Tubach, "Voice transformation using PSOLA technique, " Speech Communication, vol. 11, no. 2- 3, pp. 175-187, 1992.
- (1992) Speech Communication , vol.11 , Issue.2-3 , pp. 175-187
- Valbret, H.¹ Moulines, E.² Tubach, J.P.³

20
- 57749193836
- Voice conversion based on maximum likelihood estimation of spectral parameter trajectory
- T. Toda, A. Black, and K. Tokuda, "Voice conversion based on maximum likelihood estimation of spectral parameter trajectory, " IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2222-2235, 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.² Tokuda, K.³

21
- 77953712499
- Voice conversion using partial least squares regression
- E. Helander, T. Virtanen, J. Nurminen, and M. Gabbouj, "Voice conversion using partial least squares regression, " IEEE Trans. Audio, Speech, Lang. Process., vol. 18, Issue:5, pp. 912-921, 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.5 , pp. 912-921
- Helander, E.¹ Virtanen, T.² Nurminen, J.³ Gabbouj, M.⁴

22
- 44949210554
- Map-based adaptation for speech con- version using adaptation data selection and non-parallel training
- C. H. Lee and C. H. Wu, "Map-based adaptation for speech con- version using adaptation data selection and non-parallel training, " in Interspeech, pp. 2254-2257, 2006.
- (2006) Interspeech , pp. 2254-2257
- Lee, C.H.¹ Wu, C.H.²

23
- 34547512822
- Eigenvoice conversion based on Gaussian mixture model
- T. Toda, Y. Ohtani, and K. Shikano, "Eigenvoice conversion based on Gaussian mixture model, " in Interspeech, pp. 2446-2449, 2006.
- (2006) Interspeech , pp. 2446-2449
- Toda, T.¹ Ohtani, Y.² Shikano, K.³

24
- 84865798483
- One-to- many voice conversion based on tensor representation of speaker space
- D. Saito, K. Yamamoto, N. Minematsu, and K. Hirose, "One-to- many voice conversion based on tensor representation of speaker space, " in Interspeech, pp. 653-656, 2011.
- (2011) Interspeech , pp. 653-656
- Saito, D.¹ Yamamoto, K.² Minematsu, N.³ Hirose, K.⁴

25
- 84905269973
- Mutimodal voice conversion using non-negative matrix factorization in noisy environments
- K. Masaka, R. Aihara, T. Takiguchi, and Y. Ariki, "Mutimodal voice conversion using non-negative matrix factorization in noisy environments, " in ICASSP 2014, 2014.
- (2014) ICASSP 2014
- Masaka, K.¹ Aihara, R.² Takiguchi, T.³ Ariki, Y.⁴

26
- 0025475528
- ATR japanese speech database as a tool of speech recognition and synthesis
- A. Kurematsu, K. Takeda, Y. Sagisaka, S. Katagiri, H. Kuwabara, and K. Shikano, "ATR Japanese speech database as a tool of speech recognition and synthesis, " Speech Communication, vol. 9, pp. 357-363, 1990.
- (1990) Speech Communication , vol.9 , pp. 357-363
- Kurematsu, A.¹ Takeda, K.² Sagisaka, Y.³ Katagiri, S.⁴ Kuwabara, H.⁵ Shikano, K.⁶

27
- 85009089413
- HMM-based text-to-audio-visual speech synthesis - image-based approach
- S. Sako, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "HMM-based text-to-audio-visual speech synthesis - image-based approach, " ICSLP, vol.III, pp.25-28, 2000.
- (2000) ICSLP , vol.3 , pp. 25-28
- Sako, S.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

28
- 70349094936
- CENSREC-1-c: An evaluation framework for voice activity detection under noisy environments
- 2009
- N. Kitaoka, T. Yamada, S. Tsuge, C. Miyajima, K. Yamamoto, T. Nishiura, M. Nakayama, Y. Denda, M. Fujimoto, T. Takiguchi, S. Tamura, S. Matsuda, T. Ogawa, S. Kuroiwa, K. Takeda, and S. Nakamura, "CENSREC-1-C: An evaluation framework for voice activity detection under noisy environments, " Acoustical Science and Technology, Vol. 30 (2009), No. 5, pp. 363-371, 2009.
- (2009) Acoustical Science and Technology , vol.30 , Issue.5 , pp. 363-371
- Kitaoka, N.¹ Yamada, T.² Tsuge, S.³ Miyajima, C.⁴ Yamamoto, K.⁵ Nishiura, T.⁶ Nakayama, M.⁷ Denda, Y.⁸ Fujimoto, M.⁹ Takiguchi, T.¹⁰ Tamura, S.¹¹ Matsuda, S.¹² Ogawa, T.¹³ Kuroiwa, S.¹⁴ Takeda, K.¹⁵ Nakamura, S.¹⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.