SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 20, Issue 9, 2012, Pages 2505-2517

Statistical voice conversion techniques for body-conducted unvoiced speech enhancement

(3) Toda, Tomoki a Nakagiri, Mikihiro b Shikano, Kiyohiro a

a NARA INSTITUTE OF SCIENCE AND TECHNOLOGY (Japan)

b PANASONIC CORPORATION (Japan)

Author keywords

body conducted unvoiced speech; nonaudible murmur; Silent speech; voice conversion; whispered voice

Indexed keywords

ACOUSTIC FEATURES; CONVERSION MODEL; EXPERIMENTAL EVALUATION; FUNDAMENTAL FREQUENCY CONTOUR; GAUSSIAN MIXTURE MODELS; NON-AUDIBLE MURMUR; SPEECH SOUNDS; STATISTICAL APPROACH; UNVOICED SPEECH; VOICE CONVERSION; VOICE CONVERSION TECHNIQUES;

DEGRADATION; MICROPHONES; SPEECH ENHANCEMENT; SPEECH INTELLIGIBILITY;

SPEECH RECOGNITION;

EID: 84865698185 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2012.2205241 Document Type: Article

Times cited : (188)

References (32)

1
- 76849116340
- Silent speech interfaces
- B. Denby, T. Schultz, K. Honda, T. Hueber, J. M. Gilbert, and J. S. Brumberg, "Silent speech interfaces," Speech Commun., vol. 52, no. 4, pp. 270-287, 2010.
- (2010) Speech Commun. , vol.52 , Issue.4 , pp. 270-287
- Denby, B.¹ Schultz, T.² Honda, K.³ Hueber, T.⁴ Gilbert, J.M.⁵ Brumberg, J.S.⁶

2
- 84890487256
- Adaptation for soft whisper recognition using a throat microphone
- Jeju Island, Korea Sep.
- S.-C. Jou, T. Schultz, and A. Waibel, "Adaptation for soft whisper recognition using a throat microphone," in Proc. INTERSPEECH, Jeju Island, Korea, Sep. 2004, pp. 1493-1496.
- (2004) Proc. INTERSPEECH , pp. 1493-1496
- Jou, S.-C.¹ Schultz, T.² Waibel, A.³

3
- 76849099234
- Modeling coarticulation in emg-based continuous speech recognition
- T. Schultz and M. Wand, "Modeling coarticulation in EMG-based continuous speech recognition," Speech Commun., vol. 52, no. 4, pp. 341-353, 2010.
- (2010) Speech Commun. , vol.52 , Issue.4 , pp. 341-353
- Schultz, T.¹ Wand, M.²

4
- 76849104115
- Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips
- T. Hueber, E.-L. Benaroya, G. Chollet, B. Denby, G. Dreyfus, and M. Stone, "Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips," Speech Commun., vol. 52, no. 4, pp. 288-300, 2010.
- (2010) Speech Commun. , vol.52 , Issue.4 , pp. 288-300
- Hueber, T.¹ Benaroya, E.-L.² Chollet, G.³ Denby, B.⁴ Dreyfus, G.⁵ Stone, M.⁶

5
- 38649090114
- Multisensory processing for speech enhancement and magnitude-normalized spectra for speech modeling
- DOI 10.1016/j.specom.2007.09.002, PII S0167639307001501
- A. Subramanya, Z. Zhang, Z. Liu, and A. Acero, "Multisensory processing for speech enhancement and magnitude-normalized spectra for speech modeling," Speech Commun., vol. 50, no. 3, pp. 228-243, 2008. (Pubitemid 351172472)
- (2008) Speech Communication , vol.50 , Issue.3 , pp. 228-243
- Subramanya, A.¹ Zhang, Z.² Liu, Z.³ Acero, A.⁴

6
- 32244438249
- Non-audible murmur (nam) recognition
- Y. Nakajima, H. Kashioka, N. Cambell, and K. Shikano, "Non-audible murmur (NAM) recognition," IEICE Trans. Inf. Syst., vol. E89-D, no. 1, pp. 1-8, 2006.
- (2006) IEICE Trans. Inf. Syst. , vol.E89-D , Issue.1 , pp. 1-8
- Nakajima, Y.¹ Kashioka, H.² Cambell, N.³ Shikano, K.⁴

7
- 76849086888
- Silent-speech enhancement using body-conducted vocal-tract resonance signals
- T. Hirahara, M. Otani, S. Shimizu, T. Toda, K. Nakamura, Y. Nakajima, and K. Shikano, "Silent-speech enhancement using body-conducted vocal-tract resonance signals," Speech Commun., vol. 52, no. 4, pp. 301-313, 2010.
- (2010) Speech Commun. , vol.52 , Issue.4 , pp. 301-313
- Hirahara, T.¹ Otani, M.² Shimizu, S.³ Toda, T.⁴ Nakamura, K.⁵ Nakajima, Y.⁶ Shikano, K.⁷

8
- 70450162011
- Technologies for processing body-conducted speech detected with non-audible murmur microphone
- Brighton, U.K. Sep.
- T. Toda, K. Nakamura, T. Nagai, T. Kaino, Y. Nakajima, and K. Shikano, "Technologies for processing body-conducted speech detected with non-audible murmur microphone," in Proc. INTERSPEECH, Brighton, U.K., Sep. 2009, pp. 632-635.
- (2009) Proc. INTERSPEECH , pp. 632-635
- Toda, T.¹ Nakamura, K.² Nagai, T.³ Kaino, T.⁴ Nakajima, Y.⁵ Shikano, K.⁶

9
- 77955720028
- Analysis and recognition of namspeech using hmm distances and visual information
- Aug
- P. Heracleous, V.-A. Tran, T. Nagai, and K. Shikano, "Analysis and recognition of NAMspeech using HMM distances and visual information," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 6, pp. 1528-1538, Aug. 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.6 , pp. 1528-1538
- Heracleous, P.¹ Tran, V.-A.² Nagai, T.³ Shikano, K.⁴

10
- 0032026483
- Continuous probabilistic transform for voice conversion
- PII S1063667698017386
- Y. Stylianou, O. Cappé, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. Speech Audio Process., vol. 6, no. 2, pp. 131-142, Mar. 1998. (Pubitemid 128720639)
- (1998) IEEE Transactions on Speech and Audio Processing , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappe, O.² Moulines, E.³

11
- 57749193836
- Voice conversion based on maximum likelihood estimation of spectral parameter trajectory
- Nov
- T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based on maximum likelihood estimation of spectral parameter trajectory," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2222-2235, Nov. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.W.² Tokuda, K.³

12
- 33745214435
- NAM-to-speech conversion with Gaussian mixture models
- 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
- T. Toda and K. Shikano, "NAM-to-speech conversion with Gaussian mixture models," in Proc. INTERSPEECH, Lisbon, Portugal, Sep. 2005, pp. 1957-1960. (Pubitemid 43908472)
- (2005) 9th European Conference on Speech Communication and Technology , pp. 1957-1960
- Toda, T.¹ Shikano, K.²

13
- 44949187612
- Improving body transmitted unvoiced speech with statistical voice conversion
- Pittsburgh, PA Sep.
- M. Nakagiri, T. Toda, H. Saruwatari, and K. Shikano, "Improving body transmitted unvoiced speech with statistical voice conversion," in Proc. INTERSPEECH, Pittsburgh, PA, Sep. 2006, pp. 2270-2273.
- (2006) Proc. INTERSPEECH , pp. 2270-2273
- Nakagiri, M.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

14
- 70349200844
- Voice conversion for various types of body transmitted speech
- Taipei, Taiwan Apr.
- T. Toda, K. Nakamura, H. Sekimoto, and K. Shikano, "Voice conversion for various types of body transmitted speech," in Proc. ICASSP, Taipei, Taiwan, Apr. 2009, pp. 3601-3604.
- (2009) Proc. ICASSP , pp. 3601-3604
- Toda, T.¹ Nakamura, K.² Sekimoto, H.³ Shikano, K.⁴

15
- 76849105528
- Improvement to a nam-captured whisper-to-speech system
- V.-A. Tran, G. Bailly, H. Loevenbruck, and T. Toda, "Improvement to a NAM-captured whisper-to-speech system," Speech Commun., vol. 52, no. 4, pp. 314-326, 2010.
- (2010) Speech Commun. , vol.52 , Issue.4 , pp. 314-326
- Tran, V.-A.¹ Bailly, G.² Loevenbruck, H.³ Toda, T.⁴

16
- 33745217604
- Remodeling of the sensor for Non-Audible Murmur (NAM)
- 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
- Y. Nakajima, H. Kashioka, K. Shikano, and N. Campbell, "Remodeling of the sensor for non-audible murmur (NAM)," in Proc. INTERSPEECH, Lisbon, Portugal, Sep. 2005, pp. 389-392. (Pubitemid 43908081)
- (2005) 9th European Conference on Speech Communication and Technology , pp. 389-392
- Nakajima, Y.¹ Kashioka, H.² Shikano, K.³ Campbell, N.⁴

17
- 0031623661
- Spectral voice conversion for text-to-speech synthesis
- Seattle, WA May
- A. Kain and M. W. Macon, "Spectral voice conversion for text-to-speech synthesis," in Proc. ICASSP, Seattle, WA, May 1998, pp. 285-288.
- (1998) Proc. ICASSP , pp. 285-288
- Kain, A.¹ Macon, M.W.²

18
- 0033708106
- Speech parameter generation algorithms for hmm-based speech synthesis
- Istanbul, Turkey Jun.
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis," in Proc. ICASSP, Istanbul, Turkey, Jun. 2000, pp. 1315-1318.
- (2000) Proc. ICASSP , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

19
- 67651002140
- Statistical parametric speech synthesis
- H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis," Speech Commun., vol. 51, no. 11, pp. 1039-1064, 2009.
- (2009) Speech Commun. , vol.51 , Issue.11 , pp. 1039-1064
- Zen, H.¹ Tokuda, K.² Black, A.W.³

20
- 38649140222
- Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model
- DOI 10.1016/j.specom.2007.09.001, PII S0167639307001495
- T. Toda, A. W. Black, and K. Tokuda, "Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model," Speech Commun., vol. 50, no. 3, pp. 215-227, 2008. (Pubitemid 351172471)
- (2008) Speech Communication , vol.50 , Issue.3 , pp. 215-227
- Toda, T.¹ Black, A.W.² Tokuda, K.³

21
- 84874199000
- Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system straight
- Firenze, Italy Sep.
- H. Kawahara, J. Estill, and O. Fujimura, "Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT," in Proc. MAVEBA, Firenze, Italy, Sep. 2001.
- (2001) Proc. MAVEBA
- Kawahara, H.¹ Estill, J.² Fujimura, O.³

22
- 44949143155
- Maximum likelihood voice conversion based on gmm with straight mixed excitation
- Pittsburgh, PA Sep.
- Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation," in Proc. INTERSPEECH, Pittsburgh, PA, Sep. 2006, pp. 2266-2269.
- (2006) Proc. INTERSPEECH , pp. 2266-2269
- Ohtani, Y.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

23
- 6644226630
- A large-scale japanese speech database
- Kobe, Japan Nov.
- Y. Sagisaka, K. Takeda, M. Abe, S. Katagiri, T. Umeda, and H. Kuwabara, "A large-scale Japanese speech database," in Proc. ICSLP90, Kobe, Japan, Nov. 1990, pp. 1089-1092.
- (1990) Proc. ICSLP90 , pp. 1089-1092
- Sagisaka, Y.¹ Takeda, K.² Abe, M.³ Katagiri, S.⁴ Umeda, T.⁵ Kuwabara, H.⁶

24
- 84865690183
- JNAS: Japanese Newspaper Article Sentences [Online]. Available:
- JNAS: Japanese Newspaper Article Sentences, [Online]. Available: http://www.milab.is.tsukuba.ac.jp/jnas/instruct.html

25
- 85131821539
- Mel-generalized cepstral analysis a unified approach to speech spectral estimation
- Yokohama, Japan Sep.
- K. Tokuda, T. Kobayashi, T. Masuko, and S. Imai, "Mel-generalized cepstral analysis -A unified approach to speech spectral estimation," in Proc. ICSLP, Yokohama, Japan, Sep. 1994, pp. 1043-1045.
- (1994) Proc. ICSLP , pp. 1043-1045
- Tokuda, K.¹ Kobayashi, T.² Masuko, T.³ Imai, S.⁴

26
- 0032673049
- Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based extraction: Possible role of a repetitive structure in sounds
- H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigné, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based extraction: Possible role of a repetitive structure in sounds," Speech Commun., vol. 27, no. 3-4, pp. 187-207, 1999.
- (1999) Speech Commun. , vol.27 , Issue.3-4 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² De Cheveigné, A.³

27
- 84928118106
- Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of and periodicity
- Budapest, Hungary Sep.
- H. Kawahara, H. Katayose, A. de Cheveigné, and R. D. Patterson, "Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of and periodicity," in Proc. EUROSPEECH, Budapest, Hungary, Sep. 1999, pp. 2781-2784.
- (1999) Proc. EUROSPEECH , pp. 2781-2784
- Kawahara, H.¹ Katayose, H.² De Cheveigné, A.³ Patterson, R.D.⁴

28
- 85016140477
- An adaptive algorithm for mel-cepstral analysis of speech
- San Francisco, CA Mar.
- T. Fukada, K. Tokuda, T. Kobayashi, and S. Imai, "An adaptive algorithm for mel-cepstral analysis of speech," in Proc. ICASSP, San Francisco, CA, Mar. 1992, vol. 1, pp. 137-140.
- (1992) Proc. ICASSP , vol.1 , pp. 137-140
- Fukada, T.¹ Tokuda, K.² Kobayashi, T.³ Imai, S.⁴

29
- 84867211725
- Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory
- Brisbane, Australia Sep.
- T. Muramatsu, Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory," in Proc. INTERSPEECH, Brisbane, Australia, Sep. 2008, pp. 1076-1079.
- (2008) Proc. INTERSPEECH , pp. 1076-1079
- Muramatsu, T.¹ Ohtani, Y.² Toda, T.³ Saruwatari, H.⁴ Shikano, K.⁵

30
- 70349223901
- Acoustic compensation methods for body transmitted speech conversion
- Taipei, Taiwan Apr.
- D. Miyamoto, K. Nakamura, T. Toda, H. Saruwatari, and K. Shikano, "Acoustic compensation methods for body transmitted speech conversion," in Proc. ICASSP, Taipei, Taiwan, Apr. 2009, pp. 3901-3904.
- (2009) Proc. ICASSP , pp. 3901-3904
- Miyamoto, D.¹ Nakamura, K.² Toda, T.³ Saruwatari, H.⁴ Shikano, K.⁵

31
- 34547496175
- One-to-many and many-to-one voice conversion based on eigenvoices
- Apr.
- T. Toda, Y. Ohtani, and K. Shikano, "One-to-many and many-to-one voice conversion based on eigenvoices," in Proc. ICASSP, Apr. 2007, pp. 1249-1252.
- (2007) Proc. ICASSP , pp. 1249-1252
- Toda, T.¹ Ohtani, Y.² Shikano, K.³

32
- 70450194389
- Many-to-many eigenvoice conversion with reference voice
- Brighton, U.K. Sep.
- Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Many-to-many eigenvoice conversion with reference voice," in Proc. INTERSPEECH, Brighton, U.K., Sep. 2009, pp. 1623-1626.
- (2009) Proc. INTERSPEECH , pp. 1623-1626
- Ohtani, Y.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.