SCOPUS 정보 검색 플랫폼

Speech Communication

Volumn 52, Issue 4, 2010, Pages 314-326

Improvement to a NAM-captured whisper-to-speech system

(4) Tran, Viet Anh a Bailly, Gérard a Lœvenbruck, Hélène a Toda, Tomoki b

a GIPSA LAB (France)

b NARA INSTITUTE OF SCIENCE AND TECHNOLOGY (Japan)

Author keywords

Audiovisual voice conversion; Non audible murmur; Silent speech interface; Whispered speech

Indexed keywords

COMPUTER-MEDIATED COMMUNICATION; DIMENSIONALITY REDUCTION; INPUT AND OUTPUTS; LINEAR DISCRIMINANT ANALYSIS; NON-AUDIBLE MURMUR; SPECTRAL ENVELOPES; SPEECH INTERFACE; SPEECH SYSTEMS; SUBJECTIVE EVALUATIONS; SUBJECTIVE TESTS; SYNTHESIZED SPEECH; TIME WINDOWS; UNVOICED SPEECH; VOICE CONVERSION; VOICED SEGMENT; VOICING DECISION; WHISPERED SPEECH;

DISCRIMINANT ANALYSIS; ESTIMATION; MAGNETOSTRICTIVE DEVICES; NEURAL NETWORKS; PRINCIPAL COMPONENT ANALYSIS; SPEECH COMMUNICATION; SPEECH RECOGNITION; VEHICLE ROUTING; WINDOWS;

SPEECH INTELLIGIBILITY;

EID: 76849105528 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2009.11.005 Document Type: Article

Times cited : (39)

References (34)

1
- 0036656541
- Three-dimensional linear articulatory modeling of tongue, lips and face based on MRI and video images
- Badin P., Bailly G., Revéret L., Baciu M., Segebarth C., and Savariaux C. Three-dimensional linear articulatory modeling of tongue, lips and face based on MRI and video images. J. Phonetics 30 3 (2002) 533-553
- (2002) J. Phonetics , vol.30 , Issue.3 , pp. 533-553
- Badin, P.¹ Bailly, G.² Revéret, L.³ Baciu, M.⁴ Segebarth, C.⁵ Savariaux, C.⁶

2
- 0142216141
- Audiovisual speech synthesis
- Bailly G., Bérar M., Elisei F., and Odisio M. Audiovisual speech synthesis. Internat. J. Speech Technol. 6 (2003) 331-346
- (2003) Internat. J. Speech Technol. , vol.6 , pp. 331-346
- Bailly, G.¹ Bérar, M.² Elisei, F.³ Odisio, M.⁴

3
- 84925684753
- Speaking with Smile or Disgust: Data and Models
- Australia, pp
- Bailly, G., Bégault, A., Elisei, F., Badin, P., 2008. Speaking with Smile or Disgust: Data and Models. AVSP, Tangalooma, Australia, pp. 111-116.
- (2008) AVSP, Tangalooma , pp. 111-116
- Bailly, G.¹ Bégault, A.² Elisei, F.³ Badin, P.⁴

4
- 34047156634
- NASA, Ames Research Center, Moffett Field, CA, p
- Bett, B.J., Jorgensen, C., 2005. Small Vocabulary Recognition Using Surface Electromyography in an Acoustically Harsh Environment. NASA, Ames Research Center, Moffett Field, CA, p. 16.
- (2005) Small Vocabulary Recognition Using Surface Electromyography in an Acoustically Harsh Environment , pp. 16
- Bett, B.J.¹ Jorgensen, C.²

5
- 70450174881
- Summary of Research Supported by British Academy
- Coleman, J., Grabe, E., Braun, B., 2002. Larynx movements and intonation in whispered speech. Summary of Research Supported by British Academy.
- (2002) Larynx movements and intonation in whispered speech
- Coleman, J.¹ Grabe, E.² Braun, B.³

6
- 0035363218
- Active appearance models
- Cootes T.F., Edwards G.J., and Taylor C.J. Active appearance models. IEEE Trans. Pattern Anal. Machine Intell. 23 6 (2001) 681-685
- (2001) IEEE Trans. Pattern Anal. Machine Intell. , vol.23 , Issue.6 , pp. 681-685
- Cootes, T.F.¹ Edwards, G.J.² Taylor, C.J.³

7
- 76849101274
- Comportements laryngés en voix chuchotée, étude en caméra ultra-rapide
- Paris, October
- Crevier-Buchman, L., Vincent, C., Hans, S., 2008. Comportements laryngés en voix chuchotée, étude en caméra ultra-rapide. In: Actes du 64ième Congrès de la Société Française de Phoniatrie et des Pathologies de la Communication, Paris, October 2008.
- (2008) Actes du 64ième Congrès de la Société Française de Phoniatrie et des Pathologies de la Communication
- Crevier-Buchman, L.¹ Vincent, C.² Hans, S.³

8
- 34547521941
- A tissue-conductive acoustic sensor applied in speech recognition for privacy
- Grenoble, France, pp
- Heracleous, P., Nakajima, Y., Saruwatari, H., Shikano, K., 2005. A tissue-conductive acoustic sensor applied in speech recognition for privacy. In: Internat. Conf. on Smart Objects and Ambient Intelligence, Grenoble, France, pp. 93-98.
- (2005) Internat. Conf. on Smart Objects and Ambient Intelligence , pp. 93-98
- Heracleous, P.¹ Nakajima, Y.² Saruwatari, H.³ Shikano, K.⁴

9
- 0029972858
- Perceived pitch of whispered vowels - relationship with formant frequencies: a preliminary study
- Higashikawa M., Nakai K., Sakakura A., and Takahashi H. Perceived pitch of whispered vowels - relationship with formant frequencies: a preliminary study. J. Voice 10 2 (1996) 155-158
- (1996) J. Voice , vol.10 , Issue.2 , pp. 155-158
- Higashikawa, M.¹ Nakai, K.² Sakakura, A.³ Takahashi, H.⁴

10
- 34547554405
- EigenTongue feature extraction for an ultrasound-based silent speech interface
- Honolulu, Hawaii, pp
- Hueber, T., Aversano, G., Chollet, G., Denby, B., Dreyfus, G., Oussar, Y., Roussel, P., Stone, M., 2007a. EigenTongue feature extraction for an ultrasound-based silent speech interface. In: IEEE Internat. Conf. on Acoustics, Speech and Signal Processing, Honolulu, Hawaii, pp. 1245-1248.
- (2007) IEEE Internat. Conf. on Acoustics, Speech and Signal Processing , pp. 1245-1248
- Hueber, T.¹ Aversano, G.² Chollet, G.³ Denby, B.⁴ Dreyfus, G.⁵ Oussar, Y.⁶ Roussel, P.⁷ Stone, M.⁸

11
- 67650558346
- Continuous-speech Phone Recognition from Ultrasound and Optical Images of the Tongue and Lips
- Antwerp, Belgium, pp
- Hueber, T., Chollet, G., Denby, B., Dreyfus, G., Stone, M., 2007b. Continuous-speech Phone Recognition from Ultrasound and Optical Images of the Tongue and Lips. InterSpeech, Antwerp, Belgium, pp. 658-661.
- (2007) InterSpeech , pp. 658-661
- Hueber, T.¹ Chollet, G.² Denby, B.³ Dreyfus, G.⁴ Stone, M.⁵

12
- 76849099671
- Hueber, T., Chollet, G., Denby, B., Stone, M., Zouari, L., 2007c. Ouisper: corpus-based synthesis driven by articulatory data. In: Internat. Cong. of Phonetic Sciences, Saarbrücken, Germany, pp. 2193-2196.
- Hueber, T., Chollet, G., Denby, B., Stone, M., Zouari, L., 2007c. Ouisper: corpus-based synthesis driven by articulatory data. In: Internat. Cong. of Phonetic Sciences, Saarbrücken, Germany, pp. 2193-2196.

13
- 84867208175
- Towards a Segmental Vocoder Driven by Ultrasound and Optical Images of the Tongue and Lips
- Brisbane, Australia, pp
- Hueber, T., Chollet, G., Denby, B., Dreyfus, G., Stone, M., 2008a. Towards a Segmental Vocoder Driven by Ultrasound and Optical Images of the Tongue and Lips. InterSpeech, Brisbane, Australia, pp. 2028-2031.
- (2008) InterSpeech , pp. 2028-2031
- Hueber, T.¹ Chollet, G.² Denby, B.³ Dreyfus, G.⁴ Stone, M.⁵

14
- 84867195703
- Phone Recognition from Ultrasound and Optical Video Sequences for a Silent Speech Interface
- Brisbane, Australia, pp
- Hueber, T., Chollet, G., Denby, B., Dreyfus, G., Stone, M., 2008b. Phone Recognition from Ultrasound and Optical Video Sequences for a Silent Speech Interface. InterSpeech, Brisbane, Australia, pp. 2032-2035.
- (2008) InterSpeech , pp. 2032-2035
- Hueber, T.¹ Chollet, G.² Denby, B.³ Dreyfus, G.⁴ Stone, M.⁵

15
- 0014887733
- The electromyographic study of verbal hallucinations
- Inouye T., and Shimizu A. The electromyographic study of verbal hallucinations. J. Nerv. Mental Dis. 151 (1970) 415-422
- (1970) J. Nerv. Mental Dis. , vol.151 , pp. 415-422
- Inouye, T.¹ Shimizu, A.²

16
- 27544482614
- Web browser control using EMG-based subvocal speech recognition
- Hawaii, p
- Jorgensen, C., Binsted, K., 2005. Web browser control using EMG-based subvocal speech recognition. In: Proc. 38th Annual Hawaii Internat. Conf. on System Sciences (HICSS'05), Hawaii, p. 294c.
- (2005) Proc. 38th Annual Hawaii Internat. Conf. on System Sciences (HICSS'05)
- Jorgensen, C.¹ Binsted, K.²

17
- 44949257531
- Towards Continuous Speech Recognition Using Surface Electromyography
- Jou, S.-C., Schultz, T., Walliczek, M., Kraft, F., Waibel, A., 2006. Towards Continuous Speech Recognition Using Surface Electromyography. InterSpeech, Pittsburgh, PE, pp. 573-576.
- (2006) InterSpeech, Pittsburgh, PE , pp. 573-576
- Jou, S.-C.¹ Schultz, T.² Walliczek, M.³ Kraft, F.⁴ Waibel, A.⁵

18
- 0032673049
- 0 extraction: Possible role of a repetitive structure in sounds
- 0 extraction: Possible role of a repetitive structure in sounds. Speech Comm. 27 3-4 (1999) 187-207
- (1999) Speech Comm. , vol.27 , Issue.3-4 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² de Cheveigné, A.³

19
- 44949187612
- Improving Body Transmitted Unvoiced Speech with Statistical Voice Conversion
- Nakagiri, M., Toda, T., Kashioka, H., Shikano, K., 2006. Improving Body Transmitted Unvoiced Speech with Statistical Voice Conversion. InterSpeech, Pittsburgh, PE, pp. 2270-2273.
- (2006) InterSpeech, Pittsburgh, PE , pp. 2270-2273
- Nakagiri, M.¹ Toda, T.² Kashioka, H.³ Shikano, K.⁴

20
- 0141520383
- Non-audible murmur recognition Input Interface using stethoscopic microphone attached to the skin
- Nakajima, Y., Kashioka, H., Shikano, K., Campbell, N., 2003. Non-audible murmur recognition Input Interface using stethoscopic microphone attached to the skin. In: Internat. Conf. on Acoustics, Speech and Signal Processing, pp. 708-711.
- (2003) Internat. Conf. on Acoustics, Speech and Signal Processing , pp. 708-711
- Nakajima, Y.¹ Kashioka, H.² Shikano, K.³ Campbell, N.⁴

21
- 84893587746
- Joint Audiovisual Speech Processing for Recognition and Enhancement
- France, pp
- Potamianos, G., Neti, C., Deligne, S., 2003. Joint Audiovisual Speech Processing for Recognition and Enhancement. Auditory-Visual Speech Processing, St Jorioz, France, pp. 95-104.
- (2003) Auditory-Visual Speech Processing, St Jorioz , pp. 95-104
- Potamianos, G.¹ Neti, C.² Deligne, S.³

22
- 84870292720
- MOTHER: A new generation of talking heads providing a flexible articulatory control for video-realistic speech animation
- Beijing, China, pp
- Revéret, L., Bailly, G., Badin, P., 2000. MOTHER: a new generation of talking heads providing a flexible articulatory control for video-realistic speech animation. In: Internat. Conf. on Speech and Language Processing, Beijing, China, pp. 755-758.
- (2000) Internat. Conf. on Speech and Language Processing , pp. 755-758
- Revéret, L.¹ Bailly, G.² Badin, P.³

23
- 10444247388
- Developing an audio-visual speech source separation algorithm
- Sodoyer D., Girin L., Jutten C., and Schwartz J.-L. Developing an audio-visual speech source separation algorithm. Speech Comm. 44 1-4 (2004) 113-125
- (2004) Speech Comm. , vol.44 , Issue.1-4 , pp. 113-125
- Sodoyer, D.¹ Girin, L.² Jutten, C.³ Schwartz, J.-L.⁴

24
- 0018701386
- Use of visual information for phonetic perception
- Summerfield Q. Use of visual information for phonetic perception. Phonetica 36 (1979) 314-331
- (1979) Phonetica , vol.36 , pp. 314-331
- Summerfield, Q.¹

25
- 0002955163
- Lips, teeth, and the benefits of lipreading
- Young A.W., and Ellis H.D. (Eds), Elsevier Science Publishers, Amsterdam
- Summerfield A., MacLeod A., McGrath M., and Brooke M. Lips, teeth, and the benefits of lipreading. In: Young A.W., and Ellis H.D. (Eds). Handbook of Research on Face Processing (1989), Elsevier Science Publishers, Amsterdam 223-233
- (1989) Handbook of Research on Face Processing , pp. 223-233
- Summerfield, A.¹ MacLeod, A.² McGrath, M.³ Brooke, M.⁴

26
- 33745214435
- Lisbon, Portugal, pp
- Toda, T., Shikano, K., 2005. NAM-to-Speech Conversion with Gaussian Mixture Models. InterSpeech, Lisbon, Portugal, pp. 1957-1960.
- (2005) NAM-to-Speech Conversion with Gaussian Mixture Models. InterSpeech , pp. 1957-1960
- Toda, T.¹ Shikano, K.²

27
- 33745200051
- Lisbon, Portugal, pp
- Toda, T., Tokuda, K., 2005. Speech Parameter Generation Algorithm Considering Global Variance for HMM-based Speech Synthesis. InterSpeech, Lisbon, Portugal, pp. 2801-2804.
- (2005) Speech Parameter Generation Algorithm Considering Global Variance for HMM-based Speech Synthesis. InterSpeech , pp. 2801-2804
- Toda, T.¹ Tokuda, K.²

28
- 70349200844
- Voice conversion for various types of body transmitted speech
- Taipei, Taiwan, pp
- Toda, T., Nakamura, K., Sekimoto, H., Shikano, K., 2009. Voice conversion for various types of body transmitted speech. In: Proc. ICASSP. Taipei, Taiwan, pp. 3601-3604.
- (2009) Proc. ICASSP , pp. 3601-3604
- Toda, T.¹ Nakamura, K.² Sekimoto, H.³ Shikano, K.⁴

29
- 0033708106
- Speech parameter generation algorithms for HMM-based speech synthesis
- Istanbul, Turkey, pp
- Tokuda, K., Yoshimura, T., Masuko, T., Kobayashi, T., Kitamura, T., 2000. Speech parameter generation algorithms for HMM-based speech synthesis. In: IEEE Internat. Conf. on Acoustics, Speech, and Signal Processing. Istanbul, Turkey, pp. 1315-1318.
- (2000) IEEE Internat. Conf. on Acoustics, Speech, and Signal Processing , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

30
- 77949907441
- 0 and voicing from NAM-captured whispered speech
- Campinas, Brazil, May
- 0 and voicing from NAM-captured whispered speech. In: Proc. Speech Prosody. Campinas, Brazil, May.
- (2008) Proc. Speech Prosody
- Tran, V.-A.¹ Bailly, G.² Lœvenbruck, H.³ Toda, T.⁴

31
- 44949179587
- Sub-word Unit Based Non-audible Speech Recognition Using Surface Electromyography
- Walliczek, M., Kraft, F., Jou, S.-C., Schultz, T., Waibel, A., 2006. Sub-word Unit Based Non-audible Speech Recognition Using Surface Electromyography. InterSpeech, Pittsburgh, PE, pp. 1487-1490.
- (2006) InterSpeech, Pittsburgh, PE , pp. 1487-1490
- Walliczek, M.¹ Kraft, F.² Jou, S.-C.³ Schultz, T.⁴ Waibel, A.⁵

32
- 0003822743
- Entropic Ltd., Cambridge, United Kingdom
- Young S., Kershaw D., Odell J., Ollason D., Valtchev V., and Woodland P. The HTK Book (1999), Entropic Ltd., Cambridge, United Kingdom
- (1999) The HTK Book
- Young, S.¹ Kershaw, D.² Odell, J.³ Ollason, D.⁴ Valtchev, V.⁵ Woodland, P.⁶

33
- 76849102588
- Zen, H, Nose, T, Yamagishi, J, Sako, S, Masuko, T, Black, A, Tokuda, K, 2007. The HMM-based Speech Synthesis System Version 2.0. Speech Synthesis Workshop, Bonn, Germany, pp. 294-299
- Zen, H., Nose, T., Yamagishi, J., Sako, S., Masuko, T., Black, A., Tokuda, K., 2007. The HMM-based Speech Synthesis System Version 2.0. Speech Synthesis Workshop, Bonn, Germany, pp. 294-299.

34
- 33745222547
- Physiological Study of Whispered Speech in Moroccan Arabic
- Lisbon, pp
- Zeroual, C., Esling, J., Crevier-Buchman, L., 2005. Physiological Study of Whispered Speech in Moroccan Arabic. In: Proc. InterSpeech. Lisbon, pp. 1069-1072.
- (2005) Proc. InterSpeech , pp. 1069-1072
- Zeroual, C.¹ Esling, J.² Crevier-Buchman, L.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.