SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 22, Issue 1, 2014, Pages 172-183

Alaryngeal speech enhancement based on one-to-many eigenvoice conversion

(5) Doi, Hironori a Toda, Tomoki a Nakamura, Keigo b Saruwatari, Hiroshi a Shikano, Kiyohiro a

a NARA INSTITUTE OF SCIENCE AND TECHNOLOGY (Japan)

b RAKUTEN INC (Japan)

Author keywords

Alaryngeal speech; Eigenvoice conversion; Laryngectomees; Speech enhancement; Voice conversion

Indexed keywords

HANDICAPPED PERSONS; SPEECH ENHANCEMENT; SPEECH PRODUCTION AIDS; SPEECH PROCESSING;

ALARYNGEAL SPEECH; CONVERSION MODEL; EIGENVOICES; ELECTROLARYNGEAL SPEECH; ESOPHAGEAL SPEECH; LARYNGECTOMEES; SPEECH QUALITY; VOICE CONVERSION;

SPEECH;

EID: 84897939966 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASLP.2013.2286917 Document Type: Article

Times cited : (59)

References (29)

1
- 77956027630
- Evaluation of extremely small sound source signals used in speaking-aid system with statistical voice conversion
- Jul.
- K. Nakamura, T. Toda, H. Saruwatari, and K. Shikano, "Evaluation of extremely small sound source signals used in speaking-aid system with statistical voice conversion," IEICE Trans. Inf. Syst., vol. E93-D, no. 7, pp. 1909-1917, Jul. 2010.
- (2010) IEICE Trans. Inf. Syst. , vol.E93-D , Issue.7 , pp. 1909-1917
- Nakamura, K.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

2
- 32244438249
- Non-Audible Murmur (NAM) recognition
- Jan.
- Y. Nakajima, H. Kashioka, N. Campbell, and K. Shikano, "Non-Audible Murmur (NAM) recognition," IEICE Trans. Inf. Syst., vol. E89-D, no. 1, pp. 1-8, Jan. 2006.
- (2006) IEICE Trans. Inf. Syst. , vol.E89-D , Issue.1 , pp. 1-8
- Nakajima, Y.¹ Kashioka, H.² Campbell, N.³ Shikano, K.⁴

3
- 79959833617
- The use of air-pressure sensor in electrolaryngeal speech enhancement based on statistical voice conversion
- K. Nakamura, T. Toda, H. Saruwatari, and K. Shikano, "The use of air-pressure sensor in electrolaryngeal speech enhancement based on statistical voice conversion," in Proc. Interspeech, Sep. 2010, pp. 1628-1631.
- Proc. Interspeech, Sep. 2010 , pp. 1628-1631
- Nakamura, K.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

4
- 33751234295
- Real-time clarification of esophageal speech using a comb filter
- A. Hisada and H. Sawada, "Real-time clarification of esophageal speech using a comb filter," in Proc. ICDVRAT, Sep. 2002, pp. 39-46.
- Proc. ICDVRAT, Sep. 2002 , pp. 39-46
- Hisada, A.¹ Sawada, H.²

5
- 0032653605
- Enhancement of esophageal speech using formant synthesis
- K. Matsui, N. Hara, N. Kobayashi, and H. Hirose, "Enhancement of esophageal speech using formant synthesis," in Proc. ICASSP, May 1999, pp. 1831-1834.
- Proc. ICASSP, May 1999 , pp. 1831-1834
- Matsui, K.¹ Hara, N.² Kobayashi, N.³ Hirose, H.⁴

6
- 77956916135
- Reconstruction of normal sounding speech for laryngectomy patients through a modified CELP codec
- Oct.
- H. R. Sharifzadeh, I. V. McLoughlin, and F. Ahmadi, "Reconstruction of normal sounding speech for laryngectomy patients through a modified CELP codec," IEEE Trans. Biomed. Eng., vol. 57, no. 10, pp. 2448-2458, Oct. 2010.
- (2010) IEEE Trans. Biomed. Eng. , vol.57 , Issue.10 , pp. 2448-2458
- Sharifzadeh, H.R.¹ McLoughlin, I.V.² Ahmadi, F.³

7
- 33646358742
- Enhancement of electrolarynx speech based on auditory masking
- DOI 10.1109/TBME.2006.872821, 1621138
- H. Liu, Q. Zhao, M. Wan, and S. Wang, "Enhancement of electrolarynx speech based on auditory masking," IEEE Trans. Biomed. Eng., vol. 53, no. 5, pp. 865-874, May 2006. (Pubitemid 43667746)
- (2006) IEEE Transactions on Biomedical Engineering , vol.53 , Issue.5 , pp. 865-874
- Liu, H.¹ Zhao, Q.² Wan, M.³ Wang, S.⁴

8
- 33749056139
- Enhancement and restoration of alaryngeal speech signals
- G. Aguilar-Torres, M. Nakano-Miyatake, and H. Perez-Meana, "Enhancement and restoration of alaryngeal speech signals," in Proc. 16th IEEE Int. Conf. Electron., Commun. Comput. (CONIELECOMP '06), Feb. 2006, p. 30.
- Proc. 16th IEEE Int. Conf. Electron., Commun. Comput. (CONIELECOMP '06), Feb. 2006 , pp. 30
- Aguilar-Torres, G.¹ Nakano-Miyatake, M.² Perez-Meana, H.³

9
- 80051642767
- An evaluation of alaryngeal speech enhancement methods based on voice conversion techniques
- H. Doi, K. Nakamura, T. Toda, H. Saruwatari, and K. Shikano, "An evaluation of alaryngeal speech enhancement methods based on voice conversion techniques," in Proc. ICASSP, May 2011, pp. 5136-5139.
- Proc. ICASSP, May 2011 , pp. 5136-5139
- Doi, H.¹ Nakamura, K.² Toda, T.³ Saruwatari, H.⁴ Shikano, K.⁵

10
- 85004448479
- Voice conversion through vector quantization
- M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, "Voice conversion through vector quantization," J. Acoust. Soc. Jpn. (E), vol. 11, no. 2, pp. 71-76, 1990.
- (1990) J. Acoust. Soc. Jpn. (E) , vol.11 , Issue.2 , pp. 71-76
- Abe, M.¹ Nakamura, S.² Shikano, K.³ Kuwabara, H.⁴

11
- 0032026483
- Continuous probabilistic transform for voice conversion
- PII S1063667698017386
- Y. Stylianou, O. Cappe, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. Speech Audio Process., vol. 6, no. 2, pp. 131-142, Mar. 1998. (Pubitemid 128720639)
- (1998) IEEE Transactions on Speech and Audio Processing , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappe, O.² Moulines, E.³

12
- 34547496175
- One-to-many and many-to-one voice conversion based on eigenvoices
- T. Toda, Y. Ohtani, and K. Shikano, "One-to-many and many-to-one voice conversion based on eigenvoices," in Proc. ICASSP, Apr. 2007, pp. 1249-1252.
- Proc. ICASSP, Apr. 2007 , pp. 1249-1252
- Toda, T.¹ Ohtani, Y.² Shikano, K.³

13
- 77956795483
- Esophageal speech enhancement based on statistical voice conversion with Gaussian mixture models
- Sep.
- H. Doi, K. Nakamura, T. Toda, H. Saruwatari, and K. Shikano, "Esophageal speech enhancement based on statistical voice conversion with Gaussian mixture models," IEICE Trans. Inf. Syst., vol. E93-D, no. 9, pp. 2472-2482, Sep. 2010.
- (2010) IEICE Trans. Inf. Syst. , vol.E93-D , Issue.9 , pp. 2472-2482
- Doi, H.¹ Nakamura, K.² Toda, T.³ Saruwatari, H.⁴ Shikano, K.⁵

14
- 84928118106
- Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of and periodicity
- H. Kawahara, H. Katayose, A. Cheveigne, and R. D. Patterson, "Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of and periodicity," in Proc. EUROSPEECH, Sep. 1999, pp. 2781-2784.
- Proc. EUROSPEECH, Sep. 1999 , pp. 2781-2784
- Kawahara, H.¹ Katayose, H.² Cheveigne, A.³ Patterson, R.D.⁴

15
- 44949143155
- Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation
- Sep.
- Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation,"Interspeech '06-ICSLP, pp. 2266-2269, Sep. 2006.
- (2006) Interspeech '06-ICSLP , pp. 2266-2269
- Ohtani, Y.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

16
- 84874199000
- Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and system straight
- H. Kawahara, J. Estill, and O. Fujimura, "Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and system straight," in Proc. MAVEBA, Sep. 2001.
- Proc. MAVEBA, Sep. 2001
- Kawahara, H.¹ Estill, J.² Fujimura, O.³

17
- 0032673049
- Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based extraction: Possible role of a repetitive structure in sounds
- Apr.
- H. Kawahara, I. Masuda-Katsuse, and A. Cheveigne, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based extraction: Possible role of a repetitive structure in sounds," Speech Commun., vol. 27, no. 3-4, pp. 187-207, Apr. 1999.
- (1999) Speech Commun , vol.27 , Issue.3-4 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² Cheveigne, A.³

18
- 76849086888
- Silent-speech enhancement using body-conducted vocal-tract resonance signals
- Apr.
- T. Hirahara, M. Otani, S. Shimizu, T. Toda, and K. Nakamura, "Silent-speech enhancement using body-conducted vocal-tract resonance signals,"Speech Commun., vol. 52, no. 4, pp. 301-313, Apr. 2010.
- (2010) Speech Commun , vol.52 , Issue.4 , pp. 301-313
- Hirahara, T.¹ Otani, M.² Shimizu, S.³ Toda, T.⁴ Nakamura, K.⁵

19
- 57749193836
- Voice conversion based on maximum likelihood estimation of spectral parameter trajectory
- Nov.
- T. Toda, A.W. Black, and K. Tokuda, "Voice conversion based on maximum likelihood estimation of spectral parameter trajectory," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2222-2235, Nov. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.W.² Tokuda, K.³

20
- 0031623661
- Spectral voice conversion for text-to-speech synthesis
- A. Kain and M. W. Macon, "Spectral voice conversion for text-to-speech synthesis," in Proc. ICASSP, May 1998, pp. 285-288.
- Proc. ICASSP, May 1998 , pp. 285-288
- Kain, A.¹ Macon, M.W.²

21
- 0033708106
- Speech parameter generation algorithms for HMM-based speech synthesis
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis,"in Proc. ICASSP, Jun. 2000, pp. 1315-1318.
- Proc. ICASSP, Jun. 2000 , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

22
- 0034320005
- Rapid speaker adaptation in eigenvoice space
- DOI 10.1109/89.876308
- R. Kuhn, J. Junqua, P. Nguyen, and N. Niedzielski, "Rapid speaker adaptation in eigenvoice space," IEEE Trans. Speech Audio Process., vol. 8, no. 6, pp. 695-707, Nov. 2000. (Pubitemid 32025317)
- (2000) IEEE Transactions on Speech and Audio Processing , vol.8 , Issue.6 , pp. 695-707
- Kuhn, R.¹ Junqua, J.-C.² Nguyen, P.³ Niedzielski, N.⁴

23
- 38049064114
- Prediction of fundamental frequency and voicing from mel-frequency cepstral coefficients for unconstrained speech reconstruction
- Dec.
- B. Milner and X. Shao, "Prediction of fundamental frequency and voicing from mel-frequency cepstral coefficients for unconstrained speech reconstruction," IEEE Trans. Speech Audio Process., vol. 15, no. 1, pp. 24-33, Dec. 2007.
- (2007) IEEE Trans. Speech Audio Process. , vol.15 , Issue.1 , pp. 24-33
- Milner, B.¹ Shao, X.²

24
- 60049098432
- Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures
- Dec.
- J. Darch, B. Milner, and S. Vaseghi, "Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures," J. Acoust. Soc. Amer., vol. 124, no. 6, pp. 3989-4000, Dec. 2008.
- (2008) J. Acoust. Soc. Amer. , vol.124 , Issue.6 , pp. 3989-4000
- Darch, J.¹ Milner, B.² Vaseghi, S.³

25
- 38649140222
- Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model
- DOI 10.1016/j.specom.2007.09.001, PII S0167639307001495
- T. Toda, A. W. Black, and K. Tokuda, "Statistical mapping between articulatory movements and acoustic spectrum with a Gaussian mixture model," Speech Commun., vol. 50, no. 3, pp. 215-227, Mar. 2008. (Pubitemid 351172471)
- (2008) Speech Communication , vol.50 , Issue.3 , pp. 215-227
- Toda, T.¹ Black, A.W.² Tokuda, K.³

26
- 84865698185
- Statistical voice conversion techniques for body-conducted unvoiced speech enhancement
- Sep.
- T. Toda, M. Nakagiri, and K. Shikano, "Statistical voice conversion techniques for body-conducted unvoiced speech enhancement," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 9, pp. 2505-2517, Sep. 2012.
- (2012) IEEE Trans. Audio, Speech, Lang. Process. , vol.20 , Issue.9 , pp. 2505-2517
- Toda, T.¹ Nakagiri, M.² Shikano, K.³

27
- 0030362995
- A compact model for speaker-adaptive training
- T. Anastasakos, J. McDonough, S. R. Schwartz, and J. Makhoul, "A compact model for speaker-adaptive training," in Proc. ICSLP, 1996, vol. 2, pp. 1137-1140.
- Proc. ICSLP, 1996 , vol.2 , pp. 1137-1140
- Anastasakos, T.¹ McDonough, J.² Schwartz, S.R.³ Makhoul, J.⁴

28
- 77952978184
- Adaptive training for voice conversion based on eigenvoices
- Jun.
- Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Adaptive training for voice conversion based on eigenvoices," IEICE Trans. Inf. Syst., vol. E93-D, no. 6, pp. 1589-1598, Jun. 2010.
- (2010) IEICE Trans. Inf. Syst. , vol.93 , Issue.6 , pp. 1589-1598
- Ohtani, Y.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

29
- 85131821539
- Mel-generalized cepstral analysis - A unified approach to speech spectral estimation
- K. Tokuda, T. Kobayashi, T. Masuko, and S. Imai, "Mel-generalized cepstral analysis - a unified approach to speech spectral estimation," in Proc. ICSLP, Sep. 1994, pp. 1043-1045.
- Proc. ICSLP, Sep. 1994 , pp. 1043-1045
- Tokuda, K.¹ Kobayashi, T.² Masuko, T.³ Imai, S.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.