메뉴 건너뛰기




Volumn , Issue , 2014, Pages 290-294

A postfilter to modify the modulation spectrum in HMM-based speech synthesis

Author keywords

global variance; HMM based speech synthesis; modulation spectrum; over smoothing; postfilter

Indexed keywords

MODULATION; SPEECH SYNTHESIS;

EID: 84905234422     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2014.6853604     Document Type: Conference Paper
Times cited : (66)

References (19)
  • 1
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • H. Zen, K. Tokuda, and A. Black. Statistical parametric speech synthesis. Speech Commun., Vol. 51, No. 11, pp. 1039-1064, 2009.
    • (2009) Speech Commun. , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.3
  • 3
    • 33847129573 scopus 로고    scopus 로고
    • Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
    • J. Yamagishi and T. Kobayashi. Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training. IEICE Trans., Inf. and Syst., Vol. E90-D, No. 2, pp. 533-543, 2007.
    • (2007) IEICE Trans., Inf. and Syst. , vol.E90-D , Issue.2 , pp. 533-543
    • Yamagishi, J.1    Kobayashi, T.2
  • 4
    • 51449114529 scopus 로고    scopus 로고
    • A style control technique for HMM-based expressive speech synthesis
    • T. Nose, J. Yamagishi, T. Masuko, and T. Kobayashi. A style control technique for HMM-based expressive speech synthesis. IEICE Trans., Inf. and Syst., Vol. E90-D, No. 9, pp. 1406-1413, 2007.
    • (2007) IEICE Trans., Inf. and Syst. , vol.E90-D , Issue.9 , pp. 1406-1413
    • Nose, T.1    Yamagishi, J.2    Masuko, T.3    Kobayashi, T.4
  • 6
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    • T. Toda and K. Tokuda. A speech parameter generation algorithm considering global variance for HMM-based speech synthesis. IEICE Trans., Vol. E90-D, No. 5, pp. 816-824, 2007.
    • (2007) IEICE Trans. , vol.E90-D , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2
  • 7
    • 0028287770 scopus 로고
    • Effect of reducing slow temporalmodulations on speech reception
    • R. Drullman, J.M. Festen, and R. Plomp. Effect of reducing slow temporalmodulations on speech reception. J. Acoust. Soc. of America, Vol. 95, pp. 2670-2680, 1994.
    • (1994) J. Acoust. Soc. of America , vol.95 , pp. 2670-2680
    • Drullman, R.1    Festen, J.M.2    Plomp, R.3
  • 8
    • 70349212558 scopus 로고    scopus 로고
    • Phoneme recgnition usng spectral envelop and modulation frequency features
    • Taipei, Taiwan, April
    • S. Thomas, S. Ganapathy, and H. Hermansky. Phoneme recgnition usng spectral envelop and modulation frequency features. In Proc. ICASSP, pp. 4453-4456, Taipei, Taiwan, April 2009.
    • (2009) Proc. ICASSP , pp. 4453-4456
    • Thomas, S.1    Ganapathy, S.2    Hermansky, H.3
  • 9
    • 0033708106 scopus 로고    scopus 로고
    • Speech parameter generation algorithms for HMMbased speech synthesis
    • Istanbul, Turkey, June
    • K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura. Speech parameter generation algorithms for HMMbased speech synthesis. In Proc. ICASSP, pp. 1315-1318, Istanbul, Turkey, June 2000.
    • (2000) Proc. ICASSP , pp. 1315-1318
    • Tokuda, K.1    Yoshimura, T.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 11
    • 85008023596 scopus 로고    scopus 로고
    • Continuous F0 modeling for HMM based statistical parametric speech synthesis
    • K. Yu and S. Young. Continuous F0 modeling for HMM based statistical parametric speech synthesis. IEEE Trans. Audio, Speech and Language, Vol. 19, No. 5, pp. 1071-1079, 2011.
    • (2011) IEEE Trans. Audio, Speech and Language , vol.19 , Issue.5 , pp. 1071-1079
    • Yu, K.1    Young, S.2
  • 12
    • 84905244240 scopus 로고    scopus 로고
    • A hybrid approach to electrolaryngeal speech enhansement based on spectral subtraction and statistical voice conversion
    • Lyon, France, Sep.
    • K. Tanaka, T. Toda, G. Neubig, S. Sakti, and S. Nakamura. A hybrid approach to electrolaryngeal speech enhansement based on spectral subtraction and statistical voice conversion. In Proc. INTERSPEECH, pp. 3067-3071, Lyon, France, Sep. 2013.
    • (2013) Proc. INTERSPEECH , pp. 3067-3071
    • Tanaka, K.1    Toda, T.2    Neubig, G.3    Sakti, S.4    Nakamura, S.5
  • 14
    • 84878390910 scopus 로고    scopus 로고
    • Implementation of conputationally efficient real-time voice conversion
    • Portland, Oregon, U.S., Sept.
    • T. Toda, T. Muramatsu, and H. Banno. Implementation of conputationally efficient real-time voice conversion. In Proc. INTERSPEECH, Portland, Oregon, U.S., Sept. 2012.
    • (2012) Proc. INTERSPEECH
    • Toda, T.1    Muramatsu, T.2    Banno, H.3
  • 17
    • 84874199000 scopus 로고    scopus 로고
    • Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT
    • Firentze, Italy, Sept.
    • H. Kawahara, Jo Estill, and O. Fujimura. Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT. In MAVEBA 2001, pp. 1-6, Firentze, Italy, Sept. 2001.
    • (2001) MAVEBA 2001 , pp. 1-6
    • Kawahara, H.1    Estill, J.2    Fujimura, O.3
  • 18
    • 44949143155 scopus 로고    scopus 로고
    • Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation
    • Pittsburgh, U.S.A., Sep.
    • Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano. Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation. In Proc. INTERSPEECH, pp. 2266-2269, Pittsburgh, U.S.A., Sep. 2006.
    • (2006) Proc. INTERSPEECH , pp. 2266-2269
    • Ohtani, Y.1    Toda, T.2    Saruwatari, H.3    Shikano, K.4
  • 19
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. D. Cheveigne. Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds. Speech Commun., Vol. 27, No. 3-4, pp. 187-207, 1999.
    • (1999) Speech Commun. , vol.27 , Issue.3-4 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    Cheveigne, A.D.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.