메뉴 건너뛰기




Volumn 24, Issue 4, 2016, Pages 755-767

Postfilters to modify the modulation spectrum for statistical parametric speech synthesis

Author keywords

Clustergen; Global variance; GMM based voice conversion; Hmm based text to speech; Modulation spectrum; Over smoothing; Post filter; Statistical parametric speech synthesis

Indexed keywords

COMMUNICATION CHANNELS (INFORMATION THEORY); GAUSSIAN DISTRIBUTION; HIDDEN MARKOV MODELS; MARKOV PROCESSES; MODULATION; PLASMA DIAGNOSTICS; SPEECH PROCESSING; SPEECH SYNTHESIS; STATISTICS; TRAJECTORIES; TRELLIS CODES;

EID: 84962834006     PISSN: 23299290     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASLP.2016.2522655     Document Type: Article
Times cited : (68)

References (66)
  • 1
    • 84855906479 scopus 로고    scopus 로고
    • Speech synthesis technologies for individuals with vocal diabilities: Voice banking and reconstruction
    • J. Yamagishi, C. Veaux, S. King, and S. Renals, "Speech synthesis technologies for individuals with vocal diabilities: Voice banking and reconstruction, " Acoust. Sci. Technol., vol. 33, pp. 1-5, 2012.
    • (2012) Acoust. Sci. Technol. , vol.33 , pp. 1-5
    • Yamagishi, J.1    Veaux, C.2    King, S.3    Renals, S.4
  • 2
    • 84905252904 scopus 로고    scopus 로고
    • An evaluation of excitation feature prediction in a hybrid approach to electrolaryngeal speech enhancement
    • Florence, Italy, May
    • K. Tanaka, T. Toda, G. Neubig, S. Sakti, and S. Nakamura, "An evaluation of excitation feature prediction in a hybrid approach to electrolaryngeal speech enhancement, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Florence, Italy, May 2014, pp. 4521-4525.
    • (2014) Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 4521-4525
    • Tanaka, K.1    Toda, T.2    Neubig, G.3    Sakti, S.4    Nakamura, S.5
  • 3
    • 84874403435 scopus 로고    scopus 로고
    • Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system
    • Hollywood, CA, USA, Nov
    • H. Doi, T. Toda, T. Nakano, M. Goto, and S. Nakamura, "Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system, " in Proc. Asia Pac. Signal Inf. Process. Assoc. Annu. Summit Conf. (APSIPA ASC), Hollywood, CA, USA, Nov. 2012, pp. 1-6.
    • (2012) Proc. Asia Pac. Signal Inf. Process. Assoc. Annu. Summit Conf. (APSIPA ASC) , pp. 1-6
    • Doi, H.1    Toda, T.2    Nakano, T.3    Goto, M.4    Nakamura, S.5
  • 4
    • 84865743435 scopus 로고    scopus 로고
    • Speakeradaptive speech synthesis based on eigenvoice conversion and languagedependent prosodic conversion in speech-to-speech translation
    • Florence, Italy, Aug
    • N. Hattori, T. Toda, H. Kawai, H. Saruwatari, and K. Shikano, "Speakeradaptive speech synthesis based on eigenvoice conversion and languagedependent prosodic conversion in speech-to-speech translation, " in Proc. INTERSPEECH, Florence, Italy, Aug. 2011, pp. 2769-2772.
    • (2011) Proc. INTERSPEECH , pp. 2769-2772
    • Hattori, N.1    Toda, T.2    Kawai, H.3    Saruwatari, H.4    Shikano, K.5
  • 5
    • 84905280452 scopus 로고    scopus 로고
    • Automatic discovery of a phonetic inventory for unwritten languages for statistical speech synthesis
    • Florence, Italy, May
    • P. K. Muthukumar and A. W. Black, "Automatic discovery of a phonetic inventory for unwritten languages for statistical speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process., Florence, Italy, May 2014, pp. 2594-2598.
    • (2014) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 2594-2598
    • Muthukumar, P.K.1    Black, A.W.2
  • 7
    • 84905223321 scopus 로고    scopus 로고
    • Regression approaches to perceptual age control in singing voice conversion
    • Florence, Italy, May
    • K. Kobayashi et al., "Regression approaches to perceptual age control in singing voice conversion, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Florence, Italy, May 2014, pp. 7954-7958.
    • (2014) Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 7954-7958
    • Kobayashi, K.1
  • 8
    • 85032750981 scopus 로고    scopus 로고
    • Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends
    • May
    • Z.-H. Ling et al., "Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends, " IEEE Signal Process. Mag., vol. 32, no. 3, pp. 35-52, May 2015.
    • (2015) IEEE Signal Process. Mag. , vol.32 , Issue.3 , pp. 35-52
    • Ling, Z.-H.1
  • 9
    • 0023756465 scopus 로고
    • Speech synthesis by rule using an optimal selection of non-uniform synthesis units
    • New York, NY, USA, Apr
    • Y. Sagisaka, "Speech synthesis by rule using an optimal selection of non-uniform synthesis units, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), New York, NY, USA, Apr. 1988, pp. 679-682.
    • (1988) Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 679-682
    • Sagisaka, Y.1
  • 10
    • 0032026483 scopus 로고
    • Continuous probabilistic transform for voice conversion
    • Mar
    • Y. Stylianou, O. Cappe, and E. Moulines, "Continuous probabilistic transform for voice conversion, " IEEE Trans. Speech Audio Process., vol. 6, no. 2, pp. 131-142, Mar. 1988.
    • (1988) IEEE Trans. Speech Audio Process , vol.6 , Issue.2 , pp. 131-142
    • Stylianou, Y.1    Cappe, O.2    Moulines, E.3
  • 11
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • H. Zen, K. Tokuda, and A. Black, "Statistical parametric speech synthesis, " Speech Commun., vol. 51, no. 11, pp. 1039-1064, 2009.
    • (2009) Speech Commun. , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.3
  • 13
    • 84876687945 scopus 로고    scopus 로고
    • Speech synthesis based on hidden markov models
    • May
    • K. Tokuda, Y. Nankaku, T. Toda, H. Zen, J. Yamagishi, and K. Oura, "Speech synthesis based on hidden Markov models, " Proc. IEEE, vol. 101, no. 5, pp. 1234-1252, May 2013.
    • (2013) Proc. IEEE , vol.101 , Issue.5 , pp. 1234-1252
    • Tokuda, K.1    Nankaku, Y.2    Toda, T.3    Zen, H.4    Yamagishi, J.5    Oura, K.6
  • 14
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum likelihood estimation of spectral parameter trajectory
    • Nov
    • T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based on maximum likelihood estimation of spectral parameter trajectory, " IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2222-2235, Nov. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 15
    • 44949232373 scopus 로고    scopus 로고
    • CLUSTERGEN: A statistical parametric synthesizer using trajectory modeling
    • Pittsburgh, PA, USA, Sep
    • A.W. Black, "CLUSTERGEN: A statistical parametric synthesizer using trajectory modeling, " in Proc. INTERSPEECH, Pittsburgh, PA, USA, Sep. 2006.
    • (2006) Proc. INTERSPEECH
    • Black, A.W.1
  • 16
    • 84897902941 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis based on Gaussian process regression
    • Apr
    • T. Koriyama, T. Nose, and T. Kobayashi, "Statistical parametric speech synthesis based on Gaussian process regression, " IEEE J. Sel. Topics Signal Process., vol. 8, no. 2, pp. 173-183, Apr. 2014.
    • (2014) IEEE J. Sel. Topics Signal Process , vol.8 , Issue.2 , pp. 173-183
    • Koriyama, T.1    Nose, T.2    Kobayashi, T.3
  • 17
    • 84856141218 scopus 로고    scopus 로고
    • Voice conversion using dynamic kernel partial least squares regression
    • Mar
    • E. Helander, T. V. H. Silen, and M. Gabbouj, "Voice conversion using dynamic kernel partial least squares regression, " IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 3, pp. 806-817, Mar. 2012.
    • (2012) IEEE Trans. Audio, Speech, Lang. Process , vol.20 , Issue.3 , pp. 806-817
    • Helander, E.1    Silen, T.V.H.2    Gabbouj, M.3
  • 18
    • 84905262874 scopus 로고    scopus 로고
    • Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis
    • Florence, Italy, May
    • H. Zen and A. Senior, "Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Florence, Italy, May 2014, pp. 3872-3876.
    • (2014) Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 3872-3876
    • Zen, H.1    Senior, A.2
  • 19
    • 84906280857 scopus 로고    scopus 로고
    • Voice conversion in high-order Eigen space using deep belief nets
    • Lyon, France, Aug
    • T. Nakashika, R. Takashima, T. Takiguchi, and Y. Ariki, "Voice conversion in high-order Eigen space using deep belief nets, " in Proc. INTERSPEECH, Lyon, France, Aug. 2013, pp. 369-372.
    • (2013) Proc. INTERSPEECH , pp. 369-372
    • Nakashika, T.1    Takashima, R.2    Takiguchi, T.3    Ariki, Y.4
  • 20
    • 0029765811 scopus 로고    scopus 로고
    • Unit selection in a concatenative speech synthesis system using a large speech database
    • Atlanta, GA, USA, May
    • A. J. Hunt and A. Black, "Unit selection in a concatenative speech synthesis system using a large speech database, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Atlanta, GA, USA, May 1996, pp. 373-376.
    • (1996) Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 373-376
    • Hunt, A.J.1    Black, A.2
  • 21
    • 84988274722 scopus 로고    scopus 로고
    • An investigation of implementation performance analysis of DNN based speech synthesis system
    • Brighton, U.K
    • K. Oura, H. Zen, Y. Nankaku, A. Lee, and K. Tokuda, "An investigation of implementation performance analysis of DNN based speech synthesis system, " in Proc. INTERSPEECH, Brighton, U.K., 2014, pp. 577-582.
    • (2014) Proc. INTERSPEECH , pp. 577-582
    • Oura, K.1    Zen, H.2    Nankaku, Y.3    Lee, A.4    Tokuda, K.5
  • 22
    • 33847129573 scopus 로고    scopus 로고
    • Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
    • J. Yamagishi and T. Kobayashi, "Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training, " IEICE Trans. Inf. Syst., vol. E90-D, no. 2, pp. 533-543, 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.2 , pp. 533-543
    • Yamagishi, J.1    Kobayashi, T.2
  • 23
    • 51449114529 scopus 로고    scopus 로고
    • A style control technique for HMM-based expressive speech synthesis
    • T. Nose, J. Yamagishi, T. Masuko, and T. Kobayashi, "A style control technique for HMM-based expressive speech synthesis, " IEICE Trans. Inf. Syst., vol. E90-D, no. 9, pp. 1406-1413, 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.9 , pp. 1406-1413
    • Nose, T.1    Yamagishi, J.2    Masuko, T.3    Kobayashi, T.4
  • 24
    • 84878397811 scopus 로고    scopus 로고
    • Exploring rich expressive information from audiobook data using cluster adaptive training
    • Portland, OR, USA, Sep
    • L. Chen, M. J. F. Gales, L. Chen, K. Chin, K. Knull, and M. Akamine, "Exploring rich expressive information from audiobook data using cluster adaptive training, " in Proc. INTERSPEECH, Portland, OR, USA, Sep. 2012.
    • (2012) Proc. INTERSPEECH
    • Chen, L.1    Gales, M.J.F.2    Chen, L.3    Chin, K.4    Knull, K.5    Akamine, M.6
  • 27
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. D. Cheveigne, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds, " Speech Commun., vol. 27, no. 3-4, pp. 187-207, 1999.
    • (1999) Speech Commun. , vol.27 , Issue.3-4 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    Cheveigne, A.D.3
  • 28
    • 85010837662 scopus 로고    scopus 로고
    • An attempt to develop a singing synthesizer by collaborative creation
    • Stockholm, Sweden, Aug
    • M. Morise, "An attempt to develop a singing synthesizer by collaborative creation, " in Proc. Stockholm Music Acoust. Conf. (SMAC), Stockholm, Sweden, Aug. 2013, pp. 287 292.
    • (2013) Proc. Stockholm Music Acoust. Conf. (SMAC) , pp. 287-292
    • Morise, M.1
  • 29
    • 84930664922 scopus 로고    scopus 로고
    • Vocaine the vocoder and applications in speech synthesis
    • Brisbane, QLD, Australia, Apr
    • Y. Agiomyrgiannakis, "Vocaine the vocoder and applications in speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Brisbane, QLD, Australia, Apr. 2015, pp. 4230-4234.
    • (2015) Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 4230-4234
    • Agiomyrgiannakis, Y.1
  • 30
    • 84906279165 scopus 로고    scopus 로고
    • Optimizations and fitting procedures for the liljencrants-fant model for statistical parametric speech synthesis
    • Vancouver, BC, Canada, May
    • P. K. Muthukumar, A. W. Black, and H. T. Bunnell, "Optimizations and fitting procedures for the Liljencrants-Fant model for statistical parametric speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Vancouver, BC, Canada, May 2013, pp. 397-401.
    • (2013) Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 397-401
    • Muthukumar, P.K.1    Black, A.W.2    Bunnell, H.T.3
  • 31
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    • T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis, " IEICE Trans., vol. E90-D, no. 5, pp. 816-824, 2007.
    • (2007) IEICE Trans , vol.E90-D , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2
  • 32
    • 84897862522 scopus 로고    scopus 로고
    • Parameter generation methods with rich context models for highquality and flexible text-to-speech synthesis
    • Apr
    • S. Takamichi, T. Toda, Y. Shiga, S. Sakti, G. Neubig, and S. Nakamura, "Parameter generation methods with rich context models for highquality and flexible text-to-speech synthesis, " IEEE J. Sel. Topics Signal Process., vol. 8, no. 2, pp. 239-250, Apr. 2014.
    • (2014) IEEE J. Sel. Topics Signal Process , vol.8 , Issue.2 , pp. 239-250
    • Takamichi, S.1    Toda, T.2    Shiga, Y.3    Sakti, S.4    Neubig, G.5    Nakamura, S.6
  • 33
    • 84897832343 scopus 로고    scopus 로고
    • A parameter generation algorithm using local variance for HMM-based speech synthesis
    • Apr
    • T. Nose, V. Chunwijitra, and T. Kobayashi, "A parameter generation algorithm using local variance for HMM-based speech synthesis, " IEEE J. Sel. Topics Signal Process., vol. 8, no. 2, pp. 221-228, Apr. 2014.
    • (2014) IEEE J. Sel. Topics Signal Process , vol.8 , Issue.2 , pp. 221-228
    • Nose, T.1    Chunwijitra, V.2    Kobayashi, T.3
  • 34
    • 84890495160 scopus 로고    scopus 로고
    • Fast, low-artifact speech synthesis considering global variance
    • Vancouver, BC, Canada, May
    • M. Shannon and W. Byrne, "Fast, low-artifact speech synthesis considering global variance, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Vancouver, BC, Canada, May 2013, pp. 7869-7873.
    • (2013) Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 7869-7873
    • Shannon, M.1    Byrne, W.2
  • 35
    • 84878390910 scopus 로고    scopus 로고
    • Implementation of computationally efficient real-time voice conversion
    • Portland, OR, USA, Sep
    • T. Toda, T. Muramatsu, and H. Banno, "Implementation of computationally efficient real-time voice conversion, " in Proc. INTERSPEECH, Portland, OR, USA, Sep. 2012.
    • (2012) Proc. INTERSPEECH
    • Toda, T.1    Muramatsu, T.2    Banno, H.3
  • 36
    • 0028287770 scopus 로고
    • Effect of reducing slow temporal modulations on speech reception
    • R. Drullman, J. M. Festen, and R. Plomp, "Effect of reducing slow temporal modulations on speech reception, " J. Acoust. Soc. Amer., vol. 95, pp. 2670-2680, 1994.
    • (1994) J. Acoust. Soc. Amer. , vol.95 , pp. 2670-2680
    • Drullman, R.1    Festen, J.M.2    Plomp, R.3
  • 38
    • 84959088222 scopus 로고    scopus 로고
    • Reduction of reverberation effects in the mfcc modulation spectrum for improved classification of acoustic signals
    • Dresden, Germany, Sep
    • S. Gergen, A. Nagathil, and R. Martin, "Reduction of reverberation effects in the MFCC modulation spectrum for improved classification of acoustic signals, " in Proc. INTERSPEECH, Dresden, Germany, Sep. 2015, pp. 1992-1995.
    • (2015) Proc. INTERSPEECH , pp. 1992-1995
    • Gergen, S.1    Nagathil, A.2    Martin, R.3
  • 40
    • 0027957839 scopus 로고
    • Effect of temporal envelope smearing on speech perception
    • R. Drullman, J. M. Festen, and R. Plomp, "Effect of temporal envelope smearing on speech perception, " J. Acoust. Soc. Amer., vol. 95, pp. 1053- 1064, 1994.
    • (1994) J. Acoust. Soc. Amer. , vol.95 , pp. 1053-1064
    • Drullman, R.1    Festen, J.M.2    Plomp, R.3
  • 41
    • 0030369532 scopus 로고    scopus 로고
    • Intelligibility of speech with filtered time trajectories of spectral envelopes
    • T. Arai, M. Pavel, H. Hermansky, and C. Avendano, "Intelligibility of speech with filtered time trajectories of spectral envelopes, " in Proc. 4th Int. Conf. Spoken Lang. (ICSLP), 1996, vol. 4, pp. 2490-2493.
    • (1996) Proc. 4th Int. Conf. Spoken Lang. (ICSLP) , vol.4 , pp. 2490-2493
    • Arai, T.1    Pavel, M.2    Hermansky, H.3    Avendano, C.4
  • 42
    • 84867211725 scopus 로고    scopus 로고
    • Lowdelay voice conversion based on maximum likelihood estimation of spectral parameter trajectory
    • Brisbane, QLD, Australia, Sep
    • T. Muramatsu, Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Lowdelay voice conversion based on maximum likelihood estimation of spectral parameter trajectory, " in Proc. INTERSPEECH, Brisbane, QLD, Australia, Sep. 2008, pp. 1076-1079.
    • (2008) Proc. INTERSPEECH , pp. 1076-1079
    • Muramatsu, T.1    Ohtani, Y.2    Toda, T.3    Saruwatari, H.4    Shikano, K.5
  • 43
    • 84946045510 scopus 로고    scopus 로고
    • Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis
    • Brisbane, QLD, Australia, Apr
    • H. Zen and H. Sak, "Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Brisbane, QLD, Australia, Apr. 2015, pp. 4470-4474.
    • (2015) Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 4470-4474
    • Zen, H.1    Sak, H.2
  • 45
    • 84959144982 scopus 로고    scopus 로고
    • Modified modulation spectrum-based post-filter for HMM-based speech synthesis
    • Atlanta, GA, USA, Dec
    • S. Takamichi, T. Toda, A. W. Black, and S. Nakamura, "Modified modulation spectrum-based post-filter for HMM-based speech synthesis, " in Proc. GlobalSIP, Atlanta, GA, USA, Dec. 2014, pp. 710-714.
    • (2014) Proc. GlobalSIP , pp. 710-714
    • Takamichi, S.1    Toda, T.2    Black, A.W.3    Nakamura, S.4
  • 47
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
    • Budapest, Hungary, Apr
    • T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis, " in Proc. EUROSPEECH, Budapest, Hungary, Apr. 1999, pp. 2347-2350.
    • (1999) Proc. EUROSPEECH , pp. 2347-2350
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 49
    • 84989843139 scopus 로고    scopus 로고
    • MDL-based context-dependent subword modeling for speech recognition
    • K. Shinoda and T. Watanabe, "MDL-based context-dependent subword modeling for speech recognition, " J. Acoust. Soc. Jpn. (E), vol. 28, no. 3, pp. 140-146, 2007.
    • (2007) J. Acoust. Soc. Jpn. (E) , vol.28 , Issue.3 , pp. 140-146
    • Shinoda, K.1    Watanabe, T.2
  • 50
    • 0038443474 scopus 로고    scopus 로고
    • Joint acoustic and modulation frequency
    • L. Atlas and S. A. Shamma, "Joint acoustic and modulation frequency, " EURASIP J. Appl. Signal Process., vol. 7, pp. 668-675, 2003.
    • (2003) EURASIP J. Appl. Signal Process , vol.7 , pp. 668-675
    • Atlas, L.1    Shamma, S.A.2
  • 52
    • 85008023596 scopus 로고    scopus 로고
    • Continuous f0 modeling for hmm based statistical parametric speech synthesis
    • Jul
    • K. Yu and S. Young, "Continuous F0 modeling for HMM based statistical parametric speech synthesis, " IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 5, pp. 1071-1079, Jul. 2011.
    • (2011) IEEE Trans. Audio, Speech, Lang. Process , vol.19 , Issue.5 , pp. 1071-1079
    • Yu, K.1    Young, S.2
  • 53
    • 84905244240 scopus 로고    scopus 로고
    • A hybrid approach to electrolaryngeal speech enhansement based on spectral subtraction and statistical voice conversion
    • Lyon, France, Sep
    • K. Tanaka, T. Toda, G. Neubig, S. Sakti, and S. Nakamura, "A hybrid approach to electrolaryngeal speech enhansement based on spectral subtraction and statistical voice conversion, " in Proc. Interspeech, Lyon, France, Sep. 2013, pp. 3067-3071.
    • (2013) Proc. Interspeech , pp. 3067-3071
    • Tanaka, K.1    Toda, T.2    Neubig, G.3    Sakti, S.4    Nakamura, S.5
  • 54
    • 84925160976 scopus 로고    scopus 로고
    • Cambridge, U.K.: Cambridge Univ. Press
    • P. Taylor, Text-To-Speech Synthesis. Cambridge, U.K.: Cambridge Univ. Press, 2009.
    • (2009) Text-To-Speech Synthesis
    • Taylor, P.1
  • 55
    • 84905283795 scopus 로고    scopus 로고
    • A frequency-weighted post-filtering transform for compensation of the over-smoothing effect in hmm-based speech synthesis
    • Florence, Italy, May
    • F. Eyben and Y. Agiomyrgiannakis, "A frequency-weighted post-filtering transform for compensation of the over-smoothing effect in HMM-based speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Florence, Italy, May 2014, pp. 275-279.
    • (2014) Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 275-279
    • Eyben, F.1    Agiomyrgiannakis, Y.2
  • 57
    • 84910100893 scopus 로고    scopus 로고
    • Dnn-based stochastic postfilter for HMM-based speech synthesis
    • MAX Atria, Singapore, May
    • L.-H. Chen, T. Raitio, C.-V. Botinhao, J. Yamagishi, and Z.-H. Ling, "DNN-based stochastic postfilter for HMM-based speech synthesis, " in Proc. INTERSPEECH, MAX Atria, Singapore, May 2014, pp. 1954- 1958.
    • (2014) Proc. INTERSPEECH , pp. 1954-1958
    • Chen, L.-H.1    Raitio, T.2    Botinhao, C.-V.3    Yamagishi, J.4    Ling, Z.-H.5
  • 58
  • 59
    • 84910088495 scopus 로고    scopus 로고
    • Analysis of spectral enhancement using global variance in HMM-based speech synthesis
    • MAX Atria, Singapore, May
    • T. Nose and A. Ito, "Analysis of spectral enhancement using global variance in HMM-based speech synthesis, " in Proc. INTERSPEECH, MAX Atria, Singapore, May 2014, pp. 2917-2921.
    • (2014) Proc. INTERSPEECH , pp. 2917-2921
    • Nose, T.1    Ito, A.2
  • 60
    • 44449177634 scopus 로고    scopus 로고
    • Hidden semi- Markov model based speech synthesis system
    • E90-D, no. 5
    • H. Zen, K. Tokuda, T. K. T. Masuko, and T. Kitamura, "Hidden semi- Markov model based speech synthesis system, " IEICE Trans. Inf. Syst., E90-D, no. 5, pp. 825-834, 2007.
    • (2007) IEICE Trans. Inf. Syst. , pp. 825-834
    • Zen, H.1    Tokuda, K.2    Masuko, T.K.T.3    Kitamura, T.4
  • 62
    • 84874199000 scopus 로고    scopus 로고
    • Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system straight
    • Firentze, Italy, Sep
    • H. Kawahara, J. Estill, and O. Fujimura, "Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT, " in Proc. Int. Workshop Models Anal. Vocal Emissions Biomed. Appl. (MAVEBA), Firentze, Italy, Sep. 2001, pp. 1-6.
    • (2001) Proc. Int. Workshop Models Anal. Vocal Emissions Biomed. Appl. (MAVEBA) , pp. 1-6
    • Kawahara, H.1    Estill, J.2    Fujimura, O.3
  • 63
    • 44949143155 scopus 로고    scopus 로고
    • Maximum likelihood voice conversion based on GMM with straight mixed excitation
    • Pittsburgh, PA, USA, Sep
    • Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation, " in Proc. INTERSPEECH, Pittsburgh, PA, USA, Sep. 2006, pp. 2266-2269.
    • (2006) Proc. INTERSPEECH , pp. 2266-2269
    • Ohtani, Y.1    Toda, T.2    Saruwatari, H.3    Shikano, K.4
  • 66
    • 84870436666 scopus 로고    scopus 로고
    • Online Available
    • Amazon Mechanical Turk [Online]. Available: https://www.mturk.com/.
    • Amazon Mechanical Turk


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.