메뉴 건너뛰기




Volumn 19, Issue 1, 2011, Pages 153-165

HMM-based speech synthesis utilizing glottal inverse filtering

Author keywords

Glottal inverse filtering; hidden Markov model (HMM); speech synthesis

Indexed keywords

EXCITATION SIGNALS; GLOTTAL FLOW; GLOTTAL SOURCE; HIDDEN MARKOV MODEL (HMM); HMM-BASED SPEECH SYNTHESIS; INVERSE FILTERING; SOURCE SIGNALS; SPECTRAL FEATURE; SPEECH SYNTHESIZER; SYNTHESIS STAGES; SYNTHETIC SPEECH; TEXT INPUT; VOCAL-TRACTS; VOICE SOURCES;

EID: 77957744515     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2010.2045239     Document Type: Article
Times cited : (188)

References (85)
  • 1
    • 85009139544 scopus 로고    scopus 로고
    • Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
    • Sep
    • T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis", in Proc. Eurospeech, Sep. 1999, pp. 2374-2350.
    • (1999) Proc. Eurospeech , pp. 2374-2350
    • Yoshimura, T.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 2
    • 85031628788 scopus 로고
    • An algorithm for speech parameter generation from continuous mixture HMMs with dynamic features
    • K. Tokuda, T. Masuko, T. Yamada, T. Kobayashi, and S. Imai, "An algorithm for speech parameter generation from continuous mixture HMMs with dynamic features", in Proc. Eurospeech, 1995, vol. 1, pp. 757-760.
    • (1995) Proc. Eurospeech , vol.1 , pp. 757-760
    • Tokuda, K.1    Masuko, T.2    Yamada, T.3    Kobayashi, T.4    Imai, S.5
  • 3
    • 84966348891 scopus 로고    scopus 로고
    • An HMM-based speech synthesis system applied to English
    • Sep
    • K. Tokuda, H. Zen, and A. W. Black, "An HMM-based speech synthesis system applied to English", in Proc. IEEE Workshop Speech Synth., Sep. 2002, pp. 227-230.
    • (2002) Proc. IEEE Workshop Speech Synth. , pp. 227-230
    • Tokuda, K.1    Zen, H.2    Black, A.W.3
  • 4
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis
    • Nov
    • H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis", Speech Commun., vol. 51, no. 11, pp. 1039-1064, Nov. 2009.
    • (2009) Speech Commun. , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3
  • 5
    • 67650854725 scopus 로고    scopus 로고
    • Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm
    • Jan
    • J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, and J. Isogai, "Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm", IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 1, pp. 66-83, Jan. 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.1 , pp. 66-83
    • Yamagishi, J.1    Kobayashi, T.2    Nakano, Y.3    Ogata, K.4    Isogai, J.5
  • 6
    • 29144475179 scopus 로고    scopus 로고
    • Speech synthesis with various emotional expressions and speaking styles by style interpolation and morphing
    • DOI 10.1093/ietisy/e88-d.11.2484
    • M. Tachibana, J. Yamagishi, T. Masuko, and T. Kobayashi, "Speech synthesis with various emotional expressions and speaking styles by style interpolation and morphing", IEICE Trans. Inf. Syst., vol. E88-D, no. 11, pp. 2484-2491, Nov. 2005. (Pubitemid 41816793)
    • (2005) IEICE Transactions on Information and Systems , vol.E88-D , Issue.11 , pp. 2484-2491
    • Tachibana, M.1    Yamagishi, J.2    Masuko, T.3    Kobayashi, T.4
  • 7
    • 33846429403 scopus 로고    scopus 로고
    • Minimum generation error training for HMM-based speech synthesis
    • Y.-J. Wu and R.-H. Wang, "Minimum generation error training for HMM-based speech synthesis", in Proc. ICASSP, 2006, vol. 1, pp. 889-892.
    • (2006) Proc. ICASSP , vol.1 , pp. 889-892
    • Wu, Y.-J.1    Wang, R.-H.2
  • 8
    • 38549096029 scopus 로고    scopus 로고
    • A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    • May
    • T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis", IEICE Trans. Inf. Syst., vol. E90-D, no. 5, pp. 816-824, May 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2
  • 10
    • 0025321354 scopus 로고
    • Analysis, synthesis, and perception of voice quality variations among female and male talkers
    • Feb
    • D. H. Klatt and L. C. Klatt, "Analysis, synthesis, and perception of voice quality variations among female and male talkers", J. Acoust. Soc. Amer., vol. 87, no. 2, pp. 820-857, Feb. 1990.
    • (1990) J. Acoust. Soc. Amer. , vol.87 , Issue.2 , pp. 820-857
    • Klatt, D.H.1    Klatt, L.C.2
  • 11
    • 0025786649 scopus 로고
    • Vocal quality factors: Analysis, synthesis, and perception
    • Nov
    • D. G. Childers and C. K. Lee, "Vocal quality factors: Analysis, synthesis, and perception", J. Acoust. Soc. Amer., vol. 90, no. 5, pp. 2394-2410, Nov. 1991.
    • (1991) J. Acoust. Soc. Amer. , vol.90 , Issue.5 , pp. 2394-2410
    • Childers, D.G.1    Lee, C.K.2
  • 12
    • 0002884330 scopus 로고
    • The government standard linear predictive coding algorithm: LPC-10
    • Apr
    • T. E. Tremain, "The government standard linear predictive coding algorithm: LPC-10", Speech Technol., vol. 1, pp. 40-49, Apr. 1982.
    • (1982) Speech Technol. , vol.1 , pp. 40-49
    • Tremain, T.E.1
  • 16
    • 33846406459 scopus 로고    scopus 로고
    • Two-band excitation for HMM-based speech synthesis
    • S. J. Kim and M. Hahn, "Two-band excitation for HMM-based speech synthesis", IEICE Trans. Inf. Syst., vol. E90-D, 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D
    • Kim, S.J.1    Hahn, M.2
  • 17
    • 33947684811 scopus 로고
    • A four-parameter model of glottal flow
    • G. Fant, J. Liljencrants, and Q. Lin, "A four-parameter model of glottal flow", STL-QPSR, vol. 4, pp. 1-13, 1985.
    • (1985) STL-QPSR , vol.4 , pp. 1-13
    • Fant, G.1    Liljencrants, J.2    Lin, Q.3
  • 19
    • 0026372714 scopus 로고
    • Experiments with voice modelling in speech synthesis
    • R. Carlson, B. Granström, and I. Karlsson, "Experiments with voice modelling in speech synthesis", Speech Commun., vol. 10, pp. 481-489, 1991.
    • (1991) Speech Commun. , vol.10 , pp. 481-489
    • Carlson, R.1    Granström, B.2    Karlsson, I.3
  • 20
    • 82155160991 scopus 로고    scopus 로고
    • Towards an improved modeling of the glottal source in statistical parametric speech synthesis
    • Aug
    • J. Cabral, S. Renalds, K. Richmond, and J. Yamagishi, "Towards an improved modeling of the glottal source in statistical parametric speech synthesis", in Proc. 6th ISCA Workshop Speech Synth., Aug. 2007, pp. 113-118.
    • (2007) Proc. 6th ISCA Workshop Speech Synth. , pp. 113-118
    • Cabral, J.1    Renalds, S.2    Richmond, K.3    Yamagishi, J.4
  • 21
    • 84867224654 scopus 로고    scopus 로고
    • Glottal spectral separation for parametric speech synthesis
    • J. Cabral, S. Renalds, K. Richmond, and J. Yamagishi, "Glottal spectral separation for parametric speech synthesis", in Proc. Interspeech, 2008, pp. 1829-1832.
    • (2008) Proc. Interspeech , pp. 1829-1832
    • Cabral, J.1    Renalds, S.2    Richmond, K.3    Yamagishi, J.4
  • 22
    • 0015699693 scopus 로고
    • The influence of glottal waveform on the naturalness of speech from a parallel formant synthesizer
    • Jun
    • J. Holmes, "The influence of glottal waveform on the naturalness of speech from a parallel formant synthesizer", IEEE Trans. Audio Electroacoust., vol. AES-21, no. 3, pp. 298-305, Jun. 1973.
    • (1973) IEEE Trans. Audio Electroacoust. , vol.AES-21 , Issue.3 , pp. 298-305
    • Holmes, J.1
  • 23
    • 0026387469 scopus 로고
    • Improving naturalness in text-to-speech synthesis using natural glottal source
    • Apr
    • K. Matsui, S. D. Pearson, K. Hata, and T. Kamai, "Improving naturalness in text-to-speech synthesis using natural glottal source", in Proc. ICASSP, Apr. 1991, vol. 2, pp. 769-772.
    • (1991) Proc. ICASSP , vol.2 , pp. 769-772
    • Matsui, K.1    Pearson, S.D.2    Hata, K.3    Kamai, T.4
  • 24
    • 0032875050 scopus 로고    scopus 로고
    • A method for generating natural-sounding speech stimuli for cognitive brain research
    • P. Alku, H. Tiitinen, and R. Näätänen, "A method for generating natural-sounding speech stimuli for cognitive brain research", Clinical Neurophysiol., vol. 110, pp. 1329-1333, 1999.
    • (1999) Clinical Neurophysiol. , vol.110 , pp. 1329-1333
    • Alku, P.1    Tiitinen, H.2    Näätänen, R.3
  • 25
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • Apr
    • H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigné, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds", Speech Commun., vol. 27, Apr. 1999.
    • (1999) Speech Commun. , vol.27
    • Kawahara, H.1    Masuda-Katsuse, I.2    De Cheveigne, A.3
  • 26
    • 33846405723 scopus 로고    scopus 로고
    • Details of the Nitech HMM-based speech synthesis for Blizzard Challenge 2005
    • H. Zen, T. Toda, M. Nakamura, and K. Tokuda, "Details of the Nitech HMM-based speech synthesis for Blizzard Challenge 2005", IEICE Trans. Inf. Syst., vol. E90-D, pp. 325-333, 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , pp. 325-333
    • Zen, H.1    Toda, T.2    Nakamura, M.3    Tokuda, K.4
  • 29
    • 84867209230 scopus 로고    scopus 로고
    • HMM-based Finnish text-to-speech system utilizing glottal inverse filtering
    • T. Raitio, A. Suni, H. Pulakka, M. Vainio, and P. Alku, "HMM-based Finnish text-to-speech system utilizing glottal inverse filtering", in Proc. Interspeech, 2008, pp. 1881-1884.
    • (2008) Proc. Interspeech , pp. 1881-1884
    • Raitio, T.1    Suni, A.2    Pulakka, H.3    Vainio, M.4    Alku, P.5
  • 30
    • 84955013305 scopus 로고
    • Nature of the vocal cord wave
    • Jun
    • R. L. Miller, "Nature of the vocal cord wave", J. Acoust. Soc. Amer., vol. 31, no. 6, pp. 667-677, Jun. 1959.
    • (1959) J. Acoust. Soc. Amer. , vol.31 , Issue.6 , pp. 667-677
    • Miller, R.L.1
  • 32
    • 0026881384 scopus 로고
    • Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering
    • P. Alku, "Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering", Speech Commun., vol. 11, no. 2-3, pp. 109-118, 1992.
    • (1992) Speech Commun. , vol.11 , Issue.2-3 , pp. 109-118
    • Alku, P.1
  • 33
    • 0018653975 scopus 로고
    • Least squares glottal inverse filtering from the acoustic speech waveform
    • Aug
    • D. Wong, J. Markel, and A. Gray, Jr., "Least squares glottal inverse filtering from the acoustic speech waveform", IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-27, no. 4, pp. 350-355, Aug. 1979.
    • (1979) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-27 , Issue.4 , pp. 350-355
    • Wong, D.1    Markel, J.2    Gray Jr., A.3
  • 34
    • 0016129045 scopus 로고
    • Determination of the instant of glottal closure from the speech wave
    • H. Strube, "Determination of the instant of glottal closure from the speech wave", J. Acoust. Soc. Amer., vol. 56, no. 5, pp. 1625-1629, 1974.
    • (1974) J. Acoust. Soc. Amer. , vol.56 , Issue.5 , pp. 1625-1629
    • Strube, H.1
  • 35
    • 0032595183 scopus 로고    scopus 로고
    • Modeling of the glottal flow derivative waveform with application to speaker identification
    • Sep
    • M. Plumpe, T. Quatieri, and D. Reynolds, "Modeling of the glottal flow derivative waveform with application to speaker identification", IEEE Trans. Speech Audio Process., vol. 7, no. 5, pp. 569-585, Sep. 1999.
    • (1999) IEEE Trans. Speech Audio Process. , vol.7 , Issue.5 , pp. 569-585
    • Plumpe, M.1    Quatieri, T.2    Reynolds, D.3
  • 36
    • 0034945901 scopus 로고    scopus 로고
    • SIM-Simultaneous inverse filtering and matching of a glottal flow model for acoustic speech signals
    • M. Fröhlich, D. Michaelis, and H. Strube, "SIM-Simultaneous inverse filtering and matching of a glottal flow model for acoustic speech signals", J. Acoust. Soc. Amer., vol. 110, no. 1, pp. 479-488, 2001.
    • (2001) J. Acoust. Soc. Amer. , vol.110 , Issue.1 , pp. 479-488
    • Fröhlich, M.1    Michaelis, D.2    Strube, H.3
  • 37
    • 17444431936 scopus 로고    scopus 로고
    • Estimation of the vocal tract transfer function with application to glottal wave analysis
    • O. Akande and P. Murphy, "Estimation of the vocal tract transfer function with application to glottal wave analysis", Speech Commun., vol. 46, pp. 15-36, 2005.
    • (2005) Speech Commun. , vol.46 , pp. 15-36
    • Akande, O.1    Murphy, P.2
  • 38
    • 33751247257 scopus 로고    scopus 로고
    • Robust glottal source estimation based on joint source-filter model optimization
    • Mar
    • Q. Fu and P. Murphy, "Robust glottal source estimation based on joint source-filter model optimization", IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 2, pp. 492-501, Mar. 2006.
    • (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.2 , pp. 492-501
    • Fu, Q.1    Murphy, P.2
  • 39
    • 0028797112 scopus 로고
    • Modeling the glottal volume-velocity waveform for three voice types
    • D. Childers and C. Ahn, "Modeling the glottal volume-velocity waveform for three voice types", J. Acoust. Soc. Amer., vol. 97, no. 1, pp. 505-519, 1995.
    • (1995) J. Acoust. Soc. Amer. , vol.97 , Issue.1 , pp. 505-519
    • Childers, D.1    Ahn, C.2
  • 41
    • 0026881761 scopus 로고
    • On the relation between voice source parameters and prosodic features in connected speech
    • H. Strik and L. Boves, "On the relation between voice source parameters and prosodic features in connected speech", Speech Commun., vol. 11, pp. 167-174, 1992.
    • (1992) Speech Commun. , vol.11 , pp. 167-174
    • Strik, H.1    Boves, L.2
  • 42
  • 43
    • 0016495091 scopus 로고
    • Linear prediction: A tutorial review
    • Apr
    • J. Makhoul, "Linear prediction: A tutorial review", Proc. IEEE, vol. 63, no. 4, pp. 561-580, Apr. 1975.
    • (1975) Proc. IEEE , vol.63 , Issue.4 , pp. 561-580
    • Makhoul, J.1
  • 44
    • 0021892196 scopus 로고
    • Automatic glottal inverse filtering from speech and electroglottographic signals
    • Apr
    • D. Veeneman and S. BeMent, "Automatic glottal inverse filtering from speech and electroglottographic signals", IEEE Trans. Acoustics, Speech, Signal Process., vol. ASSP-33, no. 2, pp. 369-377, Apr. 1985.
    • (1985) IEEE Trans. Acoustics, Speech, Signal Process. , vol.ASSP-33 , Issue.2 , pp. 369-377
    • Veeneman, D.1    BeMent, S.2
  • 45
    • 33745473319 scopus 로고    scopus 로고
    • Advanced methods for glottal wave extraction
    • M. Faundez-Zanuy, Ed. et al. Berlin/Heidelberg, Germany: Springer
    • J. Walker and P. Murphy, "Advanced methods for glottal wave extraction", in Nonlinear Analyses and Algorithms for Speech Processing, M. Faundez-Zanuy, Ed. et al. Berlin/Heidelberg, Germany: Springer, 2005, pp. 139-149.
    • (2005) Nonlinear Analyses and Algorithms for Speech Processing , pp. 139-149
    • Walker, J.1    Murphy, P.2
  • 46
    • 33750333146 scopus 로고    scopus 로고
    • Performance of glottal inverse filtering as tested by aeroelastic modelling of phonation and FE modelling of vocal tract
    • P. Alku, J. Horáček, M. Airas, F. Griffond-Boitier, and A.-M. Laukkanen, "Performance of glottal inverse filtering as tested by aeroelastic modelling of phonation and FE modelling of vocal tract", Acta Acust. United With Acust., vol. 92, pp. 717-724, 2006.
    • (2006) Acta Acust. United with Acust. , vol.92 , pp. 717-724
    • Alku, P.1    Horáček, J.2    Airas, M.3    Griffond-Boitier, F.4    Laukkanen, A.-M.5
  • 47
    • 32944458861 scopus 로고    scopus 로고
    • Estimation of the voice source from speech pressure signals: Evaluation of an inverse filtering technique using physical modelling of voice production
    • P. Alku, B. Story, and M. Airas, "Estimation of the voice source from speech pressure signals: Evaluation of an inverse filtering technique using physical modelling of voice production", Folia Phoniatrica et Logopaedica, vol. 58, no. 2, pp. 102-113, 2006.
    • (2006) Folia Phoniatrica et Logopaedica , vol.58 , Issue.2 , pp. 102-113
    • Alku, P.1    Story, B.2    Airas, M.3
  • 48
    • 0036339929 scopus 로고    scopus 로고
    • Normalized amplitude quotient for parameterization of the glottal flow
    • P. Alku, T. Bäckström, and E. Vilkman, "Normalized amplitude quotient for parameterization of the glottal flow", J. Acoust. Soc. Amer., vol. 112, no. 2, pp. 701-710, 2002.
    • (2002) J. Acoust. Soc. Amer. , vol.112 , Issue.2 , pp. 701-710
    • Alku, P.1    Bäckström, T.2    Vilkman, E.3
  • 49
    • 0037380186 scopus 로고    scopus 로고
    • The role of voice quality in communicating emotion, mood and attitude
    • C. Gobl and A. Ní Chasaide, "The role of voice quality in communicating emotion, mood and attitude", Speech Commun., vol. 40, no. 1-2, pp. 189-212, 2003.
    • (2003) Speech Commun. , vol.40 , Issue.1-2 , pp. 189-212
    • Gobl, C.1    Chasaide, A.N.2
  • 50
    • 0001292859 scopus 로고    scopus 로고
    • On the perception of emotions in speech: The role of voice quality
    • A.-M. Laukkanen, E. Vilkman, P. Alku, and H. Oksanen, "On the perception of emotions in speech: The role of voice quality", Logopedics Phoniatrics Vocology, vol. 22, no. 4, pp. 157-168, 1997.
    • (1997) Logopedics Phoniatrics Vocology , vol.22 , Issue.4 , pp. 157-168
    • Laukkanen, A.-M.1    Vilkman, E.2    Alku, P.3    Oksanen, H.4
  • 52
    • 0023407575 scopus 로고
    • Review of text-to-speech conversion for English
    • D. Klatt, "Review of text-to-speech conversion for English", J. Acoust. Soc. Amer., vol. 82, no. 3, pp. 737-793, 1987.
    • (1987) J. Acoust. Soc. Amer. , vol.82 , Issue.3 , pp. 737-793
    • Klatt, D.1
  • 53
    • 0020750866 scopus 로고
    • On the time domain properties of the two-pole model of the glottal waveform and implications for LPC
    • J. Deller, "On the time domain properties of the two-pole model of the glottal waveform and implications for LPC", Speech Commun., vol. 2, no. 1, pp. 57-63, 1983.
    • (1983) Speech Commun. , vol.2 , Issue.1 , pp. 57-63
    • Deller, J.1
  • 54
    • 0002557614 scopus 로고
    • Line spectrum pair (LSP) and speech data compression
    • Mar
    • F. K. Soong and B.-H. Juang, "Line spectrum pair (LSP) and speech data compression", in Proc. ICASSP, Mar. 1984, vol. 9, pp. 37-40.
    • (1984) Proc. ICASSP , vol.9 , pp. 37-40
    • Soong, F.K.1    Juang, B.-H.2
  • 56
    • 0002557614 scopus 로고
    • Line spectrum pair (LSP) and speech data compression
    • Mar
    • F. Soong and B.-H. Juang, "Line spectrum pair (LSP) and speech data compression", in Proc. ICASSP, Mar. 1984, vol. 9, pp. 37-40.
    • (1984) Proc. ICASSP , vol.9 , pp. 37-40
    • Soong, F.1    Juang, B.-H.2
  • 57
    • 0002077742 scopus 로고
    • Quantization of LPC parameters
    • W. Kleijn and K. Paliwal, Eds. Amsterdam, The Netherlands: Elsevier
    • K. Paliwal and W. Kleijn, "Quantization of LPC parameters", in Speech Coding and Synthesis, W. Kleijn and K. Paliwal, Eds. Amsterdam, The Netherlands: Elsevier, 1995, ch. 12.
    • (1995) Speech Coding and Synthesis , pp. 12
    • Paliwal, K.1    Kleijn, W.2
  • 58
    • 0017367712 scopus 로고
    • On the use of autocorrelation analysis for pitch detection
    • L. Rabiner, "On the use of autocorrelation analysis for pitch detection", IEEE Trans. Acoust., Speech, Signal Process., vol. 25, no. 1, pp. 24-33, 1977.
    • (1977) IEEE Trans. Acoust., Speech, Signal Process. , vol.25 , Issue.1 , pp. 24-33
    • Rabiner, L.1
  • 59
    • 0032945155 scopus 로고    scopus 로고
    • Perturbation-free measurement of the harmonics-to-noise ratio in voice signals using pitch synchronous harmonic analysis
    • P. Murphy, "Perturbation-free measurement of the harmonics-to-noise ratio in voice signals using pitch synchronous harmonic analysis", J. Acoust. Soc. Amer., vol. 105, no. 5, pp. 2866-2881, 1999.
    • (1999) J. Acoust. Soc. Amer. , vol.105 , Issue.5 , pp. 2866-2881
    • Murphy, P.1
  • 61
    • 0022685753 scopus 로고
    • Continuously variable duration hidden Markov models for automatic speech recognition
    • S. Levinson, "Continuously variable duration hidden Markov models for automatic speech recognition", Computer Speech Lang., vol. 1, no. 1, pp. 29-45, 1986. (Pubitemid 17552445)
    • (1986) Computer Speech and Language , vol.1 , Issue.1 , pp. 29-45
    • Levinson, S.E.1
  • 62
    • 44449177634 scopus 로고    scopus 로고
    • A hidden semi-Markov model-based speech synthesis system
    • May
    • H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "A hidden semi-Markov model-based speech synthesis system", IEICE Trans. Inf. Syst., vol. E90-D, no. 5, pp. 825-834, May 2007.
    • (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.5 , pp. 825-834
    • Zen, H.1    Tokuda, K.2    Masuko, T.3    Kobayashi, T.4    Kitamura, T.5
  • 63
    • 53049107776 scopus 로고    scopus 로고
    • Accent and prominence in Finnish speech synthesis
    • G. Kokkinakis, N. Fakotakis, E. Dermatos, and R. Potapova, Eds., Greece, Oct, Univ. of Patras
    • M. Vainio, A. Suni, and P. Sirjola, "Accent and prominence in Finnish speech synthesis", in Proc. 10th Int. Conf. Speech Comput. (Specom 2005), G. Kokkinakis, N. Fakotakis, E. Dermatos, and R. Potapova, Eds., Greece, Oct. 2005, pp. 309-312, Univ. of Patras.
    • (2005) Proc. 10th Int. Conf. Speech Comput. (Specom 2005) , pp. 309-312
    • Vainio, M.1    Suni, A.2    Sirjola, P.3
  • 64
    • 53049097235 scopus 로고    scopus 로고
    • Deep syntactic analysis and rule based accentuation in text-to-speech synthesis
    • Text, Speech, Dialogue
    • A. Suni and M. Vainio, "Deep syntactic analysis and rule based accentuation in text-to-speech synthesis", in Proc. TSD'08: Proc. 11th Int. Conf. Text, Speech, Dialogue, 2008, pp. 535-542.
    • (2008) Proc. TSD'08: Proc. 11th Int. Conf , pp. 535-542
    • Suni, A.1    Vainio, M.2
  • 65
    • 0033906251 scopus 로고    scopus 로고
    • MDL-based context-dependent subword modeling for speech recognition
    • Mar
    • K. Shinoda and T. Watanabe, "MDL-based context-dependent subword modeling for speech recognition", J. Acoust. Soc. Japan (E), vol. 21, pp. 79-86, Mar. 2000.
    • (2000) J. Acoust. Soc. Japan (E) , vol.21 , pp. 79-86
    • Shinoda, K.1    Watanabe, T.2
  • 66
    • 31744434085 scopus 로고    scopus 로고
    • Characterizing glottal jet turbulence
    • F. Alipour and R. Scherer, "Characterizing glottal jet turbulence", J. Acoust. Soc. Amer., vol. 119, no. 2, pp. 1063-1073, 2006.
    • (2006) J. Acoust. Soc. Amer. , vol.119 , Issue.2 , pp. 1063-1073
    • Alipour, F.1    Scherer, R.2
  • 70
    • 33745215669 scopus 로고    scopus 로고
    • An overview of nitech HMM-based speech synthesis system for Blizzard Challenge 2005
    • 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
    • H. Zen and T. Toda, "An overview of Nitech HMM-based speech synthesis system for Blizzard Challenge 2005", in Proc. Interspeech, Sep. 2005, pp. 93-96. (Pubitemid 43908009)
    • (2005) 9th European Conference on Speech Communication and Technology , pp. 93-96
    • Zen, H.1    Toda, T.2
  • 71
    • 0020596154 scopus 로고    scopus 로고
    • Cepstral analysis synthesis on the mel frequency scale
    • Apr. 1983
    • S. Imai, "Cepstral analysis synthesis on the mel frequency scale", in Proc. ICASSP, Apr. 1983, vol. 8, pp. 93-96.
    • Proc. ICASSP , vol.8 , pp. 93-96
    • Imai, S.1
  • 72
    • 84874199000 scopus 로고    scopus 로고
    • Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT
    • Sep
    • H. Kawahara, J. Estill, and O. Fujimura, "Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT", in Proc. 2nd Int. Workshop Models Anal. Vocal Emissions for Biomed. Applicat. (MAVEBA), Sep. 2001.
    • (2001) Proc. 2nd Int. Workshop Models Anal. Vocal Emissions for Biomed. Applicat. (MAVEBA)
    • Kawahara, H.1    Estill, J.2    Fujimura, O.3
  • 73
    • 44949143155 scopus 로고    scopus 로고
    • Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation
    • Sep
    • Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation", in Proc. Interspeech, Sep. 2006, pp. 2266-2269.
    • (2006) Proc. Interspeech , pp. 2266-2269
    • Ohtani, Y.1    Toda, T.2    Saruwatari, H.3    Shikano, K.4
  • 74
    • 11144317887 scopus 로고    scopus 로고
    • Robust F0 estimation of speech signal using harmonicity measure based on instantaneous frequency
    • Dec
    • D. Arifianto, T. Tanaka, T. Masuko, and T. Kobayashi, "Robust F0 estimation of speech signal using harmonicity measure based on instantaneous frequency", IEICE Trans. Inf. Syst., vol. E87-D, no. 12, pp. 2812-2820, Dec. 2004.
    • (2004) IEICE Trans. Inf. Syst. , vol.E87-D , Issue.12 , pp. 2812-2820
    • Arifianto, D.1    Tanaka, T.2    Masuko, T.3    Kobayashi, T.4
  • 75
    • 84928118106 scopus 로고    scopus 로고
    • Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of F0 and periodicity
    • Sep
    • H. Kawahara, H. Katayose, A. Cheveigné, and R. Patterson, "Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of F0 and periodicity", in Proc. Eurospeech, Sep. 1999, pp. 2781-2784.
    • (1999) Proc. Eurospeech , pp. 2781-2784
    • Kawahara, H.1    Katayose, H.2    Cheveigné, A.3    Patterson, R.4
  • 76
    • 0001455934 scopus 로고
    • A robust algorithm for pitch tracking (RAPT)
    • W. Kleijn and K. Paliwal, Eds. Amsterdam, The Netherlands: Elsevier
    • D. Talkin, "A robust algorithm for pitch tracking (RAPT)", in Speech Coding and Synthesis, W. Kleijn and K. Paliwal, Eds. Amsterdam, The Netherlands: Elsevier, 1995, pp. 495-518.
    • (1995) Speech Coding and Synthesis , pp. 495-518
    • Talkin, D.1
  • 77
    • 77957731953 scopus 로고    scopus 로고
    • ESPS Programs Version 5.0 Entropic Research Laboratory Inc., 1993
    • ESPS Programs Version 5.0 Entropic Research Laboratory Inc., 1993.
  • 78
    • 0025543906 scopus 로고
    • Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones
    • Dec
    • E. Moulines and F. Charpentier, "Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones", Speech Commun., vol. 9, no. 5-6, pp. 453-467, Dec. 1990.
    • (1990) Speech Commun. , vol.9 , Issue.5-6 , pp. 453-467
    • Moulines, E.1    Charpentier, F.2
  • 79
    • 85016140477 scopus 로고
    • An adaptive algorithm for mel-cepstral analysis of speech
    • T. Fukada, K. Tokuda, T. Kobayashi, and S. Imai, "An adaptive algorithm for mel-cepstral analysis of speech", in Proc. ICASSP, 1992, vol. 1, pp. 137-140.
    • (1992) Proc. ICASSP , vol.1 , pp. 137-140
    • Fukada, T.1    Tokuda, K.2    Kobayashi, T.3    Imai, S.4
  • 80
    • 77957743908 scopus 로고    scopus 로고
    • HTS, HMM-Based Speech Synthesis System Apr, Online. Available
    • HTS, HMM-Based Speech Synthesis System Apr. 2009 [Online]. Available: http://hts.sp. nitech.ac.jp
    • (2009)
  • 82
    • 0003450846 scopus 로고    scopus 로고
    • Methods for subjective determination of transmission quality
    • ITU, Aug
    • ITU, "Methods for subjective determination of transmission quality", Int. Telecomm. Union, Rec. ITU-T P.800, Aug. 1996.
    • (1996) Int. Telecomm. Union, Rec. ITU-T , pp. 800
  • 83
    • 0030166343 scopus 로고    scopus 로고
    • The SUS test: A method for the assessment of text-to-speech synthesis intelligibility using semantically unpredictable sentences
    • C. Benoít, M. Grice, and V. Hazan, "The SUS test: A method for the assessment of text-to-speech synthesis intelligibility using semantically unpredictable sentences", Speech Commun., vol. 18, no. 4, pp. 381-392, 1996.
    • (1996) Speech Commun. , vol.18 , Issue.4 , pp. 381-392
    • Benoít, C.1    Grice, M.2    Hazan, V.3
  • 84
    • 0001884644 scopus 로고
    • Individual comparisons by ranking methods
    • F. Wilcoxon, "Individual comparisons by ranking methods", Biometrics, vol. 1, pp. 80-83, 1945.
    • (1945) Biometrics , vol.1 , pp. 80-83
    • Wilcoxon, F.1
  • 85
    • 84867197177 scopus 로고    scopus 로고
    • Articulatory control of HMM-based parametric speech synthesis driven by phonetic knowledge
    • Sep
    • Z.-H. Ling, K. Richmond, J. Yamagishi, and R.-H. Wang, "Articulatory control of HMM-based parametric speech synthesis driven by phonetic knowledge", in Proc. Interspeech, Brisbane, Australia, Sep. 2008, pp. 573-576.
    • (2008) Proc. Interspeech, Brisbane, Australia , pp. 573-576
    • Ling, Z.-H.1    Richmond, K.2    Yamagishi, J.3    Wang, R.-H.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.