SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 19, Issue 5, 2011, Pages 1071-1079

Continuous F0 Modeling for HMM Based Statistical Parametric Speech Synthesis

(2) Yu, Kai a Young, Steve a

a UNIVERSITY OF CAMBRIDGE (United Kingdom)

Author keywords

F0 modeling; hidden Markov model (HMM) based synthesis; statistical parametric speech synthesis; voicing classification

Indexed keywords

EID: 85008023596 PISSN: 15587916 EISSN: 15587924 Source Type: Journal
DOI: 10.1109/TASL.2010.2076805 Document Type: Article

Times cited : (123)

References (27)

1
- 85009139544
- Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis,” in Proc. Eurospeech, 1999, pp. 2347–2350.
- (1999) Proc. Eurospeech , pp. 2347-2350
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

2
- 0032673049
- Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
- 4
- H. Kawahara, I. M. Katsuse, and A. D. Cheveigne “Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds,” Speech Commun., vol. 27, no. 3–4, pp. 187–207, 1999.
- (1999) Speech Commun. , vol.27 , Issue.3 , pp. 187-207
- Kawahara, H.¹ Katsuse, I.M.² Cheveigne, A.D.³

3
- 84874199000
- Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system straight
- H. Kawahara, J. Estill, and O. Fujimura, “Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system straight,” in Proc. MAVEBA, 2001.
- (2001) Proc. MAVEBA
- Kawahara, H.¹ Estill, J.² Fujimura, O.³

4
- 0033708106
- Speech parameter generation algorithms for HMM-based speech synthesis
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, “Speech parameter generation algorithms for HMM-based speech synthesis,” in Proc. ICASSP, 2000, pp. 1315–1318.
- (2000) Proc. ICASSP , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

5
- 0020596154
- Cepstral analysis synthesis on the mel frequency scale
- S. Imai, “Cepstral analysis synthesis on the mel frequency scale,” in Proc. ICASSP, 1983, pp. 93–96.
- (1983) Proc. ICASSP , pp. 93-96
- Imai, S.¹

6
- 84928118106
- Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of F0 and periodicity
- H. Kawahara, H. Katayose, A. D. Cheveigne, and R. D. Patterson, “Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of F0 and periodicity,” in Proc. Eurospeech, 1999, pp. 2781–2784.
- (1999) Proc. Eurospeech , pp. 2781-2784
- Kawahara, H.¹ Katayose, H.² Cheveigne, A.D.³ Patterson, R.D.⁴

7
- 0004056285
- Upper Saddle River, NJ: Prentice-Hall
- X. Huang, A. Acero, and H. Hon, Spoken Language Process. Upper Saddle River, NJ: Prentice-Hall, 2001.
- (2001) Spoken Language Process.
- Huang, X.¹ Acero, A.² Hon, H.³

8
- 21244491419
- A robust algorithm for pitch tracking (RAPT)
- Amsterdam, The Netherlands: Elsevier
- D. Talkin, “A robust algorithm for pitch tracking (RAPT),” in Speech Coding Synth., Amsterdam, The Netherlands: Elsevier, 1995, pp. 497–516.
- (1995) Speech Coding Synth. , pp. 497-516
- Talkin, D.¹

9
- 0036522887
- Multi-space probability distribution HMM
- K. Tokuda, T. Mausko, N. Miyazaki, and T. Kobayashi “Multi-space probability distribution HMM,” IEICE Trans. Inf. Syst., vol. E85-D, no. 3, pp. 455–464, 2002.
- (2002) IEICE Trans. Inf. Syst. , vol.E85-D , Issue.3 , pp. 455-464
- Tokuda, K.¹ Mausko, T.² Miyazaki, N.³ Kobayashi, T.⁴

10
- 33646796046
- Ph.D. dissertation, Nagoya Inst. of Technol., Nagoya, Japan
- T. Yoshimura, “Simultaneous modelling of phonetic and prosodic parameters, and characteristic conversion for HMM based text-to-speech systems,” Ph.D. dissertation, Nagoya Inst. of Technol., Nagoya, Japan, 2002.
- (2002) Simultaneous modelling of phonetic and prosodic parameters, and characteristic conversion for HMM based text-to-speech systems
- Yoshimura, T.¹

11
- 0037567970
- Pitch pattern generation using multi-space probability distribution HMM
- T. Masuko, K. Tokuda, N. Miyazaki, and T. Kobayashi “Pitch pattern generation using multi-space probability distribution HMM,” IEICE Trans., vol. J83-D-II, no. 7, pp. 1600–1609, 2000.
- (2000) IEICE Trans. , vol.J83-D-II , Issue.7 , pp. 1600-1609
- Masuko, T.¹ Tokuda, K.² Miyazaki, N.³ Kobayashi, T.⁴

12
- 0023869369
- Lexical stress recognition using hidden Markov modeld
- G. J. Freij and F. Fallside, “Lexical stress recognition using hidden Markov modeld,” in Proc. ICASSP, 1988, pp. 135–138.
- (1988) Proc. ICASSP , pp. 135-138
- Freij, G.J.¹ Fallside, F.²

13
- 0028466266
- Modelling intonation contours at the phrase level using continuous density hidden Markov models
- U. Jensen, R. K. Moore, P. Dalsgaard, and B. Lindberg “Modelling intonation contours at the phrase level using continuous density hidden Markov models,” Comput. Speech Lang., vol. 8, pp. 247–260, 1994.
- (1994) Comput. Speech Lang. , vol.8 , pp. 247-260
- Jensen, U.¹ Moore, R.K.² Dalsgaard, P.³ Lindberg, B.⁴

14
- 0032665603
- A dynamical system model for generating fundamental frequency for speech synthesis
- May
- K. N. Ross and M. Ostendorf, “A dynamical system model for generating fundamental frequency for speech synthesis,” IEEE Trans. Speech Audio Process., vol. 7, no. 3, pp. 295–309, May 1999.
- (1999) IEEE Trans. Speech Audio Process. , vol.7 , Issue.3 , pp. 295-309
- Ross, K.N.¹ Ostendorf, M.²

15
- 67650823157
- Probablistic modeling of F0 in unvoiced regions in HMM based speech synthesis
- K. Yu, T. Toda, M. Gasic, S. Keizer, F. Mairesse, B. Thomson, and S. Young, “Probablistic modeling of F0 in unvoiced regions in HMM based speech synthesis,” in Proc. ICASSP, 2009.
- (2009) Proc. ICASSP
- Yu, K.¹ Toda, T.² Gasic, M.³ Keizer, S.⁴ Mairesse, F.⁵ Thomson, B.⁶ Young, S.⁷

16
- 85008040661
- New York: McGraw-Hill
- A. Papoulis, Probability, Random Rariables, and Stochastic Processes. New York: McGraw-Hill, 1984.
- (1984) Probability, Random Rariables, and Stochastic Processes
- Papoulis, A.¹

17
- 80051646062
- A pitch pattern modeling technique using dynamic features on the border of voiced and unvoiced segments
- H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “A pitch pattern modeling technique using dynamic features on the border of voiced and unvoiced segments,” Tech. Rep. IEICE, vol. 101, no. 325, pp. 53–58, 2001.
- (2001) Tech. Rep. IEICE , vol.101 , Issue.325 , pp. 53-58
- Zen, H.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

18
- 0028996993
- Speech parameter generation from HMM using dynamic features
- K. Tokuda, T. Kobayashi, and S. Imai, “Speech parameter generation from HMM using dynamic features,” in Proc. ICASSP, 1995, pp. 660–663.
- (1995) Proc. ICASSP , pp. 660-663
- Tokuda, K.¹ Kobayashi, T.² Imai, S.³

19
- 85135145174
- Acoustic modeling based on the MDL principle for speech recognition
- K. Shinoda and T. Watanabe, “Acoustic modeling based on the MDL principle for speech recognition,” in Proc. Eurospeech, 1997, pp. 99–102.
- (1997) Proc. Eurospeech , pp. 99-102
- Shinoda, K.¹ Watanabe, T.²

20
- 0011450865
- A study on pitch pattern generation using HMMs based on multi-space probability distributions
- 12
- N. Miyazaki, K. Tokuda, T. Masuko, and T. Kobayashi, “A study on pitch pattern generation using HMMs based on multi-space probability distributions,” Tech. Rep. IEICE, vol. SP98–12, 1998.
- (1998) Tech. Rep. IEICE , vol.SP98
- Miyazaki, N.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴

21
- 51749120945
- On the convergence of cubic interpolating splines
- New York: Birkhauser
- T. Lyche and L. L. Schumaker, “On the convergence of cubic interpolating splines,” in Spline Functions and Approximation Theory. New York: Birkhauser, 1973, pp. 169–189.
- (1973) Spline Functions and Approximation Theory , pp. 169-189
- Lyche, T.¹ Schumaker, L.L.²

22
- 33646773080
- CMU ARCTIC Databases for Speech Synthesis Lang
- Carnegie Mellon Univ., Pittsburgh, PA, Tech. Rep. CMU-LTI-03-177
- J. Kominek and A. Black, CMU ARCTIC Databases for Speech Synthesis Lang. Technol. Inst., School of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, 2003, Tech. Rep. CMU-LTI-03-177.
- (2003) Technol. Inst., School of Comput. Sci.
- Kominek, J.¹ Black, A.²

23
- 85008004838
- [Online]. Available: http://hts.sp.nitech.ac.jp
- HMM-Based Speech Synthesis System (HTS). [Online]. Available: http://hts.sp.nitech.ac.jp

24
- 33846405723
- Details of the Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005
- H. Zen, T. Toda, M. Nakamura, and K. Tokuda “Details of the Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005,” IEICE Trans. Inf. Syst., vol. E90-D, no. 1, pp. 325–333, 2007.
- (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.1 , pp. 325-333
- Zen, H.¹ Toda, T.² Nakamura, M.³ Tokuda, K.⁴

25
- 0011510420
- Duration modeling in HMM-based speech synthesis system
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Duration modeling in HMM-based speech synthesis system,” in Proc. ICSLP, 1998, pp. 29–32.
- (1998) Proc. ICSLP , pp. 29-32
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

26
- 33947110905
- State duration modeling for HMM-based speech synthesis
- H. Zen, K. Tokuda, T. Masuko, T. Yoshimura, T. Kobayashi, and T. Kitamura “State duration modeling for HMM-based speech synthesis,” IEICE Trans. Inf. Syst., vol. E90-D, no. 3, pp. 692–693, 2007.
- (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.3 , pp. 692-693
- Zen, H.¹ Tokuda, K.² Masuko, T.³ Yoshimura, T.⁴ Kobayashi, T.⁵ Kitamura, T.⁶

27
- 38549096029
- A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
- T. Toda and K. Tokuda “A speech parameter generation algorithm considering global variance for HMM-based speech synthesis,” IEICE Trans. Inf. Syst., vol. E90-D, no. 5, pp. 816–824, 2007.
- (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.5 , pp. 816-824
- Toda, T.¹ Tokuda, K.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.