-
1
-
-
85009139544
-
Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
-
T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis,” in Proc. Eurospeech, 1999, pp. 2347–2350.
-
(1999)
Proc. Eurospeech
, pp. 2347-2350
-
-
Yoshimura, T.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
2
-
-
0033708106
-
Speech parameter generation algorithms for HMM-based speech synthesis
-
K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, “Speech parameter generation algorithms for HMM-based speech synthesis,” in Proc. ICASSP, 2000, pp. 1315–1318.
-
(2000)
Proc. ICASSP
, pp. 1315-1318
-
-
Tokuda, K.1
Yoshimura, T.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
3
-
-
85135109865
-
ATR μ -talk speech synthesis system
-
Y. Sagisaka, N. Kaiki, N. Iwahashi, and K. Mimura, “ATR μ -talk speech synthesis system,” in Proc. ICSLP, 1992, pp. 483–486.
-
(1992)
Proc. ICSLP
, pp. 483-486
-
-
Sagisaka, Y.1
Kaiki, N.2
Iwahashi, N.3
Mimura, K.4
-
4
-
-
0029765811
-
Unit selection in a concatenative speech synthesis system using a large speech database
-
A. Hunt and A. Black, “Unit selection in a concatenative speech synthesis system using a large speech database,” in Proc. ICASSP, 1996, pp. 373–376.
-
(1996)
Proc. ICASSP
, pp. 373-376
-
-
Hunt, A.1
Black, A.2
-
5
-
-
33847129573
-
Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
-
J. Yamagishi and T. Kobayashi “Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training,” IEICE Trans. Inf. Syst., vol. E90-D, no. 2, pp. 533–543, 2007.
-
(2007)
IEICE Trans. Inf. Syst.
, vol.E90-D
, Issue.2
, pp. 533-543
-
-
Yamagishi, J.1
Kobayashi, T.2
-
6
-
-
51449114529
-
A style control technique for HMM-based expressive speech synthesis
-
T. Nose, J. Yamagishi, and T. Kobayashi “A style control technique for HMM-based expressive speech synthesis,” IEICE Trans. Inf. Syst., vol. E90-D, no. 9, pp. 1406–1413, 2007.
-
(2007)
IEICE Trans. Inf. Syst.
, vol.E90-D
, Issue.9
, pp. 1406-1413
-
-
Nose, T.1
Yamagishi, J.2
Kobayashi, T.3
-
7
-
-
67651002140
-
Statistical parametric speech synthesis
-
H. Zen, K. Tokuda, and A. W. Black “Statistical parametric speech synthesis,” Speech Commun., vol. 51, no. 11, pp. 1039–1064, 2009.
-
(2009)
Speech Commun.
, vol.51
, Issue.11
, pp. 1039-1064
-
-
Zen, H.1
Tokuda, K.2
Black, A.W.3
-
8
-
-
33846410497
-
Speech parameter generation algorithm considering global variance for HMM-Based speech synthesis
-
T. Toda and K. Tokuda, “Speech parameter generation algorithm considering global variance for HMM-Based speech synthesis,” in Proc. Eurospeech, 2005, pp. 373–376.
-
(2005)
Proc. Eurospeech
, pp. 373-376
-
-
Toda, T.1
Tokuda, K.2
-
9
-
-
34547497133
-
Combining gaussian mixture model with global variance term to improve the quality of an HMM-based polyglot speech synthesizer
-
J. Latorre, K. Iwano, and S. Furui, “Combining gaussian mixture model with global variance term to improve the quality of an HMM-based polyglot speech synthesizer,” in Proc. ICASSP, 2007, pp. 1241–1244.
-
(2007)
Proc. ICASSP
, pp. 1241-1244
-
-
Latorre, J.1
Iwano, K.2
Furui, S.3
-
10
-
-
33749573927
-
Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences
-
H. Zen, K. Tokuda, and T. Kitamura “Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences,” Comput. Speech Lang., vol. 21, no. 1, pp. 153–173, 2007.
-
(2007)
Comput. Speech Lang.
, vol.21
, Issue.1
, pp. 153-173
-
-
Zen, H.1
Tokuda, K.2
Kitamura, T.3
-
11
-
-
34547517493
-
Full HMM Training for Minimizing Generation Error in Synthesis
-
Y.-J. Wu, R.-H. Wang, and F. K. Soong, “Full HMM Training for Minimizing Generation Error in Synthesis,” in Proc. ICASSP, 2007, pp. 517–520.
-
(2007)
Proc. ICASSP
, pp. 517-520
-
-
Wu, Y.-J.1
Wang, R.-H.2
Soong, F.K.3
-
12
-
-
0003684449
-
-
New York: Springer
-
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, Data Mining, Inference, and Prediction. New York: Springer, 2001.
-
(2001)
The Elements of Statistical Learning, Data Mining, Inference, and Prediction
-
-
Hastie, T.1
Tibshirani, R.2
Friedman, J.3
-
13
-
-
85009257781
-
F0 generation for speech synthesis using a multi-tier approach
-
X.-J. Sun, “F0 generation for speech synthesis using a multi-tier approach,” in Proc. ICSLP, 2002, pp. 2077–2080.
-
(2002)
Proc. ICSLP
, pp. 2077-2080
-
-
Sun, X.-J.1
-
14
-
-
33646821329
-
Additive modeling of english f0 contour for speech synthesis
-
S. Sakai, “Additive modeling of english f0 contour for speech synthesis,” in Proc. ICASSP, 2005, pp. 277–280.
-
(2005)
Proc. ICASSP
, pp. 277-280
-
-
Sakai, S.1
-
15
-
-
48549095974
-
HMM-based trainable speech synthesis for Chinese
-
Y.-J. Wu and R.-H. Wang “HMM-based trainable speech synthesis for Chinese,” J. Chinese Inf. Process., vol. 20, no. 4, pp. 75–81, 2006.
-
(2006)
J. Chinese Inf. Process.
, vol.20
, Issue.4
, pp. 75-81
-
-
Wu, Y.-J.1
Wang, R.-H.2
-
16
-
-
84867200235
-
Generating natural F0 trajectory with additive trees
-
Y. Qian, H. Fiang, and F. K. Song, “Generating natural F0 trajectory with additive trees,” in Proc. Interspeech, 2008, pp. 2126–2129.
-
(2008)
Proc. Interspeech
, pp. 2126-2129
-
-
Qian, Y.1
Fiang, H.2
Song, F.K.3
-
17
-
-
41049090228
-
Phone duration modeling using gradient tree boosting
-
J. Yamagishi, H. Kawai, and T. Kobayashi “Phone duration modeling using gradient tree boosting,” Speech Commun., vol. 50, no. 5, pp. 405–415, 2008.
-
(2008)
Speech Commun.
, vol.50
, Issue.5
, pp. 405-415
-
-
Yamagishi, J.1
Kawai, H.2
Kobayashi, T.3
-
18
-
-
67650851610
-
Improved prosody generation by maximizing joint likelihood of state and longer units
-
Y. Qian, Z.-Z. Wu, and F. K. Song, “Improved prosody generation by maximizing joint likelihood of state and longer units,” in Proc. ICASSP, 2009, pp. 3781–3784.
-
(2009)
Proc. ICASSP
, pp. 3781-3784
-
-
Qian, Y.1
Wu, Z.-Z.2
Song, F.K.3
-
19
-
-
84867194192
-
Multilevel parametric-base F0 model for speech synthesis
-
J. Latorre and M. Akamine, “Multilevel parametric-base F0 model for speech synthesis,” in Proc. Interspeech, 2008, pp. 2274–2277.
-
(2008)
Proc. Interspeech
, pp. 2274-2277
-
-
Latorre, J.1
Akamine, M.2
-
20
-
-
33846442604
-
Investigation of state duration model based on gamma distribution for HMM-based speech synthesis
-
Y. Ishimatsu, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Investigation of state duration model based on gamma distribution for HMM-based speech synthesis,” IEICE Tech. Rep., vol. 101, no. 352, pp. 57–62, 2001.
-
(2001)
IEICE Tech. Rep.
, vol.101
, Issue.352
, pp. 57-62
-
-
Ishimatsu, Y.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
21
-
-
24144455395
-
Context-Dependent phoneme duration modeling with Tree-Based state tying
-
S. J. Park, M. W. Koo, and C. S. Jhon “Context-Dependent phoneme duration modeling with Tree-Based state tying,” IEICE Trans. Inf. Syst., vol. E88-D, no. 3, pp. 662–666, 2005.
-
(2005)
IEICE Trans. Inf. Syst.
, vol.E88-D
, Issue.3
, pp. 662-666
-
-
Park, S.J.1
Koo, M.W.2
Jhon, C.S.3
-
22
-
-
84867218426
-
Duration refinement by jointly optimizing state and longer unit likelihood
-
B.-Y. Gao, Y. Qian, Z.-Z. Wu, and F. K. Soong, “Duration refinement by jointly optimizing state and longer unit likelihood,” in Proc. Interspeech, 2008, pp. 2266–2269.
-
(2008)
Proc. Interspeech
, pp. 2266-2269
-
-
Gao, B.-Y.1
Qian, Y.2
Wu, Z.-Z.3
Soong, F.K.4
-
23
-
-
51449117929
-
Modelling and synthesising F0 contours with the discrete cosine transform
-
J. Teutenberg, C. Watson, and P. Riddle, “Modelling and synthesising F0 contours with the discrete cosine transform,” in Proc. ICASSP, 2008, pp. 3973–3976.
-
(2008)
Proc. ICASSP
, pp. 3973-3976
-
-
Teutenberg, J.1
Watson, C.2
Riddle, P.3
-
24
-
-
60849112575
-
Modeling and generating tone contour with phrase intonation for Mandarin Chinese speech
-
Z.-Z. Wu, Y. Qian, F. K. Soong, and B. Zhang, “Modeling and generating tone contour with phrase intonation for Mandarin Chinese speech,” in Proc. ISCSLP, 2008, pp. 121–124.
-
(2008)
Proc. ISCSLP
, pp. 121-124
-
-
Wu, Z.-Z.1
Qian, Y.2
Soong, F.K.3
Zhang, B.4
-
25
-
-
85093445139
-
Duration modeling for HMM-based speech synthesis
-
T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Duration modeling for HMM-based speech synthesis,” in Proc. ICSLP, 1998, pp. 29–32.
-
(1998)
Proc. ICSLP
, pp. 29-32
-
-
Yoshimura, T.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
26
-
-
0036522887
-
Multi-space probability distribution HMM
-
K. Tokuda, T. Masuko, N. Miyazaki, and T. Kobayashi “Multi-space probability distribution HMM,” IEICE Trans. Inf. Syst., vol. E85-D(3), pp. 455–464, 2002.
-
(2002)
IEICE Trans. Inf. Syst.
, vol.E85-D
, Issue.3
, pp. 455-464
-
-
Tokuda, K.1
Masuko, T.2
Miyazaki, N.3
Kobayashi, T.4
-
28
-
-
0032673049
-
Restructuring speech representations using pitch-adaptive time-frequency smoothing and instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
-
4
-
H. Kawahara, I. Masuda Katsuse, and A. de Cheveigne “Restructuring speech representations using pitch-adaptive time-frequency smoothing and instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds,” Speech Commun., vol. 27, no. 3–4, pp. 187–207, 1999.
-
(1999)
Speech Commun.
, vol.27
, Issue.3
, pp. 187-207
-
-
Kawahara, H.1
Katsuse, I.M.2
de Cheveigne, A.3
-
29
-
-
0001455934
-
chapter A robust algorithm for pitch tracking (RAPT)
-
Amsterdam, The Netherlands: Elservier
-
A. D. Talkin, “chapter A robust algorithm for pitch tracking (RAPT),” in Speech Coding and Synthesis. Amsterdam, The Netherlands: Elservier, 1995.
-
(1995)
Speech Coding and Synthesis
-
-
Talkin, A.D.1
-
30
-
-
0033906251
-
MDL-based context-dependent subword modeling for speech recognition
-
K. Shinoda and T. Watanable “MDL-based context-dependent subword modeling for speech recognition,” J. Acoust. Soc. Jpn(E), vol. 21, no. 2, pp. 79–86, 2000.
-
(2000)
J. Acoust. Soc. Jpn(E)
, vol.21
, Issue.2
, pp. 79-86
-
-
Shinoda, K.1
Watanable, T.2
-
31
-
-
67650851754
-
USTC system for blizzard challenge 2006 an improved HMM-based speech synthesis method
-
Z.-H. Ling, Y.-J. Wu, Y.-P. Wang, L. Qin, and R.-H. Wang, “USTC system for blizzard challenge 2006 an improved HMM-based speech synthesis method,” in Proc. Blizzard Challenge 2006 Workshop, 2006.
-
(2006)
Proc. Blizzard Challenge 2006 Workshop
-
-
Ling, Z.-H.1
Wu, Y.-J.2
Wang, Y.-P.3
Qin, L.4
Wang, R.-H.5
|