-
1
-
-
84910105608
-
Measuring a decade of progress in text-to-speech
-
S. King, "Measuring a decade of progress in text-to-speech," Loquens, vol. 1, no. 1, 2014.
-
(2014)
Loquens
, vol.1
, Issue.1
-
-
King, S.1
-
2
-
-
85133720638
-
The HMM-based speech synthesis system (HTS) version 2.0
-
H. Zen, T. Nose, J. Yamagishi, S. Sako, T. Masuko, A. Black, and K. Tokuda, "The HMM-based speech synthesis system (HTS) version 2.0," in Proc. SSW, vol. 6, 2007, pp. 294-299.
-
(2007)
Proc. SSW
, vol.6
, pp. 294-299
-
-
Zen, H.1
Nose, T.2
Yamagishi, J.3
Sako, S.4
Masuko, T.5
Black, A.6
Tokuda, K.7
-
3
-
-
0033708106
-
Speech parameter generation algorithms for HMM-based speech synthesis
-
K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis," in Proc. ICASSP, 2000, pp. 1315-1318.
-
(2000)
Proc. ICASSP
, pp. 1315-1318
-
-
Tokuda, K.1
Yoshimura, T.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
4
-
-
85009139544
-
Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
-
T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis," in Proc. Eurospeech, 1999, pp. 2347-2350.
-
(1999)
Proc. Eurospeech
, pp. 2347-2350
-
-
Yoshimura, T.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
5
-
-
85008023596
-
Continuous F0 modeling for HMM based statistical parametric speech synthesis
-
K. Yu and S. Young, "Continuous F0 modeling for HMM based statistical parametric speech synthesis," IEEE T. Audio Speech, vol. 19, no. 5, pp. 1071-1079, 2011.
-
(2011)
IEEE T. Audio Speech
, vol.19
, Issue.5
, pp. 1071-1079
-
-
Yu, K.1
Young, S.2
-
6
-
-
84890490547
-
Statistical parametric speech synthesis using deep neural networks
-
H. Zen, A. Senior, and M. Schuster, "Statistical parametric speech synthesis using deep neural networks," in Proc. ICASSP, 2013, pp. 7962-7966.
-
(2013)
Proc. ICASSP
, pp. 7962-7966
-
-
Zen, H.1
Senior, A.2
Schuster, M.3
-
7
-
-
84973359646
-
From HMMs to DNNs: Where do the improvements come from?
-
O. Watts, G. E. Henter, T. Merritt, Z. Wu, and S. King, "From HMMs to DNNs: where do the improvements come from?" in Proc. ICASSP, 2016, pp. 5505-5509.
-
(2016)
Proc. ICASSP
, pp. 5505-5509
-
-
Watts, O.1
Henter, G.E.2
Merritt, T.3
Wu, Z.4
King, S.5
-
8
-
-
84946045510
-
Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis
-
H. Zen and H. Sak, "Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis," in Proc. ICASSP, 2015, pp. 4470-4474.
-
(2015)
Proc. ICASSP
, pp. 4470-4474
-
-
Zen, H.1
Sak, H.2
-
9
-
-
84973355618
-
Investigating gated recurrent networks for speech synthesis
-
Z. Wu and S. King, "Investigating gated recurrent networks for speech synthesis," in Proc. ICASSP, 2016, pp. 5140-5144.
-
(2016)
Proc. ICASSP
, pp. 5140-5144
-
-
Wu, Z.1
King, S.2
-
10
-
-
84910068142
-
Prosody contour prediction with long short-term memory, bidirectional, deep recurrent neural networks
-
R. Fernandez, A. Rendel, B. Ramabhadran, and R. Hoory, "Prosody contour prediction with long short-term memory, bidirectional, deep recurrent neural networks," in Proc. Interspeech, 2014, pp. 2268-2272.
-
(2014)
Proc. Interspeech
, pp. 2268-2272
-
-
Fernandez, R.1
Rendel, A.2
Ramabhadran, B.3
Hoory, R.4
-
11
-
-
84950159800
-
Modeling F0 trajectories in hierarchically structured deep neural networks
-
X. Yin, M. Lei, Y. Qian, F. K. Soong, L. He, Z.-H. Ling, and L.-R. Dai, "Modeling F0 trajectories in hierarchically structured deep neural networks," Speech Commun., vol. 76, pp. 82-92, 2016.
-
Speech Commun.
, vol.76
, Issue.2016
, pp. 82-92
-
-
Yin, X.1
Lei, M.2
Qian, Y.3
Soong, F.K.4
He, L.5
Ling, Z.-H.6
Dai, L.-R.7
-
12
-
-
84904608062
-
SLAM: Automatic stylization and labelling of speech melody
-
N. Obin, J. Beliao, C. Veaux, and A. Lacheret, "SLAM: Automatic stylization and labelling of speech melody," in Proc. Speech Prosody, 2014, pp. 246-250.
-
(2014)
Proc. Speech Prosody
, pp. 246-250
-
-
Obin, N.1
Beliao, J.2
Veaux, C.3
Lacheret, A.4
-
13
-
-
84982995064
-
JNDSLAM: A SLAM extension for speech synthesis
-
R. Dall and X. Gonzalvo, "JNDSLAM: A SLAM extension for speech synthesis," in Proc. Speech Prosody, 2016, pp. 1024-1028.
-
(2016)
Proc. Speech Prosody
, pp. 1024-1028
-
-
Dall, R.1
Gonzalvo, X.2
-
14
-
-
84865748446
-
A statistical phrase/accent model for intonation modeling
-
G. K. Anumanchipalli, L. C. Oliveira, and A. W. Black, "A statistical phrase/accent model for intonation modeling," in Proc. Interspeech, 2011, pp. 1813-1816.
-
(2011)
Proc. Interspeech
, pp. 1813-1816
-
-
Anumanchipalli, G.K.1
Oliveira, L.C.2
Black, A.W.3
-
15
-
-
33749259827
-
Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks
-
A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber, "Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks," in Proc. ICML, 2006, pp. 369-376.
-
(2006)
Proc ICML
, pp. 369-376
-
-
Graves, A.1
Fernández, S.2
Gomez, F.3
Schmidhuber, J.4
-
16
-
-
84867194192
-
Multilevel parametric-base F0 model for speech synthesis
-
J. Latorre and M. Akamine, "Multilevel parametric-base F0 model for speech synthesis," in Proc. Interspeech, 2008, pp. 2274-2277.
-
(2008)
Proc. Interspeech
, pp. 2274-2277
-
-
Latorre, J.1
Akamine, M.2
-
17
-
-
51449117929
-
Modelling and synthesising F0 contours with the discrete cosine transform
-
J. Teutenberg, C. Watson, and P. Riddle, "Modelling and synthesising F0 contours with the discrete cosine transform," in Proc. ICASSP, 2008, pp. 3973-3976.
-
(2008)
Proc. ICASSP
, pp. 3973-3976
-
-
Teutenberg, J.1
Watson, C.2
Riddle, P.3
-
18
-
-
85008039410
-
Improved prosody generation by maximizing joint probability of state and longer units
-
Y. Qian, Z.Wu, B. Gao, and F. K. Soong, "Improved prosody generation by maximizing joint probability of state and longer units," IEEE T. Audio Speech, vol. 19, no. 6, pp. 1702-1710, 2011.
-
(2011)
IEEE T. Audio Speech
, vol.19
, Issue.6
, pp. 1702-1710
-
-
Qian, Y.1
Wu, Z.2
Gao, B.3
Soong, F.K.4
-
19
-
-
84946045633
-
Wavelets for intonation modeling inHMMspeech synthesis
-
A. Suni, D. Aalto, T. Raitio, P. Alku, and M. Vainio, "Wavelets for intonation modeling inHMMspeech synthesis," in Proc. SSW, vol. 8, 2013, pp. 285-290.
-
(2013)
Proc. SSW
, vol.8
, pp. 285-290
-
-
Suni, A.1
Aalto, D.2
Raitio, T.3
Alku, P.4
Vainio, M.5
-
20
-
-
84946044619
-
A multi-level representation of f0 using the continuous wavelet transform and the discrete cosine transform
-
M. S. Ribeiro and R. A. J. Clark, "A multi-level representation of f0 using the continuous wavelet transform and the discrete cosine transform," in Proc. ICASSP, 2015, pp. 4909-4913.
-
(2015)
Proc. ICASSP
, pp. 4909-4913
-
-
Ribeiro, M.S.1
Clark, R.A.J.2
-
21
-
-
84910044428
-
Modeling DCT parameterized F0 trajectory at intonation phrase level with DNN or decision tree
-
X. Yin, M. Lei, Y. Qian, F. K. Soong, L. He, Z.-H. Ling, and L.-R. Dai, "Modeling DCT parameterized F0 trajectory at intonation phrase level with DNN or decision tree," in Proc. Interspeech, 2014, pp. 2273-2277.
-
(2014)
Proc. Interspeech
, pp. 2273-2277
-
-
Yin, X.1
Lei, M.2
Qian, Y.3
Soong, F.K.4
He, L.5
Ling, Z.-H.6
Dai, L.-R.7
-
22
-
-
0001481529
-
Bark and ERB bilinear transforms
-
J. O. Smith III and J. S. Abel, "Bark and ERB bilinear transforms," IEEE T. Speech Audi. P., vol. 7, no. 6, pp. 697-708, 1999.
-
(1999)
IEEE T. Speech Audi. P.
, vol.7
, Issue.6
, pp. 697-708
-
-
Smith, J.O.1
Abel, J.S.2
-
23
-
-
0014129195
-
Hierarchical clustering schemes
-
S. C. Johnson, "Hierarchical clustering schemes," Psychometrika, vol. 32, no. 3, pp. 241-254, 1967.
-
(1967)
Psychometrika
, vol.32
, Issue.3
, pp. 241-254
-
-
Johnson, S.C.1
-
24
-
-
33947674781
-
Sub-phonetic modeling for capturing pronunciation variations for conversational speech synthesis
-
K. Prahallad, A.W. Black, and R. Mosur, "Sub-phonetic modeling for capturing pronunciation variations for conversational speech synthesis," in Proc. ICASSP, 2006, pp. I-853-I-856.
-
(2006)
Proc. ICASSP
, pp. I853-I856
-
-
Prahallad, K.1
Black, A.W.2
Mosur, R.3
-
25
-
-
84946033275
-
Deep neural networks employing multi-task learning and stacked bottleneck features for speech synthesis
-
Z. Wu, C. Valentini-Botinhao, O. Watts, and S. King, "Deep neural networks employing multi-task learning and stacked bottleneck features for speech synthesis," in Proc. ICASSP, 2015, pp. 4460-4464.
-
(2015)
Proc. ICASSP
, pp. 4460-4464
-
-
Wu, Z.1
Valentini-Botinhao, C.2
Watts, O.3
King, S.4
-
26
-
-
33750915991
-
STRAIGHT, exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds
-
H. Kawahara, "STRAIGHT, exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds," Acoust. Sci. Technol., vol. 27, no. 6, pp. 349-353, 2006.
-
(2006)
Acoust. Sci. Technol.
, vol.27
, Issue.6
, pp. 349-353
-
-
Kawahara, H.1
-
27
-
-
84994311603
-
-
Union, Telecommunication Standardization Sector, Geneva, Switzerland, March
-
Objective measurement of active speech level, ITU Recommendation ITU-T P.56, International Telecommunication Union, Telecommunication Standardization Sector, Geneva, Switzerland, March 2011.
-
(2011)
Objective Measurement of Active Speech Level, ITU Recommendation ITU-T P.56, International Telecommunication
-
-
-
28
-
-
13344250603
-
-
ITU Recommendation ITU-R BS.1534-3, International Telecommunication Union Radiocommunication Sector, Geneva, Switzerland, October
-
Method for the subjective assessment of intermediate quality levels of coding systems, ITU Recommendation ITU-R BS.1534- 3, International Telecommunication Union Radiocommunication Sector, Geneva, Switzerland, October 2015.
-
(2015)
Method for the Subjective Assessment of Intermediate Quality Levels of Coding Systems
-
-
-
29
-
-
84994267677
-
A simple sequentially rejective multiple test procedure
-
S. Holm, "A simple sequentially rejective multiple test procedure," Scand. J. Stat., vol. 6, no. 2, pp. 65-70, 1979.
-
(1979)
Scand. J. Stat.
, vol.6
, Issue.2
, pp. 65-70
-
-
Holm, S.1
|