-
1
-
-
85009139544
-
Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
-
T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis," in Proc. Eurospeech, 1999, pp. 2347-2350.
-
(1999)
Proc. Eurospeech
, pp. 2347-2350
-
-
Yoshimura, T.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
2
-
-
84890490547
-
Statistical parametric speech synthesis using deep neural networks
-
H. Zen, A. Senior, and M. Schuster, "Statistical parametric speech synthesis using deep neural networks," in Proc. ICASSP, 2013, pp. 7962-7966.
-
(2013)
Proc. ICASSP
, pp. 7962-7966
-
-
Zen, H.1
Senior, A.2
Schuster, M.3
-
3
-
-
0029765811
-
Unit selection in a concatenative speech synthesis system using a large speech database
-
A. Hunt and A. Black, "Unit selection in a concatenative speech synthesis system using a large speech database," in Proc. ICASSP, 1996, pp. 373-376.
-
(1996)
Proc. ICASSP
, pp. 373-376
-
-
Hunt, A.1
Black, A.2
-
4
-
-
0034842740
-
Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR
-
M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi, "Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR," in Proc. ICASSP, 2001, pp. 805-808.
-
(2001)
Proc. ICASSP
, pp. 805-808
-
-
Tamura, M.1
Masuko, T.2
Tokuda, K.3
Kobayashi, T.4
-
5
-
-
51449114529
-
A style control technique for HMM-based expressive speech synthesis
-
T. Nose, J. Yamagishi, T. Masuko, and T. Kobayashi, "A style control technique for HMM-based expressive speech synthesis," IEICE Trans. Inf. Syst., Vol. E90-D, no. 2, pp. 533-543, 2007.
-
(2007)
IEICE Trans. Inf. Syst.
, vol.E90-D
, Issue.2
, pp. 533-543
-
-
Nose, T.1
Yamagishi, J.2
Masuko, T.3
Kobayashi, T.4
-
6
-
-
79959839868
-
Quantized HMMs for low footprint text-to-speech synthesis
-
A. Gutkin, X. Gonzalvo, S. Breuer, and P. Taylor, "Quantized HMMs for low footprint text-to-speech synthesis," in Proc. Interspeech, 2010.
-
(2010)
Proc. Interspeech
-
-
Gutkin, A.1
Gonzalvo, X.2
Breuer, S.3
Taylor, P.4
-
7
-
-
67651002140
-
Statistical parametric speech synthesis
-
H. Zen, K. Tokuda, and A. Black, "Statistical parametric speech synthesis," Speech Commn., Vol. 51, no. 11, pp. 1039-1064, 2009.
-
(2009)
Speech Commn.
, vol.51
, Issue.11
, pp. 1039-1064
-
-
Zen, H.1
Tokuda, K.2
Black, A.3
-
8
-
-
0028996842
-
CELP coding based on mel-cepstral analysis
-
K. Koishida, K. Tokuda, T. Kobayashi, and S. Imai, "CELP coding based on mel-cepstral analysis," in Proc. ICASSP, 1995, pp. 33-36.
-
(1995)
Proc. ICASSP
, pp. 33-36
-
-
Koishida, K.1
Tokuda, K.2
Kobayashi, T.3
Imai, S.4
-
9
-
-
27144515530
-
Incorporating a mixed excitation model and postfilter into HMM-based text-to-speech synthesis
-
T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Incorporating a mixed excitation model and postfilter into HMM-based text-to-speech synthesis," Syst. Comput. Jpn., Vol. 36, no. 12, pp. 43-50, 2005.
-
(2005)
Syst. Comput. Jpn.
, vol.36
, Issue.12
, pp. 43-50
-
-
Yoshimura, T.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
10
-
-
67650851754
-
USTC system for blizzard challenge 2006 an improved HMM-based speech synthesis method
-
Z. Ling, Y. Wu, Y. Wang, L. Qin, and R. Wang, "USTC system for Blizzard Challenge 2006 an improved HMM-based speech synthesis method," in Proc. Blizzard Challenge Workshop, 2006.
-
(2006)
Proc. Blizzard Challenge Workshop
-
-
Ling, Z.1
Wu, Y.2
Wang, Y.3
Qin, L.4
Wang, R.5
-
11
-
-
38549096029
-
A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
-
T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis," IEICE Trans. Inf. Syst., Vol. E90-D, no. 5, pp. 816-824, 2007.
-
(2007)
IEICE Trans. Inf. Syst.
, vol.E90-D
, Issue.5
, pp. 816-824
-
-
Toda, T.1
Tokuda, K.2
-
12
-
-
84878384520
-
Ways to implement global variance in statistical speech synthesis
-
H. Silén, E. Heiander, J. Nurminen, and M. Gabbouj, "Ways to implement global variance in statistical speech synthesis," in Proc. Interspeech, 2012, pp. 1436-1439.
-
(2012)
Proc. Interspeech
, pp. 1436-1439
-
-
Silén, H.1
Heiander, E.2
Nurminen, J.3
Gabbouj, M.4
-
13
-
-
84905234422
-
A postfilter to modify the modulation spectrum in HMM-based speech synthesis
-
S. Takamichi, T. Toda, G. Neubig, S. Sakti, and S. Nakamura, "A postfilter to modify the modulation spectrum in HMM-based speech synthesis," in Proc. ICASSP, 2014, pp. 290-294.
-
(2014)
Proc. ICASSP
, pp. 290-294
-
-
Takamichi, S.1
Toda, T.2
Neubig, G.3
Sakti, S.4
Nakamura, S.5
-
14
-
-
84910100893
-
DNN-based stochastic postfilter for HMM-based speech synthesis
-
L.-H. Chen, T. Raitio, C. Valentini-Botinhao, J. Yamagishi, and Z.-H. Ling, "DNN-based stochastic postfilter for HMM-based speech synthesis," in Proc. Interspeech, 2014, pp. 1954-1958.
-
(2014)
Proc. Interspeech
, pp. 1954-1958
-
-
Chen, L.-H.1
Raitio, T.2
Valentini-Botinhao, C.3
Yamagishi, J.4
Ling, Z.-H.5
-
15
-
-
84942607168
-
A deep generative architecture for postfiltering in statistical parametric speech synthesis
-
L.-H. Chen, T. Raitio, C. Valentini-Botinhao, Z.-H. Ling, and J. Yamagishi, "A deep generative architecture for postfiltering in statistical parametric speech synthesis," IEEE Trans. Audio Speech and Lang. Process., Vol. 23, no. 11, pp. 2003-2014, 2015.
-
(2015)
IEEE Trans. Audio Speech and Lang. Process
, vol.23
, Issue.11
, pp. 2003-2014
-
-
Chen, L.-H.1
Raitio, T.2
Valentini-Botinhao, C.3
Ling, Z.-H.4
Yamagishi, J.5
-
16
-
-
84946074523
-
The effect of neural networks in statistical parametric speech synthesis
-
K. Hashimoto, K. Oura, Y. Nankaku, and K. Tokuda, "The effect of neural networks in statistical parametric speech synthesis," in Proc. ICASSP, 2015, pp. 4455-4459.
-
(2015)
Proc. ICASSP
, pp. 4455-4459
-
-
Hashimoto, K.1
Oura, K.2
Nankaku, Y.3
Tokuda, K.4
-
17
-
-
84937849144
-
Generative adversarial nets
-
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative adversarial nets," in Proc. NIPS, 2014, pp. 2672-2680.
-
(2014)
Proc. NIPS
, pp. 2672-2680
-
-
Goodfellow, I.1
Pouget-Abadie, J.2
Mirza, M.3
Xu, B.4
Warde-Farley, D.5
Ozair, S.6
Courville, A.7
Bengio, Y.8
-
18
-
-
84965143571
-
Deep generative image models using a laplacian pyramid of adversarial networks
-
E. Denton, S. Chintala, Szlam A., and R. Fergus, "Deep generative image models using a Laplacian pyramid of adversarial networks," in Proc. NIPS, 2015, pp. 1486-1494.
-
(2015)
Proc. NIPS
, pp. 1486-1494
-
-
Denton, E.1
Chintala, S.2
Szlam, A.3
Fergus, R.4
-
19
-
-
85083950271
-
Unsupervised representation learning with deep convolutional generative adversarial networks
-
A. Radford, L. Metz, and S. Chintala, "Unsupervised representation learning with deep convolutional generative adversarial networks," in Proc. ICLR, 2016.
-
(2016)
Proc. ICLR
-
-
Radford, A.1
Metz, L.2
Chintala, S.3
-
20
-
-
84999041243
-
Autoencoding beyond pixels using a learned similarity metric
-
A. B. L. Larsen, S. K. Sønderby, and O. Winther, "Autoencoding beyond pixels using a learned similarity metric," in Proc. ICML, 2016.
-
(2016)
Proc. ICML
-
-
Larsen, A.B.L.1
Sønderby, S.K.2
Winther, O.3
-
22
-
-
84986274465
-
Deep residual learning for image recognition
-
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proc. CVPR, 2016.
-
(2016)
Proc. CVPR
-
-
He, K.1
Zhang, X.2
Ren, S.3
Sun, J.4
-
23
-
-
84945230598
-
Fully convolutional networks for semantic segmentation
-
J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," in Proc. CVPR, 2015, pp. 3431-3440.
-
(2015)
Proc. CVPR
, pp. 3431-3440
-
-
Long, J.1
Shelhamer, E.2
Darrell, T.3
-
24
-
-
0032673049
-
Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
-
H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigné, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech Commn., Vol. 27, no. 3, pp. 187-207, 1999.
-
(1999)
Speech Commn.
, vol.27
, Issue.3
, pp. 187-207
-
-
Kawahara, H.1
Masuda-Katsuse, I.2
De Cheveigné, A.3
-
25
-
-
77956509090
-
Rectified linear units improve restricted boltzmann machines
-
V. Nair and G. E. Hinton, "Rectified linear units improve restricted boltzmann machines," in Proc. ICML, 2010, pp. 807-814.
-
(2010)
Proc. ICML
, pp. 807-814
-
-
Nair, V.1
Hinton, G.E.2
-
26
-
-
84893676344
-
Rectifier nonlinearities improve neural network acoustic models
-
A. Maas, A. Y. Hannun, and A. Y. Ng, "Rectifier nonlinearities improve neural network acoustic models," in Proc. ICML, 2013.
-
(2013)
Proc. ICML
-
-
Maas, A.1
Hannun, A.Y.2
Ng, A.Y.3
-
28
-
-
84969991188
-
Adam: A method for stochastic optimization
-
D. P. Kingma and M. Welling, "Adam: A method for stochastic optimization," in Proc. ICLR, 2015.
-
(2015)
Proc. ICLR
-
-
Kingma, D.P.1
Welling, M.2
-
29
-
-
85083950260
-
A note on the evaluation of generative models
-
L. Theis, A. Oord, and M. Bethge, "A note on the evaluation of generative models," in Proc. ICLR, 2016.
-
(2016)
Proc. ICLR
-
-
Theis, L.1
Oord, A.2
Bethge, M.3
-
30
-
-
84910047819
-
TTS synthesis with bidirectional LSTM based recurrent neural networks
-
Y. Fan, Y. Qian, F.-L. Xie, and F. K. Soong, "TTS synthesis with bidirectional LSTM based recurrent neural networks," in Proc. Interspeech, 2014, pp. 1964-1968.
-
(2014)
Proc. Interspeech
, pp. 1964-1968
-
-
Fan, Y.1
Qian, Y.2
Xie, F.-L.3
Soong, F.K.4
|