-
1
-
-
67651002140
-
Statistical parametric speech synthesis
-
H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis, " Speech Communication, vol. 51, no. 11, pp. 1039-1064, 2009.
-
(2009)
Speech Communication
, vol.51
, Issue.11
, pp. 1039-1064
-
-
Zen, H.1
Tokuda, K.2
Black, A.W.3
-
3
-
-
0029288633
-
Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models
-
C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models, " Computer Speech & Language, vol. 9, no. 2, pp. 171-185, 1995.
-
(1995)
Computer Speech & Language
, vol.9
, Issue.2
, pp. 171-185
-
-
Leggetter, C.J.1
Woodland, P.C.2
-
4
-
-
0028419019
-
Maximum a posteriori estimation for multivariate Gaussian mixture observations of markov chains
-
J.-L. Gauvain and C.-H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of markov chains, " IEEE Trans. on Speech and Audio Processing, vol. 2, no. 2, pp. 291-298, 1994.
-
(1994)
IEEE Trans. on Speech and Audio Processing
, vol.2
, Issue.2
, pp. 291-298
-
-
Gauvain, J.-L.1
Lee, C.-H.2
-
5
-
-
67650854725
-
Analysis of speaker adaptation algorithms for hmm-based speech synthesis and a constrained smaplr adaptation algorithm
-
J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, and J. Isogai, "Analysis of speaker adaptation algorithms for hmm-based speech synthesis and a constrained smaplr adaptation algorithm, " IEEE Trans. Audio, Speech and Language Processing, vol. 17, no. 1, pp. 66-83, 2009.
-
(2009)
IEEE Trans. Audio, Speech and Language Processing
, vol.17
, Issue.1
, pp. 66-83
-
-
Yamagishi, J.1
Kobayashi, T.2
Nakano, Y.3
Ogata, K.4
Isogai, J.5
-
6
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
-
G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, " IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82-97, 2012.
-
(2012)
IEEE Signal Processing Magazine
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.4
Mohamed, A.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.10
Kingsbury, B.11
-
7
-
-
84890490547
-
Statistical parametric speech synthesis using deep neural networks
-
H. Zen, A. Senior, and M. Schuster, "Statistical parametric speech synthesis using deep neural networks, " in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2013.
-
(2013)
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP)
-
-
Zen, H.1
Senior, A.2
Schuster, M.3
-
8
-
-
84901237776
-
Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis
-
Z.-H. Ling, L. Deng, and D. Yu, "Modeling spectral envelopes using Restricted Boltzmann Machines and Deep Belief Networks for statistical parametric speech synthesis, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 10, pp. 2129-2139, 2013.
-
(2013)
IEEE Transactions on Audio, Speech, and Language Processing
, vol.21
, Issue.10
, pp. 2129-2139
-
-
Ling, Z.-H.1
Deng, L.2
Yu, D.3
-
9
-
-
84890527090
-
Multi-distribution deep belief network for speech synthesis
-
S. Kang, X. Qian, and H. Meng, "Multi-distribution deep belief network for speech synthesis, " in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2013.
-
(2013)
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP)
-
-
Kang, S.1
Qian, X.2
Meng, H.3
-
10
-
-
84905251808
-
On the training aspects of deep neural network (DNN) for parametric TTS synthesis
-
Y. Qian, Y. Fan, W. Hu, and F. K. Soong, "On the training aspects of deep neural network (DNN) for parametric TTS synthesis, " in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2014.
-
(2014)
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP)
-
-
Qian, Y.1
Fan, Y.2
Hu, W.3
Soong, F.K.4
-
12
-
-
84910047819
-
TTS synthesis with bidirectional LSTM based recurrent neural networks
-
Y. Fan, Y. Qian, F. Xie, and F. K. Soong, "TTS synthesis with bidirectional LSTM based recurrent neural networks, " in Proc. Interspeech, 2014.
-
(2014)
Proc. Interspeech
-
-
Fan, Y.1
Qian, Y.2
Xie, F.3
Soong, F.K.4
-
13
-
-
84946033275
-
Deep neural networks employing multi-task learning and stacked bottleneck features for speech synthesis
-
Z. Wu, C. Valentini-Botinhao, O. Watts, and S. King, "Deep neural networks employing multi-task learning and stacked bottleneck features for speech synthesis, " in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2015.
-
(2015)
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP)
-
-
Wu, Z.1
Valentini-Botinhao, C.2
Watts, O.3
King, S.4
-
14
-
-
84946036894
-
Modelling acoustic feature dependencies with artificial neural networks: Trajectory-RNADE
-
B. Uriá, I. Murray, S. Renals, and C. Valentini-Botinhao, "Modelling acoustic feature dependencies with artificial neural networks: Trajectory-RNADE, " in Proc IEEE ICASSP, 2015.
-
(2015)
Proc IEEE ICASSP
-
-
Uriá, B.1
Murray, I.2
Renals, S.3
Valentini-Botinhao, C.4
-
15
-
-
84890542079
-
Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
-
D. Yu, K. Yao, H. Su, G. Li, and F. Seide, "Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition, " in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2013.
-
(2013)
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP)
-
-
Yu, D.1
Yao, K.2
Su, H.3
Li, G.4
Seide, F.5
-
16
-
-
84893691530
-
Speaker adaptation of neural network acoustic models using I-vectors
-
G. Saon, H. Soltau, D. Nahamoo, and M. Picheny, "Speaker adaptation of neural network acoustic models using I-vectors, " in Proc IEEE ASRU, 2013, pp. 55-59.
-
(2013)
Proc IEEE ASRU
, pp. 55-59
-
-
Saon, G.1
Soltau, H.2
Nahamoo, D.3
Picheny, M.4
-
17
-
-
84910068089
-
Adaptation of deep neural network acoustic models using factorised i-vectors
-
P. Karanasou, Y. Wang, M. J. Gales, and P. C. Woodland, "Adaptation of deep neural network acoustic models using factorised i-vectors, " in Proc. Interspeech, 2014.
-
(2014)
Proc. Interspeech
-
-
Karanasou, P.1
Wang, Y.2
Gales, M.J.3
Woodland, P.C.4
-
18
-
-
84921731072
-
Speaker adaptation of deep neural network based on discriminant codes
-
S. Xue, O. Abdel-Hamid, H. Jiang, L. Dai, and Q. Liu, "Speaker adaptation of deep neural network based on discriminant codes, " IEEE Trans. Audio, Speech and Language Processing, vol. 22, no. 12, pp. 1713-1725, 2014.
-
(2014)
IEEE Trans. Audio, Speech and Language Processing
, vol.22
, Issue.12
, pp. 1713-1725
-
-
Xue, S.1
Abdel-Hamid, O.2
Jiang, H.3
Dai, L.4
Liu, Q.5
-
19
-
-
84905259145
-
I-vector-based speaker adaptation of deep neural networks for French broadcast audio transcription
-
V. Gupta, P. Kenny, P. Ouellet, and T. Stafylakis, "I-vector-based speaker adaptation of deep neural networks for French broadcast audio transcription, " in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2014.
-
(2014)
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP)
-
-
Gupta, V.1
Kenny, P.2
Ouellet, P.3
Stafylakis, T.4
-
20
-
-
84983119674
-
Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models
-
P. Swietojanski and S. Renals, "Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models, " in Proc. IEEE Spoken Language Technology Workshop, 2014.
-
(2014)
Proc. IEEE Spoken Language Technology Workshop
-
-
Swietojanski, P.1
Renals, S.2
-
21
-
-
84959166808
-
Preliminary work on speaker adaptation for dnn-based speech synthesis
-
Tech. Rep.
-
B. Potard, P. Motlicek, and D. Imseng, "Preliminary work on speaker adaptation for dnn-based speech synthesis, " Idiap, Tech. Rep., 2015.
-
(2015)
Idiap
-
-
Potard, B.1
Motlicek, P.2
Imseng, D.3
-
22
-
-
79951609039
-
Front-end factor analysis for speaker verification
-
N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, "Front-end factor analysis for speaker verification, " IEEE Trans. Audio, Speech and Language Processing, vol. 19, no. 4, pp. 788-798, 2011.
-
(2011)
IEEE Trans. Audio, Speech and Language Processing
, vol.19
, Issue.4
, pp. 788-798
-
-
Dehak, N.1
Kenny, P.2
Dehak, R.3
Dumouchel, P.4
Ouellet, P.5
-
23
-
-
84865733857
-
Analysis of i-vector length normalization in speaker recognition systems
-
D. Garcia-Romero and C. Y. Espy-Wilson, "Analysis of i-vector length normalization in speaker recognition systems. " in Proc. Interspeech, 2011.
-
(2011)
Proc. Interspeech
-
-
Garcia-Romero, D.1
Espy-Wilson, C.Y.2
-
24
-
-
84896111913
-
ALIZE 3. 0-open source toolkit for state-of-the-art speaker recognition
-
A. Larcher, J.-F. Bonastre, B. G. Fauve, K.-A. Lee, C. Lévy, H. Li, J. S. Mason, and J.-Y. Parfait, "ALIZE 3. 0-open source toolkit for state-of-the-art speaker recognition. " in Proc. Interspeech, 2013.
-
(2013)
Proc. Interspeech
-
-
Larcher, A.1
Bonastre, J.-F.2
Fauve, B.G.3
Lee, K.-A.4
Lévy, C.5
Li, H.6
Mason, J.S.7
Parfait, J.-Y.8
-
25
-
-
84946032695
-
Differentiable pooling for unsupervised speaker adaptation
-
P. Swietojanski and S. Renals, "Differentiable pooling for unsupervised speaker adaptation, " in Proc. ICASSP, 2015.
-
(2015)
Proc. ICASSP
-
-
Swietojanski, P.1
Renals, S.2
-
26
-
-
84906225505
-
Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition
-
O. Abdel-Hamid and H. Jiang, "Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition. " in Proc. Interspeech. ISCA, pp. 1248-1252.
-
Proc. Interspeech. ISCA
, pp. 1248-1252
-
-
Abdel-Hamid, O.1
Jiang, H.2
-
27
-
-
57749193836
-
Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
-
T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory, " IEEE Trans. Audio, Speech and Language Processing, vol. 15, no. 8, pp. 2222-2235, 2007.
-
(2007)
IEEE Trans. Audio, Speech and Language Processing
, vol.15
, Issue.8
, pp. 2222-2235
-
-
Toda, T.1
Black, A.W.2
Tokuda, K.3
-
28
-
-
84894152556
-
The voice bank corpus: Design, collection and data analysis of a large regional accent speech database
-
C. Veaux, J. Yamagishi, and S. King, "The voice bank corpus: Design, collection and data analysis of a large regional accent speech database, " in Proc. Int. Conf. Oriental COCOSDA, 2013.
-
(2013)
Proc. Int. Conf. Oriental COCOSDA
-
-
Veaux, C.1
Yamagishi, J.2
King, S.3
-
29
-
-
0032673049
-
Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
-
H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigné, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds, " Speech communication, vol. 27, no. 3, pp. 187-207, 1999.
-
(1999)
Speech Communication
, vol.27
, Issue.3
, pp. 187-207
-
-
Kawahara, H.1
Masuda-Katsuse, I.2
De Cheveigné, A.3
|