-
1
-
-
84890490547
-
Statistical parametric speech synthesis using deep neural networks
-
H. Zen, A. Senior, and M. Schuster, "Statistical parametric speech synthesis using deep neural networks, " Proceedings of ICASSP, pp. 7962-7966, 2013
-
(2013)
Proceedings of ICASSP
, pp. 7962-7966
-
-
Zen, H.1
Senior, A.2
Schuster, M.3
-
2
-
-
84901237776
-
Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis
-
Z.-H. Ling, L. Deng, and D. Yu, "Modeling spectral envelopes using restricted Boltzmann machines and deep belief networks for statistical parametric speech synthesis, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 21, pp. 2129-2139, 2013
-
(2013)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.21
, pp. 2129-2139
-
-
Ling, Z.-H.1
Deng, L.2
Yu, D.3
-
3
-
-
84910047819
-
TTS synthesis with bidirectional LSTM based recurrent neural networks
-
Y. Fan, Y. Qian, F. Xie, and F. K. Soong, "TTS synthesis with bidirectional LSTM based recurrent neural networks, " Proceedings of Interspeech, pp. 1964-1968, 2014
-
(2014)
Proceedings of Interspeech
, pp. 1964-1968
-
-
Fan, Y.1
Qian, Y.2
Xie, F.3
Soong, F.K.4
-
4
-
-
84910068142
-
Prosody contour prediction with long short-term memory, bidirectional, deep recurrent neural networks
-
R. Fernandez, A. Rendel, B. Ramabhadran, and R. Hoory, "Prosody contour prediction with long short-term memory, bidirectional, deep recurrent neural networks, " Proceedings of Interspeech, pp. 2268-2272, 2014
-
(2014)
Proceedings of Interspeech
, pp. 2268-2272
-
-
Fernandez, R.1
Rendel, A.2
Ramabhadran, B.3
Hoory, R.4
-
5
-
-
84973331276
-
A function-wise pre-training technique for constructing a deep neural network based spectral model in statistical parametric speech synthesis
-
S. Takaki, W. Zhenzhou, and J Yamagishi, "A function-wise pre-training technique for constructing a deep neural network based spectral model in statistical parametric speech synthesis, " Machine Learning in Spoken Language Processing (MLSLP), 2015
-
(2015)
Machine Learning in Spoken Language Processing (MLSLP)
-
-
Takaki, S.1
Zhenzhou, W.2
Yamagishi, J.3
-
6
-
-
84910100893
-
DNN-based stochastic postfilter for HMM-based speech synthesis
-
L.-H. Chen, T. Raitio, C. Valentini-Botinhao, J. Yamagishi, and Z.-H. Ling, "DNN-based stochastic postfilter for HMM-based speech synthesis, " Proceedings of Interspeech, pp. 1954-1958, 2014
-
(2014)
Proceedings of Interspeech
, pp. 1954-1958
-
-
Chen, L.-H.1
Raitio, T.2
Valentini-Botinhao, C.3
Yamagishi, J.4
Ling, Z.-H.5
-
7
-
-
84959098005
-
Multiple feed-forward deep neural networks for statistical parametric speech synthesis
-
S. Takaki, S.-J. Kim, J. Yamagishi, and j.-J Kim, "Multiple feed-forward deep neural networks for statistical parametric speech synthesis, " Proceedings of Interspeech, pp. 2242-2246, 2015
-
(2015)
Proceedings of Interspeech
, pp. 2242-2246
-
-
Takaki, S.1
Kim, S.-J.2
Yamagishi, J.3
Kim, J.-J.4
-
8
-
-
84910065702
-
Acoustic modeling with deep neural networks using raw time signal for lvcsr
-
Z. Tuske, P. Golik, R. Schluter, and H. Ney, "Acoustic modeling with deep neural networks using raw time signal for lvcsr, " Proceedings of Interspeech, pp. 890-894, 2014
-
(2014)
Proceedings of Interspeech
, pp. 890-894
-
-
Tuske, Z.1
Golik, P.2
Schluter, R.3
Ney, H.4
-
9
-
-
84973386429
-
Convolutional neural networks-based continuous speech recognition using raw speech signal2
-
D. Palaz, M. Magimai.-Doss, and Collobert R., "Convolutional neural networks-based continuous speech recognition using raw speech signal2, journal =, "
-
Journal
-
-
Palaz, D.1
Magimai-Doss, M.2
Collobert, R.3
-
10
-
-
84959168440
-
Learning the speech front-end with raw waveform cldnns
-
T. N. Sainath, R. J. Weiss, A. Senior, K. W. Wilson, and O. Vinyals, "Learning the speech front-end with raw waveform cldnns, " Proceedings of Interspeech, pp. 1-5, 2015
-
(2015)
Proceedings of Interspeech
, pp. 1-5
-
-
Sainath, T.N.1
Weiss, R.J.2
Senior, A.3
Wilson, K.W.4
Vinyals, O.5
-
11
-
-
0032673049
-
Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
-
H. Kawahara, I. Masuda-Katsuse, and A. Cheveigne, "Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds, " Speech Communication, vol. 27, pp. 187-207, 1999
-
(1999)
Speech Communication
, vol.27
, pp. 187-207
-
-
Kawahara, H.1
Masuda-Katsuse, I.2
Cheveigne, A.3
-
13
-
-
84908519225
-
Cheaptrick, a spectral envelope estimator for high-quality speech synthesis
-
M. Morise, "Cheaptrick, a spectral envelope estimator for high-quality speech synthesis, " Speech Communication, vol. 67, pp. 1-7, 2015
-
(2015)
Speech Communication
, vol.67
, pp. 1-7
-
-
Morise, M.1
-
14
-
-
84937621994
-
Error evaluation of an f0-adaptive spectral envelope estimator in robustness against the additive noise and f0 error
-
M. Morise, "Error evaluation of an f0-adaptive spectral envelope estimator in robustness against the additive noise and f0 error, " IEICE transactions on information and systems, vol. E98-D, no. 7, pp. 1405-1408, 2015
-
(2015)
IEICE Transactions on Information and Systems
, vol.E98-D
, Issue.7
, pp. 1405-1408
-
-
Morise, M.1
-
15
-
-
84867593213
-
Autoencoder bottleneck features using deep belief networks
-
T. N. Sainath, B. Kingsbury, and B. Ramabhadran, "Autoencoder bottleneck features using deep belief networks, " Proceedings of ICASSP, pp. 4153-4156, 2012
-
(2012)
Proceedings of ICASSP
, pp. 4153-4156
-
-
Sainath, T.N.1
Kingsbury, B.2
Ramabhadran, B.3
-
16
-
-
84890482429
-
Extracting deep bottleneck features using stacked auto-encoders
-
J. Gehring, Y. Miao, F. Metze, and A. Waibel, "Extracting deep bottleneck features using stacked auto-encoders, " Proceedings of ICASSP, pp. 3377-3381, 2013
-
(2013)
Proceedings of ICASSP
, pp. 3377-3381
-
-
Gehring, J.1
Miao, Y.2
Metze, F.3
Waibel, A.4
-
17
-
-
84878409063
-
Recurrent neural networks for noise reduction in robust ASR
-
A. L. Maas, Q. V. Le, T. M. ONeil, O. Vinyals, P. Nguyen, and A. Ng Andrew, "Recurrent neural networks for noise reduction in robust ASR, " Proceedings of Interspeech, pp. 22-25, 2012
-
(2012)
Proceedings of Interspeech
, pp. 22-25
-
-
Maas, A.L.1
Le, Q.V.2
O'Neil, T.M.3
Vinyals, O.4
Nguyen, P.5
Ng Andrew, A.6
-
18
-
-
84906237188
-
Reverberant speech recognition based on denoising autoencoder
-
T. Ishii, H. Komiyama, T. Shinozaki, Y. Horiuchi, and S Kuroiwa, "Reverberant speech recognition based on denoising autoencoder, " Proceedings of Interspeech, pp. 3512-3516, 2013
-
(2013)
Proceedings of Interspeech
, pp. 3512-3516
-
-
Ishii, T.1
Komiyama, H.2
Shinozaki, T.3
Horiuchi, Y.4
Kuroiwa, S.5
-
19
-
-
84905259759
-
Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition
-
X. Feng, Y. Zhang, and J. Glass, "Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition, " Proceedings of ICASSP, pp. 1778-1782, 2014
-
(2014)
Proceedings of ICASSP
, pp. 1778-1782
-
-
Feng, X.1
Zhang, Y.2
Glass, J.3
-
20
-
-
79959842828
-
Binary coding of speech spectrograms using a deep auto-encoder
-
L. Deng, M. Seltzer1, D. Yu, A. Acero, A. Mohamed, and G. Hinton, "Binary coding of speech spectrograms using a deep auto-encoder, " Proceedings of Interspeech, pp. 1692-1695, 2010
-
(2010)
Proceedings of Interspeech
, pp. 1692-1695
-
-
Deng, L.1
Seltzer Yu, D.M.2
Acero, A.3
Mohamed, A.4
Hinton, G.5
-
21
-
-
84906262433
-
Speech enhancement based on deep denoising autoencoder
-
X. Lu, Y. Tsao, S. Matsuda1, and C. Hori, "Speech enhancement based on deep denoising autoencoder, " Proceedings of Interspeech, pp. 436-440, 2013
-
(2013)
Proceedings of Interspeech
, pp. 436-440
-
-
Lu, X.1
Tsao, Y.2
Matsuda, S.3
Hori, C.4
-
22
-
-
78049412607
-
An autoencoder neural-network based low-dimensionality approach to excitation modeling for hmm-based text-to-speech
-
R. Vishnubhotla, S. Fernandez and B. Ramabhadran, "An autoencoder neural-network based low-dimensionality approach to excitation modeling for hmm-based text-to-speech, " Proceedings of ICASSP, pp. 4614-4617, 2010
-
(2010)
Proceedings of ICASSP
, pp. 4614-4617
-
-
Vishnubhotla, R.1
Fernandez, S.2
Ramabhadran, B.3
-
23
-
-
84910068090
-
Deep neural network based trainable voice source model for synthesis of speech with varying vocal effort
-
T. Raitio, A. Suni, L. Juvela, M. Vainio, and P. Alku, "Deep neural network based trainable voice source model for synthesis of speech with varying vocal effort, " Proceedings of Interspeech, pp. 1969-1973, 2014
-
(2014)
Proceedings of Interspeech
, pp. 1969-1973
-
-
Raitio, T.1
Suni, A.2
Juvela, L.3
Vainio, M.4
Alku, P.5
-
24
-
-
84973366354
-
A deep learning approach to data-driven parameterizations for statistical parametric speech synthesis
-
abs/1409. 8558
-
P. K. Muthukumar and Black. A., "A deep learning approach to data-driven parameterizations for statistical parametric speech synthesis, " CoRR, vol. Abs/1409. 8558, 2014
-
(2014)
CoRR
-
-
Muthukumar, P.K.1
Black, A.2
-
25
-
-
33746600649
-
Reducing the dimensionality of data with neural networks
-
G. E. Hinton and R. Salakhutdinov, "Reducing the dimensionality of data with neural networks, " Science 28, vol. 313, no. 5786, pp. 504-507, 2006
-
(2006)
Science 28
, vol.313
, Issue.5786
, pp. 504-507
-
-
Hinton, G.E.1
Salakhutdinov, R.2
-
27
-
-
67651002140
-
Statistical parametric speech synthesis
-
H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis, " Speech Communication, vol. 51, pp. 1039-1064, 2009
-
(2009)
Speech Communication
, vol.51
, pp. 1039-1064
-
-
Zen, H.1
Tokuda, K.2
Black, A.W.3
-
28
-
-
33745200051
-
Speech parameter generation algorithm considering global variance for HMM-based speech synthesis
-
T. Toda and K. Tokuda, "Speech parameter generation algorithm considering global variance for HMM-based speech synthesis, " Proceedings of Interspeech 2005, pp. 2801-2804, 2005
-
(2005)
Proceedings of Interspeech 2005
, pp. 2801-2804
-
-
Toda, T.1
Tokuda, K.2
-
29
-
-
79959836077
-
On generating combilex pronunciations via morphological analysis
-
K. Richmond, R. Clark, and S. Fitt, "On generating combilex pronunciations via morphological analysis, " Proceedings of Interspeech, pp. 1974-1977, 2010.
-
(2010)
Proceedings of Interspeech
, pp. 1974-1977
-
-
Richmond, K.1
Clark, R.2
Fitt, S.3
|