-
1
-
-
84946065618
-
Voice conversion
-
J. Nurminen, H. Silen, and V. Popa, Voice Conversion, Speech Enhancement, Modeling and Recognition-Algorithms and Applications, pp. 69-94, 2012
-
(2012)
Speech Enhancement, Modeling and Recognition-Algorithms and Applications
, pp. 69-94
-
-
Nurminen, J.1
Silen, H.2
Popa, V.3
-
2
-
-
34547507542
-
Frequency warping based on mapping formant parameters
-
Z. W. Shuang, R. Bakis, S. Shechtman, and Y. Qin, Frequency warping based on mapping formant parameters, in Interspeech, 2006
-
(2006)
Interspeech
-
-
Shuang, Z.W.1
Bakis, R.2
Shechtman, S.3
Qin, Y.4
-
3
-
-
56149096085
-
Weighted frequency warping for voice conversion
-
D. Erro and A. Moreno, Weighted frequency warping for voice conversion, in Interspeech, 2007
-
(2007)
Interspeech
-
-
Erro, D.1
Moreno, A.2
-
4
-
-
33745805403
-
A fast learning algorithm for Deep Belief Nets
-
G. Hinton, S. Osindero, and Y. Teh, A fast learning algorithm for Deep Belief Nets, Neural computation, vol. 18, no. 7, pp. 1527-1554, 2006
-
(2006)
Neural Computation
, vol.18
, Issue.7
, pp. 1527-1554
-
-
Hinton, G.1
Osindero, S.2
Teh, Y.3
-
5
-
-
84055211743
-
Acoustic modeling using Deep Belief Networks
-
A. Mohamed, G. E. Dahl, and G. Hinton, Acoustic modeling using Deep Belief Networks, Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 1, pp. 14-22, 2012
-
(2012)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.20
, Issue.1
, pp. 14-22
-
-
Mohamed, A.1
Dahl, G.E.2
Hinton, G.3
-
6
-
-
0032026483
-
Continuous probabilistic transform for voice conversion
-
Y. Stylianou, O. Cappé, and E. Moulines, Continuous probabilistic transform for voice conversion, IEEE Transactions on Speech and Audio Processing, vol. 6, no. 2, pp. 131-142, 1998
-
(1998)
IEEE Transactions on Speech and Audio Processing
, vol.6
, Issue.2
, pp. 131-142
-
-
Stylianou, Y.1
Cappé, O.2
Moulines, E.3
-
7
-
-
57749193836
-
Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
-
T. Toda, A. W. Black, and K. Tokuda, Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory, IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 8, pp. 2222-2235, 2007
-
(2007)
IEEE Transactions on Audio, Speech, and Language Processing
, vol.15
, Issue.8
, pp. 2222-2235
-
-
Toda, T.1
Black, A.W.2
Tokuda, K.3
-
8
-
-
70349197691
-
Voice conversion using Artificial Neural Networks
-
S. Desai, E. V. Raghavendra, B. Yegnanarayana, A. W. Black, and K. Prahallad, Voice conversion using Artificial Neural Networks, in ICASSP, 2009
-
(2009)
ICASSP
-
-
Desai, S.1
Raghavendra, E.V.2
Yegnanarayana, B.3
Black, A.W.4
Prahallad, K.5
-
9
-
-
84906225084
-
Joint spectral distribution modeling using Restricted Boltzmann Machines for voice conversion
-
L. H. Chen, Z. H. Ling, Y. Song, and L. R. Dai, Joint spectral distribution modeling using Restricted Boltzmann Machines for voice conversion, in Interspeech, 2013
-
(2013)
Interspeech
-
-
Chen, L.H.1
Ling, Z.H.2
Song, Y.3
Dai, L.R.4
-
10
-
-
84921735339
-
Voice conversion using deep neural networks with layer-wise generative training
-
L. H. Chen, Z. H. Ling, L. J. Liu, and L. R. Dai, Voice conversion using Deep Neural Networks with layer-wise generative training, IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 22, no. 12, pp. 1859-1872, 2014
-
(2014)
IEEE/ACM Transactions on Audio, Speech and Language Processing
, vol.22
, Issue.12
, pp. 1859-1872
-
-
Chen, L.H.1
Ling, Z.H.2
Liu, L.J.3
Dai, L.R.4
-
11
-
-
84906280857
-
Voice conversion in high-order eigen space using Deep Belief Nets
-
T. Nakashika, R. Takashima, T. Takiguchi, and Y. Ariki, Voice conversion in high-order eigen space using Deep Belief Nets, in Interspeech, 2013
-
(2013)
Interspeech
-
-
Nakashika, T.1
Takashima, R.2
Takiguchi, T.3
Ariki, Y.4
-
12
-
-
84910087396
-
High-order sequence modeling using speaker-dependent recurrent temporal Restricted Boltzmann Machines for voice conversion
-
T. Nakashika, T. Takiguchi, and Y. Ariki, High-order sequence modeling using speaker-dependent recurrent temporal Restricted Boltzmann Machines for voice conversion, in Interspeech, 2014
-
(2014)
Interspeech
-
-
Nakashika, T.1
Takiguchi, T.2
Ariki, Y.3
-
13
-
-
0028392483
-
Learning longterm dependencies with gradient descent is difficult
-
Y. Bengio, P. Simard, and P. Frasconi, Learning longterm dependencies with gradient descent is difficult, IEEE Transactions on Neural Networks, vol. 5, no. 2, pp. 157-166, 1994
-
(1994)
IEEE Transactions on Neural Networks
, vol.5
, Issue.2
, pp. 157-166
-
-
Bengio, Y.1
Simard, P.2
Frasconi, P.3
-
14
-
-
27744588611
-
Framewise phoneme classification with bidirectional LSTM and other neural network architectures
-
A. Graves and J. Schmidhuber, Framewise phoneme classification with bidirectional LSTM and other neural network architectures, Neural Networks, vol. 18, no. 5, pp. 602-610, 2005
-
(2005)
Neural Networks
, vol.18
, Issue.5
, pp. 602-610
-
-
Graves, A.1
Schmidhuber, J.2
-
15
-
-
0031573117
-
Long short term memory
-
S. Hochreiter and J. Schmidhuber, Long Short Term Memory, Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997
-
(1997)
Neural Computation
, vol.9
, Issue.8
, pp. 1735-1780
-
-
Hochreiter, S.1
Schmidhuber, J.2
-
16
-
-
0035505385
-
LSTM recurrent networks learn simple context-free and context-sensitive languages
-
F. A. Gers and J. Schmidhuber, LSTM recurrent networks learn simple context-free and context-sensitive languages, IEEE Transactions on Neural Networks, vol. 12, no. 6, pp. 1333-1340, 2001
-
(2001)
IEEE Transactions on Neural Networks
, vol.12
, Issue.6
, pp. 1333-1340
-
-
Gers, F.A.1
Schmidhuber, J.2
-
17
-
-
84893701254
-
Hybrid speech recognition with deep bidirectional LSTM
-
A. Graves, N. Jaitly, and A. R. Mohamed, Hybrid speech recognition with deep bidirectional LSTM, in ASRU, 2013
-
(2013)
ASRU
-
-
Graves, A.1
Jaitly, N.2
Mohamed, A.R.3
-
19
-
-
84890489927
-
Feature enhancement by bidirectional LSTM networks for conversational speech recognition in highly nonstationary noise
-
M. Wollmer, Z. X. Zhang, F. Weninger, B. Schuller, and G. Rigoll, Feature enhancement by bidirectional LSTM networks for conversational speech recognition in highly nonstationary noise, in ICASSP, 2013
-
(2013)
ICASSP
-
-
Wollmer, M.1
Zhang, Z.X.2
Weninger, F.3
Schuller, B.4
Rigoll, G.5
-
20
-
-
84910047819
-
TTS synthesis with bidirectional LSTM based recurrent neural networks
-
Y. C. Fan, Y. Qian, F. L. Xie, and F. K. Soong, TTS synthesis with bidirectional LSTM based Recurrent Neural Networks, in Interspeech, 2014
-
(2014)
Interspeech
-
-
Fan, Y.C.1
Qian, Y.2
Xie, F.L.3
Soong, F.K.4
-
21
-
-
0031268931
-
Bidirectional recurrent neural networks
-
M. Schuster and K. K. Paliwal, Bidirectional Recurrent Neural Networks, IEEE Transactions on Signal Processing, vol. 45, no. 11, pp. 2673-2681, 1997
-
(1997)
IEEE Transactions on Signal Processing
, vol.45
, Issue.11
, pp. 2673-2681
-
-
Schuster, M.1
Paliwal, K.K.2
-
22
-
-
0034293152
-
Learning to forget: Continual prediction with LSTM
-
F. A. Gers, J. Schmidhuber, and F. Cummins, Learning to forget: Continual prediction with LSTM, Neural computation, vol. 12, no. 10, pp. 2451-2471, 2000
-
(2000)
Neural Computation
, vol.12
, Issue.10
, pp. 2451-2471
-
-
Gers, F.A.1
Schmidhuber, J.2
Cummins, F.3
-
23
-
-
84890543083
-
Speech recognition with deep Recurrent Neural Networks
-
A. Graves, A. R. Mohamed, and G. E. Hinton, Speech recognition with deep Recurrent Neural Networks, in ICASSP, 2013, pp. 6645-6649
-
(2013)
ICASSP
, pp. 6645-6649
-
-
Graves, A.1
Mohamed, A.R.2
Hinton, G.E.3
-
24
-
-
0032673049
-
Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequencybased F0 extraction: Possible role of a repetitive structure in sounds
-
H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigné, Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequencybased F0 extraction: Possible role of a repetitive structure in sounds, Speech communication, vol. 27, no. 3, pp. 187-207, 1999
-
(1999)
Speech Communication
, vol.27
, Issue.3
, pp. 187-207
-
-
Kawahara, H.1
Masuda-Katsuse, I.2
De Cheveigné, A.3
-
25
-
-
44949143155
-
Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation
-
Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation, in Proc. ICSLP, 2006
-
(2006)
Proc. ICSLP
-
-
Ohtani, Y.1
Toda, T.2
Saruwatari, H.3
Shikano, K.4
-
28
-
-
0025503558
-
Backpropagation through time: What it does and how to do it
-
P. J.Werbos, Backpropagation through time: what it does and how to do it, Proceedings of the IEEE, vol. 78, no. 10, pp. 1550-1560, 1990
-
(1990)
Proceedings of the IEEE
, vol.78
, Issue.10
, pp. 1550-1560
-
-
Werbos, P.J.1
-
29
-
-
0033708106
-
Speech parameter generation algorithms for hmm-based speech synthesis
-
K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, Speech parameter generation algorithms for hmm-based speech synthesis, in ICASSP, 2000
-
(2000)
ICASSP
-
-
Tokuda, K.1
Yoshimura, T.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
32
-
-
84890527090
-
Multi-distribution deep belief network for speech synthesis
-
S.Y. Kang, X.J. Qian, and H. Meng, Multi-distribution Deep Belief Network for speech synthesis, in ICASSP, 2013
-
(2013)
ICASSP
-
-
Kang, S.Y.1
Qian, X.J.2
Meng, H.3
-
33
-
-
84890490547
-
Statistical parametric speech synthesis using deep neural networks
-
H. Zen, A. Senior, and M. Schuster, Statistical parametric speech synthesis using Deep Neural Networks, in ICASSP, 2013
-
(2013)
ICASSP
-
-
Zen, H.1
Senior, A.2
Schuster, M.3
-
34
-
-
84910030421
-
Statistical parametric speech synthesis using weighted multi-distribution Deep Belief Network
-
S.Y. Kang and H.Meng, Statistical parametric speech synthesis using weighted multi-distribution Deep Belief Network, in Interspeech, 2014
-
(2014)
Interspeech
-
-
Kang, S.Y.1
Meng, H.2
|