-
1
-
-
84865698185
-
Statistical voice conversion techniques for body-conducted unvoiced speech enhancement
-
T. Toda, M. Nakagiri, and K. Shikano, "Statistical voice conversion techniques for body-conducted unvoiced speech enhancement," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 9, pp. 2505-2517, 2012.
-
(2012)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.20
, Issue.9
, pp. 2505-2517
-
-
Toda, T.1
Nakagiri, M.2
Shikano, K.3
-
2
-
-
67650657780
-
Foreign accent conversion in computer assisted pronunciation training
-
D. Felps, H. Bortfeld, and R. Gutierrez-Osuna, "Foreign accent conversion in computer assisted pronunciation training," Speech communication, vol. 51, no. 10, pp. 920-932, 2009.
-
(2009)
Speech Communication
, vol.51
, Issue.10
, pp. 920-932
-
-
Felps, D.1
Bortfeld, H.2
Gutierrez-Osuna, R.3
-
3
-
-
0023739214
-
Voice conversion through vector quantization
-
M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, "Voice conversion through vector quantization," in Proc. ICASSP, 1988, pp. 655-658.
-
(1988)
Proc. ICASSP
, pp. 655-658
-
-
Abe, M.1
Nakamura, S.2
Shikano, K.3
Kuwabara, H.4
-
4
-
-
0032026483
-
Continuous probabilistic transform for voice conversion
-
mar
-
Y. Stylianou, O. Cappe, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. Audio, Speech, and Lang. Process, vol. 6, no. 2, pp. 131-142, mar. 1998.
-
(1998)
IEEE Trans. Audio, Speech, and Lang. Process
, vol.6
, Issue.2
, pp. 131-142
-
-
Stylianou, Y.1
Cappe, O.2
Moulines, E.3
-
5
-
-
0031623661
-
Spectral voice conversion for text-tospeech synthesis
-
A. Kain and M. Macon, "Spectral voice conversion for text-tospeech synthesis," in Proc. ICASSP, 1998, pp. 285-288.
-
(1998)
Proc. ICASSP
, pp. 285-288
-
-
Kain, A.1
Macon, M.2
-
6
-
-
57749193836
-
Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
-
nov
-
T. Toda, A. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory," IEEE Trans. Audio, Speech, and Lang. Process, vol. 15, no. 8, pp. 2222-2235, nov. 2007.
-
(2007)
IEEE Trans. Audio, Speech, and Lang. Process
, vol.15
, Issue.8
, pp. 2222-2235
-
-
Toda, T.1
Black, A.2
Tokuda, K.3
-
7
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
-
IEEE
-
G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 82-97, 2012.
-
(2012)
Signal Processing Magazine
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.E.4
Mohamed, A.-R.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.N.10
-
8
-
-
84890490547
-
Statistical parametric speech synthesis using deep neural networks
-
H. Zen, A. Senior, and M. Schuster, "Statistical parametric speech synthesis using deep neural networks," in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 7962-7966.
-
(2013)
Acoustics, Speech and Signal Processing (ICASSP 2013 IEEE International Conference On. IEEE
, pp. 7962-7966
-
-
Zen, H.1
Senior, A.2
Schuster, M.3
-
9
-
-
84906225084
-
Joint spectral distribution modeling using restricted Boltzmann machines for voice conversion
-
Lyon, France, August 25-29, 2013
-
L.-H. Chen, Z.-H. Ling, Y. Song, and L.-R. Dai, "Joint spectral distribution modeling using restricted Boltzmann machines for voice conversion," in INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013, 2013, pp. 3052-3056.
-
(2013)
INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association
, pp. 3052-3056
-
-
Chen, L.-H.1
Ling, Z.-H.2
Song, Y.3
Dai, L.-R.4
-
10
-
-
84906280857
-
Voice conversion in high-order eigen space using deep belief nets
-
Lyon, France, August 25-29, 2013
-
T. Nakashika, R. Takashima, T. Takiguchi, and Y. Ariki, "Voice conversion in high-order eigen space using deep belief nets," in INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013, 2013, pp. 369-372.
-
(2013)
INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association
, pp. 369-372
-
-
Nakashika, T.1
Takashima, R.2
Takiguchi, T.3
Ariki, Y.4
-
11
-
-
84923867813
-
Voice conversion using rnn pre-trained by recurrent temporal restricted boltzmann machines
-
March
-
T. Nakashika, T. Takiguchi, and Y. Ariki, "Voice conversion using rnn pre-trained by recurrent temporal restricted boltzmann machines," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 3, pp. 580-587, March 2015.
-
(2015)
IEEE/ACM Transactions on Audio, Speech, and Language Processing
, vol.23
, Issue.3
, pp. 580-587
-
-
Nakashika, T.1
Takiguchi, T.2
Ariki, Y.3
-
12
-
-
84946027999
-
Voice conversion using deep bidirectional long short-term memory based recurrent neural networks
-
L. Sun, S. Kang, K. Li, and H. Meng, "Voice conversion using deep bidirectional long short-term memory based recurrent neural networks," in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015, pp. 4869-4873.
-
(2015)
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference On. IEEE
, pp. 4869-4873
-
-
Sun, L.1
Kang, S.2
Li, K.3
Meng, H.4
-
13
-
-
84921735339
-
Voice conversion using deep neural networks with layer-wise generative training
-
L.-H. Chen, Z.-H. Ling, L.-J. Liu, and L.-R. Dai, "Voice conversion using deep neural networks with layer-wise generative training," Audio, Speech, and Language Processing, IEEE/ACM Transactions on, vol. 22, no. 12, pp. 1859-1872, 2014.
-
(2014)
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
, vol.22
, Issue.12
, pp. 1859-1872
-
-
Chen, L.-H.1
Ling, Z.-H.2
Liu, L.-J.3
Dai, L.-R.4
-
14
-
-
27644522706
-
Vocal tract normalization equals linear transformation in cepstral space
-
sep
-
M. Pitz and H. Ney, "Vocal tract normalization equals linear transformation in cepstral space," Speech and Audio Processing, IEEE Transactions on, vol. 13, no. 5, pp. 930-944, sep. 2005.
-
(2005)
Speech and Audio Processing, IEEE Transactions on
, vol.13
, Issue.5
, pp. 930-944
-
-
Pitz, M.1
Ney, H.2
-
15
-
-
84865785753
-
Improved bottleneck features using pretrained deep neural networks
-
D. Yu and M. L. Seltzer, "Improved bottleneck features using pretrained deep neural networks." Interspeech, pp. 237-240, 2011.
-
(2011)
Interspeech
, pp. 237-240
-
-
Yu, D.1
Seltzer, M.L.2
-
16
-
-
84959118000
-
The fisher corpus: A resource for the next generations of speech-to-text
-
C. Cieri, D. Miller, and K. Walker, "The fisher corpus: a resource for the next generations of speech-to-text." in LREC, vol. 4, 2004, pp. 69-71.
-
(2004)
LREC
, vol.4
, pp. 69-71
-
-
Cieri, C.1
Miller, D.2
Walker, K.3
-
17
-
-
84906257669
-
Voice conversion for non-parallel datasets using dynamic kernel partial least squares regression
-
H. Siln, J. Nurminen, E. Helander, and M. Gabbouj, "Voice conversion for non-parallel datasets using dynamic kernel partial least squares regression," in Interspeech, 2013.
-
(2013)
Interspeech
-
-
Siln, H.1
Nurminen, J.2
Helander, E.3
Gabbouj, M.4
-
18
-
-
84946020861
-
Sparse representation for frequency warping based voice conversion
-
X. Tian, Z. Wu, S. W. Lee, and N. Q. Hy, "Sparse representation for frequency warping based voice conversion," in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, 2015.
-
(2015)
Acoustics, Speech and Signal Processing (ICASSP 2015 IEEE International Conference on
-
-
Tian, X.1
Wu, Z.2
Lee, S.W.3
Hy, N.Q.4
-
19
-
-
0032673049
-
Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
-
H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigné, "Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech Communication, vol. 27, no. 3, pp. 187-208, 1999.
-
(1999)
Speech Communication
, vol.27
, Issue.3
, pp. 187-208
-
-
Kawahara, H.1
Masuda-Katsuse, I.2
De Cheveigné, A.3
-
20
-
-
84905283451
-
New methods in continuous Mandarin speech recognition
-
C. J. Chen, R. A. Gopinath, M. D. Monkowski, M. A. Picheny, and K. Shen, "New methods in continuous mandarin speech recognition." in Eurospeech, 1997.
-
(1997)
Eurospeech
-
-
Chen, C.J.1
Gopinath, R.A.2
Monkowski, M.D.3
Picheny, M.A.4
Shen, K.5
|