-
1
-
-
85004448479
-
Voice conversion through vector quantization
-
M. Abe, S. Nakamura, and K. Shikano, "Voice conversion through vector quantization," The Journal of the Acoustical Society of Japan (E), vol. 11, no. 2, pp. 71-76, 1990.
-
(1990)
The Journal of the Acoustical Society of Japan (E
, vol.11
, Issue.2
, pp. 71-76
-
-
Abe, M.1
Nakamura, S.2
Shikano, K.3
-
2
-
-
34447635527
-
Improving the intelligibility of dysarthric speech
-
A. B. Kain, J. P. Hosom, X. Niu, J. P. H. van Santen, M. Fried-Oken, and J. Staehely, "Improving the intelligibility of dysarthric speech," Speech Communication, vol. 49, no. 9, pp. 743-759, 2007.
-
(2007)
Speech Communication
, vol.49
, Issue.9
, pp. 743-759
-
-
Kain, A.B.1
Hosom, J.P.2
Niu, X.3
Van Santen, J.P.H.4
Fried-Oken, M.5
Staehely, J.6
-
3
-
-
84897939966
-
Alaryngeal speech enhancement based on one-to-many eigenvoice conversion
-
H. Doi, T. Toda, H. Saruwatari, and K. Shikano, "Alaryngeal speech enhancement based on one-to-many eigenvoice conversion," Audio, Speech, and Language Processing, IEEE/ACM Transactions on, vol. 22, no. 1, pp. 172-183, 2014.
-
(2014)
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
, vol.22
, Issue.1
, pp. 172-183
-
-
Doi, H.1
Toda, T.2
Saruwatari, H.3
Shikano, K.4
-
4
-
-
58149203393
-
Data-driven emotion conversion in spoken english
-
Z. Inanoglu and S. Young, "Data-driven emotion conversion in spoken english," Speech Communication, vol. 51, no. 3, pp. 268-283, 2009.
-
(2009)
Speech Communication
, vol.51
, Issue.3
, pp. 268-283
-
-
Inanoglu, Z.1
Young, S.2
-
5
-
-
77953699443
-
Evaluation of expressive speech synthesis with voice conversion and copy resynthesis techniques
-
O. Türk and M. Schröder, "Evaluation of expressive speech synthesis with voice conversion and copy resynthesis techniques," Audio, Speech, and Language Processing, IEEE/ACM Transactions on, vol. 18, no. 5, pp. 965-973, 2010.
-
(2010)
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
, vol.18
, Issue.5
, pp. 965-973
-
-
Türk, O.1
Schröder, M.2
-
6
-
-
79959827418
-
Applying voice conversion to concatenative singing-voice synthesis
-
F. Villavicencio and J. Bonada, "Applying voice conversion to concatenative singing-voice synthesis," in Proc. INTERSPEECH, 2010, pp. 2162-2165.
-
(2010)
Proc. INTERSPEECH
, pp. 2162-2165
-
-
Villavicencio, F.1
Bonada, J.2
-
7
-
-
84901767453
-
Voice timbre control based on perceived age in singing voice conversion
-
K. Kobayashi, T. Toda, H. Doi, T. Nakano, M. Goto, G. Neubig, S. Sakti, and S. Nakamura, "Voice timbre control based on perceived age in singing voice conversion," Information and Systems, IEICE Transactions on, vol. E97-D, no. 6, pp. 1419-1428, 2014.
-
(2014)
Information and Systems, IEICE Transactions on
, vol.E97-D
, Issue.6
, pp. 1419-1428
-
-
Kobayashi, K.1
Toda, T.2
Doi, H.3
Nakano, T.4
Goto, M.5
Neubig, G.6
Sakti, S.7
Nakamura, S.8
-
8
-
-
0038383054
-
On artificial bandwidth extension of telephone speech
-
P. Jax and P. Vary, "On artificial bandwidth extension of telephone speech," Signal Processing, vol. 83, no. 8, pp. 1707-1719, 2003.
-
(2003)
Signal Processing
, vol.83
, Issue.8
, pp. 1707-1719
-
-
Jax, P.1
Vary, P.2
-
9
-
-
84865698185
-
Statistical voice conversion techniques for body-conducted unvoiced speech enhancement
-
T. Toda, M. Nakagiri, and K. Shikano, "Statistical voice conversion techniques for body-conducted unvoiced speech enhancement," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 9, pp. 2505-2517, 2012.
-
(2012)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.20
, Issue.9
, pp. 2505-2517
-
-
Toda, T.1
Nakagiri, M.2
Shikano, K.3
-
10
-
-
67650657780
-
Foreign accent conversion in computer assisted pronunciation training
-
D. Felps, H. Bortfeld, and R. Gutierrez-Osuna, "Foreign accent conversion in computer assisted pronunciation training," Speech Communication, vol. 51, no. 10, pp. 920-932, 2009.
-
(2009)
Speech Communication
, vol.51
, Issue.10
, pp. 920-932
-
-
Felps, D.1
Bortfeld, H.2
Gutierrez-Osuna, R.3
-
11
-
-
0023739214
-
Voice conversion through vector quantization
-
M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, "Voice conversion through vector quantization," in Proc. ICASSP, 1988, pp. 655-658.
-
(1988)
Proc. ICASSP
, pp. 655-658
-
-
Abe, M.1
Nakamura, S.2
Shikano, K.3
Kuwabara, H.4
-
12
-
-
0025892924
-
Statistical analysis of bilingual speaker's speech for cross-language voice conversion
-
M. Abe, K. Shikano, and H. Kuwabara, "Statistical analysis of bilingual speaker's speech for cross-language voice conversion," The Journal of the Acoustical Society of America, vol. 90, no. 1, pp. 76-82, 1991.
-
(1991)
The Journal of the Acoustical Society of America
, vol.90
, Issue.1
, pp. 76-82
-
-
Abe, M.1
Shikano, K.2
Kuwabara, H.3
-
13
-
-
0026880275
-
Voice transformation using psola technique
-
H. Valbret, E. Moulines, and J. P. Tubach, "Voice transformation using psola technique," Speech Communication, vol. 11, no. 2-3, pp. 175-187, 1992.
-
(1992)
Speech Communication
, vol.11
, Issue.2-3
, pp. 175-187
-
-
Valbret, H.1
Moulines, E.2
Tubach, J.P.3
-
14
-
-
33745216749
-
The Blizzard Challenge-2005: Evaluating corpus-based speech synthesis on common datasets
-
A. W. Black and K. Tokuda, "The Blizzard Challenge-2005: evaluating corpus-based speech synthesis on common datasets," in Proc. INTERSPEECH, 2005, pp. 77-80.
-
(2005)
Proc. INTERSPEECH
, pp. 77-80
-
-
Black, A.W.1
Tokuda, K.2
-
15
-
-
0035127703
-
Applying the harmonic plus noise model in concatenative speech synthesis
-
Y. Stylianou, "Applying the harmonic plus noise model in concatenative speech synthesis," Speech and Audio Processing, IEEE Transactions on, vol. 9, no. 1, pp. 21-29, 2001.
-
(2001)
Speech and Audio Processing, IEEE Transactions on
, vol.9
, Issue.1
, pp. 21-29
-
-
Stylianou, Y.1
-
16
-
-
0032673049
-
Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds
-
H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigné, "Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based f0 extraction: possible role of a repetitive structure in sounds," Speech Communication, vol. 27, no. 3-4, pp. 187-207, 1999.
-
(1999)
Speech Communication
, vol.27
, Issue.3-4
, pp. 187-207
-
-
Kawahara, H.1
Masuda-Katsuse, I.2
De Cheveigné, A.3
-
17
-
-
84885499464
-
Optimal quantization of lsp parameters
-
F. Soong and B. Juang, "Optimal quantization of lsp parameters," Speech and Audio Processing, IEEE Transactions on, vol. 1, no. 1, pp. 15-24, 1993.
-
(1993)
Speech and Audio Processing, IEEE Transactions on
, vol.1
, Issue.1
, pp. 15-24
-
-
Soong, F.1
Juang, B.2
-
18
-
-
85131821539
-
Melgeneralized cepstral analysis-a unified approach to speech spectral estimation
-
K. Tokuda, T. Kobayashi, T. Masuko, and S. Imai, "Melgeneralized cepstral analysis-a unified approach to speech spectral estimation," in Proc. ICSLP, 1994, pp. 1043-1045.
-
(1994)
Proc. ICSLP
, pp. 1043-1045
-
-
Tokuda, K.1
Kobayashi, T.2
Masuko, T.3
Imai, S.4
-
19
-
-
84921735339
-
Voice conversion using deep neural networks with layer-wise generative training
-
L.-H. Chen, Z.-H. Ling, L.-J. Liu, and L.-R. Dai, "Voice conversion using deep neural networks with layer-wise generative training," Audio, Speech, and Language Processing, IEEE/ACM Transactions on, vol. 22, no. 12, pp. 1859-1872, 2014.
-
(2014)
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
, vol.22
, Issue.12
, pp. 1859-1872
-
-
Chen, L.-H.1
Ling, Z.-H.2
Liu, L.-J.3
Dai, L.-R.4
-
20
-
-
44949143155
-
Maximum likelihood voice conversion based on gmm with straight mixed excitation
-
Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Maximum likelihood voice conversion based on gmm with straight mixed excitation," in Proc. INTERSPEECH, 2006, pp. 2266-2269.
-
(2006)
Proc. INTERSPEECH
, pp. 2266-2269
-
-
Ohtani, Y.1
Toda, T.2
Saruwatari, H.3
Shikano, K.4
-
21
-
-
0034841948
-
Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction
-
A. Kain and M.W. Macon, "Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction," in Proc. ICASSP, 2001, pp. 813-816.
-
(2001)
Proc. ICASSP
, pp. 813-816
-
-
Kain, A.1
Macon, M.W.2
-
22
-
-
34047254509
-
Quality-enhanced voice morphing using maximum likelihood transformations
-
H. Ye and S. Young, "Quality-enhanced voice morphing using maximum likelihood transformations," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 14, no. 4, pp. 1301-1312, 2006.
-
(2006)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.14
, Issue.4
, pp. 1301-1312
-
-
Ye, H.1
Young, S.2
-
23
-
-
85009212516
-
Transforming F0 contours
-
B. Gillett and S. King, "Transforming F0 contours," in Proc. INTERSPEECH, 2003, pp. 101-104.
-
(2003)
Proc. INTERSPEECH
, pp. 101-104
-
-
Gillett, B.1
King, S.2
-
24
-
-
84928405078
-
Generative modeling of voice fundamental frequency contours
-
H. Kameoka, K. Yoshizato, T. Ishihara, K. Kadowaki, Y. Ohishi, and K. Kashino, "Generative modeling of voice fundamental frequency contours," Audio, Speech, and Language Processing, IEEE/ACM Transactions on, vol. 23, no. 6, pp. 1042-1053, 2015.
-
(2015)
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
, vol.23
, Issue.6
, pp. 1042-1053
-
-
Kameoka, H.1
Yoshizato, K.2
Ishihara, T.3
Kadowaki, K.4
Ohishi, Y.5
Kashino, K.6
-
25
-
-
84867199771
-
Simultaneous conversion of duration and spectrum based on statistical models including time-sequence matching
-
K. Yutani, Y. Uto, Y. Nankaku, T. Toda, and K. Tokuda, "Simultaneous conversion of duration and spectrum based on statistical models including time-sequence matching," in Proc. INTERSPEECH, 2008, pp. 1072-1075.
-
(2008)
Proc. INTERSPEECH
, pp. 1072-1075
-
-
Yutani, K.1
Uto, Y.2
Nankaku, Y.3
Toda, T.4
Tokuda, K.5
-
26
-
-
0032026483
-
Continuous probabilistic transform for voice conversion
-
Y. Stylianou, O. Cappé, and E. Moulines, "Continuous probabilistic transform for voice conversion," Speech and Audio Processing, IEEE Transactions on, vol. 6, no. 2, pp. 131-142, 1998.
-
(1998)
Speech and Audio Processing, IEEE Transactions on
, vol.6
, Issue.2
, pp. 131-142
-
-
Stylianou, Y.1
Cappé, O.2
Moulines, E.3
-
27
-
-
57749193836
-
Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
-
T. Toda, A.W. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 15, no. 8, pp. 2222-2235, 2007.
-
(2007)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.15
, Issue.8
, pp. 2222-2235
-
-
Toda, T.1
Black, A.W.2
Tokuda, K.3
-
28
-
-
84901766069
-
Voice conversion based on speaker-dependent restricted boltzmann machines
-
T. Nakashika, T. Takiguchi, and Y. Ariki, "Voice conversion based on speaker-dependent restricted boltzmann machines," Information and Systems, IEICE Transactions on, vol. E67-D, no. 6, pp. 1403-1410, 2014.
-
(2014)
Information and Systems, IEICE Transactions on
, vol.E67-D
, Issue.6
, pp. 1403-1410
-
-
Nakashika, T.1
Takiguchi, T.2
Ariki, Y.3
-
29
-
-
84856141218
-
Voice conversion using dynamic kernel partial least squares regression
-
E. Helander, H. Silé, T. Virtanen, and M. M. Gabbouj, "Voice conversion using dynamic kernel partial least squares regression," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 3, pp. 806-817, 2012.
-
(2012)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.20
, Issue.3
, pp. 806-817
-
-
Helander, E.1
Silé, H.2
Virtanen, T.3
Gabbouj, M.M.4
-
30
-
-
84865737668
-
Gaussian process experts for voice conversion
-
N. Pilkington, H. Zen, and M. Gales, "Gaussian process experts for voice conversion," in Proc. INTERSPEECH, 2011, pp. 2761-2764.
-
(2011)
Proc. INTERSPEECH
, pp. 2761-2764
-
-
Pilkington, N.1
Zen, H.2
Gales, M.3
-
31
-
-
84890539284
-
Voice conversion based on Gaussian processes by coherent and asymmetric training with limited training data
-
N. Xu, Y. Tang, J. Bao, A. Jiang, X. Liu, and Z. Yang, "Voice conversion based on gaussian processes by coherent and asymmetric training with limited training data," Speech Communication, vol. 58, pp. 124-138, 2014.
-
(2014)
Speech Communication
, vol.58
, pp. 124-138
-
-
Xu, N.1
Tang, Y.2
Bao, J.3
Jiang, A.4
Liu, X.5
Yang, Z.6
-
32
-
-
77953707533
-
Spectral mapping using artificial neural networks for voice conversion
-
S. Desai, A. Black, B. Yegnanarayana, and K. Prahallad, "Spectral mapping using artificial neural networks for voice conversion," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 18, no. 5, pp. 954-964, 2010.
-
(2010)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.18
, Issue.5
, pp. 954-964
-
-
Desai, S.1
Black, A.2
Yegnanarayana, B.3
Prahallad, K.4
-
33
-
-
84885055553
-
Exemplar-based voice conversion using sparse representation in noisy environments
-
R. Takashima, T. Takiguchi, and Y. Ariki, "Exemplar-based voice conversion using sparse representation in noisy environments," Information and Systems, IEICE Transactions on, vol. E96-A, no. 10, pp. 1946-1953, 2013.
-
(2013)
Information and Systems, IEICE Transactions on
, vol.E96-A
, Issue.10
, pp. 1946-1953
-
-
Takashima, R.1
Takiguchi, T.2
Ariki, Y.3
-
34
-
-
84911369131
-
Exemplar-based sparse representation with residual compensation for voice conversion
-
Z. Wu, T. Virtanen, E. Chng, and H. Li, "Exemplar-based sparse representation with residual compensation for voice conversion," Audio, Speech, and Language Processing, IEEE/ACM Transactions on, vol. 22, no. 10, pp. 1506-1521, 2014.
-
(2014)
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
, vol.22
, Issue.10
, pp. 1506-1521
-
-
Wu, Z.1
Virtanen, T.2
Chng, E.3
Li, H.4
-
35
-
-
84946027999
-
Voice conversion using deep bidirectional long short-term memory based recurrent neural networks
-
L. Sun, S. Kang, K. Li, and H. Meng, "Voice conversion using deep bidirectional long short-term memory based recurrent neural networks," in Proc. ICASSP, 2015, pp. 4869-4873.
-
(2015)
Proc. ICASSP
, pp. 4869-4873
-
-
Sun, L.1
Kang, S.2
Li, K.3
Meng, H.4
-
36
-
-
84962834006
-
Post-filters to modify the modulation spectrum for statistical parametric speech synthesis
-
S. Takamichi, T. Toda, A. Black, G. Neubig, S. Sakti, and S. Nakamura, "Post-filters to modify the modulation spectrum for statistical parametric speech synthesis," Audio, Speech, and Language Processing, IEEE/ACM Transactions on, vol. 24, no. 4, pp. 757-767, 2016.
-
(2016)
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
, vol.24
, Issue.4
, pp. 757-767
-
-
Takamichi, S.1
Toda, T.2
Black, A.3
Neubig, G.4
Sakti, S.5
Nakamura, S.6
-
37
-
-
77953727123
-
Voice conversion based on weighted frequency warping
-
D. Erro, A. Moreno, and A. Bonafonte, "Voice conversion based on weighted frequency warping," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 18, no. 5, pp. 922-931, 2010.
-
(2010)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.18
, Issue.5
, pp. 922-931
-
-
Erro, D.1
Moreno, A.2
Bonafonte, A.3
-
38
-
-
84919935005
-
Can we automatically transform speech recorded on common consumer devices in real-world environments into professional production quality speech?-a dataset, insights, and challenges
-
Aug
-
G. J. Mysore, "Can we automatically transform speech recorded on common consumer devices in real-world environments into professional production quality speech?-a dataset, insights, and challenges," IEEE Signal Processing Letters, vol. 22, no. 8, pp. 1006-1010, Aug 2015.
-
(2015)
IEEE Signal Processing Letters
, vol.22
, Issue.8
, pp. 1006-1010
-
-
Mysore, G.J.1
-
39
-
-
84994351528
-
Analysis of the Voice Conversion Challenge 2016 evaluation results
-
M. Wester, Z. Wu, and J. Yamagishi, "Analysis of the Voice Conversion Challenge 2016 evaluation results," in (submitted to) Interspeech, 2016.
-
(2016)
(Submitted To) Interspeech
-
-
Wester, M.1
Wu, Z.2
Yamagishi, J.3
-
40
-
-
84962901047
-
Anti-spoofing for text-independent speaker verification: An initial database, comparison of countermeasures, and human performance
-
Z. Wu, P. L. De Leon, C. Demiroglu, A. Khodabakhsh, S. King, Z.-H. Ling, D. Saito, B. Stewart, T. Toda, M. Wester, and J. Yamagishi, "Anti-spoofing for text-independent speaker verification: An initial database, comparison of countermeasures, and human performance," Audio, Speech and Language Processing, IEEE/ACM Transactions on, vol. 24, pp. 768-783, 2016.
-
(2016)
Audio, Speech and Language Processing, IEEE/ACM Transactions on
, vol.24
, pp. 768-783
-
-
Wu, Z.1
De Leon, P.L.2
Demiroglu, C.3
Khodabakhsh, A.4
King, S.5
Ling, Z.-H.6
Saito, D.7
Stewart, B.8
Toda, T.9
Wester, M.10
Yamagishi, J.11
|