-
1
-
-
0031623661
-
Spectral voice conversion for text-to-speech synthesis
-
Alexander Kain and Michael W. Macon, "Spectral voice conversion for text-to-speech synthesis," in ICASSP, 1998, pp. 285-288.
-
(1998)
ICASSP
, pp. 285-288
-
-
Kain, A.1
Macon, M.W.2
-
2
-
-
84865747520
-
Intonation conversion from neutral to expressive speech
-
Christophe Veaux and X. Robet, "Intonation conversion from neutral to expressive speech," in Interspeech, 2011, pp. 2765-2768.
-
(2011)
Interspeech
, pp. 2765-2768
-
-
Veaux, C.1
Robet, X.2
-
3
-
-
80052698826
-
Speaking-aid systems using gmm-based voice conversion for electrolaryngeal speech
-
Keigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, and Kiyohiro Shikano, "Speaking-aid systems using gmm-based voice conversion for electrolaryngeal speech," Speech Communication, vol. 54, no. 1, pp. 134-146, 2012.
-
(2012)
Speech Communication
, vol.54
, Issue.1
, pp. 134-146
-
-
Nakamura, K.1
Toda, T.2
Saruwatari, H.3
Shikano, K.4
-
4
-
-
0034855352
-
High-performance robust speech recognition using stereo training data
-
Li Deng, Alex Acero, Li Jiang, Jasha Droppo, and Xuedong Huang, "High-performance robust speech recognition using stereo training data," in ICASSP, 2001, pp. 301-304.
-
(2001)
ICASSP
, pp. 301-304
-
-
Deng, L.1
Acero, A.2
Jiang, L.3
Droppo, J.4
Huang, X.5
-
5
-
-
70450192197
-
Speech generation from hand gestures based on space mapping
-
Aki Kunikoshi, Yu Qiao, Nobuaki Minematsu, and Keikichi Hirose, "Speech generation from hand gestures based on space mapping," in Interspeech, 2009, pp. 308-311.
-
(2009)
Interspeech
, pp. 308-311
-
-
Kunikoshi, A.1
Qiao, Y.2
Minematsu, N.3
Hirose, K.4
-
6
-
-
84910091291
-
Multimodal exemplar-based voice conversion using lip features in noisy environments
-
Kenta Masaka, Ryo Aihara, Tetsuya Takiguchi, and Yasuo Ariki, "Multimodal exemplar-based voice conversion using lip features in noisy environments," in Interspeech, 2014, pp. 1159-1163.
-
(2014)
Interspeech
, pp. 1159-1163
-
-
Masaka, K.1
Aihara, R.2
Takiguchi, T.3
Ariki, Y.4
-
7
-
-
0021412027
-
Vector quantization
-
Robert Gray, "Vector quantization," IEEE ASSP Magazine, vol. 1, no. 2, pp. 4-29, 1984.
-
(1984)
IEEE ASSP Magazine
, vol.1
, Issue.2
, pp. 4-29
-
-
Gray, R.1
-
8
-
-
0026880275
-
Voice transformation using PSOLA technique
-
H. Valbret, E. Moulines, and Jean-Pierre Tubach, "Voice transformation using PSOLA technique," Speech Communication, vol. 11, no. 2, pp. 175-187, 1992.
-
(1992)
Speech Communication
, vol.11
, Issue.2
, pp. 175-187
-
-
Valbret, H.1
Moulines, E.2
Tubach, J.3
-
9
-
-
0032026483
-
Continuous probabilistic transform for voice conversion
-
Yannis Stylianou, Olivier Cappé, and Eric Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. Speech, Audio Process., vol. 6, no. 2, pp. 131-142, 1998.
-
(1998)
IEEE Trans. Speech, Audio Process.
, vol.6
, Issue.2
, pp. 131-142
-
-
Stylianou, Y.1
Cappé, O.2
Moulines, E.3
-
10
-
-
57749193836
-
Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
-
Tomoki Toda, Alan W. Black, and Keiichi Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory," IEEE Trans. Speech, Audio Process., vol. 15, no. 8, pp. 2222-2235, 2007.
-
(2007)
IEEE Trans. Speech, Audio Process.
, vol.15
, Issue.8
, pp. 2222-2235
-
-
Toda, T.1
Black, A.W.2
Tokuda, K.3
-
11
-
-
77953712499
-
Voice conversion using partial least squares regression
-
Elina Helander, Tuomas Virtanen, Jani Nurminen, and Moncef Gabbouj, "Voice conversion using partial least squares regression," IEEE Trans. Speech, Audio Process., vol. 18, no. 5, pp. 912-921, 2010.
-
(2010)
IEEE Trans. Speech, Audio Process.
, vol.18
, Issue.5
, pp. 912-921
-
-
Helander, E.1
Virtanen, T.2
Nurminen, J.3
Gabbouj, M.4
-
12
-
-
84865798483
-
One-to-many voice conversion based on tensor representation of speaker space
-
Daisuke Saito, Keisuke Yamamoto, Nobuaki Minematsu, and Keikichi Hirose, "One-to-many voice conversion based on tensor representation of speaker space," in Interspeech, 2011, pp. 653-656.
-
(2011)
Interspeech
, pp. 653-656
-
-
Saito, D.1
Yamamoto, K.2
Minematsu, N.3
Hirose, K.4
-
13
-
-
44949210554
-
Map-based adaptation for speech conversion using adaptation data selection and nonparallel training
-
Chung-Han Lee and Chung-Hsien Wu, "Map-based adaptation for speech conversion using adaptation data selection and nonparallel training," in Interspeech, 2006, pp. 2254-2257.
-
(2006)
Interspeech
, pp. 2254-2257
-
-
Lee, C.-H.1
Wu, C.-H.2
-
14
-
-
84901237776
-
Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis
-
Z-H Ling, Li Deng, and Dong Yu, "Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis," IEEE Trans. Audio, Speech, Lang. Process., no. 10, pp. 2129-2139, 2013.
-
(2013)
IEEE Trans. Audio, Speech, Lang. Process.
, Issue.10
, pp. 2129-2139
-
-
Ling, Z.-H.1
Deng, L.2
Yu, D.3
-
15
-
-
84906276055
-
Exemplar-based unit selection for voice conversion utilizing temporal information
-
Zhizheng Wu, Tuomas Virtanen, Tomi Kinnunen, Eng Siong Chng, and Haizhou Li, "Exemplar-based unit selection for voice conversion utilizing temporal information," in Interspeech, 2013, pp. 3057-3061.
-
(2013)
Interspeech
, pp. 3057-3061
-
-
Wu, Z.1
Virtanen, T.2
Kinnunen, T.3
Siong Chng, E.4
Li, H.5
-
16
-
-
84906281888
-
Alleviating the over-smoothing problem in gmm-based voice conversion with discriminative training
-
Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, and Sin-Horng Chen, "Alleviating the over-smoothing problem in gmm-based voice conversion with discriminative training," in Interspeech, 2013, pp. 3062-3066.
-
(2013)
Interspeech
, pp. 3062-3066
-
-
Hwang, H.1
Tsao, Y.2
Wang, H.3
Wang, Y.4
Chen, S.5
-
18
-
-
84874248255
-
Exemplar-based voice conversion in noisy environment
-
Ryoichi Takashima, Tetsuya Takiguchi, and Yasuo Ariki, "Exemplar-based voice conversion in noisy environment," in SLT, 2012, pp. 313-317.
-
(2012)
SLT
, pp. 313-317
-
-
Takashima, R.1
Takiguchi, T.2
Ariki, Y.3
-
19
-
-
84905271796
-
Noise-robust voice conversion based on spectral mapping on sparse space
-
Ryoichi Takashima, Ryo Aihara, Tetsuya Takiguchi, and Yasuo Ariki, "Noise-robust voice conversion based on spectral mapping on sparse space," in SSW8, 2013, pp. 71-75.
-
(2013)
SSW8
, pp. 71-75
-
-
Takashima, R.1
Aihara, R.2
Takiguchi, T.3
Ariki, Y.4
-
20
-
-
70349197691
-
Voice conversion using artificial neural networks
-
Srinivas Desai, E. Veera Raghavendra, B. Yegnanarayana, Alan W. Black, and Kishore Prahallad, "Voice conversion using artificial neural networks," in ICASSP. IEEE, 2009, pp. 3893-3896.
-
(2009)
ICASSP. IEEE
, pp. 3893-3896
-
-
Desai, S.1
Veera Raghavendra, E.2
Yegnanarayana, B.3
Black, A.W.4
Prahallad, K.5
-
21
-
-
84906225084
-
Joint spectral distribution modeling using restricted boltzmann machines for voice conversion
-
Ling-Hui Chen, Zhen-Hua Ling, Yan Song, and Li-Rong Dai, "Joint spectral distribution modeling using restricted boltzmann machines for voice conversion," in Interspeech, 2013, pp. 3052-3056.
-
(2013)
Interspeech
, pp. 3052-3056
-
-
Chen, L.1
Ling, Z.2
Song, Y.3
Dai, L.4
-
22
-
-
84889579519
-
Conditional restricted boltzmann machine for voice conversion
-
Zhizheng Wu, Eng Siong Chng, and Haizhou Li, "Conditional restricted boltzmann machine for voice conversion," in ChinaSIP, 2013.
-
(2013)
ChinaSIP
-
-
Wu, Z.1
Siong Chng, E.2
Li, H.3
-
23
-
-
84864026688
-
Modeling human motion using binary latent variables
-
Graham W. Taylor, Geoffrey E. Hinton, and Sam T. Roweis, "Modeling human motion using binary latent variables," in Advances in neural information processing systems, 2006, pp. 1345-1352.
-
(2006)
Advances in Neural Information Processing Systems
, pp. 1345-1352
-
-
Taylor, G.W.1
Hinton, G.E.2
Roweis, S.T.3
-
24
-
-
84906280857
-
Voice conversion in high-order eigen space using deep belief nets
-
Toru Nakashika, Ryoichi Takashima, Tetsuya Takiguchi, and Yasuo Ariki, "Voice conversion in high-order eigen space using deep belief nets," in Interspeech, 2013, pp. 369-372.
-
(2013)
Interspeech
, pp. 369-372
-
-
Nakashika, T.1
Takashima, R.2
Takiguchi, T.3
Ariki, Y.4
-
25
-
-
85053885315
-
Speaker-dependent conditionl restricted boltzmann machine for voice conversion
-
Toru Nakashika, Tetsuya Takiguchi, and Yasuo Ariki, "Speaker-dependent conditionl restricted boltzmann machine for voice conversion," IEICE Technical Report SP2013-88, vol. 113, no. 366, pp. 83-88, 2013.
-
(2013)
IEICE Technical Report SP2013-88
, vol.113
, Issue.366
, pp. 83-88
-
-
Nakashika, T.1
Takiguchi, T.2
Ariki, Y.3
-
26
-
-
84055211743
-
Acoustic modeling using deep belief networks
-
Abdel-rahman Mohamed, George E. Dahl, and Geoffrey Hinton, "Acoustic modeling using deep belief networks," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 1, pp. 14-22, 2012.
-
(2012)
IEEE Trans. Audio, Speech, Lang. Process.
, vol.20
, Issue.1
, pp. 14-22
-
-
Mohamed, A.1
Dahl, G.E.2
Hinton, G.3
-
27
-
-
33745805403
-
A fast learning algorithm for deep belief nets
-
Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh, "A fast learning algorithm for deep belief nets," Neural computation, vol. 18, no. 7, pp. 1527-1554, 2006.
-
(2006)
Neural Computation
, vol.18
, Issue.7
, pp. 1527-1554
-
-
Hinton, G.E.1
Osindero, S.2
Teh, Y.3
-
28
-
-
78149306047
-
3D object recognition with deep belief nets
-
Vinod Nair and Geoffrey E Hinton, "3D object recognition with deep belief nets.," in NIPS, 2009, pp. 1339-1347.
-
(2009)
NIPS
, pp. 1339-1347
-
-
Nair, V.1
Hinton, G.E.2
-
29
-
-
84991233704
-
A deep learning approach to machine transliteration
-
Thomas Deselaers, Saša Hasan, Oliver Bender, and Hermann Ney, "A deep learning approach to machine transliteration," in Statis. Machine Trans., 2009, pp. 233-241.
-
(2009)
Statis. Machine Trans.
, pp. 233-241
-
-
Deselaers, T.1
Hasan, S.2
Bender, O.3
Ney, H.4
-
30
-
-
33645712892
-
Compressed sensing
-
David L Donoho, "Compressed sensing," IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289-1306, 2006.
-
(2006)
IEEE Transactions on Information Theory
, vol.52
, Issue.4
, pp. 1289-1306
-
-
Donoho, D.L.1
-
31
-
-
5044219639
-
Superresolution through neighbor embedding
-
IEEE
-
Hong Chang, Dit-Yan Yeung, and Yimin Xiong, "Superresolution through neighbor embedding," in Computer Vision and Pattern Recognition. IEEE, 2004, vol. 1, pp. 275-282.
-
(2004)
Computer Vision and Pattern Recognition
, vol.1
, pp. 275-282
-
-
Chang, H.1
Yeung, D.2
Xiong, Y.3
-
33
-
-
79959342724
-
Improved learning of Gaussian-bernoulli restricted boltzmann machines
-
Springer
-
KyungHyun Cho, Alexander Ilin, and Tapani Raiko, "Improved learning of gaussian-bernoulli restricted boltzmann machines," in ICANN, pp. 10-17. Springer, 2011.
-
(2011)
ICANN
, pp. 10-17
-
-
Cho, K.1
Ilin, A.2
Raiko, T.3
-
34
-
-
85161980001
-
Sparse deep belief net model for visual area v2
-
Honglak Lee, Chaitanya Ekanadham, and Andrew Y Ng, "Sparse deep belief net model for visual area v2," in Advances in Neural Info. Process. Systems, 2008, pp. 873-880.
-
(2008)
Advances in Neural Info. Process. Systems
, pp. 873-880
-
-
Lee, H.1
Ekanadham, C.2
Ng, A.Y.3
-
35
-
-
0025475528
-
ATR Japanese speech database as a tool of speech recognition and synthesis
-
Akira Kurematsu, Kazuya Takeda, Yoshinori Sagisaka, Shigeru Katagiri, Hisao Kuwabara, and Kiyohiro Shikano, "ATR japanese speech database as a tool of speech recognition and synthesis," Speech Communication, vol. 9, no. 4, pp. 357-363, 1990.
-
(1990)
Speech Communication
, vol.9
, Issue.4
, pp. 357-363
-
-
Kurematsu, A.1
Takeda, K.2
Sagisaka, Y.3
Katagiri, S.4
Kuwabara, H.5
Shikano, K.6
-
36
-
-
51449108867
-
TANDEMSTRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation
-
IEEE
-
Hideki Kawahara, Masanori Morise, Toru Takahashi, Ryuichi Nisimura, Toshio Irino, and Hideki Banno, "TANDEMSTRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation," in ICASSP. IEEE, 2008, pp. 3933-3936.
-
(2008)
ICASSP
, pp. 3933-3936
-
-
Kawahara, H.1
Morise, M.2
Takahashi, T.3
Nisimura, R.4
Irino, T.5
Banno, H.6
-
37
-
-
80052359758
-
Speech reconstruction from melfrequency cepstral coefficients using a source-filter model
-
Ben Milner and Xu Shao, "Speech reconstruction from melfrequency cepstral coefficients using a source-filter model," in Interspeech, 2002, pp. 2421-2424.
-
(2002)
Interspeech
, pp. 2421-2424
-
-
Milner, B.1
Shao, X.2
|