SCOPUS 정보 검색 플랫폼

Proceedings - IEEE International Conference on Multimedia and Expo

Volumn 2015-August, Issue , 2015, Pages

Sparse nonlinear representation for voice conversion

(3) Nakashika, Toru a Takiguchi, Tetsuya b Ariki, Yasuo b

a UNIVERSITY OF ELECTRO COMMUNICATIONS (Japan)

b KOBE UNIVERSITY (Japan)

Author keywords

Joint Density; Parallel Dictionary Learning; Restricted Boltzmann Machine; Sparse Representation; Voice Conversion

Indexed keywords

FACTORIZATION; GAUSSIAN DISTRIBUTION;

DICTIONARY LEARNING; JOINT DENSITIES; RESTRICTED BOLTZMANN MACHINE; SPARSE REPRESENTATION; VOICE CONVERSION;

MATRIX ALGEBRA;

EID: 84946019814 PISSN: 19457871 EISSN: 1945788X Source Type: Conference Proceeding
DOI: 10.1109/ICME.2015.7177437 Document Type: Conference Paper

Times cited : (4)

References (37)

1
- 0031623661
- Spectral voice conversion for text-to-speech synthesis
- Alexander Kain and Michael W. Macon, "Spectral voice conversion for text-to-speech synthesis," in ICASSP, 1998, pp. 285-288.
- (1998) ICASSP , pp. 285-288
- Kain, A.¹ Macon, M.W.²

2
- 84865747520
- Intonation conversion from neutral to expressive speech
- Christophe Veaux and X. Robet, "Intonation conversion from neutral to expressive speech," in Interspeech, 2011, pp. 2765-2768.
- (2011) Interspeech , pp. 2765-2768
- Veaux, C.¹ Robet, X.²

3
- 80052698826
- Speaking-aid systems using gmm-based voice conversion for electrolaryngeal speech
- Keigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, and Kiyohiro Shikano, "Speaking-aid systems using gmm-based voice conversion for electrolaryngeal speech," Speech Communication, vol. 54, no. 1, pp. 134-146, 2012.
- (2012) Speech Communication , vol.54 , Issue.1 , pp. 134-146
- Nakamura, K.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

4
- 0034855352
- High-performance robust speech recognition using stereo training data
- Li Deng, Alex Acero, Li Jiang, Jasha Droppo, and Xuedong Huang, "High-performance robust speech recognition using stereo training data," in ICASSP, 2001, pp. 301-304.
- (2001) ICASSP , pp. 301-304
- Deng, L.¹ Acero, A.² Jiang, L.³ Droppo, J.⁴ Huang, X.⁵

5
- 70450192197
- Speech generation from hand gestures based on space mapping
- Aki Kunikoshi, Yu Qiao, Nobuaki Minematsu, and Keikichi Hirose, "Speech generation from hand gestures based on space mapping," in Interspeech, 2009, pp. 308-311.
- (2009) Interspeech , pp. 308-311
- Kunikoshi, A.¹ Qiao, Y.² Minematsu, N.³ Hirose, K.⁴

6
- 84910091291
- Multimodal exemplar-based voice conversion using lip features in noisy environments
- Kenta Masaka, Ryo Aihara, Tetsuya Takiguchi, and Yasuo Ariki, "Multimodal exemplar-based voice conversion using lip features in noisy environments," in Interspeech, 2014, pp. 1159-1163.
- (2014) Interspeech , pp. 1159-1163
- Masaka, K.¹ Aihara, R.² Takiguchi, T.³ Ariki, Y.⁴

7
- 0021412027
- Vector quantization
- Robert Gray, "Vector quantization," IEEE ASSP Magazine, vol. 1, no. 2, pp. 4-29, 1984.
- (1984) IEEE ASSP Magazine , vol.1 , Issue.2 , pp. 4-29
- Gray, R.¹

8
- 0026880275
- Voice transformation using PSOLA technique
- H. Valbret, E. Moulines, and Jean-Pierre Tubach, "Voice transformation using PSOLA technique," Speech Communication, vol. 11, no. 2, pp. 175-187, 1992.
- (1992) Speech Communication , vol.11 , Issue.2 , pp. 175-187
- Valbret, H.¹ Moulines, E.² Tubach, J.³

9
- 0032026483
- Continuous probabilistic transform for voice conversion
- Yannis Stylianou, Olivier Cappé, and Eric Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. Speech, Audio Process., vol. 6, no. 2, pp. 131-142, 1998.
- (1998) IEEE Trans. Speech, Audio Process. , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappé, O.² Moulines, E.³

10
- 57749193836
- Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
- Tomoki Toda, Alan W. Black, and Keiichi Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory," IEEE Trans. Speech, Audio Process., vol. 15, no. 8, pp. 2222-2235, 2007.
- (2007) IEEE Trans. Speech, Audio Process. , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.W.² Tokuda, K.³

11
- 77953712499
- Voice conversion using partial least squares regression
- Elina Helander, Tuomas Virtanen, Jani Nurminen, and Moncef Gabbouj, "Voice conversion using partial least squares regression," IEEE Trans. Speech, Audio Process., vol. 18, no. 5, pp. 912-921, 2010.
- (2010) IEEE Trans. Speech, Audio Process. , vol.18 , Issue.5 , pp. 912-921
- Helander, E.¹ Virtanen, T.² Nurminen, J.³ Gabbouj, M.⁴

12
- 84865798483
- One-to-many voice conversion based on tensor representation of speaker space
- Daisuke Saito, Keisuke Yamamoto, Nobuaki Minematsu, and Keikichi Hirose, "One-to-many voice conversion based on tensor representation of speaker space," in Interspeech, 2011, pp. 653-656.
- (2011) Interspeech , pp. 653-656
- Saito, D.¹ Yamamoto, K.² Minematsu, N.³ Hirose, K.⁴

13
- 44949210554
- Map-based adaptation for speech conversion using adaptation data selection and nonparallel training
- Chung-Han Lee and Chung-Hsien Wu, "Map-based adaptation for speech conversion using adaptation data selection and nonparallel training," in Interspeech, 2006, pp. 2254-2257.
- (2006) Interspeech , pp. 2254-2257
- Lee, C.-H.¹ Wu, C.-H.²

14
- 84901237776
- Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis
- Z-H Ling, Li Deng, and Dong Yu, "Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis," IEEE Trans. Audio, Speech, Lang. Process., no. 10, pp. 2129-2139, 2013.
- (2013) IEEE Trans. Audio, Speech, Lang. Process. , Issue.10 , pp. 2129-2139
- Ling, Z.-H.¹ Deng, L.² Yu, D.³

15
- 84906276055
- Exemplar-based unit selection for voice conversion utilizing temporal information
- Zhizheng Wu, Tuomas Virtanen, Tomi Kinnunen, Eng Siong Chng, and Haizhou Li, "Exemplar-based unit selection for voice conversion utilizing temporal information," in Interspeech, 2013, pp. 3057-3061.
- (2013) Interspeech , pp. 3057-3061
- Wu, Z.¹ Virtanen, T.² Kinnunen, T.³ Siong Chng, E.⁴ Li, H.⁵

16
- 84906281888
- Alleviating the over-smoothing problem in gmm-based voice conversion with discriminative training
- Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, and Sin-Horng Chen, "Alleviating the over-smoothing problem in gmm-based voice conversion with discriminative training," in Interspeech, 2013, pp. 3062-3066.
- (2013) Interspeech , pp. 3062-3066
- Hwang, H.¹ Tsao, Y.² Wang, H.³ Wang, Y.⁴ Chen, S.⁵

17
- 0001093042
- Algorithms for nonnegative matrix factorization
- Daniel D Lee and H Sebastian Seung, "Algorithms for nonnegative matrix factorization," in Advances in neural information processing systems, 2000, pp. 556-562.
- (2000) Advances in Neural Information Processing Systems , pp. 556-562
- Lee, D.D.¹ Sebastian Seung, H.²

18
- 84874248255
- Exemplar-based voice conversion in noisy environment
- Ryoichi Takashima, Tetsuya Takiguchi, and Yasuo Ariki, "Exemplar-based voice conversion in noisy environment," in SLT, 2012, pp. 313-317.
- (2012) SLT , pp. 313-317
- Takashima, R.¹ Takiguchi, T.² Ariki, Y.³

19
- 84905271796
- Noise-robust voice conversion based on spectral mapping on sparse space
- Ryoichi Takashima, Ryo Aihara, Tetsuya Takiguchi, and Yasuo Ariki, "Noise-robust voice conversion based on spectral mapping on sparse space," in SSW8, 2013, pp. 71-75.
- (2013) SSW8 , pp. 71-75
- Takashima, R.¹ Aihara, R.² Takiguchi, T.³ Ariki, Y.⁴

20
- 70349197691
- Voice conversion using artificial neural networks
- Srinivas Desai, E. Veera Raghavendra, B. Yegnanarayana, Alan W. Black, and Kishore Prahallad, "Voice conversion using artificial neural networks," in ICASSP. IEEE, 2009, pp. 3893-3896.
- (2009) ICASSP. IEEE , pp. 3893-3896
- Desai, S.¹ Veera Raghavendra, E.² Yegnanarayana, B.³ Black, A.W.⁴ Prahallad, K.⁵

21
- 84906225084
- Joint spectral distribution modeling using restricted boltzmann machines for voice conversion
- Ling-Hui Chen, Zhen-Hua Ling, Yan Song, and Li-Rong Dai, "Joint spectral distribution modeling using restricted boltzmann machines for voice conversion," in Interspeech, 2013, pp. 3052-3056.
- (2013) Interspeech , pp. 3052-3056
- Chen, L.¹ Ling, Z.² Song, Y.³ Dai, L.⁴

22
- 84889579519
- Conditional restricted boltzmann machine for voice conversion
- Zhizheng Wu, Eng Siong Chng, and Haizhou Li, "Conditional restricted boltzmann machine for voice conversion," in ChinaSIP, 2013.
- (2013) ChinaSIP
- Wu, Z.¹ Siong Chng, E.² Li, H.³

23
- 84864026688
- Modeling human motion using binary latent variables
- Graham W. Taylor, Geoffrey E. Hinton, and Sam T. Roweis, "Modeling human motion using binary latent variables," in Advances in neural information processing systems, 2006, pp. 1345-1352.
- (2006) Advances in Neural Information Processing Systems , pp. 1345-1352
- Taylor, G.W.¹ Hinton, G.E.² Roweis, S.T.³

24
- 84906280857
- Voice conversion in high-order eigen space using deep belief nets
- Toru Nakashika, Ryoichi Takashima, Tetsuya Takiguchi, and Yasuo Ariki, "Voice conversion in high-order eigen space using deep belief nets," in Interspeech, 2013, pp. 369-372.
- (2013) Interspeech , pp. 369-372
- Nakashika, T.¹ Takashima, R.² Takiguchi, T.³ Ariki, Y.⁴

25
- 85053885315
- Speaker-dependent conditionl restricted boltzmann machine for voice conversion
- Toru Nakashika, Tetsuya Takiguchi, and Yasuo Ariki, "Speaker-dependent conditionl restricted boltzmann machine for voice conversion," IEICE Technical Report SP2013-88, vol. 113, no. 366, pp. 83-88, 2013.
- (2013) IEICE Technical Report SP2013-88 , vol.113 , Issue.366 , pp. 83-88
- Nakashika, T.¹ Takiguchi, T.² Ariki, Y.³

26
- 84055211743
- Acoustic modeling using deep belief networks
- Abdel-rahman Mohamed, George E. Dahl, and Geoffrey Hinton, "Acoustic modeling using deep belief networks," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 1, pp. 14-22, 2012.
- (2012) IEEE Trans. Audio, Speech, Lang. Process. , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.E.² Hinton, G.³

27
- 33745805403
- A fast learning algorithm for deep belief nets
- Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh, "A fast learning algorithm for deep belief nets," Neural computation, vol. 18, no. 7, pp. 1527-1554, 2006.
- (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
- Hinton, G.E.¹ Osindero, S.² Teh, Y.³

28
- 78149306047
- 3D object recognition with deep belief nets
- Vinod Nair and Geoffrey E Hinton, "3D object recognition with deep belief nets.," in NIPS, 2009, pp. 1339-1347.
- (2009) NIPS , pp. 1339-1347
- Nair, V.¹ Hinton, G.E.²

29
- 84991233704
- A deep learning approach to machine transliteration
- Thomas Deselaers, Saša Hasan, Oliver Bender, and Hermann Ney, "A deep learning approach to machine transliteration," in Statis. Machine Trans., 2009, pp. 233-241.
- (2009) Statis. Machine Trans. , pp. 233-241
- Deselaers, T.¹ Hasan, S.² Bender, O.³ Ney, H.⁴

30
- 33645712892
- Compressed sensing
- David L Donoho, "Compressed sensing," IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289-1306, 2006.
- (2006) IEEE Transactions on Information Theory , vol.52 , Issue.4 , pp. 1289-1306
- Donoho, D.L.¹

31
- 5044219639
- Superresolution through neighbor embedding
- IEEE
- Hong Chang, Dit-Yan Yeung, and Yimin Xiong, "Superresolution through neighbor embedding," in Computer Vision and Pattern Recognition. IEEE, 2004, vol. 1, pp. 275-282.
- (2004) Computer Vision and Pattern Recognition , vol.1 , pp. 275-282
- Chang, H.¹ Yeung, D.² Xiong, Y.³

32
- 56449085852
- Computer Research Laboratory
- Yoav Freund and David Haussler, Unsupervised learning of distributions of binary vectors using two layer networks, Computer Research Laboratory, 1994.
- (1994) Unsupervised Learning of Distributions of Binary Vectors Using Two Layer Networks
- Freund, Y.¹ Haussler, D.²

33
- 79959342724
- Improved learning of Gaussian-bernoulli restricted boltzmann machines
- Springer
- KyungHyun Cho, Alexander Ilin, and Tapani Raiko, "Improved learning of gaussian-bernoulli restricted boltzmann machines," in ICANN, pp. 10-17. Springer, 2011.
- (2011) ICANN , pp. 10-17
- Cho, K.¹ Ilin, A.² Raiko, T.³

34
- 85161980001
- Sparse deep belief net model for visual area v2
- Honglak Lee, Chaitanya Ekanadham, and Andrew Y Ng, "Sparse deep belief net model for visual area v2," in Advances in Neural Info. Process. Systems, 2008, pp. 873-880.
- (2008) Advances in Neural Info. Process. Systems , pp. 873-880
- Lee, H.¹ Ekanadham, C.² Ng, A.Y.³

35
- 0025475528
- ATR Japanese speech database as a tool of speech recognition and synthesis
- Akira Kurematsu, Kazuya Takeda, Yoshinori Sagisaka, Shigeru Katagiri, Hisao Kuwabara, and Kiyohiro Shikano, "ATR japanese speech database as a tool of speech recognition and synthesis," Speech Communication, vol. 9, no. 4, pp. 357-363, 1990.
- (1990) Speech Communication , vol.9 , Issue.4 , pp. 357-363
- Kurematsu, A.¹ Takeda, K.² Sagisaka, Y.³ Katagiri, S.⁴ Kuwabara, H.⁵ Shikano, K.⁶

36
- 51449108867
- TANDEMSTRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation
- IEEE
- Hideki Kawahara, Masanori Morise, Toru Takahashi, Ryuichi Nisimura, Toshio Irino, and Hideki Banno, "TANDEMSTRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation," in ICASSP. IEEE, 2008, pp. 3933-3936.
- (2008) ICASSP , pp. 3933-3936
- Kawahara, H.¹ Morise, M.² Takahashi, T.³ Nisimura, R.⁴ Irino, T.⁵ Banno, H.⁶

37
- 80052359758
- Speech reconstruction from melfrequency cepstral coefficients using a source-filter model
- Ben Milner and Xu Shao, "Speech reconstruction from melfrequency cepstral coefficients using a source-filter model," in Interspeech, 2002, pp. 2421-2424.
- (2002) Interspeech , pp. 2421-2424
- Milner, B.¹ Shao, X.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.