메뉴 건너뛰기




Volumn 2015-August, Issue , 2015, Pages

Sparse nonlinear representation for voice conversion

Author keywords

Joint Density; Parallel Dictionary Learning; Restricted Boltzmann Machine; Sparse Representation; Voice Conversion

Indexed keywords

FACTORIZATION; GAUSSIAN DISTRIBUTION;

EID: 84946019814     PISSN: 19457871     EISSN: 1945788X     Source Type: Conference Proceeding    
DOI: 10.1109/ICME.2015.7177437     Document Type: Conference Paper
Times cited : (4)

References (37)
  • 1
    • 0031623661 scopus 로고    scopus 로고
    • Spectral voice conversion for text-to-speech synthesis
    • Alexander Kain and Michael W. Macon, "Spectral voice conversion for text-to-speech synthesis," in ICASSP, 1998, pp. 285-288.
    • (1998) ICASSP , pp. 285-288
    • Kain, A.1    Macon, M.W.2
  • 2
    • 84865747520 scopus 로고    scopus 로고
    • Intonation conversion from neutral to expressive speech
    • Christophe Veaux and X. Robet, "Intonation conversion from neutral to expressive speech," in Interspeech, 2011, pp. 2765-2768.
    • (2011) Interspeech , pp. 2765-2768
    • Veaux, C.1    Robet, X.2
  • 3
    • 80052698826 scopus 로고    scopus 로고
    • Speaking-aid systems using gmm-based voice conversion for electrolaryngeal speech
    • Keigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, and Kiyohiro Shikano, "Speaking-aid systems using gmm-based voice conversion for electrolaryngeal speech," Speech Communication, vol. 54, no. 1, pp. 134-146, 2012.
    • (2012) Speech Communication , vol.54 , Issue.1 , pp. 134-146
    • Nakamura, K.1    Toda, T.2    Saruwatari, H.3    Shikano, K.4
  • 4
    • 0034855352 scopus 로고    scopus 로고
    • High-performance robust speech recognition using stereo training data
    • Li Deng, Alex Acero, Li Jiang, Jasha Droppo, and Xuedong Huang, "High-performance robust speech recognition using stereo training data," in ICASSP, 2001, pp. 301-304.
    • (2001) ICASSP , pp. 301-304
    • Deng, L.1    Acero, A.2    Jiang, L.3    Droppo, J.4    Huang, X.5
  • 5
    • 70450192197 scopus 로고    scopus 로고
    • Speech generation from hand gestures based on space mapping
    • Aki Kunikoshi, Yu Qiao, Nobuaki Minematsu, and Keikichi Hirose, "Speech generation from hand gestures based on space mapping," in Interspeech, 2009, pp. 308-311.
    • (2009) Interspeech , pp. 308-311
    • Kunikoshi, A.1    Qiao, Y.2    Minematsu, N.3    Hirose, K.4
  • 6
    • 84910091291 scopus 로고    scopus 로고
    • Multimodal exemplar-based voice conversion using lip features in noisy environments
    • Kenta Masaka, Ryo Aihara, Tetsuya Takiguchi, and Yasuo Ariki, "Multimodal exemplar-based voice conversion using lip features in noisy environments," in Interspeech, 2014, pp. 1159-1163.
    • (2014) Interspeech , pp. 1159-1163
    • Masaka, K.1    Aihara, R.2    Takiguchi, T.3    Ariki, Y.4
  • 7
    • 0021412027 scopus 로고
    • Vector quantization
    • Robert Gray, "Vector quantization," IEEE ASSP Magazine, vol. 1, no. 2, pp. 4-29, 1984.
    • (1984) IEEE ASSP Magazine , vol.1 , Issue.2 , pp. 4-29
    • Gray, R.1
  • 8
    • 0026880275 scopus 로고
    • Voice transformation using PSOLA technique
    • H. Valbret, E. Moulines, and Jean-Pierre Tubach, "Voice transformation using PSOLA technique," Speech Communication, vol. 11, no. 2, pp. 175-187, 1992.
    • (1992) Speech Communication , vol.11 , Issue.2 , pp. 175-187
    • Valbret, H.1    Moulines, E.2    Tubach, J.3
  • 9
    • 0032026483 scopus 로고    scopus 로고
    • Continuous probabilistic transform for voice conversion
    • Yannis Stylianou, Olivier Cappé, and Eric Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. Speech, Audio Process., vol. 6, no. 2, pp. 131-142, 1998.
    • (1998) IEEE Trans. Speech, Audio Process. , vol.6 , Issue.2 , pp. 131-142
    • Stylianou, Y.1    Cappé, O.2    Moulines, E.3
  • 10
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
    • Tomoki Toda, Alan W. Black, and Keiichi Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory," IEEE Trans. Speech, Audio Process., vol. 15, no. 8, pp. 2222-2235, 2007.
    • (2007) IEEE Trans. Speech, Audio Process. , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 12
    • 84865798483 scopus 로고    scopus 로고
    • One-to-many voice conversion based on tensor representation of speaker space
    • Daisuke Saito, Keisuke Yamamoto, Nobuaki Minematsu, and Keikichi Hirose, "One-to-many voice conversion based on tensor representation of speaker space," in Interspeech, 2011, pp. 653-656.
    • (2011) Interspeech , pp. 653-656
    • Saito, D.1    Yamamoto, K.2    Minematsu, N.3    Hirose, K.4
  • 13
    • 44949210554 scopus 로고    scopus 로고
    • Map-based adaptation for speech conversion using adaptation data selection and nonparallel training
    • Chung-Han Lee and Chung-Hsien Wu, "Map-based adaptation for speech conversion using adaptation data selection and nonparallel training," in Interspeech, 2006, pp. 2254-2257.
    • (2006) Interspeech , pp. 2254-2257
    • Lee, C.-H.1    Wu, C.-H.2
  • 14
    • 84901237776 scopus 로고    scopus 로고
    • Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis
    • Z-H Ling, Li Deng, and Dong Yu, "Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis," IEEE Trans. Audio, Speech, Lang. Process., no. 10, pp. 2129-2139, 2013.
    • (2013) IEEE Trans. Audio, Speech, Lang. Process. , Issue.10 , pp. 2129-2139
    • Ling, Z.-H.1    Deng, L.2    Yu, D.3
  • 15
    • 84906276055 scopus 로고    scopus 로고
    • Exemplar-based unit selection for voice conversion utilizing temporal information
    • Zhizheng Wu, Tuomas Virtanen, Tomi Kinnunen, Eng Siong Chng, and Haizhou Li, "Exemplar-based unit selection for voice conversion utilizing temporal information," in Interspeech, 2013, pp. 3057-3061.
    • (2013) Interspeech , pp. 3057-3061
    • Wu, Z.1    Virtanen, T.2    Kinnunen, T.3    Siong Chng, E.4    Li, H.5
  • 16
    • 84906281888 scopus 로고    scopus 로고
    • Alleviating the over-smoothing problem in gmm-based voice conversion with discriminative training
    • Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, and Sin-Horng Chen, "Alleviating the over-smoothing problem in gmm-based voice conversion with discriminative training," in Interspeech, 2013, pp. 3062-3066.
    • (2013) Interspeech , pp. 3062-3066
    • Hwang, H.1    Tsao, Y.2    Wang, H.3    Wang, Y.4    Chen, S.5
  • 18
    • 84874248255 scopus 로고    scopus 로고
    • Exemplar-based voice conversion in noisy environment
    • Ryoichi Takashima, Tetsuya Takiguchi, and Yasuo Ariki, "Exemplar-based voice conversion in noisy environment," in SLT, 2012, pp. 313-317.
    • (2012) SLT , pp. 313-317
    • Takashima, R.1    Takiguchi, T.2    Ariki, Y.3
  • 19
    • 84905271796 scopus 로고    scopus 로고
    • Noise-robust voice conversion based on spectral mapping on sparse space
    • Ryoichi Takashima, Ryo Aihara, Tetsuya Takiguchi, and Yasuo Ariki, "Noise-robust voice conversion based on spectral mapping on sparse space," in SSW8, 2013, pp. 71-75.
    • (2013) SSW8 , pp. 71-75
    • Takashima, R.1    Aihara, R.2    Takiguchi, T.3    Ariki, Y.4
  • 21
    • 84906225084 scopus 로고    scopus 로고
    • Joint spectral distribution modeling using restricted boltzmann machines for voice conversion
    • Ling-Hui Chen, Zhen-Hua Ling, Yan Song, and Li-Rong Dai, "Joint spectral distribution modeling using restricted boltzmann machines for voice conversion," in Interspeech, 2013, pp. 3052-3056.
    • (2013) Interspeech , pp. 3052-3056
    • Chen, L.1    Ling, Z.2    Song, Y.3    Dai, L.4
  • 22
    • 84889579519 scopus 로고    scopus 로고
    • Conditional restricted boltzmann machine for voice conversion
    • Zhizheng Wu, Eng Siong Chng, and Haizhou Li, "Conditional restricted boltzmann machine for voice conversion," in ChinaSIP, 2013.
    • (2013) ChinaSIP
    • Wu, Z.1    Siong Chng, E.2    Li, H.3
  • 24
    • 84906280857 scopus 로고    scopus 로고
    • Voice conversion in high-order eigen space using deep belief nets
    • Toru Nakashika, Ryoichi Takashima, Tetsuya Takiguchi, and Yasuo Ariki, "Voice conversion in high-order eigen space using deep belief nets," in Interspeech, 2013, pp. 369-372.
    • (2013) Interspeech , pp. 369-372
    • Nakashika, T.1    Takashima, R.2    Takiguchi, T.3    Ariki, Y.4
  • 25
    • 85053885315 scopus 로고    scopus 로고
    • Speaker-dependent conditionl restricted boltzmann machine for voice conversion
    • Toru Nakashika, Tetsuya Takiguchi, and Yasuo Ariki, "Speaker-dependent conditionl restricted boltzmann machine for voice conversion," IEICE Technical Report SP2013-88, vol. 113, no. 366, pp. 83-88, 2013.
    • (2013) IEICE Technical Report SP2013-88 , vol.113 , Issue.366 , pp. 83-88
    • Nakashika, T.1    Takiguchi, T.2    Ariki, Y.3
  • 27
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh, "A fast learning algorithm for deep belief nets," Neural computation, vol. 18, no. 7, pp. 1527-1554, 2006.
    • (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
    • Hinton, G.E.1    Osindero, S.2    Teh, Y.3
  • 28
    • 78149306047 scopus 로고    scopus 로고
    • 3D object recognition with deep belief nets
    • Vinod Nair and Geoffrey E Hinton, "3D object recognition with deep belief nets.," in NIPS, 2009, pp. 1339-1347.
    • (2009) NIPS , pp. 1339-1347
    • Nair, V.1    Hinton, G.E.2
  • 29
    • 84991233704 scopus 로고    scopus 로고
    • A deep learning approach to machine transliteration
    • Thomas Deselaers, Saša Hasan, Oliver Bender, and Hermann Ney, "A deep learning approach to machine transliteration," in Statis. Machine Trans., 2009, pp. 233-241.
    • (2009) Statis. Machine Trans. , pp. 233-241
    • Deselaers, T.1    Hasan, S.2    Bender, O.3    Ney, H.4
  • 33
    • 79959342724 scopus 로고    scopus 로고
    • Improved learning of Gaussian-bernoulli restricted boltzmann machines
    • Springer
    • KyungHyun Cho, Alexander Ilin, and Tapani Raiko, "Improved learning of gaussian-bernoulli restricted boltzmann machines," in ICANN, pp. 10-17. Springer, 2011.
    • (2011) ICANN , pp. 10-17
    • Cho, K.1    Ilin, A.2    Raiko, T.3
  • 35
    • 0025475528 scopus 로고
    • ATR Japanese speech database as a tool of speech recognition and synthesis
    • Akira Kurematsu, Kazuya Takeda, Yoshinori Sagisaka, Shigeru Katagiri, Hisao Kuwabara, and Kiyohiro Shikano, "ATR japanese speech database as a tool of speech recognition and synthesis," Speech Communication, vol. 9, no. 4, pp. 357-363, 1990.
    • (1990) Speech Communication , vol.9 , Issue.4 , pp. 357-363
    • Kurematsu, A.1    Takeda, K.2    Sagisaka, Y.3    Katagiri, S.4    Kuwabara, H.5    Shikano, K.6
  • 36
    • 51449108867 scopus 로고    scopus 로고
    • TANDEMSTRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation
    • IEEE
    • Hideki Kawahara, Masanori Morise, Toru Takahashi, Ryuichi Nisimura, Toshio Irino, and Hideki Banno, "TANDEMSTRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation," in ICASSP. IEEE, 2008, pp. 3933-3936.
    • (2008) ICASSP , pp. 3933-3936
    • Kawahara, H.1    Morise, M.2    Takahashi, T.3    Nisimura, R.4    Irino, T.5    Banno, H.6
  • 37
    • 80052359758 scopus 로고    scopus 로고
    • Speech reconstruction from melfrequency cepstral coefficients using a source-filter model
    • Ben Milner and Xu Shao, "Speech reconstruction from melfrequency cepstral coefficients using a source-filter model," in Interspeech, 2002, pp. 2421-2424.
    • (2002) Interspeech , pp. 2421-2424
    • Milner, B.1    Shao, X.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.