SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2014, Pages 7889-7893

Voice conversion in time-invariant speaker-independent space

(3) Nakashika, Toru a Takiguchi, Tetsuya a Ariki, Yasuo a

a KOBE UNIVERSITY (Japan)

Author keywords

conditional restricted Boltzmann machine; deep learning; speaker specific features; Voice conversion

Indexed keywords

SIGNAL PROCESSING;

ACOUSTIC FEATURES; CONDITIONAL RESTRICTED BOLTZMANN MACHINES; DEEP LEARNING; NEURAL NETWORK (NN); OBJECTIVE AND SUBJECTIVE EVALUATIONS; SPEAKER-SPECIFIC FEATURES; TIME INVARIANTS; VOICE CONVERSION;

SPEECH PROCESSING;

EID: 84905252390 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2014.6855136 Document Type: Conference Paper

Times cited : (6)

References (19)

1
- 0031623661
- Spectral voice conversion for text-to-speech synthesis
- Alexander Kain and Michael W. Macon, "Spectral voice conversion for text-to-speech synthesis, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1998, pp. 285-288.
- (1998) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 285-288
- Kain, A.¹ Macon, M.W.²

2
- 84865747520
- Intonation conversion from neutral to expressive speech
- Christophe Veaux and X. Robet, "Intonation conversion from neutral to expressive speech, " in Proc. Interspeech, 2011, pp. 2765-2768.
- (2011) Proc. Interspeech , pp. 2765-2768
- Veaux, C.¹ Robet, X.²

3
- 80052698826
- Speaking-aid systems using gmm-based voice conversion for electrolaryngeal speech
- Keigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, and Kiyohiro Shikano, "Speaking-aid systems using gmm-based voice conversion for electrolaryngeal speech, " Speech Communication, vol. 54, no. 1, pp. 134-146, 2012.
- (2012) Speech Communication , vol.54 , Issue.1 , pp. 134-146
- Nakamura, K.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

4
- 0034855352
- High-performance robust speech recognition using stereo training data
- Li Deng, Alex Acero, Li Jiang, Jasha Droppo, and Xuedong Huang, "High-performance robust speech recognition using stereo training data, " in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2001, pp. 301-304.
- (2001) IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) , pp. 301-304
- Deng, L.¹ Acero, A.² Jiang, L.³ Droppo, J.⁴ Huang, X.⁵

5
- 70450192197
- Speech generation from hand gestures based on space mapping
- Aki Kunikoshi, Yu Qiao, Nobuaki Minematsu, and Keikichi Hirose, "Speech generation from hand gestures based on space mapping, " in Proc. Interspeech, 2009, pp. 308-311.
- (2009) Proc. Interspeech , pp. 308-311
- Kunikoshi, A.¹ Qiao, Y.² Minematsu, N.³ Hirose, K.⁴

6
- 0021412027
- Vector quantization
- Robert Gray, "Vector quantization, " IEEE ASSP Magazine, vol. 1, no. 2, pp. 4-29, 1984.
- (1984) IEEE ASSP Magazine , vol.1 , Issue.2 , pp. 4-29
- Gray, R.¹

7
- 0026880275
- Voice transformation using PSOLA technique
- H. Valbret, E. Moulines, and Jean-Pierre Tubach, "Voice transformation using PSOLA technique, " Speech Communication, vol. 11, no. 2, pp. 175-187, 1992.
- (1992) Speech Communication , vol.11 , Issue.2 , pp. 175-187
- Valbret, H.¹ Moulines, E.² Tubach, J.-P.³

8
- 0032026483
- Continuous probabilistic transform for voice conversion
- Yannis Stylianou, Olivier Cappé, and Eric Moulines, "Continuous probabilistic transform for voice conversion, " IEEE Transactions on Speech and Audio Processing, vol. 6, no. 2, pp. 131-142, 1998.
- (1998) IEEE Transactions on Speech and Audio Processing , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappé, O.² Moulines, E.³

9
- 57749193836
- Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
- Tomoki Toda, AlanW. Black, and Keiichi Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 8, pp. 2222-2235, 2007.
- (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.W.² Tokuda, K.³

10
- 77953712499
- Voice conversion using partial least squares regression
- Elina Helander, Tuomas Virtanen, Jani Nurminen, and Moncef Gabbouj, "Voice conversion using partial least squares regression, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 5, pp. 912-921, 2010.
- (2010) IEEE Transactions on Audio, Speech, and Language Processing , vol.18 , Issue.5 , pp. 912-921
- Helander, E.¹ Virtanen, T.² Nurminen, J.³ Gabbouj, M.⁴

11
- 70349197691
- Voice conversion using artificial neural networks
- Srinivas Desai, E. Veera Raghavendra, B. Yegnanarayana, Alan W. Black, and Kishore Prahallad, "Voice conversion using artificial neural networks, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2009, pp. 3893-3896.
- (2009) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE , pp. 3893-3896
- Desai, S.¹ Veera Raghavendra, E.² Yegnanarayana, B.³ Black, A.W.⁴ Prahallad, K.⁵

12
- 84906280857
- Voice conversion in high-order eigen space using deep belief nets
- Toru Nakashika, Ryoichi Takashima, Tetsuya Takiguchi, and Yasuo Ariki, "Voice conversion in high-order eigen space using deep belief nets, " in Proc. Interspeech, 2013, pp. 369-372.
- (2013) Proc. Interspeech , pp. 369-372
- Nakashika, T.¹ Takashima, R.² Takiguchi, T.³ Ariki, Y.⁴

13
- 33745805403
- A fast learning algorithm for deep belief nets
- Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh, "A fast learning algorithm for deep belief nets, " Neural computation, vol. 18, no. 7, pp. 1527-1554, 2006.
- (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
- Hinton, G.E.¹ Osindero, S.² Teh, Y.-W.³

14
- 84889579519
- Conditional restricted boltzmann machine for voice conversion
- Zhizheng Wu, Eng Siong Chng, and Haizhou Li, "Conditional restricted boltzmann machine for voice conversion, " in IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP), 2013.
- (2013) IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)
- Wu, Z.¹ Siong Chng, E.² Li, H.³

15
- 84864026688
- Modeling human motion using binary latent variables
- Graham W. Taylor, Geoffrey E. Hinton, and Sam T. Roweis, "Modeling human motion using binary latent variables, " in Advances in neural information processing systems, 2006, pp. 1345-1352.
- (2006) Advances in Neural Information Processing Systems , pp. 1345-1352
- Taylor, G.W.¹ Hinton, G.E.² Roweis, S.T.³

16
- 56449085852
- Computer Research Laboratory
- Yoav Freund and David Haussler, Unsupervised learning of distributions of binary vectors using two layer networks, Computer Research Laboratory, 1994.
- (1994) Unsupervised Learning of Distributions of Binary Vectors Using Two Layer Networks
- Freund, Y.¹ Haussler, D.²

17
- 0025475528
- ATR japanese speech database as a tool of speech recognition and synthesis
- Akira Kurematsu, Kazuya Takeda, Yoshinori Sagisaka, Shigeru Katagiri, Hisao Kuwabara, and Kiyohiro Shikano, "ATR japanese speech database as a tool of speech recognition and synthesis, " Speech Communication, vol. 9, no. 4, pp. 357-363, 1990.
- (1990) Speech Communication , vol.9 , Issue.4 , pp. 357-363
- Kurematsu, A.¹ Takeda, K.² Sagisaka, Y.³ Katagiri, S.⁴ Kuwabara, H.⁵ Shikano, K.⁶

18
- 51449108867
- TANDEMSTRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, f0, and aperiodicity estimation
- Hideki Kawahara, Masanori Morise, Toru Takahashi, Ryuichi Nisimura, Toshio Irino, and Hideki Banno, "TANDEMSTRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, f0, and aperiodicity estimation, " in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2008, pp. 3933-3936.
- (2008) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE , pp. 3933-3936
- Kawahara, H.¹ Morise, M.² Takahashi, T.³ Nisimura, R.⁴ Irino, T.⁵ Banno, H.⁶

19
- 80052359758
- Speech reconstruction from melfrequency cepstral coefficients using a source-filter model
- Ben Milner and Xu Shao, "Speech reconstruction from melfrequency cepstral coefficients using a source-filter model, " in Proc. Interspeech, 2002, pp. 2421-2424.
- (2002) Proc. Interspeech , pp. 2421-2424
- Milner, B.¹ Shao, X.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.