SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2011, Pages 653-656

One-to-many voice conversion based on tensor representation of speaker space

(4) Saito, Daisuke a Yamamoto, Keisuke a Minematsu, Nobuaki a Hirose, Keikichi a

Author keywords

Eigenvoice; Gaussian mixture model; Tensor analysis; Tucker decomposition; Voice conversion

Indexed keywords

EIGENVOICES; FLEXIBLE CONTROL; GAUSSIAN COMPONENTS; GAUSSIAN MIXTURE MODEL; HIGH-DIMENSIONAL; MEAN VECTOR; SPEAKER CHARACTERISTICS; SPEAKER RECOGNITION; SUPERVECTOR; TENSOR ANALYSIS; TENSOR REPRESENTATION; TRAINING DATA SETS; VOICE CONVERSION;

SPEECH RECOGNITION; TENSORS; VECTOR SPACES;

SPEECH PROCESSING;

EID: 84865798483 PISSN: None EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (94)

References (22)

1
- 0023739214
- Voice conversion through vector quantization
- M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, "Voice conversion through vector quantization," Proc. ICASSP, pp. 655-658, 1988.
- (1988) Proc. ICASSP , pp. 655-658
- Abe, M.¹ Nakamura, S.² Shikano, K.³ Kuwabara, H.⁴

2
- 0031623661
- Spectral voice conversion for text-tospeech synthesis
- A. Kain and M.W. Macon, "Spectral voice conversion for text-tospeech synthesis," Proc. ICASSP, vol. 1, pp. 285-288, 1998.
- (1998) Proc. ICASSP , vol.1 , pp. 285-288
- Kain, A.¹ MacOn, M.W.²

3
- 0034855352
- Highperformance robust speech recognition using stereo training data
- L. Deng, A. Acero, L. Jiang, J. Droppo, and X. Huang, "Highperformance robust speech recognition using stereo training data," Proc. ICASSP, pp. 301-304, 2001.
- (2001) Proc. ICASSP , pp. 301-304
- Deng, L.¹ Acero, A.² Jiang, L.³ Droppo, J.⁴ Huang, X.⁵

4
- 70450192197
- Speech generation from hand gestures based on space mapping
- A. Kunikoshi, Y. Qiao, N. Minematsu, and K. Hirose, "Speech generation from hand gestures based on space mapping," Proc. INTERSPEECH, pp. 308-311, 2009.
- (2009) Proc. INTERSPEECH , pp. 308-311
- Kunikoshi, A.¹ Qiao, Y.² Minematsu, N.³ Hirose, K.⁴

5
- 70349197691
- Voice conversion using artificial neural networks
- S. Desai, E. V. Raghavendra, B. Yegnanarayana, A.W. Black, and K. Prahallad, "Voice conversion using artificial neural networks," Proc. ICASSP, pp. 3893-3896, 2009.
- (2009) Proc. ICASSP , pp. 3893-3896
- Desai, S.¹ Raghavendra, E.V.² Yegnanarayana, B.³ Black, A.W.⁴ Prahallad, K.⁵

6
- 0032026483
- Continuous probabilistic transform for voice conversion
- Y. Stylianou, O. Cappé, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. on Speech and Audio Processing, vol. 6, no. 2, pp. 131-142, 1998.
- (1998) IEEE Trans. on Speech and Audio Processing , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappé, O.² Moulines, E.³

7
- 34047245444
- Nonparallel training for voice conversion based on a parameter adaptation approach
- A. Mouchtaris, J. V. der Spiegel, and P. Mueller, "Nonparallel training for voice conversion based on a parameter adaptation approach," IEEE Trans. on Audio, Speech, and Language Processing, vol. 14, no. 3, pp. 952-963, 2006.
- (2006) IEEE Trans. on Audio, Speech, and Language Processing , vol.14 , Issue.3 , pp. 952-963
- Mouchtaris, A.¹ Der Spiegel, J.V.² Mueller, P.³

8
- 44949210554
- Map-based adaptation for speech conversion using adaptation data selection and non-parallel training
- C. H. Lee and C. H. Wu, "Map-based adaptation for speech conversion using adaptation data selection and non-parallel training," Proc. INTERSPEECH, pp. 2254-2257, 2006.
- (2006) Proc. INTERSPEECH , pp. 2254-2257
- Lee, C.H.¹ Wu, C.H.²

9
- 34547512822
- Eigenvoice conversion based on Gaussian mixture model
- T. Toda, Y. Ohtani, and K. Shikano, "Eigenvoice conversion based on Gaussian mixture model," Proc. INTERSPEECH, pp. 2446- 2449, 2006.
- (2006) Proc. INTERSPEECH , pp. 2446-2449
- Toda, T.¹ Ohtani, Y.² Shikano, K.³

10
- 0034320005
- Rapid speaker adaptation in Eigenvoice space
- R. Kuhn, J-C. Junqua, P. Nguyen, and N. Niedzielski, "Rapid speaker adaptation in Eigenvoice space," IEEE Trans. on Speech and Audio Processing, vol. 8, no. 6, pp. 695-707, 2000.
- (2000) IEEE Trans. on Speech and Audio Processing , vol.8 , Issue.6 , pp. 695-707
- Kuhn, R.¹ Junqua, J.-C.² Nguyen, P.³ Niedzielski, N.⁴

11
- 58349106697
- A study of interspeaker variability in speaker verification
- P. Kenny, P. Ouellet, N. Dehak, V. Gupta, and P. Dumouchel, "A study of interspeaker variability in speaker verification," IEEE Trans. on Audio, Speech, and Language Processing, vol. 16, no. 5, pp. 980-988, 2008.
- (2008) IEEE Trans. on Audio, Speech, and Language Processing , vol.16 , Issue.5 , pp. 980-988
- Kenny, P.¹ Ouellet, P.² Dehak, N.³ Gupta, V.⁴ Dumouchel, P.⁵

12
- 84944415516
- Mutilinear analysis of image ensembles: TensorFaces
- M. A. O. Vasilescu and D. Terzopoulos, "Mutilinear analysis of image ensembles: TensorFaces," Proc. ECCV, pp. 447-460, 2002.
- (2002) Proc. ECCV , pp. 447-460
- Vasilescu, M.A.O.¹ Terzopoulos, D.²

13
- 70450182468
- Speaker adaptive training for one-to-many eigenvoice conversion based on Gaussian mixture model
- Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Speaker adaptive training for one-to-many eigenvoice conversion based on Gaussian mixture model," Proc. INTERSPEECH, pp. 1981-1984, 2007.
- (2007) Proc. INTERSPEECH , pp. 1981-1984
- Ohtani, Y.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

14
- 78049398713
- Non-parallel training for many-to-many eigenvoice conversion
- Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Non-parallel training for many-to-many eigenvoice conversion," Proc. ICASSP, pp. 4822-4825, 2010.
- (2010) Proc. ICASSP , pp. 4822-4825
- Ohtani, Y.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

15
- 34547496175
- One-to-many and many-toone voice conversion based on eigenvoices
- T. Toda, Y. Ohtani, and K. Shikano, "One-to-many and many-toone voice conversion based on eigenvoices," Proc. ICASSP, vol. IV, pp. 693-696, 2007.
- (2007) Proc. ICASSP , vol.4 , pp. 693-696
- Toda, T.¹ Ohtani, Y.² Shikano, K.³

16
- 57749193836
- Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
- T. Toda, A.W. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory," IEEE Trans. on Audio, Speech, and Language Processing, vol. 15, no. 8, pp. 2222-2235, 2007.
- (2007) IEEE Trans. on Audio, Speech, and Language Processing , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.W.² Tokuda, K.³

17
- 0034144758
- A multilinear singular value decomposition
- L. De Lathauwer, B. De Moor and J. Vandewalle, "A multilinear singular value decomposition," SIAM Journal on Matrix Analysis and Applications, vol. 21, No. 4, pp. 1253-1278, 2000.
- (2000) SIAM Journal on Matrix Analysis and Applications , vol.21 , Issue.4 , pp. 1253-1278
- De Lathauwer, L.¹ De Moor, B.² Vandewalle, J.³

18
- 0013953617
- Some mathematical notes on three-mode factor analysis
- L. R. Tucker, "Some mathematical notes on three-mode factor analysis," Psychometrika, vol. 31, no. 3, pp. 279-311, 1966.
- (1966) Psychometrika , vol.31 , Issue.3 , pp. 279-311
- Tucker, L.R.¹

19
- 78049396810
- Speaker adaptation based on the multilinear decomposition of training speaker models
- Y. Jeong, "Speaker adaptation based on the multilinear decomposition of training speaker models," Proc. ICASSP, pp. 4870-4873, 2010.
- (2010) Proc. ICASSP , pp. 4870-4873
- Jeong, Y.¹

20
- 0025475528
- ATR Japanese speech database as a tool of speech recognition and synthesis
- A. Kurematsu, K. Takeda, Y. Sagisaka, S. Katagiri, H. Kuwabara, and K.Shikano, "ATR Japanese speech database as a tool of speech recognition and synthesis," Speech Communication, vol.9, pp.357-363, 1990.
- (1990) Speech Communication , vol.9 , pp. 357-363
- Kurematsu, A.¹ Takeda, K.² Sagisaka, Y.³ Katagiri, S.⁴ Kuwabara, H.⁵ Shikano, K.⁶

21
- 84865766423
- "Jnas: Japanese newspaper article sentences," http://www.milab.is.tsukuba.ac.jp/jnas/instruct.html

22
- 0032673049
- Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
- H. Kawahara, I. Masuda-Katsuse, and A.de Cheveigné, "Restructuring speech representations using a pitch-adaptive timefrequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech Communication, vol.27, pp.187-207, 1999.
- (1999) Speech Communication , vol.27 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² De Cheveigné, A.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.