SCOPUS 정보 검색 플랫폼

9th ISCA Speech Synthesis Workshop, SSW 2016

Volumn , Issue , 2016, Pages 134-139

Novel Pre-processing using Outlier Removal in Voice Conversion

(3) Rao, Sushant V a Shah, Nirmesh J a Patil, Hemant A a

a DHIRUBHAI AMBANI INSTITUTE OF INFORMATION AND COMMUNICATION TECHNOLOGY (India)

Author keywords

Gaussian mixture model; outliers; robust principal component analysis; Voice conversion

Indexed keywords

GAUSSIAN DISTRIBUTION; PHOTOMAPPING; PRINCIPAL COMPONENT ANALYSIS; SPEECH COMMUNICATION; SPEECH PROCESSING; SPEECH SYNTHESIS; STATISTICS;

GAUSSIAN MIXTURE MODEL; OUTLIER; OUTLIER REMOVALS; PRE-PROCESSING; PRE-PROCESSING STEP; ROBUST PRINCIPAL COMPONENT ANALYSIS; SPEECH UTTERANCE; TARGET SPEAKER; VOICE CONVERSION; VOICE CONVERSION TECHNIQUES;

DEGREES OF FREEDOM (MECHANICS);

EID: 85036464413 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (12)

References (26)

1
- 70349197715
- Voice transformation: a survey
- Taipei, Taiwan
- Y. Stylianou, "Voice transformation: a survey," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, 2009, pp. 3585-3588.
- (2009) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 3585-3588
- Stylianou, Y.¹

2
- 85009084358
- A first step towards text-independent voice conversion
- Jeju Island, South Korea
- D. Sündermann, A. Bonafonte, H. Ney, and H. Höge, "A first step towards text-independent voice conversion," in Proc. of the International Conference on Spoken Language Processing (ICSLP), Jeju Island, South Korea, 2004.
- (2004) Proc. of the International Conference on Spoken Language Processing (ICSLP)
- Sündermann, D.¹ Bonafonte, A.² Ney, H.³ Höge, H.⁴

3
- 84905248180
- Effectiveness of PLP-based phonetic segmentation for speech synthesis
- Florence, Italy: IEEE
- N. J. Shah, B. B. Vachhani, H. B. Sailor, and H. A. Patil, "Effectiveness of PLP-based phonetic segmentation for speech synthesis," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Florence, Italy: IEEE, 2014, pp. 270-274.
- (2014) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 270-274
- Shah, N. J.¹ Vachhani, B. B.² Sailor, H. B.³ Patil, H. A.⁴

4
- 84941033901
- Effectiveness of multiscale fractal dimension-based phonetic segmentation in speech synthesis for low resource language
- Kuching, Borneo Malaysia
- M. Zaki, J. N. Shah, and H. A. Patil, "Effectiveness of multiscale fractal dimension-based phonetic segmentation in speech synthesis for low resource language," in International Conference on Asian Language Processing (IALP), Kuching, Borneo Malaysia, 2014, pp. 103-106.
- (2014) International Conference on Asian Language Processing (IALP) , pp. 103-106
- Zaki, M.¹ Shah, J. N.² Patil, H. A.³

5
- 84867198185
- On the impact of alignment on voice conversion performance
- Brisbane, Australia
- E. Helander, J. Schwarz, J. Nurminen, H. Silen, and M. Gabbouj, "On the impact of alignment on voice conversion performance," in INTERSPEECH, Brisbane, Australia, 2008, pp. 1-5.
- (2008) INTERSPEECH , pp. 1-5
- Helander, E.¹ Schwarz, J.² Nurminen, J.³ Silen, H.⁴ Gabbouj, M.⁵

6
- 7544223741
- A survey of outlier detection methodologies
- V. J. Hodge and J. Austin, "A survey of outlier detection methodologies," Artificial Intelligence Review, vol. 22, no. 2, pp. 85-126, 2004.
- (2004) Artificial Intelligence Review , vol.22 , Issue.2 , pp. 85-126
- Hodge, V. J.¹ Austin, J.²

7
- 13444287831
- ROBPCA: a new approach to robust principal component analysis
- M. Hubert, P. J. Rousseeuw, and K. Vanden Branden, "ROBPCA: a new approach to robust principal component analysis," Technometrics, vol. 47, no. 1, pp. 64-79, 2005.
- (2005) Technometrics , vol.47 , Issue.1 , pp. 64-79
- Hubert, M.¹ Rousseeuw, P. J.² Vanden Branden, K.³

8
- 0023739214
- Voice conversion through vector quantization
- New York, NY, USA: IEEE
- M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, "Voice conversion through vector quantization," in International Conference on Acoustics, Speech, and Signal Processing (ICASSP). New York, NY, USA: IEEE, 1988, pp. 655-658.
- (1988) International Conference on Acoustics, Speech, and Signal Processing (ICASSP) , pp. 655-658
- Abe, M.¹ Nakamura, S.² Shikano, K.³ Kuwabara, H.⁴

9
- 0031623661
- Spectral voice conversion for text-tospeech synthesis
- Seattle, WA
- A. Kain and M.W. Macon, "Spectral voice conversion for text-tospeech synthesis," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seattle, WA, 1998, pp. 285-288.
- (1998) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 285-288
- Kain, A.¹ Macon, M.W.²

10
- 0032026483
- Continuous probabilistic transform for voice conversion
- Y. Stylianou, O. Cappé, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. on Speech and Audio Processing, vol. 6, no. 2, pp. 131-142, 1998.
- (1998) IEEE Trans. on Speech and Audio Processing , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappé, O.² Moulines, E.³

11
- 57749193836
- Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
- T. Toda, A.W. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory," IEEE Trans. on Audio, Speech and Language Processing, vol. 15, no. 8, pp. 2222-2235, 2007.
- (2007) IEEE Trans. on Audio, Speech and Language Processing , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.W.² Tokuda, K.³

12
- 77953712499
- Voice conversion using partial least squares regression
- E. Helander, T. Virtanen, J. Nurminen, and M. Gabbouj, "Voice conversion using partial least squares regression," IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 5, pp. 912-921, 2010.
- (2010) IEEE Transactions on Audio, Speech, and Language Processing , vol.18 , Issue.5 , pp. 912-921
- Helander, E.¹ Virtanen, T.² Nurminen, J.³ Gabbouj, M.⁴

13
- 84856141218
- Voice conversion using dynamic kernel partial least squares regression
- E. Helander, H. Silén, T. Virtanen, and M. Gabbouj, "Voice conversion using dynamic kernel partial least squares regression," IEEE Transactions on Audio, Speech, and Language processing, vol. 20, no. 3, pp. 806-817, 2012.
- (2012) IEEE Transactions on Audio, Speech, and Language processing , vol.20 , Issue.3 , pp. 806-817
- Helander, E.¹ Silén, H.² Virtanen, T.³ Gabbouj, M.⁴

14
- 84921735339
- Voice conversion using deep neural networks with layer-wise generative training
- L.-H. Chen, Z.-H. Ling, L.-J. Liu, and L.-R. Dai, "Voice conversion using deep neural networks with layer-wise generative training," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, no. 12, pp. 1859-1872, 2014.
- (2014) IEEE/ACM Transactions on Audio, Speech, and Language Processing , vol.22 , Issue.12 , pp. 1859-1872
- Chen, L.-H.¹ Ling, Z.-H.² Liu, L.-J.³ Dai, L.-R.⁴

15
- 84946685887
- Voice conversion using deep neural networks with speaker-independent pre-training
- Nevada, USA
- S. H. Mohammadi and A. Kain, "Voice conversion using deep neural networks with speaker-independent pre-training," in IEEE Spoken Language Technology Workshop (SLT), Nevada, USA, 2014, pp. 19-23.
- (2014) IEEE Spoken Language Technology Workshop (SLT) , pp. 19-23
- Mohammadi, S. H.¹ Kain, A.²

16
- 84959173289
- Semi-supervised training of a voice conversion mapping function using a joint-autoencoder
- Dresden, Germany
- S. H. Mohammadi and A. Kain, "Semi-supervised training of a voice conversion mapping function using a joint-autoencoder," in INTERSPEECH, Dresden, Germany, 2015, pp. 1-5.
- (2015) INTERSPEECH , pp. 1-5
- Mohammadi, S. H.¹ Kain, A.²

17
- 84901803470
- Exemplar-based voice conversion using non-negative spectrogram deconvolution
- Barcelona, Spain
- Z. Wu, T. Virtanen, T. Kinnunen, E. S. Chng, and H. Li, "Exemplar-based voice conversion using non-negative spectrogram deconvolution," in Proc. 8th ISCA Speech Synthesis Workshop, Barcelona, Spain, 2013, pp. 201-206.
- (2013) Proc. 8th ISCA Speech Synthesis Workshop , pp. 201-206
- Wu, Z.¹ Virtanen, T.² Kinnunen, T.³ Chng, E. S.⁴ Li, H.⁵

18
- 84911369131
- Exemplar-based sparse representation with residual compensation for voice conversion
- Z. Wu, T. Virtanen, E. S. Chng, and H. Li, "Exemplar-based sparse representation with residual compensation for voice conversion," IEEE/ACM Trans. on Audio, Speech, and Language Processing, vol. 22, no. 10, pp. 1506-1521, 2014.
- (2014) IEEE/ACM Trans. on Audio, Speech, and Language Processing , vol.22 , Issue.10 , pp. 1506-1521
- Wu, Z.¹ Virtanen, T.² Chng, E. S.³ Li, H.⁴

19
- 84973345217
- Semi-nonnegative matrix factorization using alternating direction method of multipliers for voice conversion
- Shanghai, China
- R. AIHARA, T. TAKIGUCHI, and Y. ARIKI, "Semi-nonnegative matrix factorization using alternating direction method of multipliers for voice conversion," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, pp. 5170-5174.
- (2016) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 5170-5174
- AIHARA, R.¹ TAKIGUCHI, T.² ARIKI, Y.³

20
- 0034842552
- Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum
- Salt Lake City, UT, USA
- T. Toda, H. Saruwatari, and K. Shikano, "Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Salt Lake City, UT, USA, 2001, pp. 841-844.
- (2001) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 841-844
- Toda, T.¹ Saruwatari, H.² Shikano, K.³

21
- 33749682962
- Springer
- O. Maimon and L. Rokach, Data Mining and Knowledge Discovery Handbook. Springer, 2005, vol. 2.
- (2005) Data Mining and Knowledge Discovery Handbook , vol.2
- Maimon, O.¹ Rokach, L.²

22
- 0032680362
- A fast algorithm for the minimum covariance determinant estimator
- P. J. Rousseeuw and K. V. Driessen, "A fast algorithm for the minimum covariance determinant estimator," Technometrics, vol. 41, no. 3, pp. 212-223, 1999.
- (1999) Technometrics , vol.41 , Issue.3 , pp. 212-223
- Rousseeuw, P. J.¹ Driessen, K. V.²

23
- 0002629270
- Maximum likelihood from incomplete data via the em algorithm
- A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the em algorithm," Journal of the royal statistical society. Series B (methodological), vol. 39, no. 1, pp. 1-38, 1977.
- (1977) Journal of the royal statistical society. Series B (methodological) , vol.39 , Issue.1 , pp. 1-38
- Dempster, A. P.¹ Laird, N. M.² Rubin, D. B.³

24
- 85090475413
- The CMU ARCTIC speech databases
- J. Kominek and A. W. Black, "The CMU ARCTIC speech databases," in Fifth ISCA Workshop on Speech Synthesis, 2004.
- (2004) Fifth ISCA Workshop on Speech Synthesis
- Kominek, J.¹ Black, A. W.²

25
- 84865795787
- Improved hnm-based vocoder for statistical synthesizers
- Florence, Italy
- D. Erro, I. Sainz, E. Navas, and I. Hernáez, "Improved hnm-based vocoder for statistical synthesizers." in INTERSPEECH, Florence, Italy, 2011, pp. 1809-1812.
- (2011) INTERSPEECH , pp. 1809-1812
- Erro, D.¹ Sainz, I.² Navas, E.³ Hernáez, I.⁴

26
- 0012392720
- P. 85. a method for subjective performance assessment of the quality of speech voice output devices
- International Telecommunication Union (ITU), Geneva., Last Accessed {July 26, 2016
- I. Rec, "P. 85. a method for subjective performance assessment of the quality of speech voice output devices," International Telecommunication Union (ITU), Geneva., Available Online: {https://www.itu.int/rec/T-REC-P.85-199406-I/en} Last Accessed {July 26, 2016}.
- Rec, I.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.