메뉴 건너뛰기




Volumn 23, Issue 1, 2009, Pages 42-64

Frequency warping for VTLN and speaker adaptation by linear transformation of standard MFCC

Author keywords

Automatic speech recognition; Frequency warping; Linear transformation; Speaker adaptation; Speaker normalization; VTLN

Indexed keywords

ESTIMATION; FILTER BANKS; FREQUENCY ESTIMATION; LOUDSPEAKERS; MATHEMATICAL TRANSFORMATIONS; MAXIMUM LIKELIHOOD ESTIMATION; SPEECH RECOGNITION; TREES (MATHEMATICS);

EID: 47549091998     PISSN: 08852308     EISSN: 10958363     Source Type: Journal    
DOI: 10.1016/j.csl.2008.02.003     Document Type: Article
Times cited : (29)

References (27)
  • 1
    • 0030362995 scopus 로고    scopus 로고
    • Anastasakos, T., McDonough, J., Makhoul, J., 1996. A compact model for speaker-adaptive training. In: Proceedings of ICSLP 1996, pp. 1137-1140.
    • Anastasakos, T., McDonough, J., Makhoul, J., 1996. A compact model for speaker-adaptive training. In: Proceedings of ICSLP 1996, pp. 1137-1140.
  • 3
    • 33745207149 scopus 로고    scopus 로고
    • Cui, X., Alwan, A., 2005. MLLR-like speaker adaptation based on linearization of VTLN with MFCC features. In: Interspeech 2005, pp. 273-276.
    • Cui, X., Alwan, A., 2005. MLLR-like speaker adaptation based on linearization of VTLN with MFCC features. In: Interspeech 2005, pp. 273-276.
  • 4
    • 33746753361 scopus 로고    scopus 로고
    • Adaptation of children's speech with limited data based on formant-like peak alignment
    • Cui X., and Alwan A. Adaptation of children's speech with limited data based on formant-like peak alignment. Computer Speech and Language 20 4 (2006) 400-419
    • (2006) Computer Speech and Language , vol.20 , Issue.4 , pp. 400-419
    • Cui, X.1    Alwan, A.2
  • 5
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • Davis S.B., and Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech and Signal Processing ASSP-28 (1980) 357-366
    • (1980) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.ASSP-28 , pp. 357-366
    • Davis, S.B.1    Mermelstein, P.2
  • 7
    • 0030263447 scopus 로고    scopus 로고
    • Mean and variance adaptation within the MLLR framework
    • Gales M.J.F. Mean and variance adaptation within the MLLR framework. Computer Speech and Language 10 (1996) 249-264
    • (1996) Computer Speech and Language , vol.10 , pp. 249-264
    • Gales, M.J.F.1
  • 8
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • Gales M.J.F. Maximum likelihood linear transformations for HMM-based speech recognition. Computer Speech and Language 12 2 (1998) 75-98
    • (1998) Computer Speech and Language , vol.12 , Issue.2 , pp. 75-98
    • Gales, M.J.F.1
  • 10
    • 47549107423 scopus 로고    scopus 로고
    • Gouvea, E.B., Stern, R.M., 1997. Speaker normalization through formant-based warping of the frequency scale. In: Eurospeech 1997, vol. 3, pp. 1139-1142.
    • Gouvea, E.B., Stern, R.M., 1997. Speaker normalization through formant-based warping of the frequency scale. In: Eurospeech 1997, vol. 3, pp. 1139-1142.
  • 12
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • Leggetter C.J., and Woodland P.C. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Computer Speech and Language 9 (1995) 171-185
    • (1995) Computer Speech and Language , vol.9 , pp. 171-185
    • Leggetter, C.J.1    Woodland, P.C.2
  • 13
    • 33947682233 scopus 로고    scopus 로고
    • Loof, J., Ney, H., Umesh, S., 2006. VTLN warping factor estimation using accumulation of sufficient statistics. In: Proceedings of ICASSP 2006, vol. 1, pp. 1-4.
    • Loof, J., Ney, H., Umesh, S., 2006. VTLN warping factor estimation using accumulation of sufficient statistics. In: Proceedings of ICASSP 2006, vol. 1, pp. 1-4.
  • 14
    • 47549084232 scopus 로고    scopus 로고
    • McDonough, J.W., 2000. Speaker compensation with all-pass transforms. Ph.D. dissertation, Johns Hopkins University, Baltimore, MD.
    • McDonough, J.W., 2000. Speaker compensation with all-pass transforms. Ph.D. dissertation, Johns Hopkins University, Baltimore, MD.
  • 15
    • 0032657747 scopus 로고    scopus 로고
    • McDonough, J., Byrne, W., 1999. Speaker adaptation with all-pass transforms. In: Proceedings of ICASSP, vol. 2, pp. 757-760.
    • McDonough, J., Byrne, W., 1999. Speaker adaptation with all-pass transforms. In: Proceedings of ICASSP, vol. 2, pp. 757-760.
  • 16
    • 4544375020 scopus 로고    scopus 로고
    • McDonough, J., Waibel, A., 2004. Performance comparisons of all-pass transform adaptation with maximum likelihood linear regression. In: Proceedings of ICASSP 2004.
    • McDonough, J., Waibel, A., 2004. Performance comparisons of all-pass transform adaptation with maximum likelihood linear regression. In: Proceedings of ICASSP 2004.
  • 17
    • 47549093914 scopus 로고    scopus 로고
    • McDonough, J., Byrne, W., Luo, X., 1998. Speaker normalization with all-pass transforms. In: Proceedings of ICSLP 1998, vol. 16, pp. 2307-2310.
    • McDonough, J., Byrne, W., Luo, X., 1998. Speaker normalization with all-pass transforms. In: Proceedings of ICSLP 1998, vol. 16, pp. 2307-2310.
  • 18
    • 0347269184 scopus 로고    scopus 로고
    • McDonough, J., Schaaf, T., Waibel, A., 2004. Speaker adaptation with all-pass transforms. In: Speech Communication Special Issue on Adaptive Methods in Speech Recognition, January.
    • McDonough, J., Schaaf, T., Waibel, A., 2004. Speaker adaptation with all-pass transforms. In: Speech Communication Special Issue on Adaptive Methods in Speech Recognition, January.
  • 19
    • 44949157762 scopus 로고    scopus 로고
    • Panchapagesan, S., 2006. Frequency warping by linear transformation of standard MFCC. In: Proceedings of Interspeech 2006, ICSLP, pp. 397-400.
    • Panchapagesan, S., 2006. Frequency warping by linear transformation of standard MFCC. In: Proceedings of Interspeech 2006, ICSLP, pp. 397-400.
  • 20
    • 33947620076 scopus 로고    scopus 로고
    • Panchapagesan, S., Alwan, A., 2006. Multi-parameter frequency warping for VTLN by gradient search, ICASSP 2006, I-1181.
    • Panchapagesan, S., Alwan, A., 2006. Multi-parameter frequency warping for VTLN by gradient search, ICASSP 2006, I-1181.
  • 21
    • 85009174854 scopus 로고    scopus 로고
    • Pitz, M., Ney, H., 2003. Vocal tract normalization as linear transformation of MFCC. In: Eurospeech 2003, pp. 1445-1448.
    • Pitz, M., Ney, H., 2003. Vocal tract normalization as linear transformation of MFCC. In: Eurospeech 2003, pp. 1445-1448.
  • 22
    • 47549104447 scopus 로고    scopus 로고
    • Pitz, M., Molau, S., Schlueter, R., Ney, H., 2001. Vocal tract normalization equals linear transformation in cepstral space. In: Eurospeech 2001, pp. 721-724.
    • Pitz, M., Molau, S., Schlueter, R., Ney, H., 2001. Vocal tract normalization equals linear transformation in cepstral space. In: Eurospeech 2001, pp. 721-724.
  • 23
    • 0030149866 scopus 로고    scopus 로고
    • A maximum-likelihood approach to stochastic matching for robust speech recognition
    • Sankar A., and Lee C.-H. A maximum-likelihood approach to stochastic matching for robust speech recognition. IEEE Transactions on Speech and Audio Processing 4 3 (1996) 190-202
    • (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , Issue.3 , pp. 190-202
    • Sankar, A.1    Lee, C.-H.2
  • 24
    • 33745201218 scopus 로고    scopus 로고
    • Umesh, S., Zolnay, A., Ney, H., 2005. Implementing frequency-warping and VTLN through linear transformation of conventional MFCC. In: Interspeech 2005, pp. 269-272.
    • Umesh, S., Zolnay, A., Ney, H., 2005. Implementing frequency-warping and VTLN through linear transformation of conventional MFCC. In: Interspeech 2005, pp. 269-272.
  • 25
    • 0029764708 scopus 로고    scopus 로고
    • Wegmann, S., McAllaster, D., Orloff, J., Peskin, B., 1996. Speaker normalization on conversational telephone speech. In: Proceedings of ICASSP, pp. 339-341.
    • Wegmann, S., McAllaster, D., Orloff, J., Peskin, B., 1996. Speaker normalization on conversational telephone speech. In: Proceedings of ICASSP, pp. 339-341.
  • 27
    • 47549106664 scopus 로고    scopus 로고
    • Zhan, P., Waibel, A., 1997. Vocal tract length normalization for large vocabulary continuous speech recognition. Technical Report, CMU-CS-97-148, May.
    • Zhan, P., Waibel, A., 1997. Vocal tract length normalization for large vocabulary continuous speech recognition. Technical Report, CMU-CS-97-148, May.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.