메뉴 건너뛰기




Volumn 15, Issue 8, 2007, Pages 2454-2464

Speaker adaptation with limited data using regression-tree-based spectral peak alignment

Author keywords

Peak alignment; Regression tree; Speaker adaptation; Speech recognition; Vocal tract length normalization (VTLN)

Indexed keywords

AUTOMATIC SPEECH RECOGNITION SYSTEMS; CONNECTED DIGITS; DATA-DRIVEN; FREQUENCY WARPING; GAUSSIAN MIXTURES; LIMITED DATUM; PEAK ALIGNMENT; PERFORMANCE IMPROVEMENTS; RAPID SPEAKER ADAPTATIONS; REGRESSION TREE; SPEAKER ADAPTATION; SPEAKER NORMALIZATIONS; SPECTRAL MISMATCHES; SPECTRAL PEAKS; STATISTICAL ESTIMATIONS; TRAINING AND TESTING; TREE-BASED; UNSUPERVISED ADAPTATIONS; VOCAL TRACT LENGTH NORMALIZATION (VTLN);

EID: 64149115312     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2007.906740     Document Type: Article
Times cited : (6)

References (35)
  • 1
    • 0027578837 scopus 로고
    • On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition
    • Apr
    • X. Huang and K. F. Lee, "On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition," IEEE Trans. Speech Audio Process., vol. 1, no. 2, pp. 150-157, Apr. 1993.
    • (1993) IEEE Trans. Speech Audio Process , vol.1 , Issue.2 , pp. 150-157
    • Huang, X.1    Lee, K.F.2
  • 2
    • 0032969462 scopus 로고    scopus 로고
    • Acoustic of children's speech: Developmental changes of temporal and spectral parameters
    • S. Lee, A. Potamianos, and S. Narayanan, "Acoustic of children's speech: Developmental changes of temporal and spectral parameters," J. Acoust. Soc. Amer., vol. 105, no. 3, pp. 1455-1468, 1999.
    • (1999) J. Acoust. Soc. Amer , vol.105 , Issue.3 , pp. 1455-1468
    • Lee, S.1    Potamianos, A.2    Narayanan, S.3
  • 3
    • 0029747582 scopus 로고    scopus 로고
    • A study of speech recognition for children and elderly
    • J.Wilpon and C. Jacobsen, "A study of speech recognition for children and elderly," in Proc. ICASSP, 1996, vol. I, pp. 349-352.
    • (1996) Proc. ICASSP , vol.1 , pp. 349-352
    • Wilpon, J.1    Jacobsen, C.2
  • 4
    • 0017482612 scopus 로고
    • Normalization of vowels by vocal tract length and its application to vowel identification
    • Apr
    • H. Wakita, "Normalization of vowels by vocal tract length and its application to vowel identification," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-25, no. 2, pp. 183-192, Apr. 1977.
    • (1977) IEEE Trans. Acoust., Speech, Signal Process , vol.ASSP-25 , Issue.2 , pp. 183-192
    • Wakita, H.1
  • 6
    • 0029375590 scopus 로고
    • Speaker adaptation using constrained estimation of Gaussian mixtures
    • Sep
    • V. Digalakis, D. Rtischev, and L. G. Neumeyer, "Speaker adaptation using constrained estimation of Gaussian mixtures," IEEE Trans. Speech Audio Process., vol. 3, no. 5, pp. 357-366, Sep. 1995.
    • (1995) IEEE Trans. Speech Audio Process , vol.3 , Issue.5 , pp. 357-366
    • Digalakis, V.1    Rtischev, D.2    Neumeyer, L.G.3
  • 7
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMMbased speech recognition
    • M. J. F. Gales, "Maximum likelihood linear transformations for HMMbased speech recognition," Comput. Speech Lang., vol. 12, no. 2, pp. 75-98, 1998.
    • (1998) Comput. Speech Lang , vol.12 , Issue.2 , pp. 75-98
    • Gales, M.J.F.1
  • 9
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., vol. 9, pp. 171-185, 1995.
    • (1995) Comput. Speech Lang , vol.9 , pp. 171-185
    • Leggetter, C.J.1    Woodland, P.C.2
  • 10
    • 0031647824 scopus 로고    scopus 로고
    • A frequencywarping approach to speaker normalization
    • Jan
    • L. Lee and R. Rose, "A frequencywarping approach to speaker normalization," IEEE Trans. Speech Audio Process., vol. 6, no. 1, pp. 49-60, Jan. 1998.
    • (1998) IEEE Trans. Speech Audio Process , vol.6 , Issue.1 , pp. 49-60
    • Lee, L.1    Rose, R.2
  • 11
    • 0029764708 scopus 로고    scopus 로고
    • Speaker normalisation on conversational telephone speech
    • S.Wegmann, D. McAllaster, J. Orloff, and B. Peskin, "Speaker normalisation on conversational telephone speech," in Proc. ICASSP, 1996, vol. I, pp. 339-341.
    • (1996) Proc. ICASSP , vol.1 , pp. 339-341
    • Wegmann, S.1    McAllaster, D.2    Orloff, J.3    Peskin, B.4
  • 12
    • 0029725604 scopus 로고    scopus 로고
    • A parametric approach to vocal tract length normalization
    • E. Eide and H. Gish, "A parametric approach to vocal tract length normalization," in Proc. ICASSP, 1996, pp. 346-349.
    • (1996) Proc. ICASSP , pp. 346-349
    • Eide, E.1    Gish, H.2
  • 13
    • 85128432219 scopus 로고    scopus 로고
    • Speaker normalization with all-pass transforms
    • J. McDonough, W. Byrne, and X. Luo, "Speaker normalization with all-pass transforms," in Proc. ICSLP, 1998, vol. VI, pp. 2307-2310.
    • (1998) Proc. ICSLP , vol.6 , pp. 2307-2310
    • McDonough, J.1    Byrne, W.2    Luo, X.3
  • 14
    • 84937324786 scopus 로고    scopus 로고
    • Speaker compensation with all-pass transforms,
    • Ph.D. dissertation, Johns Hopkins Univ, Baltimore, MD
    • J. McDonough, "Speaker compensation with all-pass transforms," Ph.D. dissertation, Johns Hopkins Univ., Baltimore, MD, 2000.
    • (2000)
    • McDonough, J.1
  • 15
    • 0028419019 scopus 로고
    • Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
    • Apr
    • J. L. Gauvain and C. H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Process., vol. 2, no. 2, pp. 291-298, Apr. 1994.
    • (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.2 , pp. 291-298
    • Gauvain, J.L.1    Lee, C.H.2
  • 16
    • 85009236682 scopus 로고    scopus 로고
    • Implementing vocal tract length normalization in the MLLR framework
    • G. Ding, Y. Zhu, C. Li, and B. Xu, "Implementing vocal tract length normalization in the MLLR framework," in Proc. ICSLP, 2002, pp. 1389-1392.
    • (2002) Proc. ICSLP , pp. 1389-1392
    • Ding, G.1    Zhu, Y.2    Li, C.3    Xu, B.4
  • 17
    • 0032204117 scopus 로고    scopus 로고
    • A novel feature transformation for vocal tract length normalization in automatic speech recognition
    • Nov
    • T. Claes, I. Dologlou, L. Bosch, and D. V. Compernolle, "A novel feature transformation for vocal tract length normalization in automatic speech recognition," IEEE Trans. Speech Audio Processs., vol. 11, no. 6, pp. 549-557, Nov. 1998.
    • (1998) IEEE Trans. Speech Audio Processs , vol.11 , Issue.6 , pp. 549-557
    • Claes, T.1    Dologlou, I.2    Bosch, L.3    Compernolle, D.V.4
  • 18
    • 0032657747 scopus 로고    scopus 로고
    • Speaker adaptation with all-pass transforms
    • J. McDonough and W. Byrne, "Speaker adaptation with all-pass transforms," in Proc. ICASSP, 1999, pp. 757-760.
    • (1999) Proc. ICASSP , pp. 757-760
    • McDonough, J.1    Byrne, W.2
  • 19
    • 0347269184 scopus 로고    scopus 로고
    • Speaker adaptation with allpass transforms
    • J. McDonough, T. Shaaf, and A.Waibel, "Speaker adaptation with allpass transforms," Speech Commn., vol. 42, pp. 75-91, 2004.
    • (2004) Speech Commn , vol.42 , pp. 75-91
    • McDonough, J.1    Shaaf, T.2    Waibel, A.3
  • 20
    • 4544375020 scopus 로고    scopus 로고
    • Performance comparisons of all-pass transform adaptation with maximum likelihood linear regression
    • J. McDonough and A. Waibel, "Performance comparisons of all-pass transform adaptation with maximum likelihood linear regression," in Proc. ICASSP, 2004, pp. I313-I316.
    • (2004) Proc. ICASSP
    • McDonough, J.1    Waibel, A.2
  • 21
    • 85009174854 scopus 로고    scopus 로고
    • Vocal tract normalization as linear transformation of MFCC
    • M. Pitz and H. Ney, "Vocal tract normalization as linear transformation of MFCC," in Proc. Eur. Conf. Speech Commun. Technol., 2003, pp. 1445-1448.
    • (2003) Proc. Eur. Conf. Speech Commun. Technol , pp. 1445-1448
    • Pitz, M.1    Ney, H.2
  • 22
    • 33745201218 scopus 로고    scopus 로고
    • Implementing frequency-warping and VTLN through linear transformation of conventional MFCC
    • S. Umesh, A. Zolnay, and H. Ney, "Implementing frequency-warping and VTLN through linear transformation of conventional MFCC," in Proc. Interspeech'05, 2005, pp. 269-272.
    • (2005) Proc. Interspeech'05 , pp. 269-272
    • Umesh, S.1    Zolnay, A.2    Ney, H.3
  • 23
    • 33745207149 scopus 로고    scopus 로고
    • MLLR-like speaker adaptation based on linearization of VTLN with MFCC features
    • X. Cui and A. Alwan, "MLLR-like speaker adaptation based on linearization of VTLN with MFCC features," in Proc. Interspeech'05, 2005, pp. 273-276.
    • (2005) Proc. Interspeech'05 , pp. 273-276
    • Cui, X.1    Alwan, A.2
  • 24
    • 33746753361 scopus 로고    scopus 로고
    • Adaptation of children's speech with limited data based on formant-like peak alignment
    • X. Cui and A. Alwan, "Adaptation of children's speech with limited data based on formant-like peak alignment," Comput. Speech Lang., vol. 20, no. 4, pp. 400-419, 2006.
    • (2006) Comput. Speech Lang , vol.20 , Issue.4 , pp. 400-419
    • Cui, X.1    Alwan, A.2
  • 25
    • 0002629270 scopus 로고
    • Maximum likelihood from incomplete data via the EM algorithm
    • A. Dempster, N. Laird, and D. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc., vol. 39, no. 1, pp. 1-38, 1977.
    • (1977) J. R. Statist. Soc , vol.39 , Issue.1 , pp. 1-38
    • Dempster, A.1    Laird, N.2    Rubin, D.3
  • 26
    • 0023776398 scopus 로고    scopus 로고
    • The DARPA 1000-word resource management database for continuous speech recognition
    • P. Price, W. M. Fisher, J. Bernstein, and D. S. Pallett, "The DARPA 1000-word resource management database for continuous speech recognition," in Proc. ICASSP, 1998, vol. 1, pp. 651-654.
    • (1998) Proc. ICASSP , vol.1 , pp. 651-654
    • Price, P.1    Fisher, W.M.2    Bernstein, J.3    Pallett, D.S.4
  • 27
    • 0002560960 scopus 로고
    • A database for speaker-independent digit recognition
    • R. Leonard, "A database for speaker-independent digit recognition," in Proc. ICASSP, 1984, vol. 9, pp. 328-331.
    • (1984) Proc. ICASSP , vol.9 , pp. 328-331
    • Leonard, R.1
  • 29
    • 0030374913 scopus 로고    scopus 로고
    • Formant analysis using mixtures of Gaussians
    • P. Zolfaghari and T. Robinson, "Formant analysis using mixtures of Gaussians," in Proc. ICSLP, 1996, pp. 1229-1232.
    • (1996) Proc. ICSLP , pp. 1229-1232
    • Zolfaghari, P.1    Robinson, T.2
  • 30
    • 44949157762 scopus 로고    scopus 로고
    • Frequency warping by linear transformation of standard MFCC
    • S. Panchapagesan, "Frequency warping by linear transformation of standard MFCC," in Proc. Interspeech, 2006, pp. 397-400.
    • (2006) Proc. Interspeech , pp. 397-400
    • Panchapagesan, S.1
  • 31
    • 84874875877 scopus 로고    scopus 로고
    • Maximum a posterior linear regression with elliptically symmetric matrix variate priors
    • W. Chou, "Maximum a posterior linear regression with elliptically symmetric matrix variate priors," in Proc. Eurospeech, 1999, pp. 1-4.
    • (1999) Proc. Eurospeech , pp. 1-4
    • Chou, W.1
  • 32
    • 84885507503 scopus 로고    scopus 로고
    • Maximum a posteriori linear regression (MAPLR) variance adaptation for continuous density HMMs
    • W. Chou and X. He, "Maximum a posteriori linear regression (MAPLR) variance adaptation for continuous density HMMs," in Proc. Eurospeech, 2003, pp. 1513-1516.
    • (2003) Proc. Eurospeech , pp. 1513-1516
    • Chou, W.1    He, X.2
  • 33
    • 85135272864 scopus 로고    scopus 로고
    • Maximum a posteriori linear regression for hidden Markov model adaptation
    • C. Chesta, O. Siohan, and C. Lee, "Maximum a posteriori linear regression for hidden Markov model adaptation," in Proc. Eurospeech, 1999, pp. 211-214.
    • (1999) Proc. Eurospeech , pp. 211-214
    • Chesta, C.1    Siohan, O.2    Lee, C.3
  • 35
    • 0024909979 scopus 로고
    • Some statistical issues in the comparison of speech recognition algorithm
    • L. Gillick and S. Cox, "Some statistical issues in the comparison of speech recognition algorithm," in Proc. ICASSP, 1989, pp. 532-535.
    • (1989) Proc. ICASSP , pp. 532-535
    • Gillick, L.1    Cox, S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.