SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 15, Issue 8, 2007, Pages 2454-2464

Speaker adaptation with limited data using regression-tree-based spectral peak alignment

(3) Wang, Shizhen a Cui, Xiaodong a,b Alwan, Abeer a

a UNIVERSITY OF CALIFORNIA (United States)

b IBM T J WATSON RESEARCH CENTER (United States)

Author keywords

Peak alignment; Regression tree; Speaker adaptation; Speech recognition; Vocal tract length normalization (VTLN)

Indexed keywords

AUTOMATIC SPEECH RECOGNITION SYSTEMS; CONNECTED DIGITS; DATA-DRIVEN; FREQUENCY WARPING; GAUSSIAN MIXTURES; LIMITED DATUM; PEAK ALIGNMENT; PERFORMANCE IMPROVEMENTS; RAPID SPEAKER ADAPTATIONS; REGRESSION TREE; SPEAKER ADAPTATION; SPEAKER NORMALIZATIONS; SPECTRAL MISMATCHES; SPECTRAL PEAKS; STATISTICAL ESTIMATIONS; TRAINING AND TESTING; TREE-BASED; UNSUPERVISED ADAPTATIONS; VOCAL TRACT LENGTH NORMALIZATION (VTLN);

ALIGNMENT; FREQUENCY ESTIMATION; MAXIMUM LIKELIHOOD; REGRESSION ANALYSIS; REMELTING; SPEECH ANALYSIS; TREES (MATHEMATICS);

SPEECH RECOGNITION;

EID: 64149115312 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2007.906740 Document Type: Article

Times cited : (6)

References (35)

1
- 0027578837
- On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition
- Apr
- X. Huang and K. F. Lee, "On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition," IEEE Trans. Speech Audio Process., vol. 1, no. 2, pp. 150-157, Apr. 1993.
- (1993) IEEE Trans. Speech Audio Process , vol.1 , Issue.2 , pp. 150-157
- Huang, X.¹ Lee, K.F.²

2
- 0032969462
- Acoustic of children's speech: Developmental changes of temporal and spectral parameters
- S. Lee, A. Potamianos, and S. Narayanan, "Acoustic of children's speech: Developmental changes of temporal and spectral parameters," J. Acoust. Soc. Amer., vol. 105, no. 3, pp. 1455-1468, 1999.
- (1999) J. Acoust. Soc. Amer , vol.105 , Issue.3 , pp. 1455-1468
- Lee, S.¹ Potamianos, A.² Narayanan, S.³

3
- 0029747582
- A study of speech recognition for children and elderly
- J.Wilpon and C. Jacobsen, "A study of speech recognition for children and elderly," in Proc. ICASSP, 1996, vol. I, pp. 349-352.
- (1996) Proc. ICASSP , vol.1 , pp. 349-352
- Wilpon, J.¹ Jacobsen, C.²

4
- 0017482612
- Normalization of vowels by vocal tract length and its application to vowel identification
- Apr
- H. Wakita, "Normalization of vowels by vocal tract length and its application to vowel identification," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-25, no. 2, pp. 183-192, Apr. 1977.
- (1977) IEEE Trans. Acoust., Speech, Signal Process , vol.ASSP-25 , Issue.2 , pp. 183-192
- Wakita, H.¹

5
- 0003418124
- The Hague, The Netherlands: Mouton
- G. Fant, Acoustic Theory of Speech Production. The Hague, The Netherlands: Mouton, 1960.
- (1960) Acoustic Theory of Speech Production
- Fant, G.¹

6
- 0029375590
- Speaker adaptation using constrained estimation of Gaussian mixtures
- Sep
- V. Digalakis, D. Rtischev, and L. G. Neumeyer, "Speaker adaptation using constrained estimation of Gaussian mixtures," IEEE Trans. Speech Audio Process., vol. 3, no. 5, pp. 357-366, Sep. 1995.
- (1995) IEEE Trans. Speech Audio Process , vol.3 , Issue.5 , pp. 357-366
- Digalakis, V.¹ Rtischev, D.² Neumeyer, L.G.³

7
- 0032050110
- Maximum likelihood linear transformations for HMMbased speech recognition
- M. J. F. Gales, "Maximum likelihood linear transformations for HMMbased speech recognition," Comput. Speech Lang., vol. 12, no. 2, pp. 75-98, 1998.
- (1998) Comput. Speech Lang , vol.12 , Issue.2 , pp. 75-98
- Gales, M.J.F.¹

8
- 0032657748
- Rapid speech recognizer adaptation to new speakers
- V. Digalakis, S. Berkowitz, E. Bocchieri, C. Boulis, and W. Byrnc, "Rapid speech recognizer adaptation to new speakers," in Proc. ICASSP, 1999, pp. 765-768.
- (1999) Proc. ICASSP , pp. 765-768
- Digalakis, V.¹ Berkowitz, S.² Bocchieri, E.³ Boulis, C.⁴ Byrnc, W.⁵

9
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., vol. 9, pp. 171-185, 1995.
- (1995) Comput. Speech Lang , vol.9 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

10
- 0031647824
- A frequencywarping approach to speaker normalization
- Jan
- L. Lee and R. Rose, "A frequencywarping approach to speaker normalization," IEEE Trans. Speech Audio Process., vol. 6, no. 1, pp. 49-60, Jan. 1998.
- (1998) IEEE Trans. Speech Audio Process , vol.6 , Issue.1 , pp. 49-60
- Lee, L.¹ Rose, R.²

11
- 0029764708
- Speaker normalisation on conversational telephone speech
- S.Wegmann, D. McAllaster, J. Orloff, and B. Peskin, "Speaker normalisation on conversational telephone speech," in Proc. ICASSP, 1996, vol. I, pp. 339-341.
- (1996) Proc. ICASSP , vol.1 , pp. 339-341
- Wegmann, S.¹ McAllaster, D.² Orloff, J.³ Peskin, B.⁴

12
- 0029725604
- A parametric approach to vocal tract length normalization
- E. Eide and H. Gish, "A parametric approach to vocal tract length normalization," in Proc. ICASSP, 1996, pp. 346-349.
- (1996) Proc. ICASSP , pp. 346-349
- Eide, E.¹ Gish, H.²

13
- 85128432219
- Speaker normalization with all-pass transforms
- J. McDonough, W. Byrne, and X. Luo, "Speaker normalization with all-pass transforms," in Proc. ICSLP, 1998, vol. VI, pp. 2307-2310.
- (1998) Proc. ICSLP , vol.6 , pp. 2307-2310
- McDonough, J.¹ Byrne, W.² Luo, X.³

14
- 84937324786
- Speaker compensation with all-pass transforms,
- Ph.D. dissertation, Johns Hopkins Univ, Baltimore, MD
- J. McDonough, "Speaker compensation with all-pass transforms," Ph.D. dissertation, Johns Hopkins Univ., Baltimore, MD, 2000.
- (2000)
- McDonough, J.¹

15
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
- Apr
- J. L. Gauvain and C. H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Process., vol. 2, no. 2, pp. 291-298, Apr. 1994.
- (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.L.¹ Lee, C.H.²

16
- 85009236682
- Implementing vocal tract length normalization in the MLLR framework
- G. Ding, Y. Zhu, C. Li, and B. Xu, "Implementing vocal tract length normalization in the MLLR framework," in Proc. ICSLP, 2002, pp. 1389-1392.
- (2002) Proc. ICSLP , pp. 1389-1392
- Ding, G.¹ Zhu, Y.² Li, C.³ Xu, B.⁴

17
- 0032204117
- A novel feature transformation for vocal tract length normalization in automatic speech recognition
- Nov
- T. Claes, I. Dologlou, L. Bosch, and D. V. Compernolle, "A novel feature transformation for vocal tract length normalization in automatic speech recognition," IEEE Trans. Speech Audio Processs., vol. 11, no. 6, pp. 549-557, Nov. 1998.
- (1998) IEEE Trans. Speech Audio Processs , vol.11 , Issue.6 , pp. 549-557
- Claes, T.¹ Dologlou, I.² Bosch, L.³ Compernolle, D.V.⁴

18
- 0032657747
- Speaker adaptation with all-pass transforms
- J. McDonough and W. Byrne, "Speaker adaptation with all-pass transforms," in Proc. ICASSP, 1999, pp. 757-760.
- (1999) Proc. ICASSP , pp. 757-760
- McDonough, J.¹ Byrne, W.²

19
- 0347269184
- Speaker adaptation with allpass transforms
- J. McDonough, T. Shaaf, and A.Waibel, "Speaker adaptation with allpass transforms," Speech Commn., vol. 42, pp. 75-91, 2004.
- (2004) Speech Commn , vol.42 , pp. 75-91
- McDonough, J.¹ Shaaf, T.² Waibel, A.³

20
- 4544375020
- Performance comparisons of all-pass transform adaptation with maximum likelihood linear regression
- J. McDonough and A. Waibel, "Performance comparisons of all-pass transform adaptation with maximum likelihood linear regression," in Proc. ICASSP, 2004, pp. I313-I316.
- (2004) Proc. ICASSP
- McDonough, J.¹ Waibel, A.²

21
- 85009174854
- Vocal tract normalization as linear transformation of MFCC
- M. Pitz and H. Ney, "Vocal tract normalization as linear transformation of MFCC," in Proc. Eur. Conf. Speech Commun. Technol., 2003, pp. 1445-1448.
- (2003) Proc. Eur. Conf. Speech Commun. Technol , pp. 1445-1448
- Pitz, M.¹ Ney, H.²

22
- 33745201218
- Implementing frequency-warping and VTLN through linear transformation of conventional MFCC
- S. Umesh, A. Zolnay, and H. Ney, "Implementing frequency-warping and VTLN through linear transformation of conventional MFCC," in Proc. Interspeech'05, 2005, pp. 269-272.
- (2005) Proc. Interspeech'05 , pp. 269-272
- Umesh, S.¹ Zolnay, A.² Ney, H.³

23
- 33745207149
- MLLR-like speaker adaptation based on linearization of VTLN with MFCC features
- X. Cui and A. Alwan, "MLLR-like speaker adaptation based on linearization of VTLN with MFCC features," in Proc. Interspeech'05, 2005, pp. 273-276.
- (2005) Proc. Interspeech'05 , pp. 273-276
- Cui, X.¹ Alwan, A.²

24
- 33746753361
- Adaptation of children's speech with limited data based on formant-like peak alignment
- X. Cui and A. Alwan, "Adaptation of children's speech with limited data based on formant-like peak alignment," Comput. Speech Lang., vol. 20, no. 4, pp. 400-419, 2006.
- (2006) Comput. Speech Lang , vol.20 , Issue.4 , pp. 400-419
- Cui, X.¹ Alwan, A.²

25
- 0002629270
- Maximum likelihood from incomplete data via the EM algorithm
- A. Dempster, N. Laird, and D. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc., vol. 39, no. 1, pp. 1-38, 1977.
- (1977) J. R. Statist. Soc , vol.39 , Issue.1 , pp. 1-38
- Dempster, A.¹ Laird, N.² Rubin, D.³

26
- 0023776398
- The DARPA 1000-word resource management database for continuous speech recognition
- P. Price, W. M. Fisher, J. Bernstein, and D. S. Pallett, "The DARPA 1000-word resource management database for continuous speech recognition," in Proc. ICASSP, 1998, vol. 1, pp. 651-654.
- (1998) Proc. ICASSP , vol.1 , pp. 651-654
- Price, P.¹ Fisher, W.M.² Bernstein, J.³ Pallett, D.S.⁴

27
- 0002560960
- A database for speaker-independent digit recognition
- R. Leonard, "A database for speaker-independent digit recognition," in Proc. ICASSP, 1984, vol. 9, pp. 328-331.
- (1984) Proc. ICASSP , vol.9 , pp. 328-331
- Leonard, R.¹

28
- 0030362995
- A compact model for speaker-adaptive training
- T. Anastasakos, J. McDonough, R. Schwartz, and J. Makhoul, "A compact model for speaker-adaptive training," in Proc. ICSLP, 1996, pp. 1137-1140.
- (1996) Proc. ICSLP , pp. 1137-1140
- Anastasakos, T.¹ McDonough, J.² Schwartz, R.³ Makhoul, J.⁴

29
- 0030374913
- Formant analysis using mixtures of Gaussians
- P. Zolfaghari and T. Robinson, "Formant analysis using mixtures of Gaussians," in Proc. ICSLP, 1996, pp. 1229-1232.
- (1996) Proc. ICSLP , pp. 1229-1232
- Zolfaghari, P.¹ Robinson, T.²

30
- 44949157762
- Frequency warping by linear transformation of standard MFCC
- S. Panchapagesan, "Frequency warping by linear transformation of standard MFCC," in Proc. Interspeech, 2006, pp. 397-400.
- (2006) Proc. Interspeech , pp. 397-400
- Panchapagesan, S.¹

31
- 84874875877
- Maximum a posterior linear regression with elliptically symmetric matrix variate priors
- W. Chou, "Maximum a posterior linear regression with elliptically symmetric matrix variate priors," in Proc. Eurospeech, 1999, pp. 1-4.
- (1999) Proc. Eurospeech , pp. 1-4
- Chou, W.¹

32
- 84885507503
- Maximum a posteriori linear regression (MAPLR) variance adaptation for continuous density HMMs
- W. Chou and X. He, "Maximum a posteriori linear regression (MAPLR) variance adaptation for continuous density HMMs," in Proc. Eurospeech, 2003, pp. 1513-1516.
- (2003) Proc. Eurospeech , pp. 1513-1516
- Chou, W.¹ He, X.²

33
- 85135272864
- Maximum a posteriori linear regression for hidden Markov model adaptation
- C. Chesta, O. Siohan, and C. Lee, "Maximum a posteriori linear regression for hidden Markov model adaptation," in Proc. Eurospeech, 1999, pp. 211-214.
- (1999) Proc. Eurospeech , pp. 211-214
- Chesta, C.¹ Siohan, O.² Lee, C.³

34
- 0003425258
- Englewood Cliffs, NJ: Prentice-Hall
- L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals. Englewood Cliffs, NJ: Prentice-Hall, 1978.
- (1978) Digital Processing of Speech Signals
- Rabiner, L.R.¹ Schafer, R.W.²

35
- 0024909979
- Some statistical issues in the comparison of speech recognition algorithm
- L. Gillick and S. Cox, "Some statistical issues in the comparison of speech recognition algorithm," in Proc. ICASSP, 1989, pp. 532-535.
- (1989) Proc. ICASSP , pp. 532-535
- Gillick, L.¹ Cox, S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.