SCOPUS 정보 검색 플랫폼

Computer Speech and Language

Volumn 23, Issue 2, 2009, Pages 176-199

Improving robustness of MLLR adaptation with speaker-clustered regression class trees

(3) Mandal, Arindam a Ostendorf, Mari a Stolcke, Andreas b,c

a University of Washington (United States)

b SRI INTERNATIONAL (United States)

c INTERNATIONAL COMPUTER SCIENCE INSTITUTE (United States)

Author keywords

Regression class trees; Speaker adaptation; Speaker clustering; Speech recognition

Indexed keywords

BOOLEAN FUNCTIONS; ERROR ANALYSIS; LEARNING SYSTEMS; MATHEMATICAL TRANSFORMATIONS; MAXIMUM LIKELIHOOD ESTIMATION; REGRESSION ANALYSIS; TREES (MATHEMATICS);

AUTOMATIC SPEECH RECOGNITION SYSTEMS; CLUSTERED REGRESSION; CLUSTERING PROCEDURES; EIGEN SPACE; LINEAR COMBINATION; MAXIMUM-LIKELIHOOD LINEAR REGRESSION; MLLR ADAPTATION; PERFORMANCE LOSSES; RECOGNITION PERFORMANCE; REGRESSION CLASS TREES; SIGNIFICANT REDUCTION; SPEAKER ADAPTATION; SPEAKER CLUSTERING; SPEAKER VARIABILITY; TREE STRUCTURES; UNSUPERVISED ADAPTATION; WORD ERROR RATES;

SPEECH RECOGNITION;

EID: 53849127143 PISSN: 08852308 EISSN: 10958363 Source Type: Journal
DOI: 10.1016/j.csl.2008.05.004 Document Type: Article

Times cited : (9)

References (38)

1
- 0030677475
- Anastasakos, T., McDonough, J., Makhoul, J., 1997. Speaker adaptive training: a maximum likelihood approach to speaker normalization. In: Proc. of ICASSP, vol. 2, pp. 1043-1046.
- Anastasakos, T., McDonough, J., Makhoul, J., 1997. Speaker adaptive training: a maximum likelihood approach to speaker normalization. In: Proc. of ICASSP, vol. 2, pp. 1043-1046.

2
- 0032657749
- Bocchieri, E., Digalakis, V., Corduneanu, A., Boulis, C., 1999. Correlation modeling of MLLR transform biases for rapid HMM adaptation to new speakers. In: Proc. of ICASSP, vol. 2, pp. 773-776.
- Bocchieri, E., Digalakis, V., Corduneanu, A., Boulis, C., 1999. Correlation modeling of MLLR transform biases for rapid HMM adaptation to new speakers. In: Proc. of ICASSP, vol. 2, pp. 773-776.

3
- 0035412897
- Maximum likelihood stochastic transformations adaptation for medium and small data sets
- Boulis C., Diakoloukas V., and Digalakis V. Maximum likelihood stochastic transformations adaptation for medium and small data sets. Computer Speech & Language 15 3 (2001) 257-287
- (2001) Computer Speech & Language , vol.15 , Issue.3 , pp. 257-287
- Boulis, C.¹ Diakoloukas, V.² Digalakis, V.³

4
- 85009097035
- Chen, K.T., Liau, W.W., Wang, H.M., Lee, L.S., 2000. Fast speaker adaptation using eigenspace-based maximum likelihood linear regression. In: Proc. of ICSLP, vol. III, pp. 742-745.
- Chen, K.T., Liau, W.W., Wang, H.M., Lee, L.S., 2000. Fast speaker adaptation using eigenspace-based maximum likelihood linear regression. In: Proc. of ICSLP, vol. III, pp. 742-745.

5
- 84959118000
- Cieri, C., Miller, D., Walker, K., 2004. The Fisher corpus: a resource for the next generations of speech-to-text. In: Fourth International Conference on Language Resources and Evaluation.
- Cieri, C., Miller, D., Walker, K., 2004. The Fisher corpus: a resource for the next generations of speech-to-text. In: Fourth International Conference on Language Resources and Evaluation.

6
- 0002629270
- Maximum likelihood from incomplete data via the EM algorithm
- Dempster A., Laird N., and Rubin D. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society 39 1 (1977) 1-38
- (1977) Journal of the Royal Statistical Society , vol.39 , Issue.1 , pp. 1-38
- Dempster, A.¹ Laird, N.² Rubin, D.³

7
- 0029375590
- Speaker adaptation using constrained estimation of Gaussian mixtures
- Digalakis V., Rtischev D., and Neumeyer L. Speaker adaptation using constrained estimation of Gaussian mixtures. IEEE Transactions on Speech and Audio Processing 3 5 (1995) 357-366
- (1995) IEEE Transactions on Speech and Audio Processing , vol.3 , Issue.5 , pp. 357-366
- Digalakis, V.¹ Rtischev, D.² Neumeyer, L.³

8
- 33745193034
- Ferrer, L., Sönmez, K., Kajarekar, S., 2005. Class-based score combination for speaker recognition. In: Proc. of Eurospeech, pp. 2173-2176.
- Ferrer, L., Sönmez, K., Kajarekar, S., 2005. Class-based score combination for speaker recognition. In: Proc. of Eurospeech, pp. 2173-2176.

9
- 85079937920
- Gales, M., 1996. The generation and use of regression class trees for MLLR adaptation. Tech. Rep. CUED/F-INFENG/TR263, Cambridge University.
- Gales, M., 1996. The generation and use of regression class trees for MLLR adaptation. Tech. Rep. CUED/F-INFENG/TR263, Cambridge University.

10
- 85079947237
- Gales, M., 1997. Transformation smoothing for speaker and environmental adaptation. In: Proc. of Eurospeech, vol. 4, pp. 2067-2070.
- Gales, M., 1997. Transformation smoothing for speaker and environmental adaptation. In: Proc. of Eurospeech, vol. 4, pp. 2067-2070.

11
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- Gales M. Maximum likelihood linear transformations for HMM-based speech recognition. Computer Speech & Language 12 (1998) 75-98
- (1998) Computer Speech & Language , vol.12 , pp. 75-98
- Gales, M.¹

12
- 0034227757
- Cluster adaptive training of hidden Markov models
- Gales M. Cluster adaptive training of hidden Markov models. IEEE Transactions on Speech and Audio Processing 8 4 (2000) 417-428
- (2000) IEEE Transactions on Speech and Audio Processing , vol.8 , Issue.4 , pp. 417-428
- Gales, M.¹

13
- 0030263447
- Mean and variance compensation within the MLLR framework
- Gales M., and Woodland P. Mean and variance compensation within the MLLR framework. Computer Speech & Language 10 (1996) 249-264
- (1996) Computer Speech & Language , vol.10 , pp. 249-264
- Gales, M.¹ Woodland, P.²

14
- 0035279117
- Automatic generation of phonetic regression class trees for MLLR adaptation
- Haeb-Umbach R. Automatic generation of phonetic regression class trees for MLLR adaptation. IEEE Transactions on Speech and Audio Processing 9 3 (2001) 299-302
- (2001) IEEE Transactions on Speech and Audio Processing , vol.9 , Issue.3 , pp. 299-302
- Haeb-Umbach, R.¹

15
- 85009113198
- Huang, C., Chen, T., Li, S., Chang, E., Zhou, J., 2001. Analysis of speaker variability. In: Proc. of Eurospeech, vol. 2, pp. 1377-1380.
- Huang, C., Chen, T., Li, S., Chang, E., Zhou, J., 2001. Analysis of speaker variability. In: Proc. of Eurospeech, vol. 2, pp. 1377-1380.

16
- 14644420596
- Hwang, M.-Y., Huang, X., 1998. Dynamically configurable acoustic models for speech recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP'98, vol. 2, pp. 669-672.
- Hwang, M.-Y., Huang, X., 1998. Dynamically configurable acoustic models for speech recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP'98, vol. 2, pp. 669-672.

17
- 34547543570
- Hwang, M.-Y., Lei, X., Wang, W., Shinozaki, T., 2006. Investigation on Mandarin broadcast news speech recognition. In: Proc. of ICSLP, pp. 1233-1236.
- Hwang, M.-Y., Lei, X., Wang, W., Shinozaki, T., 2006. Investigation on Mandarin broadcast news speech recognition. In: Proc. of ICSLP, pp. 1233-1236.

18
- 0026405257
- Imamura, A., 1991. Speaker adaptive HMM-based speech recognition with a stochastic speaker classifier. In: Proc. of ICASSP, vol. 2, pp. 841-844.
- Imamura, A., 1991. Speaker adaptive HMM-based speech recognition with a stochastic speaker classifier. In: Proc. of ICASSP, vol. 2, pp. 841-844.

19
- 0028462458
- Maximum likelihood clustering of gaussians for speech recognition
- Kannan A., Ostendorf M., and Rohlicek J.R. Maximum likelihood clustering of gaussians for speech recognition. IEEE Transactions in Speech and Audio Processing 2 3 (1994) 453-455
- (1994) IEEE Transactions in Speech and Audio Processing , vol.2 , Issue.3 , pp. 453-455
- Kannan, A.¹ Ostendorf, M.² Rohlicek, J.R.³

20
- 85009078667
- Kosaka, T., Sagayama, S., 1994. Tree structured speaker clustering for fast speaker adaptation. In: Proc. of ICASSP, vol. 1, pp. 245-248.
- Kosaka, T., Sagayama, S., 1994. Tree structured speaker clustering for fast speaker adaptation. In: Proc. of ICASSP, vol. 1, pp. 245-248.

21
- 0034320005
- Rapid speaker adaptation in eigenvoice space
- Kuhn R., Junqua J., Nguyen P., and Niedzielski N. Rapid speaker adaptation in eigenvoice space. IEEE Transactions on Speech and Audio Processing 8 6 (2000) 695-707
- (2000) IEEE Transactions on Speech and Audio Processing , vol.8 , Issue.6 , pp. 695-707
- Kuhn, R.¹ Junqua, J.² Nguyen, P.³ Niedzielski, N.⁴

22
- 85079919421
- Labov, W., 1996. The organization of dialect diversity in North America. In: Fourth International Conference on Spoken Language Processing.
- Labov, W., 1996. The organization of dialect diversity in North America. In: Fourth International Conference on Spoken Language Processing.

23
- 85080010507
- Leggetter, C., 1995. Improved acoustic modelling for HMMs using linear transformations. Ph.D. Thesis, University of Cambridge.
- Leggetter, C., 1995. Improved acoustic modelling for HMMs using linear transformations. Ph.D. Thesis, University of Cambridge.

24
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of HMMs
- Leggetter C., and Woodland P. Maximum likelihood linear regression for speaker adaptation of HMMs. Computer Speech & Language 9 (1995) 171-185
- (1995) Computer Speech & Language , vol.9 , pp. 171-185
- Leggetter, C.¹ Woodland, P.²

25
- 33646780066
- Mak, B., Hsiao, R., 2004. Improving eigenspace-based MLLR adaptation by kernel PCA. In: Proc. of ICSLP, vol. I, pp. 13-16.
- Mak, B., Hsiao, R., 2004. Improving eigenspace-based MLLR adaptation by kernel PCA. In: Proc. of ICSLP, vol. I, pp. 13-16.

26
- 33745214663
- Mandal, A., Ostendorf, M., Stolcke, A., 2005. Leveraging speaker-dependent variation of adaptation. In: Proc. of Eurospeech, pp. 1793-1796.
- Mandal, A., Ostendorf, M., Stolcke, A., 2005. Leveraging speaker-dependent variation of adaptation. In: Proc. of Eurospeech, pp. 1793-1796.

27
- 44949085751
- Mandal, A., Ostendorf, M., Stolcke, A., 2006. Speaker clustered regression-class trees for MLLR adaptation. In: Proc. of ICSLP, pp. 1133-1136.
- Mandal, A., Ostendorf, M., Stolcke, A., 2006. Speaker clustered regression-class trees for MLLR adaptation. In: Proc. of ICSLP, pp. 1133-1136.

28
- 0003607151
- Academic Press
- Mardia K., Kent J., and Bibby J. Multivariate Analysis (1979), Academic Press
- (1979) Multivariate Analysis
- Mardia, K.¹ Kent, J.² Bibby, J.³

29
- 0031704151
- Speaker clustering and transformation for speaker adaptation in speech recognition systems
- Padmanabhan M., Bahl L., Nahamoo D., and Picheny M. Speaker clustering and transformation for speaker adaptation in speech recognition systems. IEEE Transactions on Speech and Audio Processing 6 1 (1998) 71-77
- (1998) IEEE Transactions on Speech and Audio Processing , vol.6 , Issue.1 , pp. 71-77
- Padmanabhan, M.¹ Bahl, L.² Nahamoo, D.³ Picheny, M.⁴

30
- 85080000755
- R Development Core Team, 2005. R: a language and environment for statistical computing. R Foundation for Statistical Computing, ISBN 3-900051-07-0. .
- R Development Core Team, 2005. R: a language and environment for statistical computing. R Foundation for Statistical Computing, ISBN 3-900051-07-0. .

31
- 85079984138
- Sankar, A., Beaufays, F., Digilakis, V., 1995. Training data clustering for improved speech recognition. In: Proc. of Eurospeech, vol. 1, pp. 502-505.
- Sankar, A., Beaufays, F., Digilakis, V., 1995. Training data clustering for improved speech recognition. In: Proc. of Eurospeech, vol. 1, pp. 502-505.

32
- 0029726519
- Sankar, A., Neumeyer, L., Weintraub, M., 1996. An experimental study of acoustic adaptation algorithms. In: Proc. of ICASSP, vol. 2, pp. 713-716.
- Sankar, A., Neumeyer, L., Weintraub, M., 1996. An experimental study of acoustic adaptation algorithms. In: Proc. of ICASSP, vol. 2, pp. 713-716.

33
- 85079975000
- Sankar, A., Gadde, R., Weng, F., 1999. SRI's 1998 broadcast news system - towards faster, smaller, and better speech recognition. In: DARPA Broadcast News Workshop, pp. 281-286.
- Sankar, A., Gadde, R., Weng, F., 1999. SRI's 1998 broadcast news system - towards faster, smaller, and better speech recognition. In: DARPA Broadcast News Workshop, pp. 281-286.

34
- 33745216683
- Stolcke, A., Ferrer, L., Kajarekar, S., Shriberg, E., Venkataraman, A., 2005. MLLR transforms as features in speaker recognition. In: Proc. of Eurospeech, pp. 2425-2428.
- Stolcke, A., Ferrer, L., Kajarekar, S., Shriberg, E., Venkataraman, A., 2005. MLLR transforms as features in speaker recognition. In: Proc. of Eurospeech, pp. 2425-2428.

35
- 34047270914
- Recent innovations in speech-to-text transcription at SRI-ICSI-UW
- Stolcke A., Chen B., Franco H., Gadde R., Graciarena M., Hwang M.Y., Kirchoff K., Lei X., Mandal A., Morgan N., Ng T., Ostendorf M., Sonmez K., Venkataraman A., Vergyri D., Wang W., Zheng J., and Zhu Q. Recent innovations in speech-to-text transcription at SRI-ICSI-UW. IEEE Transactions on Audio, Speech and Language Processing 14 5 (2006) 1729-1744
- (2006) IEEE Transactions on Audio, Speech and Language Processing , vol.14 , Issue.5 , pp. 1729-1744
- Stolcke, A.¹ Chen, B.² Franco, H.³ Gadde, R.⁴ Graciarena, M.⁵ Hwang, M.Y.⁶ Kirchoff, K.⁷ Lei, X.⁸ Mandal, A.⁹ Morgan, N.¹⁰ Ng, T.¹¹ Ostendorf, M.¹² Sonmez, K.¹³ Venkataraman, A.¹⁴ Vergyri, D.¹⁵ Wang, W.¹⁶ Zheng, J.¹⁷ Zhu, Q.¹⁸

36
- 85080012139
- Venkataraman, A., Stolcke, A., Wang, W., Vergyri, D., Gadde, V., Zheng, J., 2004. SRI's 2004 broadcast news speech to text system. In: EARS RT04 Workshop.
- Venkataraman, A., Stolcke, A., Wang, W., Vergyri, D., Gadde, V., Zheng, J., 2004. SRI's 2004 broadcast news speech to text system. In: EARS RT04 Workshop.

37
- 0029726509
- Woodland, P., Gales, M., Pye, D., 1996. Improving environmental robustness in large vocabulary speech recognition. In: Proc. of ICASSP, vol. 1, pp. 65-68.
- Woodland, P., Gales, M., Pye, D., 1996. Improving environmental robustness in large vocabulary speech recognition. In: Proc. of ICASSP, vol. 1, pp. 65-68.

38
- 85079920297
- Young, S., Odell, J., Woodland, P., 1994. Tree based state tying for high accuracy modelling. In: Proc. ARPA Spoken Language Technology Workshop, pp. 405-410.
- Young, S., Odell, J., Woodland, P., 1994. Tree based state tying for high accuracy modelling. In: Proc. ARPA Spoken Language Technology Workshop, pp. 405-410.

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.