메뉴 건너뛰기




Volumn 23, Issue 2, 2009, Pages 176-199

Improving robustness of MLLR adaptation with speaker-clustered regression class trees

Author keywords

Regression class trees; Speaker adaptation; Speaker clustering; Speech recognition

Indexed keywords

BOOLEAN FUNCTIONS; ERROR ANALYSIS; LEARNING SYSTEMS; MATHEMATICAL TRANSFORMATIONS; MAXIMUM LIKELIHOOD ESTIMATION; REGRESSION ANALYSIS; TREES (MATHEMATICS);

EID: 53849127143     PISSN: 08852308     EISSN: 10958363     Source Type: Journal    
DOI: 10.1016/j.csl.2008.05.004     Document Type: Article
Times cited : (9)

References (38)
  • 1
    • 0030677475 scopus 로고    scopus 로고
    • Anastasakos, T., McDonough, J., Makhoul, J., 1997. Speaker adaptive training: a maximum likelihood approach to speaker normalization. In: Proc. of ICASSP, vol. 2, pp. 1043-1046.
    • Anastasakos, T., McDonough, J., Makhoul, J., 1997. Speaker adaptive training: a maximum likelihood approach to speaker normalization. In: Proc. of ICASSP, vol. 2, pp. 1043-1046.
  • 2
    • 0032657749 scopus 로고    scopus 로고
    • Bocchieri, E., Digalakis, V., Corduneanu, A., Boulis, C., 1999. Correlation modeling of MLLR transform biases for rapid HMM adaptation to new speakers. In: Proc. of ICASSP, vol. 2, pp. 773-776.
    • Bocchieri, E., Digalakis, V., Corduneanu, A., Boulis, C., 1999. Correlation modeling of MLLR transform biases for rapid HMM adaptation to new speakers. In: Proc. of ICASSP, vol. 2, pp. 773-776.
  • 3
    • 0035412897 scopus 로고    scopus 로고
    • Maximum likelihood stochastic transformations adaptation for medium and small data sets
    • Boulis C., Diakoloukas V., and Digalakis V. Maximum likelihood stochastic transformations adaptation for medium and small data sets. Computer Speech & Language 15 3 (2001) 257-287
    • (2001) Computer Speech & Language , vol.15 , Issue.3 , pp. 257-287
    • Boulis, C.1    Diakoloukas, V.2    Digalakis, V.3
  • 4
    • 85009097035 scopus 로고    scopus 로고
    • Chen, K.T., Liau, W.W., Wang, H.M., Lee, L.S., 2000. Fast speaker adaptation using eigenspace-based maximum likelihood linear regression. In: Proc. of ICSLP, vol. III, pp. 742-745.
    • Chen, K.T., Liau, W.W., Wang, H.M., Lee, L.S., 2000. Fast speaker adaptation using eigenspace-based maximum likelihood linear regression. In: Proc. of ICSLP, vol. III, pp. 742-745.
  • 5
    • 84959118000 scopus 로고    scopus 로고
    • Cieri, C., Miller, D., Walker, K., 2004. The Fisher corpus: a resource for the next generations of speech-to-text. In: Fourth International Conference on Language Resources and Evaluation.
    • Cieri, C., Miller, D., Walker, K., 2004. The Fisher corpus: a resource for the next generations of speech-to-text. In: Fourth International Conference on Language Resources and Evaluation.
  • 8
    • 33745193034 scopus 로고    scopus 로고
    • Ferrer, L., Sönmez, K., Kajarekar, S., 2005. Class-based score combination for speaker recognition. In: Proc. of Eurospeech, pp. 2173-2176.
    • Ferrer, L., Sönmez, K., Kajarekar, S., 2005. Class-based score combination for speaker recognition. In: Proc. of Eurospeech, pp. 2173-2176.
  • 9
    • 85079937920 scopus 로고    scopus 로고
    • Gales, M., 1996. The generation and use of regression class trees for MLLR adaptation. Tech. Rep. CUED/F-INFENG/TR263, Cambridge University.
    • Gales, M., 1996. The generation and use of regression class trees for MLLR adaptation. Tech. Rep. CUED/F-INFENG/TR263, Cambridge University.
  • 10
    • 85079947237 scopus 로고    scopus 로고
    • Gales, M., 1997. Transformation smoothing for speaker and environmental adaptation. In: Proc. of Eurospeech, vol. 4, pp. 2067-2070.
    • Gales, M., 1997. Transformation smoothing for speaker and environmental adaptation. In: Proc. of Eurospeech, vol. 4, pp. 2067-2070.
  • 11
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • Gales M. Maximum likelihood linear transformations for HMM-based speech recognition. Computer Speech & Language 12 (1998) 75-98
    • (1998) Computer Speech & Language , vol.12 , pp. 75-98
    • Gales, M.1
  • 13
    • 0030263447 scopus 로고    scopus 로고
    • Mean and variance compensation within the MLLR framework
    • Gales M., and Woodland P. Mean and variance compensation within the MLLR framework. Computer Speech & Language 10 (1996) 249-264
    • (1996) Computer Speech & Language , vol.10 , pp. 249-264
    • Gales, M.1    Woodland, P.2
  • 14
    • 0035279117 scopus 로고    scopus 로고
    • Automatic generation of phonetic regression class trees for MLLR adaptation
    • Haeb-Umbach R. Automatic generation of phonetic regression class trees for MLLR adaptation. IEEE Transactions on Speech and Audio Processing 9 3 (2001) 299-302
    • (2001) IEEE Transactions on Speech and Audio Processing , vol.9 , Issue.3 , pp. 299-302
    • Haeb-Umbach, R.1
  • 15
    • 85009113198 scopus 로고    scopus 로고
    • Huang, C., Chen, T., Li, S., Chang, E., Zhou, J., 2001. Analysis of speaker variability. In: Proc. of Eurospeech, vol. 2, pp. 1377-1380.
    • Huang, C., Chen, T., Li, S., Chang, E., Zhou, J., 2001. Analysis of speaker variability. In: Proc. of Eurospeech, vol. 2, pp. 1377-1380.
  • 16
    • 14644420596 scopus 로고    scopus 로고
    • Hwang, M.-Y., Huang, X., 1998. Dynamically configurable acoustic models for speech recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP'98, vol. 2, pp. 669-672.
    • Hwang, M.-Y., Huang, X., 1998. Dynamically configurable acoustic models for speech recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP'98, vol. 2, pp. 669-672.
  • 17
    • 34547543570 scopus 로고    scopus 로고
    • Hwang, M.-Y., Lei, X., Wang, W., Shinozaki, T., 2006. Investigation on Mandarin broadcast news speech recognition. In: Proc. of ICSLP, pp. 1233-1236.
    • Hwang, M.-Y., Lei, X., Wang, W., Shinozaki, T., 2006. Investigation on Mandarin broadcast news speech recognition. In: Proc. of ICSLP, pp. 1233-1236.
  • 18
    • 0026405257 scopus 로고    scopus 로고
    • Imamura, A., 1991. Speaker adaptive HMM-based speech recognition with a stochastic speaker classifier. In: Proc. of ICASSP, vol. 2, pp. 841-844.
    • Imamura, A., 1991. Speaker adaptive HMM-based speech recognition with a stochastic speaker classifier. In: Proc. of ICASSP, vol. 2, pp. 841-844.
  • 20
    • 85009078667 scopus 로고    scopus 로고
    • Kosaka, T., Sagayama, S., 1994. Tree structured speaker clustering for fast speaker adaptation. In: Proc. of ICASSP, vol. 1, pp. 245-248.
    • Kosaka, T., Sagayama, S., 1994. Tree structured speaker clustering for fast speaker adaptation. In: Proc. of ICASSP, vol. 1, pp. 245-248.
  • 22
    • 85079919421 scopus 로고    scopus 로고
    • Labov, W., 1996. The organization of dialect diversity in North America. In: Fourth International Conference on Spoken Language Processing.
    • Labov, W., 1996. The organization of dialect diversity in North America. In: Fourth International Conference on Spoken Language Processing.
  • 23
    • 85080010507 scopus 로고    scopus 로고
    • Leggetter, C., 1995. Improved acoustic modelling for HMMs using linear transformations. Ph.D. Thesis, University of Cambridge.
    • Leggetter, C., 1995. Improved acoustic modelling for HMMs using linear transformations. Ph.D. Thesis, University of Cambridge.
  • 24
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of HMMs
    • Leggetter C., and Woodland P. Maximum likelihood linear regression for speaker adaptation of HMMs. Computer Speech & Language 9 (1995) 171-185
    • (1995) Computer Speech & Language , vol.9 , pp. 171-185
    • Leggetter, C.1    Woodland, P.2
  • 25
    • 33646780066 scopus 로고    scopus 로고
    • Mak, B., Hsiao, R., 2004. Improving eigenspace-based MLLR adaptation by kernel PCA. In: Proc. of ICSLP, vol. I, pp. 13-16.
    • Mak, B., Hsiao, R., 2004. Improving eigenspace-based MLLR adaptation by kernel PCA. In: Proc. of ICSLP, vol. I, pp. 13-16.
  • 26
    • 33745214663 scopus 로고    scopus 로고
    • Mandal, A., Ostendorf, M., Stolcke, A., 2005. Leveraging speaker-dependent variation of adaptation. In: Proc. of Eurospeech, pp. 1793-1796.
    • Mandal, A., Ostendorf, M., Stolcke, A., 2005. Leveraging speaker-dependent variation of adaptation. In: Proc. of Eurospeech, pp. 1793-1796.
  • 27
    • 44949085751 scopus 로고    scopus 로고
    • Mandal, A., Ostendorf, M., Stolcke, A., 2006. Speaker clustered regression-class trees for MLLR adaptation. In: Proc. of ICSLP, pp. 1133-1136.
    • Mandal, A., Ostendorf, M., Stolcke, A., 2006. Speaker clustered regression-class trees for MLLR adaptation. In: Proc. of ICSLP, pp. 1133-1136.
  • 30
    • 85080000755 scopus 로고    scopus 로고
    • R Development Core Team, 2005. R: a language and environment for statistical computing. R Foundation for Statistical Computing, ISBN 3-900051-07-0. .
    • R Development Core Team, 2005. R: a language and environment for statistical computing. R Foundation for Statistical Computing, ISBN 3-900051-07-0. .
  • 31
    • 85079984138 scopus 로고    scopus 로고
    • Sankar, A., Beaufays, F., Digilakis, V., 1995. Training data clustering for improved speech recognition. In: Proc. of Eurospeech, vol. 1, pp. 502-505.
    • Sankar, A., Beaufays, F., Digilakis, V., 1995. Training data clustering for improved speech recognition. In: Proc. of Eurospeech, vol. 1, pp. 502-505.
  • 32
    • 0029726519 scopus 로고    scopus 로고
    • Sankar, A., Neumeyer, L., Weintraub, M., 1996. An experimental study of acoustic adaptation algorithms. In: Proc. of ICASSP, vol. 2, pp. 713-716.
    • Sankar, A., Neumeyer, L., Weintraub, M., 1996. An experimental study of acoustic adaptation algorithms. In: Proc. of ICASSP, vol. 2, pp. 713-716.
  • 33
    • 85079975000 scopus 로고    scopus 로고
    • Sankar, A., Gadde, R., Weng, F., 1999. SRI's 1998 broadcast news system - towards faster, smaller, and better speech recognition. In: DARPA Broadcast News Workshop, pp. 281-286.
    • Sankar, A., Gadde, R., Weng, F., 1999. SRI's 1998 broadcast news system - towards faster, smaller, and better speech recognition. In: DARPA Broadcast News Workshop, pp. 281-286.
  • 34
    • 33745216683 scopus 로고    scopus 로고
    • Stolcke, A., Ferrer, L., Kajarekar, S., Shriberg, E., Venkataraman, A., 2005. MLLR transforms as features in speaker recognition. In: Proc. of Eurospeech, pp. 2425-2428.
    • Stolcke, A., Ferrer, L., Kajarekar, S., Shriberg, E., Venkataraman, A., 2005. MLLR transforms as features in speaker recognition. In: Proc. of Eurospeech, pp. 2425-2428.
  • 36
    • 85080012139 scopus 로고    scopus 로고
    • Venkataraman, A., Stolcke, A., Wang, W., Vergyri, D., Gadde, V., Zheng, J., 2004. SRI's 2004 broadcast news speech to text system. In: EARS RT04 Workshop.
    • Venkataraman, A., Stolcke, A., Wang, W., Vergyri, D., Gadde, V., Zheng, J., 2004. SRI's 2004 broadcast news speech to text system. In: EARS RT04 Workshop.
  • 37
    • 0029726509 scopus 로고    scopus 로고
    • Woodland, P., Gales, M., Pye, D., 1996. Improving environmental robustness in large vocabulary speech recognition. In: Proc. of ICASSP, vol. 1, pp. 65-68.
    • Woodland, P., Gales, M., Pye, D., 1996. Improving environmental robustness in large vocabulary speech recognition. In: Proc. of ICASSP, vol. 1, pp. 65-68.
  • 38
    • 85079920297 scopus 로고    scopus 로고
    • Young, S., Odell, J., Woodland, P., 1994. Tree based state tying for high accuracy modelling. In: Proc. ARPA Spoken Language Technology Workshop, pp. 405-410.
    • Young, S., Odell, J., Woodland, P., 1994. Tree based state tying for high accuracy modelling. In: Proc. ARPA Spoken Language Technology Workshop, pp. 405-410.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.