메뉴 건너뛰기




Volumn 51, Issue 1, 2009, Pages 42-57

Techniques in rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics

Author keywords

Adaptation based on speaker selection; HMM Sufficient Statistics; Rapid adaptation; Single utterance adaptation; Speaker adaptation; Unsupervised adaptation

Indexed keywords

HIDDEN MARKOV MODELS; REGRESSION ANALYSIS; SPEECH; SPEECH ANALYSIS; SPEECH PROCESSING; STATISTICAL METHODS; STATISTICS; TARGETS;

EID: 55049094528     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2008.05.014     Document Type: Article
Times cited : (7)

References (27)
  • 1
    • 0030362995 scopus 로고    scopus 로고
    • Anastakos, T., McDonough, J., Schwartz, R., Makhoul, J., 1996. A compact model for speaker-adaptive training. In: Proceedings of ICSLP, October 1996.
    • Anastakos, T., McDonough, J., Schwartz, R., Makhoul, J., 1996. A compact model for speaker-adaptive training. In: Proceedings of ICSLP, October 1996.
  • 2
    • 85009067411 scopus 로고    scopus 로고
    • Baba, A., Yoshizawa, S., Yamada, M., Lee, A., Shikano, K., 2001. Elderly acoustic model for large vocabulary continuous speech recognition. In: Proceedings of EUROSPEECH, pp. 1657-1660.
    • Baba, A., Yoshizawa, S., Yamada, M., Lee, A., Shikano, K., 2001. Elderly acoustic model for large vocabulary continuous speech recognition. In: Proceedings of EUROSPEECH, pp. 1657-1660.
  • 4
    • 0028419019 scopus 로고
    • Maximum a posteriori estimation for multivariate Gaussian mixture observation of Markov Chains
    • Gauvain J., and Lee C.H. Maximum a posteriori estimation for multivariate Gaussian mixture observation of Markov Chains. IEEE Transactions SAP 2 (1994) 291-298
    • (1994) IEEE Transactions SAP , vol.2 , pp. 291-298
    • Gauvain, J.1    Lee, C.H.2
  • 5
    • 85143191399 scopus 로고    scopus 로고
    • Giuliani, Gerosa, M., 2003. Investigating recognition of children's speech. Proceedings of ICASSP 2, 137-140.
    • Giuliani, Gerosa, M., 2003. Investigating recognition of children's speech. Proceedings of ICASSP 2, 137-140.
  • 6
    • 55049093772 scopus 로고    scopus 로고
    • Gomez, R., Lee, A., Saruwatari, H., Shikano, K., 2005. Rapid unsupervised speaker adaptation based on multi-template HMM Sufficient Statistics in noisy environments. In: Proceedings of EUROSPEECH, pp. 296-301.
    • Gomez, R., Lee, A., Saruwatari, H., Shikano, K., 2005. Rapid unsupervised speaker adaptation based on multi-template HMM Sufficient Statistics in noisy environments. In: Proceedings of EUROSPEECH, pp. 296-301.
  • 7
    • 55049094679 scopus 로고    scopus 로고
    • Gomez, R., Lee, A., Saruwatari, H., Shikano, K., 2005. Speaker-class reduction for HMM-Sufficient Statistics adaptation using multiple acoustic models. In: Proceedings of Acoustical Society of Japan, March 2005.
    • Gomez, R., Lee, A., Saruwatari, H., Shikano, K., 2005. Speaker-class reduction for HMM-Sufficient Statistics adaptation using multiple acoustic models. In: Proceedings of Acoustical Society of Japan, March 2005.
  • 8
    • 33645787305 scopus 로고    scopus 로고
    • Gomez, R., Lee, A., Toda, T., Saruwatari, H., Shikano, K., 2006. Improving rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics in noisy environments using multi-template models. IEICE Special Issue on Statistical Modeling for Speech Processing E89-D (3).
    • Gomez, R., Lee, A., Toda, T., Saruwatari, H., Shikano, K., 2006. Improving rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics in noisy environments using multi-template models. IEICE Special Issue on Statistical Modeling for Speech Processing E89-D (3).
  • 9
    • 33847209621 scopus 로고    scopus 로고
    • Gomez, R., Toda, T., Saruwatari, H., Shikano, K., 2007. Reducing computation time of the rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics. IEICE Transaction on Information and Systems E90-D (2).
    • Gomez, R., Toda, T., Saruwatari, H., Shikano, K., 2007. Reducing computation time of the rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics. IEICE Transaction on Information and Systems E90-D (2).
  • 10
    • 55049122617 scopus 로고    scopus 로고
    • Gomez, R., Toda, T., Saruwatari, H., Shikano, K., in press. Improving rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics. In: Proceedings of ICASSP.
    • Gomez, R., Toda, T., Saruwatari, H., Shikano, K., in press. Improving rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics. In: Proceedings of ICASSP.
  • 11
    • 55049094047 scopus 로고    scopus 로고
    • HTK: Hidden Markov Model Toolkit. Cambridge University Engineering Department. www.htk.eng.cam.ac.uk.
    • HTK: Hidden Markov Model Toolkit. Cambridge University Engineering Department. www.htk.eng.cam.ac.uk.
  • 12
    • 85009113198 scopus 로고    scopus 로고
    • Huang, C., Chen, T., Li, S., Zhou, J.L., 2001. Analysis of speaker variability. Proceedings of Eurospeech 2, 1377-1380.
    • Huang, C., Chen, T., Li, S., Zhou, J.L., 2001. Analysis of speaker variability. Proceedings of Eurospeech 2, 1377-1380.
  • 13
    • 85009133701 scopus 로고    scopus 로고
    • Huang, C., Chen, T., Chan, E., 2004. Transformation and combination of hidden Markov models for speaker selection training. In: Proceedings of ICSLP.
    • Huang, C., Chen, T., Chan, E., 2004. Transformation and combination of hidden Markov models for speaker selection training. In: Proceedings of ICSLP.
  • 14
    • 55049121253 scopus 로고    scopus 로고
    • Lee, A. JULIUS: A Free Continuous Speech Recognition Software. Nagoya Institute of Technology, Japan. www.sourceforge.jp.
    • Lee, A. JULIUS: A Free Continuous Speech Recognition Software. Nagoya Institute of Technology, Japan. www.sourceforge.jp.
  • 15
    • 0033721605 scopus 로고    scopus 로고
    • Lee, A. Kawahara, T., Takeda, K., Shikano, K., 2000. A new phonetic tied-mixture model For efficient decoding. In: Proceedings of ICASSP, pp. 1269-1272.
    • Lee, A. Kawahara, T., Takeda, K., Shikano, K., 2000. A new phonetic tied-mixture model For efficient decoding. In: Proceedings of ICASSP, pp. 1269-1272.
  • 16
    • 0029288633 scopus 로고    scopus 로고
    • Leggeter, C.J., Woodland, 1995. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Proceedings of Computer Speech and Language 9, 171-185.
    • Leggeter, C.J., Woodland, 1995. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Proceedings of Computer Speech and Language 9, 171-185.
  • 17
    • 55049137513 scopus 로고    scopus 로고
    • Matsoukas, S., Schwartz, R., Jin, H., Nguyen, L., 1997. Practical implementation of speaker adaptive training. In: Proceedings ARPA Workshop in Speech Recognition, 1997.
    • Matsoukas, S., Schwartz, R., Jin, H., Nguyen, L., 1997. Practical implementation of speaker adaptive training. In: Proceedings ARPA Workshop in Speech Recognition, 1997.
  • 18
    • 0030682285 scopus 로고    scopus 로고
    • Matsui, T., Matsouka, T., Furui, S., 1997. Smoothed N-best based speaker adaptation for speech recognition. In: Proceedings of ICASSP, pp. 1015-1018.
    • Matsui, T., Matsouka, T., Furui, S., 1997. Smoothed N-best based speaker adaptation for speech recognition. In: Proceedings of ICASSP, pp. 1015-1018.
  • 19
    • 55049140678 scopus 로고    scopus 로고
    • Minka, T., 1998. Expectation-maximization as lower bound maximization. http://www-white.media.mit.edu/tp-minka/papers/em.html.
    • Minka, T., 1998. Expectation-maximization as lower bound maximization. http://www-white.media.mit.edu/tp-minka/papers/em.html.
  • 20
    • 55049103410 scopus 로고    scopus 로고
    • Neumeyer, L., Sankar, A., Digalakis, V., 1995. A comparative study of speaker adaptation techniques. Proceedings European Conference on Speech Communication and Technology 2, 1127-1130.
    • Neumeyer, L., Sankar, A., Digalakis, V., 1995. A comparative study of speaker adaptation techniques. Proceedings European Conference on Speech Communication and Technology 2, 1127-1130.
  • 21
    • 0030672082 scopus 로고    scopus 로고
    • Pye, D., Woodland, P.C., 1997. Experiments in speaker normalisation and adaptation for large vocabulary adaptation. Proceedings of ICASSP 2 (1), 1047-1051.
    • Pye, D., Woodland, P.C., 1997. Experiments in speaker normalisation and adaptation for large vocabulary adaptation. Proceedings of ICASSP 2 (1), 1047-1051.
  • 22
    • 55049134824 scopus 로고    scopus 로고
    • Vatbhava, G., Karthik, V., Ramesh, G., 2001. Rapid adaptation with linear combinations of rank-one matrices. In: Proceedings of ICASSP.
    • Vatbhava, G., Karthik, V., Ramesh, G., 2001. Rapid adaptation with linear combinations of rank-one matrices. In: Proceedings of ICASSP.
  • 23
    • 33645764424 scopus 로고    scopus 로고
    • Xiang, B., Nguyen, L., Matsoukas, S., Schwartz, R., 2005. Cluster-dependent acoustic modeling. Proceedings of ICASSP 1, 677-680.
    • Xiang, B., Nguyen, L., Matsoukas, S., Schwartz, R., 2005. Cluster-dependent acoustic modeling. Proceedings of ICASSP 1, 677-680.
  • 24
    • 85009144926 scopus 로고    scopus 로고
    • Yamada, M., Baba, A. Yoshizawa, S., Lee, A., Saruwatari, H., Shikano, K., 2001. Unsupervised noisy environment adaptation algorithm using MLLR and speaker selection. In: Proceedings of 7th European Conference on Speech Communication and Technology, pp. 869-872, 2001.
    • Yamada, M., Baba, A. Yoshizawa, S., Lee, A., Saruwatari, H., Shikano, K., 2001. Unsupervised noisy environment adaptation algorithm using MLLR and speaker selection. In: Proceedings of 7th European Conference on Speech Communication and Technology, pp. 869-872, 2001.
  • 25
    • 55049121571 scopus 로고    scopus 로고
    • Yamade, S., Matsunami, K., Baba, A., Lee, A., Saruwatari, H., Shikano, K., 2000. Spectral subtraction in noisy environments applied to speaker adaptation based on HMM Sufficient Statistics. In: Proceedings of ICSLP, pp. I-1045-I-1048.
    • Yamade, S., Matsunami, K., Baba, A., Lee, A., Saruwatari, H., Shikano, K., 2000. Spectral subtraction in noisy environments applied to speaker adaptation based on HMM Sufficient Statistics. In: Proceedings of ICSLP, pp. I-1045-I-1048.
  • 26
    • 0034848875 scopus 로고    scopus 로고
    • Yoshizawa, S., Baba, A., Matsunami, K., Mera, Y., Yamada, M., Shikano, K., 2001. Unsupervised speaker adaptation based on sufficient HMM statistics of selected speakers. In: Proceedings of ICASSP.
    • Yoshizawa, S., Baba, A., Matsunami, K., Mera, Y., Yamada, M., Shikano, K., 2001. Unsupervised speaker adaptation based on sufficient HMM statistics of selected speakers. In: Proceedings of ICASSP.
  • 27
    • 55049108736 scopus 로고    scopus 로고
    • Zhan, P., Westphal, M., Finke, M., Waibel, A., 1997. Speaker normalization and speaker adaptation - a combination for conversational speech recognition. Proceedings of Eurospeech, 2087-2090.
    • Zhan, P., Westphal, M., Finke, M., Waibel, A., 1997. Speaker normalization and speaker adaptation - a combination for conversational speech recognition. Proceedings of Eurospeech, 2087-2090.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.