SCOPUS 정보 검색 플랫폼

Volumn 51, Issue 1, 2009, Pages 42-57

Techniques in rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics

(4) Gomez, Randy a Toda, Tomoki a Saruwatari, Hiroshi a Shikano, Kiyohiro a

a NARA INSTITUTE OF SCIENCE AND TECHNOLOGY (Japan)

Author keywords

Adaptation based on speaker selection; HMM Sufficient Statistics; Rapid adaptation; Single utterance adaptation; Speaker adaptation; Unsupervised adaptation

Indexed keywords

HIDDEN MARKOV MODELS; REGRESSION ANALYSIS; SPEECH; SPEECH ANALYSIS; SPEECH PROCESSING; STATISTICAL METHODS; STATISTICS; TARGETS;

ADAPTATION BASED ON SPEAKER SELECTION; HMM-SUFFICIENT STATISTICS; RAPID ADAPTATION; SINGLE-UTTERANCE ADAPTATION; SPEAKER ADAPTATION; UNSUPERVISED ADAPTATION;

SPEECH RECOGNITION;

EID: 55049094528 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2008.05.014 Document Type: Article

Times cited : (7)

References (27)

1
- 0030362995
- Anastakos, T., McDonough, J., Schwartz, R., Makhoul, J., 1996. A compact model for speaker-adaptive training. In: Proceedings of ICSLP, October 1996.
- Anastakos, T., McDonough, J., Schwartz, R., Makhoul, J., 1996. A compact model for speaker-adaptive training. In: Proceedings of ICSLP, October 1996.

2
- 85009067411
- Baba, A., Yoshizawa, S., Yamada, M., Lee, A., Shikano, K., 2001. Elderly acoustic model for large vocabulary continuous speech recognition. In: Proceedings of EUROSPEECH, pp. 1657-1660.
- Baba, A., Yoshizawa, S., Yamada, M., Lee, A., Shikano, K., 2001. Elderly acoustic model for large vocabulary continuous speech recognition. In: Proceedings of EUROSPEECH, pp. 1657-1660.

3
- 0030263447
- Mean and variance adaptation within the MLLR framework
- Gales M.J.F., and Woodland P.C. Mean and variance adaptation within the MLLR framework. Proceedings of Computer Speech and Language 10 (1996) 249-264
- (1996) Proceedings of Computer Speech and Language , vol.10 , pp. 249-264
- Gales, M.J.F.¹ Woodland, P.C.²

4
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observation of Markov Chains
- Gauvain J., and Lee C.H. Maximum a posteriori estimation for multivariate Gaussian mixture observation of Markov Chains. IEEE Transactions SAP 2 (1994) 291-298
- (1994) IEEE Transactions SAP , vol.2 , pp. 291-298
- Gauvain, J.¹ Lee, C.H.²

5
- 85143191399
- Giuliani, Gerosa, M., 2003. Investigating recognition of children's speech. Proceedings of ICASSP 2, 137-140.
- Giuliani, Gerosa, M., 2003. Investigating recognition of children's speech. Proceedings of ICASSP 2, 137-140.

6
- 55049093772
- Gomez, R., Lee, A., Saruwatari, H., Shikano, K., 2005. Rapid unsupervised speaker adaptation based on multi-template HMM Sufficient Statistics in noisy environments. In: Proceedings of EUROSPEECH, pp. 296-301.
- Gomez, R., Lee, A., Saruwatari, H., Shikano, K., 2005. Rapid unsupervised speaker adaptation based on multi-template HMM Sufficient Statistics in noisy environments. In: Proceedings of EUROSPEECH, pp. 296-301.

7
- 55049094679
- Gomez, R., Lee, A., Saruwatari, H., Shikano, K., 2005. Speaker-class reduction for HMM-Sufficient Statistics adaptation using multiple acoustic models. In: Proceedings of Acoustical Society of Japan, March 2005.
- Gomez, R., Lee, A., Saruwatari, H., Shikano, K., 2005. Speaker-class reduction for HMM-Sufficient Statistics adaptation using multiple acoustic models. In: Proceedings of Acoustical Society of Japan, March 2005.

8
- 33645787305
- Gomez, R., Lee, A., Toda, T., Saruwatari, H., Shikano, K., 2006. Improving rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics in noisy environments using multi-template models. IEICE Special Issue on Statistical Modeling for Speech Processing E89-D (3).
- Gomez, R., Lee, A., Toda, T., Saruwatari, H., Shikano, K., 2006. Improving rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics in noisy environments using multi-template models. IEICE Special Issue on Statistical Modeling for Speech Processing E89-D (3).

9
- 33847209621
- Gomez, R., Toda, T., Saruwatari, H., Shikano, K., 2007. Reducing computation time of the rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics. IEICE Transaction on Information and Systems E90-D (2).
- Gomez, R., Toda, T., Saruwatari, H., Shikano, K., 2007. Reducing computation time of the rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics. IEICE Transaction on Information and Systems E90-D (2).

10
- 55049122617
- Gomez, R., Toda, T., Saruwatari, H., Shikano, K., in press. Improving rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics. In: Proceedings of ICASSP.
- Gomez, R., Toda, T., Saruwatari, H., Shikano, K., in press. Improving rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics. In: Proceedings of ICASSP.

11
- 55049094047
- HTK: Hidden Markov Model Toolkit. Cambridge University Engineering Department. www.htk.eng.cam.ac.uk.
- HTK: Hidden Markov Model Toolkit. Cambridge University Engineering Department. www.htk.eng.cam.ac.uk.

12
- 85009113198
- Huang, C., Chen, T., Li, S., Zhou, J.L., 2001. Analysis of speaker variability. Proceedings of Eurospeech 2, 1377-1380.
- Huang, C., Chen, T., Li, S., Zhou, J.L., 2001. Analysis of speaker variability. Proceedings of Eurospeech 2, 1377-1380.

13
- 85009133701
- Huang, C., Chen, T., Chan, E., 2004. Transformation and combination of hidden Markov models for speaker selection training. In: Proceedings of ICSLP.
- Huang, C., Chen, T., Chan, E., 2004. Transformation and combination of hidden Markov models for speaker selection training. In: Proceedings of ICSLP.

14
- 55049121253
- Lee, A. JULIUS: A Free Continuous Speech Recognition Software. Nagoya Institute of Technology, Japan. www.sourceforge.jp.
- Lee, A. JULIUS: A Free Continuous Speech Recognition Software. Nagoya Institute of Technology, Japan. www.sourceforge.jp.

15
- 0033721605
- Lee, A. Kawahara, T., Takeda, K., Shikano, K., 2000. A new phonetic tied-mixture model For efficient decoding. In: Proceedings of ICASSP, pp. 1269-1272.
- Lee, A. Kawahara, T., Takeda, K., Shikano, K., 2000. A new phonetic tied-mixture model For efficient decoding. In: Proceedings of ICASSP, pp. 1269-1272.

16
- 0029288633
- Leggeter, C.J., Woodland, 1995. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Proceedings of Computer Speech and Language 9, 171-185.
- Leggeter, C.J., Woodland, 1995. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Proceedings of Computer Speech and Language 9, 171-185.

17
- 55049137513
- Matsoukas, S., Schwartz, R., Jin, H., Nguyen, L., 1997. Practical implementation of speaker adaptive training. In: Proceedings ARPA Workshop in Speech Recognition, 1997.
- Matsoukas, S., Schwartz, R., Jin, H., Nguyen, L., 1997. Practical implementation of speaker adaptive training. In: Proceedings ARPA Workshop in Speech Recognition, 1997.

18
- 0030682285
- Matsui, T., Matsouka, T., Furui, S., 1997. Smoothed N-best based speaker adaptation for speech recognition. In: Proceedings of ICASSP, pp. 1015-1018.
- Matsui, T., Matsouka, T., Furui, S., 1997. Smoothed N-best based speaker adaptation for speech recognition. In: Proceedings of ICASSP, pp. 1015-1018.

19
- 55049140678
- Minka, T., 1998. Expectation-maximization as lower bound maximization. http://www-white.media.mit.edu/tp-minka/papers/em.html.
- Minka, T., 1998. Expectation-maximization as lower bound maximization. http://www-white.media.mit.edu/tp-minka/papers/em.html.

20
- 55049103410
- Neumeyer, L., Sankar, A., Digalakis, V., 1995. A comparative study of speaker adaptation techniques. Proceedings European Conference on Speech Communication and Technology 2, 1127-1130.
- Neumeyer, L., Sankar, A., Digalakis, V., 1995. A comparative study of speaker adaptation techniques. Proceedings European Conference on Speech Communication and Technology 2, 1127-1130.

21
- 0030672082
- Pye, D., Woodland, P.C., 1997. Experiments in speaker normalisation and adaptation for large vocabulary adaptation. Proceedings of ICASSP 2 (1), 1047-1051.
- Pye, D., Woodland, P.C., 1997. Experiments in speaker normalisation and adaptation for large vocabulary adaptation. Proceedings of ICASSP 2 (1), 1047-1051.

22
- 55049134824
- Vatbhava, G., Karthik, V., Ramesh, G., 2001. Rapid adaptation with linear combinations of rank-one matrices. In: Proceedings of ICASSP.
- Vatbhava, G., Karthik, V., Ramesh, G., 2001. Rapid adaptation with linear combinations of rank-one matrices. In: Proceedings of ICASSP.

23
- 33645764424
- Xiang, B., Nguyen, L., Matsoukas, S., Schwartz, R., 2005. Cluster-dependent acoustic modeling. Proceedings of ICASSP 1, 677-680.
- Xiang, B., Nguyen, L., Matsoukas, S., Schwartz, R., 2005. Cluster-dependent acoustic modeling. Proceedings of ICASSP 1, 677-680.

24
- 85009144926
- Yamada, M., Baba, A. Yoshizawa, S., Lee, A., Saruwatari, H., Shikano, K., 2001. Unsupervised noisy environment adaptation algorithm using MLLR and speaker selection. In: Proceedings of 7th European Conference on Speech Communication and Technology, pp. 869-872, 2001.
- Yamada, M., Baba, A. Yoshizawa, S., Lee, A., Saruwatari, H., Shikano, K., 2001. Unsupervised noisy environment adaptation algorithm using MLLR and speaker selection. In: Proceedings of 7th European Conference on Speech Communication and Technology, pp. 869-872, 2001.

25
- 55049121571
- Yamade, S., Matsunami, K., Baba, A., Lee, A., Saruwatari, H., Shikano, K., 2000. Spectral subtraction in noisy environments applied to speaker adaptation based on HMM Sufficient Statistics. In: Proceedings of ICSLP, pp. I-1045-I-1048.
- Yamade, S., Matsunami, K., Baba, A., Lee, A., Saruwatari, H., Shikano, K., 2000. Spectral subtraction in noisy environments applied to speaker adaptation based on HMM Sufficient Statistics. In: Proceedings of ICSLP, pp. I-1045-I-1048.

26
- 0034848875
- Yoshizawa, S., Baba, A., Matsunami, K., Mera, Y., Yamada, M., Shikano, K., 2001. Unsupervised speaker adaptation based on sufficient HMM statistics of selected speakers. In: Proceedings of ICASSP.
- Yoshizawa, S., Baba, A., Matsunami, K., Mera, Y., Yamada, M., Shikano, K., 2001. Unsupervised speaker adaptation based on sufficient HMM statistics of selected speakers. In: Proceedings of ICASSP.

27
- 55049108736
- Zhan, P., Westphal, M., Finke, M., Waibel, A., 1997. Speaker normalization and speaker adaptation - a combination for conversational speech recognition. Proceedings of Eurospeech, 2087-2090.
- Zhan, P., Westphal, M., Finke, M., Waibel, A., 1997. Speaker normalization and speaker adaptation - a combination for conversational speech recognition. Proceedings of Eurospeech, 2087-2090.

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.