SCOPUS 정보 검색 플랫폼

Speech Communication

Volumn 42, Issue 1, 2004, Pages 75-91

Speaker adaptation with all-pass transforms

(3) McDonough, John a Schaaf, Thomas a Waibel, Alex a

a UNIVERSITY OF KARLSRUHE (Germany)

Author keywords

Speaker adaptation; Speech recognition

Indexed keywords

ERROR COMPENSATION; MARKOV PROCESSES; MATHEMATICAL TRANSFORMATIONS; MAXIMUM LIKELIHOOD ESTIMATION; PARAMETER ESTIMATION; PROBABILITY DENSITY FUNCTION; REGRESSION ANALYSIS; SPEECH COMMUNICATION;

BILINEAR TRANSFORMS (BT); EXPECTATION MAXIMIZATION (EM) ALGORITHM; SPEAKER ADAPTATION;

SPEECH RECOGNITION;

EID: 0347269184 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2003.09.005 Document Type: Conference Paper

Times cited : (17)

References (41)

1
- 0004319975
- Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA
- Acero, A., 1990. Acoustical and environmental robustness in automatic speech recognition. Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA.
- (1990) Acoustical and Environmental Robustness in Automatic Speech Recognition
- Acero, A.¹

2
- 0030362995
- A compact model for speaker-adaptive training
- Anastasakos, T., McDonough, J., Schwartz, R., Makhoul, J., 1996. A compact model for speaker-adaptive training. In: Proc. ICSLP.
- (1996) Proc. ICSLP
- Anastasakos, T.¹ McDonough, J.² Schwartz, R.³ Makhoul, J.⁴

3
- 0141629128
- Experiments in vocal tract normalization
- Andreou, A., Kamm, T., Cohen, J., 1994. Experiments in vocal tract normalization. In: Proc. CAIP Workshop: Frontiers in Speech Recognition II.
- (1994) Proc. CAIP Workshop: Frontiers in Speech Recognition II
- Andreou, A.¹ Kamm, T.² Cohen, J.³

4
- 0032657749
- Correlation modeling of MLLR transform biases for rapid HMM adaptation to new speakers
- Bocchieri, E., Digalakis, V., Corduneanu, A., Boulis, C., 1999. Correlation modeling of MLLR transform biases for rapid HMM adaptation to new speakers. In: Proc. ICASSP, Vol. II, pp. 773-776.
- (1999) Proc. ICASSP , vol.2 , pp. 773-776
- Bocchieri, E.¹ Digalakis, V.² Corduneanu, A.³ Boulis, C.⁴

5
- 0003751555
- New York: McGraw-Hill
- Churchill R.V., Brown J.W. Complex Variables and Applications. fifth ed. 1990;McGraw-Hill, New York.
- (1990) Complex Variables and Applications. Fifth Ed.
- Churchill, R.V.¹ Brown, J.W.²

6
- 0002629270
- Maximum likelihood from incomplete data via the EM algorithm
- Dempster A.P., Laird N.M., Rubin D.B. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. 39B:1977;1-38.
- (1977) J. Roy. Statist. Soc. , vol.39 B , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

7
- 0029375590
- Fast speaker adaptation using constrained estimation of gaussian mixtures
- Digalakis V., Rtischev D., Neumeyer L. Fast speaker adaptation using constrained estimation of gaussian mixtures. IEEE Trans. Speech Audio Process. 3:1995;357-366.
- (1995) IEEE Trans. Speech Audio Process. , vol.3 , pp. 357-366
- Digalakis, V.¹ Rtischev, D.² Neumeyer, L.³

8
- 0346069161
- Rapid speech recognizer adaptation to new speakers
- Digalakis, V., Berkowitz, S., Bocchieri, E., Boulis, C., Byrne, W., Collier, H., Corduneanu, A., Kannan, A., Khudanpur, S., Sankar, A., 1996. Rapid speech recognizer adaptation to new speakers. In: Proc. ICASSP, Vol. I, pp. 339-341.
- (1996) Proc. ICASSP , vol.1 , pp. 339-341
- Digalakis, V.¹ Berkowitz, S.² Bocchieri, E.³ Boulis, C.⁴ Byrne, W.⁵ Collier, H.⁶ Corduneanu, A.⁷ Kannan, A.⁸ Khudanpur, S.⁹ Sankar, A.¹⁰

9
- 85009236682
- Implementing vocal tract length normalization in the MLLR framework
- Ding, G.-H., Zhu, Y.-F., Li, C., Xu, B., 2002. Implementing vocal tract length normalization in the MLLR framework. In: ICSLP, pp. 1389-1392.
- (2002) ICSLP , pp. 1389-1392
- Ding, G.-H.¹ Zhu, Y.-F.² Li, C.³ Xu, B.⁴

10
- 0029725604
- A parametric approach to vocal tract length normalization
- Eide, E., Gish, H., 1996. A parametric approach to vocal tract length normalization. In: Proc. ICASSP, Vol. I, pp. 346-348.
- (1996) Proc. ICASSP , vol.1 , pp. 346-348
- Eide, E.¹ Gish, H.²

11
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- Gales M.J.F. Maximum likelihood linear transformations for HMM-based speech recognition. Computer Speech Language. 12:1998;75-98.
- (1998) Computer Speech Language , vol.12 , pp. 75-98
- Gales, M.J.F.¹

12
- 0032638856
- Semi-tied covariance matrices for hidden markov models
- Gales M.J.F. Semi-tied covariance matrices for hidden markov models. IEEE Trans. Speech Audio Process. 7:1999;272-281.
- (1999) IEEE Trans. Speech Audio Process. , vol.7 , pp. 272-281
- Gales, M.J.F.¹

13
- 0030263447
- Mean and variance adaptation within the MLLR framework
- Gales M.J.F., Woodland P.C. Mean and variance adaptation within the MLLR framework. Computer Speech Language. 10:1996;249-264.
- (1996) Computer Speech Language , vol.10 , pp. 249-264
- Gales, M.J.F.¹ Woodland, P.C.²

14
- 0004240547
- London: Academic Press
- Gill P.E., Murray W., Wright M.H. Practical Optimization. 1981;Academic Press, London.
- (1981) Practical Optimization
- Gill, P.E.¹ Murray, W.² Wright, M.H.³

15
- 0033708818
- Robust estimation for rapid speaker adaptation using discounted likelihood techniques
- 5-9 June 2000
- Gunawardana, A., Byrne, W., 2000. Robust estimation for rapid speaker adaptation using discounted likelihood techniques, IEEE ICASSP, 5-9 June 2000, Vol. 2, pp. II985-II988.
- (2000) IEEE ICASSP , vol.2
- Gunawardana, A.¹ Byrne, W.²

16
- 0025041264
- Perceptual linear predictive (PLP) analysis of speech
- Hermansky H. Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87(4):1990;1738-1752.
- (1990) J. Acoust. Soc. Am. , vol.87 , Issue.4 , pp. 1738-1752
- Hermansky, H.¹

17
- 0032653576
- Tree-structured models of parameter dependence for rapid adaptation in large vocabulary conversational speech recognition
- Kannan, A., Khudanpur, S., 1996. Tree-structured models of parameter dependence for rapid adaptation in large vocabulary conversational speech recognition. In: Proc. ICASSP, Vol. II, pp. 769-772.
- (1996) Proc. ICASSP , vol.2 , pp. 769-772
- Kannan, A.¹ Khudanpur, S.²

18
- 0034320005
- Speaker adaptation in eigenvoice space
- Kuhn R., Junqua J.-C., Nguyen P., Niedzielski N. Speaker adaptation in eigenvoice space. IEEE Trans. Speech Audio Process. 8(6):2000;695-707.
- (2000) IEEE Trans. Speech Audio Process. , vol.8 , Issue.6 , pp. 695-707
- Kuhn, R.¹ Junqua, J.-C.² Nguyen, P.³ Niedzielski, N.⁴

19
- 0032289099
- Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition
- Kumar N., Andreou A.G. Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition. Speech Commun. 26:1998;238-297.
- (1998) Speech Commun. , vol.26 , pp. 238-297
- Kumar, N.¹ Andreou, A.G.²

20
- 0029747183
- Speaker normalization using efficient frequency warping procedures
- Lee, L., Rose, R.C., 1996. Speaker normalization using efficient frequency warping procedures. In: Proc. ICASSP, Vol. I, pp. 353-356.
- (1996) Proc. ICASSP , vol.1 , pp. 353-356
- Lee, L.¹ Rose, R.C.²

21
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- Leggetter C.J., Woodland P.C. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Computer Speech Language. 9:1995;171-185.
- (1995) Computer Speech Language , vol.9 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

22
- 0003488911
- New York: Addison-Wesley
- Luenberger D.G. Linear and Nonlinear Programming. second ed. 1984;Addison-Wesley, New York.
- (1984) Linear and Nonlinear Programming. Second Ed.
- Luenberger, D.G.¹

23
- 0347330345
- Bases in hilbert space related to the representation of stationary operators
- Masry E., Stieglitz K., Liu B. Bases in hilbert space related to the representation of stationary operators. SIAM J. Appl. Math. 16:1968;552-562.
- (1968) SIAM J. Appl. Math. , vol.16 , pp. 552-562
- Masry, E.¹ Stieglitz, K.² Liu, B.³

24
- 0347960638
- On the estimation of optimal regression classes for speaker adaptation
- Center for Language and Speech Processing, The Johns Hopkins University
- McDonough, J.W., 1998. On the estimation of optimal regression classes for speaker adaptation. Tech. Rep. 36, Center for Language and Speech Processing, The Johns Hopkins University.
- (1998) Tech. Rep. , vol.36
- McDonough, J.W.¹

25
- 85017327824
- The Homewood extensions
- Center for Language and Speech Processing, The Johns Hopkins University, Baltimore, MD
- McDonough, J.W., 1999. The Homewood extensions. Tech. Rep. 39, Center for Language and Speech Processing, The Johns Hopkins University, Baltimore, MD.
- (1999) Tech. Rep. , vol.39
- McDonough, J.W.¹

26
- 84937324786
- Ph.D. thesis, The Johns Hopkins University, Baltimore, MD
- McDonough, J.W., 2000. Speaker compensation with all-pass transforms. Ph.D. thesis, The Johns Hopkins University, Baltimore, MD.
- (2000) Speaker Compensation with All-pass Transforms
- McDonough, J.W.¹

27
- 0346699960
- On maximum mutual information speaker-adapted training
- Universität Karlsruhe
- McDonough, J.W., 2001. On maximum mutual information speaker-adapted training. Tech. Rep. 103, Universität Karlsruhe.
- (2001) Tech. Rep. , vol.103
- McDonough, J.W.¹

28
- 0346699959
- Performance comparisons of all-pass transform adaptation with maximum likelihood linear regression
- Universität Karlsruhe
- McDonough, J., Waibel, A., 2003. Performance comparisons of all-pass transform adaptation with maximum likelihood linear regression. Tech. Rep. 102, Universität Karlsruhe.
- (2003) Tech. Rep. , vol.102
- McDonough, J.¹ Waibel, A.²

29
- 84902047630
- Single-pass adapted training with all-pass transforms
- McDonough, J.W., Byrne, W., 1999. Single-pass adapted training with all-pass transforms. In: Proc. Eurospeech.
- (1999) Proc. Eurospeech
- McDonough, J.W.¹ Byrne, W.²

30
- 85128432219
- Speaker normalization with all-pass transforms
- McDonough, J., Byrne, W., Luo, X., 1998. Speaker normalization with all-pass transforms. In: Proc. ICSLP.
- (1998) Proc. ICSLP
- McDonough, J.¹ Byrne, W.² Luo, X.³

31
- 0001052406
- Discrete-time representation of signals
- Oppenheim A.V., Johnson D.H. Discrete-time representation of signals. Proc. IEEE. 60(6):1972;681-691.
- (1972) Proc. IEEE , vol.60 , Issue.6 , pp. 681-691
- Oppenheim, A.V.¹ Johnson, D.H.²

32
- 0003513556
- Englewood Cliffs, NJ: Prentice-Hall
- Oppenheim A.V., Schafer R.W. Discrete-Time Signal Processing. 1989;Prentice-Hall, Englewood Cliffs, NJ.
- (1989) Discrete-time Signal Processing
- Oppenheim, A.V.¹ Schafer, R.W.²

33
- 0347960630
- Vocal tract normalization equals linear transformation in cepstral space
- Pitz, M., Molau, S., Schlüter, R., Ney, H., 2001. Vocal tract normalization equals linear transformation in cepstral space. In: Eurospeech, pp. 721-724.
- (2001) Eurospeech , pp. 721-724
- Pitz, M.¹ Molau, S.² Schlüter, R.³ Ney, H.⁴

34
- 0030672082
- Experiments in speaker normalisation and adaptation for large vocabulary speech recognition
- Pye, D., Woodland, P.C., 1997. Experiments in speaker normalisation and adaptation for large vocabulary speech recognition. In: Proc. ICASSP, Vol. II, pp. 1047-1050.
- (1997) Proc. ICASSP , vol.2 , pp. 1047-1050
- Pye, D.¹ Woodland, P.C.²

35
- 0030149866
- A maximum-likelihood approach to stochastic matching for robust speech recognition
- Sankar A., Lee C.-H. A maximum-likelihood approach to stochastic matching for robust speech recognition. IEEE Trans. Speech Audio Process. 4(3):1996;190-201.
- (1996) IEEE Trans. Speech Audio Process. , vol.4 , Issue.3 , pp. 190-201
- Sankar, A.¹ Lee, C.-H.²

36
- 0033677121
- Maximum likelihood discriminant feature spaces
- Saon, G., Zweig, G., Padmanabhan, M., 2000. Maximum likelihood discriminant feature spaces. In: Proc. ICASSP.
- (2000) Proc. ICASSP
- Saon, G.¹ Zweig, G.² Padmanabhan, M.³

37
- 0003459982
- Evaluation of LPC spectral matching measures for phonetic unit recognition
- Computer Science Department, Carnegie Mellon University, Pittsburgh, PA
- Shikano, K., 1986. Evaluation of LPC spectral matching measures for phonetic unit recognition. Tech. Rep., Computer Science Department, Carnegie Mellon University, Pittsburgh, PA.
- (1986) Tech. Rep.
- Shikano, K.¹

38
- 0029764708
- Speaker normalization on conversational telephone speech
- Wegmann, S., McAllaster, D., Orloff, J., Peskin, B., 1996. Speaker normalization on conversational telephone speech. In: Proc. ICASSP, Vol. I, pp. 339-341.
- (1996) Proc. ICASSP , vol.1 , pp. 339-341
- Wegmann, S.¹ McAllaster, D.² Orloff, J.³ Peskin, B.⁴

39
- 0002867698
- Large scale discriminative training for speech recognition
- Woodland, P., Povey, D., 2000. Large scale discriminative training for speech recognition. In: ISCA ITRW Automatic Speech Recognition: Challenges for the Millenium, pp. 7-16.
- (2000) ISCA ITRW Automatic Speech Recognition: Challenges for the Millenium , pp. 7-16
- Woodland, P.¹ Povey, D.²

40
- 0003822743
- Cambridge: Entropic Software
- Young S., Odell J., Ollason D., Valtchev V., Woodland P. The HTK Book. 1999;Entropic Software, Cambridge.
- (1999) The HTK Book
- Young, S.¹ Odell, J.² Ollason, D.³ Valtchev, V.⁴ Woodland, P.⁵

41
- 0347960637
- Translation of divers' speech using digital frequency warping
- Res. Lab. Eltron., Massachusetts Institute of Technology, Cambridge, MA
- Zue, V., 1971. Translation of divers' speech using digital frequency warping. Tech. Rep. 101, Res. Lab. Eltron., Massachusetts Institute of Technology, Cambridge, MA.
- (1971) Tech. Rep. , vol.101
- Zue, V.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.