SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 14, Issue 6, 2006, Pages 2134-2146

Tree-based covariance modeling of hidden markov models

(4) Tian, Ye a Zhou, Jian Lai b Lin, Hui c Jiang, Hui d

a MICROSOFT (United States)

b MICROSOFT RESEARCH ASIA (China)

c TSINGHUA UNIVERSITY (China)

d YORK UNIVERSITY (Canada)

Author keywords

Automatic speech recognition; Covariance modeling; Gaussian mixture models; Tree modeling

Indexed keywords

AUTOMATIC SPEECH RECOGNITION; COVARIANCE MATRICES; COVARIANCE MODELING; DATA SPARSENESS PROBLEMS; E-M ALGORITHMS; FULL COVARIANCE MODELING; GAUSSIAN; GAUSSIAN MIXTURE MODELS; HETEROSCEDASTIC; HIDDEN MARKOV MODELING; HIERARCHICAL STRUCTURES; INTERPOLATION COEFFICIENTS; INVERSE COVARIANCES; KULLBACK-LEIBLER DIVERGENCES; LINEAR DISCRIMINANT ANALYSIS; MATRIXES; MULTI LAYERS; MULTI-LAYERED; PARAMETRIC FORMS; RESOURCE MANAGEMENTS; ROOT NODES; TREE MODELING; TREE-BASED;

BLIND SOURCE SEPARATION; COMMUNICATION CHANNELS (INFORMATION THEORY); DISCRIMINANT ANALYSIS; HIDDEN MARKOV MODELS; INTERPOLATION; MAXIMUM LIKELIHOOD ESTIMATION; MIXTURES; OBJECT RECOGNITION; RESOURCE ALLOCATION; SPEECH ANALYSIS; SPEECH RECOGNITION; TREES (MATHEMATICS); TRELLIS CODES;

COVARIANCE MATRIX;

EID: 44449103265 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TSA.2005.863210 Document Type: Article

Times cited : (8)

References (31)

1
- 0028466072
- The importance of cepstral parameter correlation in speech recognition
- A. Ljolje, "The importance of cepstral parameter correlation in speech recognition," Comput. Speech Lang., vol. 8, pp. 223-232, 1994.
- (1994) Comput. Speech Lang , vol.8 , pp. 223-232
- Ljolje, A.¹

2
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- Aug
- S. B. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-28, no. 4, p. 357, Aug. 1980.
- (1980) IEEE Trans. Acoust., Speech, Signal Process , vol.ASSP-28 , Issue.4 , pp. 357
- Davis, S.B.¹ Mermelstein, P.²

3
- 0032097263
- New York: Academic
- K. Fukunaga, Introduction to Statistical Pattern Recognition. New York: Academic, 1972.
- (1972) Introduction to Statistical Pattern Recognition
- Fukunaga, K.¹

4
- 85017287487
- Linear discriminant analysis for improved large vocabulary continuous speech recognition
- R. Haeb-Umbach and H. Ney, "Linear discriminant analysis for improved large vocabulary continuous speech recognition," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 1992, vol. 1, pp. 13-16.
- (1992) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing , vol.1 , pp. 13-16
- Haeb-Umbach, R.¹ Ney, H.²

5
- 17344383223
- Continuous mixture densities and linear discriminant analysis for improved context-dependent acoustic models
- X. Aubert, R. Haeb-Umbach, and H. Ney, "Continuous mixture densities and linear discriminant analysis for improved context-dependent acoustic models," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 1993, vol. 2, pp. 27-30.
- (1993) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing , vol.2 , pp. 27-30
- Aubert, X.¹ Haeb-Umbach, R.² Ney, H.³

6
- 0033677121
- Maximum likelihood discriminant feature spaces
- G. Saon, M. Padmanabhan, R. Gopinath, and S. Chen, "Maximum likelihood discriminant feature spaces," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 2000, vol. 2, pp. 1129-1132.
- (2000) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing , vol.2 , pp. 1129-1132
- Saon, G.¹ Padmanabhan, M.² Gopinath, R.³ Chen, S.⁴

7
- 84892187452
- Maximum likelihood modeling with Gaussian distributions for classification
- R. A. Gopinath, "Maximum likelihood modeling with Gaussian distributions for classification," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 1998, vol. 2, pp. 661-664.
- (1998) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing , vol.2 , pp. 661-664
- Gopinath, R.A.¹

8
- 0003871508
- Investigation of silicon-auditory models and generalization of linear discriminant analysis for improved speech recognition,
- Ph.D. dissertation, Johns Hopkins Univ, Baltimore, MD
- N.Kumar, "Investigation of silicon-auditory models and generalization of linear discriminant analysis for improved speech recognition," Ph.D. dissertation, , Johns Hopkins Univ., Baltimore, MD, 1997.
- (1997)
- Kumar, N.¹

9
- 0141703325
- Automatic complexity control forHLDAsystems
- X. Liu, M. F. J. Gales, and P. C. Woodland, "Automatic complexity control forHLDAsystems," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 2003, vol. 1, pp. 132-135.
- (2003) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing , vol.1 , pp. 132-135
- Liu, X.¹ Gales, M.F.J.² Woodland, P.C.³

10
- 0033677061
- Full covariance modeling and adaptation in sub-bands
- B. Doherty, S.Vaseghi, and P. McCourt, "Full covariance modeling and adaptation in sub-bands," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 2000, vol. 2, pp. 969-972.
- (2000) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing , vol.2 , pp. 969-972
- Doherty, B.¹ Vaseghi, S.² McCourt, P.³

11
- 0033677172
- Factored sparse inverse covariance matrices
- J. A. Bilmes, "Factored sparse inverse covariance matrices," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 2000, vol. 2, pp. 1009-1012.
- (2000) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing , vol.2 , pp. 1009-1012
- Bilmes, J.A.¹

12
- 0032638856
- Semi-tied covariance matrices for hidden Markov models
- May
- M. J. F. Gales, "Semi-tied covariance matrices for hidden Markov models," IEEE Trans. Speech Audio Process., vol. 7, no. 3, pp. 272-281, May 1999.
- (1999) IEEE Trans. Speech Audio Process , vol.7 , Issue.3 , pp. 272-281
- Gales, M.J.F.¹

13
- 0036475982
- Maximum likelihood multiple subspace projections for hidden Markov models
- Feb
- -, "Maximum likelihood multiple subspace projections for hidden Markov models," IEEE Trans. Speech Audio Process., vol. 10, no. 2, pp. 37-47, Feb. 2002.
- (2002) IEEE Trans. Speech Audio Process , vol.10 , Issue.2 , pp. 37-47
- Gales, M.J.F.¹

14
- 2442457791
- Mixtures of inverse covariance
- May
- V. Vanhoucke and A. Sankar, "Mixtures of inverse covariance," IEEE Trans. Speech Audio Process., vol. 12, no. 3, pp. 250-264, May 2004.
- (2004) IEEE Trans. Speech Audio Process , vol.12 , Issue.3 , pp. 250-264
- Vanhoucke, V.¹ Sankar, A.²

15
- 0742272654
- Modeling inverse covariance matrices by basis expansion
- Jan
- P. A. Olsen and R. A. Gopinath, "Modeling inverse covariance matrices by basis expansion," IEEE Trans. Acoust., Speech, Signal Process., vol. 12, no. 1, pp. 37-46, Jan. 2004.
- (2004) IEEE Trans. Acoust., Speech, Signal Process , vol.12 , Issue.1 , pp. 37-46
- Olsen, P.A.¹ Gopinath, R.A.²

16
- 85009289957
- Modeling with a subspace constraint on inverse covariance matrices
- S. Axelrod, R. Gopinath, and P. Olsen, "Modeling with a subspace constraint on inverse covariance matrices," in Proc. Int. Conf. Spoken Language Processing, 2001, vol. 9, pp. 2177-2180.
- (2001) Proc. Int. Conf. Spoken Language Processing , vol.9 , pp. 2177-2180
- Axelrod, S.¹ Gopinath, R.² Olsen, P.³

17
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- Feb
- L. R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE, vol. 7, no. 2, pp. 257-286, Feb. 1989.
- (1989) Proc. IEEE , vol.7 , Issue.2 , pp. 257-286
- Rabiner, L.R.¹

18
- 0035279111
- A structural Bayes approach to speaker adaptation
- Mar
- K. Shinoda and C. H. Lee, "A structural Bayes approach to speaker adaptation," IEEE Trans. Speech Audio Process., vol. 9, no. 3, pp. 276-287, Mar. 2001.
- (2001) IEEE Trans. Speech Audio Process , vol.9 , Issue.3 , pp. 276-287
- Shinoda, K.¹ Lee, C.H.²

19
- 85009064348
- Constrained maximum likelihood linear regression for speaker adaptation
- M. Afify and O. Siohan, "Constrained maximum likelihood linear regression for speaker adaptation," in Proc. Int. Conf. Spoken Language Processing, 2000, pp. 861-864.
- (2000) Proc. Int. Conf. Spoken Language Processing , pp. 861-864
- Afify, M.¹ Siohan, O.²

20
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., vol. 9, pp. 171-185, 1995.
- (1995) Comput. Speech Lang , vol.9 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

21
- 0000133385
- Speech recognition using tree-structured probability density function
- T. Watanabe, K. Shinoda, K. Takagi, and E. Yamada, "Speech recognition using tree-structured probability density function," in Proc. Int. Conf. Spoken Language Processing, 1994, pp. 223-226.
- (1994) Proc. Int. Conf. Spoken Language Processing , pp. 223-226
- Watanabe, T.¹ Shinoda, K.² Takagi, K.³ Yamada, E.⁴

22
- 0000120766
- Estimating the dimension of a model
- G. Schwarz, "Estimating the dimension of a model," Ann. Statist., vol. 6, pp. 461-464, 1973.
- (1973) Ann. Statist , vol.6 , pp. 461-464
- Schwarz, G.¹

23
- 0004161838
- Cambridge, U.K, Cambridge University Press
- W. H. Press, Numerical Recipes in C: The Art of Scientific Computing. Cambridge, U.K.: Cambridge University Press, 1986.
- (1986) Numerical Recipes in C: The Art of Scientific Computing
- Press, W.H.¹

24
- 0023776398
- A database for continuous speech recognition in a 1000-word domain
- P. Price, W. Fisher, J. Bernstein, and D. Pallett, "A database for continuous speech recognition in a 1000-word domain," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, 1988, vol. 2, pp. 651-654.
- (1988) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing , vol.2 , pp. 651-654
- Price, P.¹ Fisher, W.² Bernstein, J.³ Pallett, D.⁴

25
- 64549152628
- S. Young, D. Kershaw, J. Odell, D. Ollason, V. Valtchev, and P.Woodland, The HTK Book for HTK Version 3.0, 2000 [Online, Available
- S. Young, D. Kershaw, J. Odell, D. Ollason, V. Valtchev, and P.Woodland, The HTK Book (for HTK Version 3.0). : , 2000 [Online]. Available: http://htk.eng.cam.ac.uk/

26
- 64549145995
- Speech lab in a box: AMandarin speech toolbox to jump start speech related research toolbox
- E. Chang, Y. Shi, J. Zhou, and C. Huang, "Speech lab in a box: aMandarin speech toolbox to jump start speech related research toolbox," in Proc. Eur. Conf. Speech Communication and Technology, 2001, pp. 2782-2799.
- (2001) Proc. Eur. Conf. Speech Communication and Technology , pp. 2782-2799
- Chang, E.¹ Shi, Y.² Zhou, J.³ Huang, C.⁴

27
- 85009126501
- Large vocabulary Mandarin speech recognition with different approaches in modeling tones
- E. Chang, J. Zhou, C. Huang, and K. F. Lee, "Large vocabulary Mandarin speech recognition with different approaches in modeling tones," in Proc. Int. Conf. Spoken Language Processing, 2000, pp. 983-986.
- (2000) Proc. Int. Conf. Spoken Language Processing , pp. 983-986
- Chang, E.¹ Zhou, J.² Huang, C.³ Lee, K.F.⁴

28
- 64549160400
- Cambridge, U.K, Entropic Cambridge Research Laboratory
- R. Morton, D. Whitehouse, and D. Ollason, The HAPI Book. Cambridge, U.K.: Entropic Cambridge Research Laboratory.
- The HAPI Book
- Morton, R.¹ Whitehouse, D.² Ollason, D.³

29
- 1642377925
- Factor analyzed hidden Markov models for speech recognition
- A.-V. I. Rosti and M. J. F. Gales, "Factor analyzed hidden Markov models for speech recognition," Comput. Speech Lang., vol. 18, no. 2, pp. 181-200, 2003.
- (2003) Comput. Speech Lang , vol.18 , Issue.2 , pp. 181-200
- Rosti, A.-V.I.¹ Gales, M.J.F.²

30
- 33646798740
- The IBM 2004 conversational telephony system for rich transcription
- H. Soltau, B. Kingsbury, L. Mangu, D. Povey, G. Saon, and G. Zweig, "The IBM 2004 conversational telephony system for rich transcription," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, 2005, vol. 1, pp. 205-208.
- (2005) Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing , vol.1 , pp. 205-208
- Soltau, H.¹ Kingsbury, B.² Mangu, L.³ Povey, D.⁴ Saon, G.⁵ Zweig, G.⁶

31
- 85009192356
- An architecture for rapid decoding of large vocabulary conversational speech
- G. Saon, G. Zweig, B. Kingsbury, L. Mangu, and U. Chaudhari, "An architecture for rapid decoding of large vocabulary conversational speech," in Proc. Eur. Conf. Speech Communication and Technology, 2003, pp. 1977-1980.
- (2003) Proc. Eur. Conf. Speech Communication and Technology , pp. 1977-1980
- Saon, G.¹ Zweig, G.² Kingsbury, B.³ Mangu, L.⁴ Chaudhari, U.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.