SCOPUS 정보 검색 플랫폼

Speech Communication

Volumn 53, Issue 6, 2011, Pages 830-841

Contextual invariant-integration features for improved speaker-independent speech recognition

(2) Müller, Florian a Mertins, Alfred a

a UNIVERSITY OF LÜBECK (Germany)

Author keywords

Invariant integration; Speaker independency; Speech recognition

Indexed keywords

ADAPTATION METHODS; CEPSTRAL COEFFICIENTS; EXTRACTION METHOD; FEATURE TYPES; INVARIANT-INTEGRATION; SPEAKER ADAPTATION; SPEAKER-INDEPENDENCY; SPEAKER-INDEPENDENT SPEECH RECOGNITION; TEST CONDITION; TESTING CONDITIONS; THEORY OF INVARIANTS; TIME-PERIODS; VERY LOW COMPLEXITY; VOCAL TRACT LENGTH NORMALIZATION;

FEATURE EXTRACTION; FILTER BANKS; INTEGRATION;

SPEECH RECOGNITION;

EID: 79955539267 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2011.02.002 Document Type: Article

Times cited : (27)

References (50)

1
- 34547941599
- Automatic speech recognition and speech variability: A review
- DOI 10.1016/j.specom.2007.02.006, PII S0167639307000404
- M. Benzeghiba, R.D. Mori, O. Deroo, S. Dupont, T. Erbes, D. Jouvet, L. Fissore, P. Laface, A. Mertins, C. Ris, R. Rose, V. Tyagi, and C. Wellekens Automatic speech recognition and speech variability: a review Speech Comm. 49 10-11 2007 763 786 (Pubitemid 47268571)
- (2007) Speech Communication , vol.49 , Issue.10-11 , pp. 763-786
- Benzeghiba, M.¹ De Mori, R.² Deroo, O.³ Dupont, S.⁴ Erbes, T.⁵ Jouvet, D.⁶ Fissore, L.⁷ Laface, P.⁸ Mertins, A.⁹ Ris, C.¹⁰ Rose, R.¹¹ Tyagi, V.¹² Wellekens, C.¹³

2
- 79955537195
- Skull and vocal tract growth from newborn to adult
- Ubatuba, Brazil
- Boë, L.-J., Granat, J., Badin, P., Autesserre, D., Pochic, D., Zga, N., Henrich, N., Ménard, L., 2006. Skull and vocal tract growth from newborn to adult. In: Proc. 7th Int. Seminar on Speech Production (ISSP7). Ubatuba, Brazil, pp. 75-82.
- (2006) Proc. 7th Int. Seminar on Speech Production (ISSP7) , pp. 75-82
- Boë, L.-J.¹

3
- 0019075787
- On invariant sets of a certain class of fast translation-invariant transforms
- H. Burkhardt, and X. Müller On invariant sets of a certain class of fast translation-invariant transforms IEEE Trans. Acoust. Speech Signal Process. 28 5 1980 517 523 (Pubitemid 11471084)
- (1980) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.ASSP-28 , Issue.5 , pp. 517-523
- Burkhardt Hans¹ Mueller Xaver²

4
- 0002163712
- Invariant features in pattern recognition - Fundamentals and applications
- H. Burkhardt, and S. Siggelkow Invariant features in pattern recognition - fundamentals and applications C. Kotropoulos, I. Pitas, Nonlinear Model-Based Image/Video Processing and Analysis 2001 John Wiley & Sons 269 307
- (2001) Nonlinear Model-Based Image/Video Processing and Analysis , pp. 269-307
- Burkhardt, H.¹ Siggelkow, S.²

5
- 0027815284
- The scale representation
- L. Cohen The scale representation IEEE Trans. Signal Process. 41 12 1993 3275 3292
- (1993) IEEE Trans. Signal Process. , vol.41 , Issue.12 , pp. 3275-3292
- Cohen, L.¹

6
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- S. Davis, and P. Mermelstein Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences IEEE Trans. Acoust. Speech Signal Process. 28 4 1980 357 366 (Pubitemid 11464930)
- (1980) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.ASSP-28 , Issue.4 , pp. 357-366
- Davis Steven, B.¹ Mermelstein Paul²

7
- 0003424145
- Macmillan New York
- J.R. Deller, J.G. Proakis, and J.H.L. Hansen Discrete-Time Processing of Speech Signals 1993 Macmillan New York
- (1993) Discrete-Time Processing of Speech Signals
- Deller, J.R.¹ Proakis, J.G.² Hansen, J.H.L.³

8
- 78651237606
- web resource
- Ellis, D.P.W., 2009. Gammatone-like spectrograms. web resource: .
- (2009) Gammatone-like Spectrograms
- Ellis, D.P.W.¹

9
- 84975559454
- Modified rapid transform
- M. Fang, and G. Häusler Modified rapid transform Appl. Opt. 28 6 1989 1257 1262
- (1989) Appl. Opt. , vol.28 , Issue.6 , pp. 1257-1262
- Fang, M.¹ Häusler, G.²

10
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- M.J.F. Gales Maximum likelihood linear transformations for HMM-based speech recognition Comput. Speech Lang. 12 2 1998 75 98 (Pubitemid 128383747)
- (1998) Computer Speech and Language , vol.12 , Issue.2 , pp. 75-98
- Gales, M.J.F.¹

11
- 3042518464
- DARPA TIMIT acoustic phonetic speech corpus
- Philadelphia
- Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L., 1993. DARPA TIMIT acoustic phonetic speech corpus. Linguistic Data Consortium, Philadelphia.
- (1993) Linguistic Data Consortium
- Garofolo, J.S.¹ Lamel, L.F.² Fisher, W.M.³ Fiscus, J.G.⁴ Pallett, D.S.⁵ Dahlgren, N.L.⁶

12
- 0026292099
- Word recognition with the feature finding neural network (FFNN)
- Princeton, NJ, USA
- Gramss, T., 1991. Word recognition with the feature finding neural network (FFNN). In: Proc. IEEE Workshop Neural Networks for Signal Processing, Princeton, NJ, USA, pp. 289-298.
- (1991) Proc. IEEE Workshop Neural Networks for Signal Processing , pp. 289-298
- Gramss, T.¹

13
- 85017287487
- Linear discriminant analysis for improved large vocabulary continuous speech recognition
- San Francisco, CA, USA
- Haeb-Umbach, R., Ney, H., 1992. Linear discriminant analysis for improved large vocabulary continuous speech recognition. In: Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, vol. 1, San Francisco, CA, USA, pp. 13-16.
- (1992) Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing , vol.1 , pp. 13-16
- Haeb-Umbach, R.¹ Ney, H.²

14
- 0003877861
- Ph.D. Thesis, Massachusetts Institute of Technology
- Halberstadt, A.K., 1998. Heterogeneous acoustic measurements and multiple classifiers for speech recognition. Ph.D. Thesis, Massachusetts Institute of Technology.
- (1998) Heterogeneous Acoustic Measurements and Multiple Classifiers for Speech Recognition
- Halberstadt, A.K.¹

15
- 0004056285
- Prentice Hall PTR Upper Saddle River, New York, USA
- X. Huang, A. Acero, and H. Hon Spoken Language Processing: A Guide to Theory, Algorithm, and System Development 2001 Prentice Hall PTR Upper Saddle River, New York, USA
- (2001) Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
- Huang, X.¹ Acero, A.² Hon, H.³

16
- 0002111759
- Mathematisch-physikalische Klasse
- Hurwitz, A., 1897. Ueber die Erzeugung der Invarianten durch Integration. Nachrichten von der Königl. Gesellschaft der Wissenschaften zu Göttingen, Mathematisch-physikalische Klasse, pp. 71-90.
- (1897) Ueber Die Erzeugung der Invarianten Durch Integration. Nachrichten von der Königl. Gesellschaft der Wissenschaften zu Göttingen , pp. 71-90
- Hurwitz, A.¹

17
- 0036497684
- Segregating information about the size and shape of the vocal tract using a time-domain auditory model: The stabilised wavelet-Mellin transform
- DOI 10.1016/S0167-6393(00)00085-6, PII S0167639300000856
- T. Irino, and R. Patterson Segregating information about the size and the shape of the vocal tract using a time-domain auditory model: The stabilised wavelet-Mellin transform Speech Commun. 36 3 2002 181 203 (Pubitemid 34040942)
- (2002) Speech Communication , vol.36 , Issue.3-4 , pp. 181-203
- Irino, T.¹ Patterson, R.D.²

18
- 33745738849
- Speech feature extraction method using subband-based periodicity and nonperiodicity decomposition
- DOI 10.1121/1.2205131
- K. Ishizuka, T. Nakatani, Y. Minami, and N. Miyazaki Speech feature extraction method using subband-based periodicity and nonperiodicity decomposition J. Acoust. Soc. Amer. 120 1 2006 443 452 (Pubitemid 44014182)
- (2006) Journal of the Acoustical Society of America , vol.120 , Issue.1 , pp. 443-452
- Ishizuka, K.¹ Nakatani, T.² Minami, Y.³ Miyazaki, N.⁴

19
- 12344336755
- Ph.D. Thesis, Universität Oldenburg
- Kleinschmidt, M., 2002. Robust speech recognition based on spectro-temporal processing. Ph.D. Thesis, Universität Oldenburg.
- (2002) Robust Speech Recognition Based on Spectro-temporal Processing
- Kleinschmidt, M.¹

20
- 85009233038
- Improving word accuracy with Gabor feature extraction
- Kleinschmidt, M., Gelbart, D., 2002. Improving word accuracy with Gabor feature extraction. In: Proc. Int. Conf. Spoken Language Processing. pp. 25-28.
- (2002) Proc. Int. Conf. Spoken Language Processing , pp. 25-28
- Kleinschmidt, M.¹ Gelbart, D.²

21
- 0024768209
- Speaker-independent phone recognition using hidden Markov models
- K.F. Lee, and H.W. Hon Speaker-independent phone recognition using hidden Markov models IEEE Trans. Acoust. Speech Signal Process. 37 11 1989 1641 1648
- (1989) IEEE Trans. Acoust. Speech Signal Process. , vol.37 , Issue.11 , pp. 1641-1648
- Lee, K.F.¹ Hon, H.W.²

22
- 0031647824
- A frequency warping approach to speaker normalization
- PII S1063667698000960
- L. Lee, and R.C. Rose A frequency warping approach to speaker normalization IEEE Trans. Speech Audio Process. 6 1 1998 49 60 (Pubitemid 128720631)
- (1998) IEEE Transactions on Speech and Audio Processing , vol.6 , Issue.1 , pp. 49-60
- Lee, L.¹ Rose, R.²

23
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- C. Leggetter, and P. Woodland Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models Comput. Speech Lang. 9 2 1995 171 185
- (1995) Comput. Speech Lang. , vol.9 , Issue.2 , pp. 171-185
- Leggetter, C.¹ Woodland, P.²

24
- 0005908591
- Linguistic Data Consortium, Philadelphia
- Leonard, R.G., Doddington, G., 1993. TIDIGITS. Linguistic Data Consortium, Philadelphia.
- (1993) TIDIGITS
- Leonard, R.G.¹ Doddington, G.²

25
- 10444269014
- Algorithms for hardware-based pattern recognition
- V. Lohweg, C. Diederichs, and D. Müller Algorithms for hardware-based pattern recognition EURASIP J. Appl. Signal Process. 2004 2004 1912 1920
- (2004) EURASIP J. Appl. Signal Process. , vol.2004 , pp. 1912-1920
- Lohweg, V.¹ Diederichs, C.² Müller, D.³

26
- 33846210086
- Vocal tract length invariant features for automatic speech recognition
- San Juan, Puerto Rico
- Mertins, A., Rademacher, J., 2005. Vocal tract length invariant features for automatic speech recognition. In: Proc. 2005 IEEE Automatic Speech Recognition and Understanding Workshop, San Juan, Puerto Rico, pp. 308-312.
- (2005) Proc. 2005 IEEE Automatic Speech Recognition and Understanding Workshop , pp. 308-312
- Mertins, A.¹ Rademacher, J.²

27
- 33947666117
- Frequency-warping invariant features for automatic speech recognition
- Toulouse, France
- Mertins, A., Rademacher, J., 2006. Frequency-warping invariant features for automatic speech recognition. In: Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, vol. V, Toulouse, France, pp. 1025-1028.
- (2006) Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing , vol.5 , pp. 1025-1028
- Mertins, A.¹ Rademacher, J.²

28
- 70450166695
- Low-dimensional, auditory feature vectors that improve vocal-tract-length normalization in automatic speech recognition
- J.J. Monaghan, C. Feldbauer, T.C. Walters, and R.D. Patterson Low-dimensional, auditory feature vectors that improve vocal-tract-length normalization in automatic speech recognition J. Acoust. Soc. Amer. 123 5 2008 3066
- (2008) J. Acoust. Soc. Amer. , vol.123 , Issue.5 , pp. 3066
- Monaghan, J.J.¹ Feldbauer, C.² Walters, T.C.³ Patterson, R.D.⁴

29
- 0030101058
- A revision of Zwicker's loudness model
- B.C.J. Moore, and B.R. Glasberg A revision of Zwicker's loudness model Acta Acustica united with Acustica 82 11 1996 245 335
- (1996) Acta Acustica United with Acustica , vol.82 , Issue.11 , pp. 245-335
- Moore, B.C.J.¹ Glasberg, B.R.²

30
- 77949470277
- Generalized cyclic transformations in speaker-independent speech recognition
- Merano, Italy
- Müller, F., Belilovsky, E., Mertins, A., 2009. Generalized cyclic transformations in speaker-independent speech recognition. In: Proc. IEEE Automatic Speech Recognition and Understanding Workshop, Merano, Italy, pp. 211-215.
- (2009) Proc. IEEE Automatic Speech Recognition and Understanding Workshop , pp. 211-215
- Müller, F.¹

31
- 70450220336
- Invariant-integration method for robust feature extraction in speaker-independent speech recognition
- Brighton, UK
- Müller, F., Mertins, A., 2009. Invariant-integration method for robust feature extraction in speaker-independent speech recognition. In: Proc. Int. Conf. Spoken Language Processing (Interspeech 2009-ICSLP), Brighton, UK, pp. 2975-2978.
- (2009) Proc. Int. Conf. Spoken Language Processing (Interspeech 2009-ICSLP) , pp. 2975-2978
- Müller, F.¹

32
- 77951480608
- Nonlinear translation-invariant transformations for speaker-independent speech recognition
- Advances in Nonlinear Speech Processing
- F. Müller, and A. Mertins Nonlinear translation-invariant transformations for speaker-independent speech recognition J. Sole-Casals, V. Zaiats, Advances in Nonlinear Speech Processing LNAI vol. 5933 2010 Springer Heidelberg, Germany 111 119
- (2010) LNAI , vol.5933 , pp. 111-119
- Müller, F.¹ Mertins, A.²

33
- 34250957819
- Der Endlichkeitssatz der Invarianten endlicher Gruppen
- E. Noether Der Endlichkeitssatz der Invarianten endlicher Gruppen Mathematische Annalen 77 1 1915 89 92
- (1915) Mathematische Annalen , vol.77 , Issue.1 , pp. 89-92
- Noether, E.¹

34
- 0034227088
- Auditory images: How complex sounds are represented in the auditory system
- R.D. Patterson Auditory images: How complex sounds are represented in the auditory system J. Acoust. Soc. Japan (E) 21 4 2000 183 190
- (2000) J. Acoust. Soc. Japan (E) , vol.21 , Issue.4 , pp. 183-190
- Patterson, R.D.¹

35
- 0000460671
- Complex sounds and auditory images
- Cazals, Y., Demany, L., Horner, K. (Eds.) Pergamon, Oxford
- Patterson, R.D., Robinson, K., Holdsworth, J., McKeown, D., Zhang, C., Allerhand, M., 1992. Complex sounds and auditory images. In: Cazals, Y., Demany, L., Horner, K. (Eds.), Auditory Physiology and Perception. Advanced Bioscience, vol. 83, Pergamon, Oxford, pp. 429-446.
- (1992) Auditory Physiology and Perception. Advanced Bioscience , vol.83 , pp. 429-446
- Patterson, R.D.¹ Robinson, K.² Holdsworth, J.³ McKeown, D.⁴ Zhang, C.⁵ Allerhand, M.⁶

36
- 27644522706
- Vocal tract normalization equals linear transformation in cepstral space
- DOI 10.1109/TSA.2005.848881
- M. Pitz, and H. Ney Vocal tract normalization equals linear transformation in cepstral space IEEE Trans. Speech Audio Process. 13 5 2005 930 944 (Pubitemid 41558907)
- (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , Issue.5 , pp. 930-944
- Pitz, M.¹ Ney, H.²

37
- 44949218505
- Improved warping-invariant features for automatic speech recognition
- Pittsburgh, PA, USA
- Rademacher, J., Wächter, M., Mertins, A., 2006. Improved warping-invariant features for automatic speech recognition. In: Proc. Int. Conf. Spoken Language Processing (Interspeech 2006 - ICSLP), Pittsburgh, PA, USA, pp. 1499-1502.
- (2006) Proc. Int. Conf. Spoken Language Processing (Interspeech 2006 - ICSLP) , pp. 1499-1502
- Rademacher, J.¹

38
- 0014551188
- A transformation with invariance under cyclic permutation for applications in pattern recognition
- H. Reitboeck, and T.P. Brody A transformation with invariance under cyclic permutation for applications in pattern recognition Inform. Control 15 2 1969 130 154
- (1969) Inform. Control , vol.15 , Issue.2 , pp. 130-154
- Reitboeck, H.¹ Brody, T.P.²

39
- 0033677121
- Maximum likelihood discriminant feature spaces
- Saon, G., Padmanabhan, M., Gopinath, R., Chen, S., 2000. Maximum likelihood discriminant feature spaces. In: Proc. Int. Conf. Audio Speech and Signal Processing. pp. 1129-1132.
- (2000) Proc. Int. Conf. Audio Speech and Signal Processing , pp. 1129-1132
- Saon, G.¹ Padmanabhan, M.² Gopinath, R.³ Chen, S.⁴

40
- 34547521738
- Feature combination using linear discriminant analysis and its pitfalls
- Pittsburgh, USA
- Schlüter, R., Zolnay, A., Ney, H., 2006. Feature combination using linear discriminant analysis and its pitfalls. In: Proc. Int. Conf. Spoken Language Processing (ICSLP/Interspeech). Pittsburgh, USA, pp. 345-348.
- (2006) Proc. Int. Conf. Spoken Language Processing (ICSLP/Interspeech) , pp. 345-348
- Schlüter, R.¹

41
- 35048839956
- On the existence of complete invariant feature spaces in pattern recognition
- Hague, Netherlands
- Schulz-Mirbach, H., 1992. On the existence of complete invariant feature spaces in pattern recognition. In: Proc. Int. Conf. Pattern Recognition, vol. 2, Hague, Netherlands, pp. 178-182.
- (1992) Proc. Int. Conf. Pattern Recognition , vol.2 , pp. 178-182
- Schulz-Mirbach, H.¹

42
- 1642281820
- Tr-402-95-018, Universitaet Hamburg, Hamburg, Germany
- Schulz-Mirbach, H., 1995a. Anwendung von Invarianzprinzipien zur Merkmalgewinnung in der Mustererkennung. Tr-402-95-018, Universitaet Hamburg, Hamburg, Germany.
- (1995) Anwendung von Invarianzprinzipien Zur Merkmalgewinnung in der Mustererkennung
- Schulz-Mirbach, H.¹

43
- 0001797247
- Invariant features for gray scale images
- 1995 DAGM-Symposium. Springer, London, UK
- Schulz-Mirbach, H., 1995b. Invariant features for gray scale images. In: Mustererkennung 1995, 17. DAGM-Symposium. Springer, London, UK, pp. 1-14.
- (1995) Mustererkennung , vol.17 , pp. 1-14
- Schulz-Mirbach, H.¹

44
- 84905172706
- A study on using the mellin transform for vowel recognition
- Italy
- Sena, A.D., Rocchesso, D., 2005. A study on using the mellin transform for vowel recognition. In: Proc. Sound and Music Conf. Salerno, Italy.
- (2005) Proc. Sound and Music Conf. Salerno
- Sena, A.D.¹ Rocchesso, D.²

45
- 10044294210
- Ph.D. thesis, Fakultät für Angewandte Wissenschaften, Albert-Ludwigs-Universität Freiburg, Breisgau, Germany
- Siggelkow, S., 2002. Feature histograms for content-based image retrieval. Ph.D. thesis, Fakultät für Angewandte Wissenschaften, Albert-Ludwigs-Universität Freiburg, Breisgau, Germany.
- (2002) Feature Histograms for Content-based Image Retrieval
- Siggelkow, S.¹

46
- 0036293694
- Non-uniform scaling based speaker normalization
- Orlando, USA
- Sinha, R., Umesh, S., 2002. Non-uniform scaling based speaker normalization. In: Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP'02), vol. 1. Orlando, USA, pp. I-589-I-592.
- (2002) Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP'02) , vol.1
- Sinha, R.¹ Umesh, S.²

47
- 0032761999
- Scale transform in speech analysis
- S. Umesh, L. Cohen, N. Marinovic, and D.J. Nelson Scale transform in speech analysis IEEE Trans. Speech Audio Process. 7 1 1999 40 45
- (1999) IEEE Trans. Speech Audio Process. , vol.7 , Issue.1 , pp. 40-45
- Umesh, S.¹ Cohen, L.² Marinovic, N.³ Nelson, D.J.⁴

48
- 0036299175
- A simple approach to non-uniform vowel normalization
- Orlando, USA
- Umesh, S., Kumar, S.V.B., Vinay, M.K., Sharma, R., Sinha, R., 2002. A simple approach to non-uniform vowel normalization. In: Proc. Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP'02), vol. 1, Orlando, USA, pp. I-517-I-520.
- (2002) Proc. Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP'02) , vol.1
- Umesh, S.¹ Kumar, S.V.B.² Vinay, M.K.³ Sharma, R.⁴ Sinha, R.⁵

49
- 0036753897
- Speaker adaptive modeling by vocal tract normalization
- DOI 10.1109/TSA.2002.803435
- L. Welling, H. Ney, and S. Kanthak Speaker adaptive modeling by vocal tract normalization IEEE Trans. Speech Audio Process. 10 6 2002 415 426 (Pubitemid 35311935)
- (2002) IEEE Transactions on Speech and Audio Processing , vol.10 , Issue.6 , pp. 415-426
- Welling, L.¹ Ney, H.² Kanthak, S.³

50
- 0003571976
- Cambridge University Engineering Department Cambridge, UK
- S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X.A. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland The HTK Book (for HTK Version 3.4.1) 2009 Cambridge University Engineering Department Cambridge, UK
- (2009) The HTK Book (For HTK Version 3.4.1)
- Young, S.¹ Evermann, G.² Gales, M.³ Hain, T.⁴ Kershaw, D.⁵ Liu, X.A.⁶ Moore, G.⁷ Odell, J.⁸ Ollason, D.⁹ Povey, D.¹⁰ Valtchev, V.¹¹ Woodland, P.¹²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.