메뉴 건너뛰기




Volumn 53, Issue 6, 2011, Pages 830-841

Contextual invariant-integration features for improved speaker-independent speech recognition

Author keywords

Invariant integration; Speaker independency; Speech recognition

Indexed keywords

ADAPTATION METHODS; CEPSTRAL COEFFICIENTS; EXTRACTION METHOD; FEATURE TYPES; INVARIANT-INTEGRATION; SPEAKER ADAPTATION; SPEAKER-INDEPENDENCY; SPEAKER-INDEPENDENT SPEECH RECOGNITION; TEST CONDITION; TESTING CONDITIONS; THEORY OF INVARIANTS; TIME-PERIODS; VERY LOW COMPLEXITY; VOCAL TRACT LENGTH NORMALIZATION;

EID: 79955539267     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2011.02.002     Document Type: Article
Times cited : (27)

References (50)
  • 2
    • 79955537195 scopus 로고    scopus 로고
    • Skull and vocal tract growth from newborn to adult
    • Ubatuba, Brazil
    • Boë, L.-J., Granat, J., Badin, P., Autesserre, D., Pochic, D., Zga, N., Henrich, N., Ménard, L., 2006. Skull and vocal tract growth from newborn to adult. In: Proc. 7th Int. Seminar on Speech Production (ISSP7). Ubatuba, Brazil, pp. 75-82.
    • (2006) Proc. 7th Int. Seminar on Speech Production (ISSP7) , pp. 75-82
    • Boë, L.-J.1
  • 5
    • 0027815284 scopus 로고
    • The scale representation
    • L. Cohen The scale representation IEEE Trans. Signal Process. 41 12 1993 3275 3292
    • (1993) IEEE Trans. Signal Process. , vol.41 , Issue.12 , pp. 3275-3292
    • Cohen, L.1
  • 6
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • S. Davis, and P. Mermelstein Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences IEEE Trans. Acoust. Speech Signal Process. 28 4 1980 357 366 (Pubitemid 11464930)
    • (1980) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.ASSP-28 , Issue.4 , pp. 357-366
    • Davis Steven, B.1    Mermelstein Paul2
  • 9
    • 84975559454 scopus 로고
    • Modified rapid transform
    • M. Fang, and G. Häusler Modified rapid transform Appl. Opt. 28 6 1989 1257 1262
    • (1989) Appl. Opt. , vol.28 , Issue.6 , pp. 1257-1262
    • Fang, M.1    Häusler, G.2
  • 10
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • M.J.F. Gales Maximum likelihood linear transformations for HMM-based speech recognition Comput. Speech Lang. 12 2 1998 75 98 (Pubitemid 128383747)
    • (1998) Computer Speech and Language , vol.12 , Issue.2 , pp. 75-98
    • Gales, M.J.F.1
  • 12
    • 0026292099 scopus 로고
    • Word recognition with the feature finding neural network (FFNN)
    • Princeton, NJ, USA
    • Gramss, T., 1991. Word recognition with the feature finding neural network (FFNN). In: Proc. IEEE Workshop Neural Networks for Signal Processing, Princeton, NJ, USA, pp. 289-298.
    • (1991) Proc. IEEE Workshop Neural Networks for Signal Processing , pp. 289-298
    • Gramss, T.1
  • 13
    • 85017287487 scopus 로고
    • Linear discriminant analysis for improved large vocabulary continuous speech recognition
    • San Francisco, CA, USA
    • Haeb-Umbach, R., Ney, H., 1992. Linear discriminant analysis for improved large vocabulary continuous speech recognition. In: Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, vol. 1, San Francisco, CA, USA, pp. 13-16.
    • (1992) Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing , vol.1 , pp. 13-16
    • Haeb-Umbach, R.1    Ney, H.2
  • 17
    • 0036497684 scopus 로고    scopus 로고
    • Segregating information about the size and shape of the vocal tract using a time-domain auditory model: The stabilised wavelet-Mellin transform
    • DOI 10.1016/S0167-6393(00)00085-6, PII S0167639300000856
    • T. Irino, and R. Patterson Segregating information about the size and the shape of the vocal tract using a time-domain auditory model: The stabilised wavelet-Mellin transform Speech Commun. 36 3 2002 181 203 (Pubitemid 34040942)
    • (2002) Speech Communication , vol.36 , Issue.3-4 , pp. 181-203
    • Irino, T.1    Patterson, R.D.2
  • 18
    • 33745738849 scopus 로고    scopus 로고
    • Speech feature extraction method using subband-based periodicity and nonperiodicity decomposition
    • DOI 10.1121/1.2205131
    • K. Ishizuka, T. Nakatani, Y. Minami, and N. Miyazaki Speech feature extraction method using subband-based periodicity and nonperiodicity decomposition J. Acoust. Soc. Amer. 120 1 2006 443 452 (Pubitemid 44014182)
    • (2006) Journal of the Acoustical Society of America , vol.120 , Issue.1 , pp. 443-452
    • Ishizuka, K.1    Nakatani, T.2    Minami, Y.3    Miyazaki, N.4
  • 21
    • 0024768209 scopus 로고
    • Speaker-independent phone recognition using hidden Markov models
    • K.F. Lee, and H.W. Hon Speaker-independent phone recognition using hidden Markov models IEEE Trans. Acoust. Speech Signal Process. 37 11 1989 1641 1648
    • (1989) IEEE Trans. Acoust. Speech Signal Process. , vol.37 , Issue.11 , pp. 1641-1648
    • Lee, K.F.1    Hon, H.W.2
  • 22
    • 0031647824 scopus 로고    scopus 로고
    • A frequency warping approach to speaker normalization
    • PII S1063667698000960
    • L. Lee, and R.C. Rose A frequency warping approach to speaker normalization IEEE Trans. Speech Audio Process. 6 1 1998 49 60 (Pubitemid 128720631)
    • (1998) IEEE Transactions on Speech and Audio Processing , vol.6 , Issue.1 , pp. 49-60
    • Lee, L.1    Rose, R.2
  • 23
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • C. Leggetter, and P. Woodland Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models Comput. Speech Lang. 9 2 1995 171 185
    • (1995) Comput. Speech Lang. , vol.9 , Issue.2 , pp. 171-185
    • Leggetter, C.1    Woodland, P.2
  • 24
    • 0005908591 scopus 로고
    • Linguistic Data Consortium, Philadelphia
    • Leonard, R.G., Doddington, G., 1993. TIDIGITS. Linguistic Data Consortium, Philadelphia.
    • (1993) TIDIGITS
    • Leonard, R.G.1    Doddington, G.2
  • 28
    • 70450166695 scopus 로고    scopus 로고
    • Low-dimensional, auditory feature vectors that improve vocal-tract-length normalization in automatic speech recognition
    • J.J. Monaghan, C. Feldbauer, T.C. Walters, and R.D. Patterson Low-dimensional, auditory feature vectors that improve vocal-tract-length normalization in automatic speech recognition J. Acoust. Soc. Amer. 123 5 2008 3066
    • (2008) J. Acoust. Soc. Amer. , vol.123 , Issue.5 , pp. 3066
    • Monaghan, J.J.1    Feldbauer, C.2    Walters, T.C.3    Patterson, R.D.4
  • 30
    • 77949470277 scopus 로고    scopus 로고
    • Generalized cyclic transformations in speaker-independent speech recognition
    • Merano, Italy
    • Müller, F., Belilovsky, E., Mertins, A., 2009. Generalized cyclic transformations in speaker-independent speech recognition. In: Proc. IEEE Automatic Speech Recognition and Understanding Workshop, Merano, Italy, pp. 211-215.
    • (2009) Proc. IEEE Automatic Speech Recognition and Understanding Workshop , pp. 211-215
    • Müller, F.1
  • 31
    • 70450220336 scopus 로고    scopus 로고
    • Invariant-integration method for robust feature extraction in speaker-independent speech recognition
    • Brighton, UK
    • Müller, F., Mertins, A., 2009. Invariant-integration method for robust feature extraction in speaker-independent speech recognition. In: Proc. Int. Conf. Spoken Language Processing (Interspeech 2009-ICSLP), Brighton, UK, pp. 2975-2978.
    • (2009) Proc. Int. Conf. Spoken Language Processing (Interspeech 2009-ICSLP) , pp. 2975-2978
    • Müller, F.1
  • 32
    • 77951480608 scopus 로고    scopus 로고
    • Nonlinear translation-invariant transformations for speaker-independent speech recognition
    • Advances in Nonlinear Speech Processing
    • F. Müller, and A. Mertins Nonlinear translation-invariant transformations for speaker-independent speech recognition J. Sole-Casals, V. Zaiats, Advances in Nonlinear Speech Processing LNAI vol. 5933 2010 Springer Heidelberg, Germany 111 119
    • (2010) LNAI , vol.5933 , pp. 111-119
    • Müller, F.1    Mertins, A.2
  • 33
    • 34250957819 scopus 로고
    • Der Endlichkeitssatz der Invarianten endlicher Gruppen
    • E. Noether Der Endlichkeitssatz der Invarianten endlicher Gruppen Mathematische Annalen 77 1 1915 89 92
    • (1915) Mathematische Annalen , vol.77 , Issue.1 , pp. 89-92
    • Noether, E.1
  • 34
    • 0034227088 scopus 로고    scopus 로고
    • Auditory images: How complex sounds are represented in the auditory system
    • R.D. Patterson Auditory images: How complex sounds are represented in the auditory system J. Acoust. Soc. Japan (E) 21 4 2000 183 190
    • (2000) J. Acoust. Soc. Japan (E) , vol.21 , Issue.4 , pp. 183-190
    • Patterson, R.D.1
  • 36
    • 27644522706 scopus 로고    scopus 로고
    • Vocal tract normalization equals linear transformation in cepstral space
    • DOI 10.1109/TSA.2005.848881
    • M. Pitz, and H. Ney Vocal tract normalization equals linear transformation in cepstral space IEEE Trans. Speech Audio Process. 13 5 2005 930 944 (Pubitemid 41558907)
    • (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , Issue.5 , pp. 930-944
    • Pitz, M.1    Ney, H.2
  • 37
    • 44949218505 scopus 로고    scopus 로고
    • Improved warping-invariant features for automatic speech recognition
    • Pittsburgh, PA, USA
    • Rademacher, J., Wächter, M., Mertins, A., 2006. Improved warping-invariant features for automatic speech recognition. In: Proc. Int. Conf. Spoken Language Processing (Interspeech 2006 - ICSLP), Pittsburgh, PA, USA, pp. 1499-1502.
    • (2006) Proc. Int. Conf. Spoken Language Processing (Interspeech 2006 - ICSLP) , pp. 1499-1502
    • Rademacher, J.1
  • 38
    • 0014551188 scopus 로고
    • A transformation with invariance under cyclic permutation for applications in pattern recognition
    • H. Reitboeck, and T.P. Brody A transformation with invariance under cyclic permutation for applications in pattern recognition Inform. Control 15 2 1969 130 154
    • (1969) Inform. Control , vol.15 , Issue.2 , pp. 130-154
    • Reitboeck, H.1    Brody, T.P.2
  • 40
    • 34547521738 scopus 로고    scopus 로고
    • Feature combination using linear discriminant analysis and its pitfalls
    • Pittsburgh, USA
    • Schlüter, R., Zolnay, A., Ney, H., 2006. Feature combination using linear discriminant analysis and its pitfalls. In: Proc. Int. Conf. Spoken Language Processing (ICSLP/Interspeech). Pittsburgh, USA, pp. 345-348.
    • (2006) Proc. Int. Conf. Spoken Language Processing (ICSLP/Interspeech) , pp. 345-348
    • Schlüter, R.1
  • 41
    • 35048839956 scopus 로고
    • On the existence of complete invariant feature spaces in pattern recognition
    • Hague, Netherlands
    • Schulz-Mirbach, H., 1992. On the existence of complete invariant feature spaces in pattern recognition. In: Proc. Int. Conf. Pattern Recognition, vol. 2, Hague, Netherlands, pp. 178-182.
    • (1992) Proc. Int. Conf. Pattern Recognition , vol.2 , pp. 178-182
    • Schulz-Mirbach, H.1
  • 43
    • 0001797247 scopus 로고
    • Invariant features for gray scale images
    • 1995 DAGM-Symposium. Springer, London, UK
    • Schulz-Mirbach, H., 1995b. Invariant features for gray scale images. In: Mustererkennung 1995, 17. DAGM-Symposium. Springer, London, UK, pp. 1-14.
    • (1995) Mustererkennung , vol.17 , pp. 1-14
    • Schulz-Mirbach, H.1
  • 45
    • 10044294210 scopus 로고    scopus 로고
    • Ph.D. thesis, Fakultät für Angewandte Wissenschaften, Albert-Ludwigs-Universität Freiburg, Breisgau, Germany
    • Siggelkow, S., 2002. Feature histograms for content-based image retrieval. Ph.D. thesis, Fakultät für Angewandte Wissenschaften, Albert-Ludwigs-Universität Freiburg, Breisgau, Germany.
    • (2002) Feature Histograms for Content-based Image Retrieval
    • Siggelkow, S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.