메뉴 건너뛰기




Volumn 51, Issue 11, 2009, Pages 1124-1138

Unsupervised learning of time-frequency patches as a noise-robust representation of speech

Author keywords

Acoustic signal analysis; Automatic speech recognition; Language acquisition; Matrix factorization; Noise robustness

Indexed keywords

ACOUSTIC SIGNAL ANALYSIS; AUTOMATIC SPEECH RECOGNITION; LANGUAGE ACQUISITION; MATRIX FACTORIZATION; NOISE ROBUSTNESS;

EID: 67651030071     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2009.05.003     Document Type: Article
Times cited : (19)

References (36)
  • 1
    • 0029306339 scopus 로고
    • Improving the readability of time-frequency and time-scale representations by the reassignment method
    • Auger F., Flandrin, and P. Improving the readability of time-frequency and time-scale representations by the reassignment method. IEEE Trans. Signal Process. 43 5 (1995) 1068-1089
    • (1995) IEEE Trans. Signal Process. , vol.43 , Issue.5 , pp. 1068-1089
    • Auger, F.1    Flandrin2
  • 4
    • 0027646354 scopus 로고
    • Automatic segmentation and labeling of speech based on hidden markov models
    • Brugnara F., Falavigna D., and Omologo M. Automatic segmentation and labeling of speech based on hidden markov models. Speech Commun. 12 4 (1993) 357-370
    • (1993) Speech Commun. , vol.12 , Issue.4 , pp. 357-370
    • Brugnara, F.1    Falavigna, D.2    Omologo, M.3
  • 6
    • 33749568310 scopus 로고    scopus 로고
    • Convex and semi-nonnegative matrix factorizations for clustering and low-dimension representation
    • Tech. Rep. LBNL-60428, Lawrence Berkeley National Laboratory, US
    • Ding, C., Li, T., Jordan, M., 2006. Convex and semi-nonnegative matrix factorizations for clustering and low-dimension representation. Tech. Rep. LBNL-60428, Lawrence Berkeley National Laboratory, US.
    • (2006)
    • Ding, C.1    Li, T.2    Jordan, M.3
  • 7
    • 67651031775 scopus 로고    scopus 로고
    • ETSI Standard Document, 2000a. Distributed Speech Recognition; Front End Feature Extraction Algorithm; Compression Algorithm. ETSI ES 201 108 v1.1.2, April.
    • ETSI Standard Document, 2000a. Distributed Speech Recognition; Front End Feature Extraction Algorithm; Compression Algorithm. ETSI ES 201 108 v1.1.2, April.
  • 8
    • 67651031776 scopus 로고    scopus 로고
    • ETSI Standard Document, 2000b. Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Advanced Front-end Feature Extraction Algorithm; Compression Algorithm. ETSI ES 202 050 v1.1.1 (2002-10), April.
    • ETSI Standard Document, 2000b. Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Advanced Front-end Feature Extraction Algorithm; Compression Algorithm. ETSI ES 202 050 v1.1.1 (2002-10), April.
  • 10
    • 67651022525 scopus 로고    scopus 로고
    • Hainsworth, S., Macleod, M., 2003. Time-frequency reassignment: a review and analysis. Tech. Rep. CUED/FINFENG/TR.459, Cambridge University Engineering Department.
    • Hainsworth, S., Macleod, M., 2003. Time-frequency reassignment: a review and analysis. Tech. Rep. CUED/FINFENG/TR.459, Cambridge University Engineering Department.
  • 14
    • 0038669544 scopus 로고    scopus 로고
    • The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions
    • Workshop, Paris, France, September, pp
    • Hirsch, H.G., Pearce, D., 2000. The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In: Proc. ISCA ITRW ASR2000 Workshop, Paris, France, September, pp. 18-20.
    • (2000) Proc. ISCA ITRW ASR2000 , pp. 18-20
    • Hirsch, H.G.1    Pearce, D.2
  • 15
    • 84900510076 scopus 로고    scopus 로고
    • Non-negative matrix factorization with sparseness constraints
    • Hoyer P. Non-negative matrix factorization with sparseness constraints. J. Machine Learn. Res. 5 (2004) 1457-1469
    • (2004) J. Machine Learn. Res. , vol.5 , pp. 1457-1469
    • Hoyer, P.1
  • 16
    • 0032136330 scopus 로고    scopus 로고
    • Robust speech recognition using the modulation spectrogram
    • Kingsbury B., Morgan N., and Greenberg S. Robust speech recognition using the modulation spectrogram. Speech Commun. 25 (1998) 117-132
    • (1998) Speech Commun. , vol.25 , pp. 117-132
    • Kingsbury, B.1    Morgan, N.2    Greenberg, S.3
  • 17
    • 85009227802 scopus 로고    scopus 로고
    • Localized spectro-temporal features for automatic speech recognition
    • Geneva, Switzerland, September, pp
    • Kleinschmidt, M., 2003. Localized spectro-temporal features for automatic speech recognition. In: Proc. Eurospeech, Geneva, Switzerland, September, pp. 2573-2576.
    • (2003) Proc. Eurospeech , pp. 2573-2576
    • Kleinschmidt, M.1
  • 19
    • 84898964201 scopus 로고    scopus 로고
    • Algorithms for non-negative matrix factorization
    • Lee D., and Seung H. Algorithms for non-negative matrix factorization. Adv. Neural Inform. Process. Systems 13 (2001) 556-562
    • (2001) Adv. Neural Inform. Process. Systems , vol.13 , pp. 556-562
    • Lee, D.1    Seung, H.2
  • 21
    • 85046873967 scopus 로고    scopus 로고
    • The DET curve in assessment of detection task performance
    • Rhodes, Greece, September, pp
    • Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M., 1997. The DET curve in assessment of detection task performance. In: Proc. Eurospeech, Rhodes, Greece, September, pp. 1895-1898.
    • (1997) Proc. Eurospeech , pp. 1895-1898
    • Martin, A.1    Doddington, G.2    Kamm, T.3    Ordowski, M.4    Przybocki, M.5
  • 23
    • 55949112804 scopus 로고    scopus 로고
    • Discovering speech phones using convolutive non-negative matrix factorisation with a sparseness constraint
    • O'Grady P.D., and Pearlmutter B.A. Discovering speech phones using convolutive non-negative matrix factorisation with a sparseness constraint. Neurocomputing 72 (2008) 88-101
    • (2008) Neurocomputing , vol.72 , pp. 88-101
    • O'Grady, P.D.1    Pearlmutter, B.A.2
  • 24
    • 33846193289 scopus 로고    scopus 로고
    • Towards unsupervised pattern discovery in speech
    • San Juan, Puerto Rico, December, pp
    • Park, A., Glass, J., 2005. Towards unsupervised pattern discovery in speech. In: Proc. ASRU, San Juan, Puerto Rico, December, pp. 53-58.
    • (2005) Proc. ASRU , pp. 53-58
    • Park, A.1    Glass, J.2
  • 25
    • 0032069218 scopus 로고    scopus 로고
    • Improvement of speech spectrogram accuracy by the method of reassignment
    • Plante F., Meyer G., and Ainsworth W. Improvement of speech spectrogram accuracy by the method of reassignment. IEEE Trans. Speech Audio Process. 6 3 (1998) 282-286
    • (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.3 , pp. 282-286
    • Plante, F.1    Meyer, G.2    Ainsworth, W.3
  • 26
    • 67651028652 scopus 로고    scopus 로고
    • Unsupervised Phoneme Segmentation Using Transformed Cepstrum Features
    • Soc. Japan
    • Qiao, Y., Shimomura, N. Minematsu, N., 2008. Unsupervised Phoneme Segmentation Using Transformed Cepstrum Features. In: Proc. Spring Meeting of Acoust. Soc. Japan, 287-2901-11-20.
    • (2008) Proc. Spring Meeting of Acoust
    • Qiao, Y.1    Shimomura, N.2    Minematsu, N.3
  • 29
    • 85009186256 scopus 로고    scopus 로고
    • Unlimited vocabulary speech recognition based on morphs discovered in an unsupervised manner
    • Siivola, V., Hirsimäki, T., Creutz, M., Kurimo, M., 2003. Unlimited vocabulary speech recognition based on morphs discovered in an unsupervised manner. In: Proc. Eurospeech, pp. 2293-2296.
    • (2003) Proc. Eurospeech , pp. 2293-2296
    • Siivola, V.1    Hirsimäki, T.2    Creutz, M.3    Kurimo, M.4
  • 30
    • 38049021850 scopus 로고    scopus 로고
    • Convolutive speech bases and their application to speech separation
    • Smaragdis P. Convolutive speech bases and their application to speech separation. IEEE Trans. Speech Audio Process. 15 1 (2007) 1-12
    • (2007) IEEE Trans. Speech Audio Process. , vol.15 , Issue.1 , pp. 1-12
    • Smaragdis, P.1
  • 31
    • 67650137748 scopus 로고    scopus 로고
    • Discovering phone patterns in spoken utterances by nonnegative matrix factorisation
    • Stouten V., Demuynck K., and Van hamme H. Discovering phone patterns in spoken utterances by nonnegative matrix factorisation. IEEE Signal Process. Lett. 15 (2008) 131-134
    • (2008) IEEE Signal Process. Lett. , vol.15 , pp. 131-134
    • Stouten, V.1    Demuynck, K.2    Van hamme, H.3
  • 32
    • 84946750132 scopus 로고    scopus 로고
    • Mel-cepstrum modulation spectrum (MCMS) features for robust ASR
    • Virgin Islands, December, pp
    • Tyagi, V., McCowan, I., Misra, H., Bourlard, H., 2003. Mel-cepstrum modulation spectrum (MCMS) features for robust ASR. In: Proc. ASRU 2003 Workshop, St. Thomas, Virgin Islands, December, pp. 399-404.
    • (2003) Proc. ASRU 2003 Workshop, St. Thomas , pp. 399-404
    • Tyagi, V.1    McCowan, I.2    Misra, H.3    Bourlard, H.4
  • 33
    • 84867218914 scopus 로고    scopus 로고
    • HAC-models: A Novel Approach to Continuous Speech Recognition
    • Brisbane, Australia, pp
    • Van hamme, H., 2008a. HAC-models: A Novel Approach to Continuous Speech Recognition. In: Proc. International Conference on Spoken Language Processing, Brisbane, Australia, pp. 2554-2557.
    • (2008) Proc. International Conference on Spoken Language Processing , pp. 2554-2557
    • Van hamme, H.1
  • 35
    • 50249152311 scopus 로고    scopus 로고
    • Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria
    • Virtanen T. Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio Speech Language Process. 15 3 (2007) 1066-1074
    • (2007) IEEE Trans. Audio Speech Language Process. , vol.15 , Issue.3 , pp. 1066-1074
    • Virtanen, T.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.