메뉴 건너뛰기




Volumn 21, Issue 5, 2013, Pages 1023-1034

Computing MMSE estimates and residual uncertainty directly in the feature domain of ASR using STFT domain speech distortion models

Author keywords

MMSE; uncertainty decoding; uncertainty propagation; wiener filter

Indexed keywords

AUTOMATIC SPEECH RECOGNITION; CEPSTRAL COEFFICIENTS; DISTORTION MODEL; DYNAMIC COMPENSATION; FEATURE DOMAIN; FEATURE EXTRACTION METHODS; MINIMUM MEAN SQUARE ERRORS (MMSE); MMSE; MMSE ESTIMATORS; OBSERVATION UNCERTAINTIES; POSTERIOR DISTRIBUTIONS; ROBUST ASR; SHORT TIME FOURIER TRANSFORMS; SPEECH DISTORTION; UNCERTAINTY DECODING; UNCERTAINTY PROPAGATION; WIENER FILTERS;

EID: 84873901811     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2013.2244085     Document Type: Article
Times cited : (23)

References (49)
  • 6
    • 85009275141 scopus 로고    scopus 로고
    • Exploiting variances in robust feature extraction based on a parametric model of speech distortion
    • L. Deng, J. Droppo, and A. Acero, "Exploiting variances in robust feature extraction based on a parametric model of speech distortion," in Proc. Int. Conf. Spoken Lang. Process. (ICSLP), 2002.
    • (2002) Proc. Int. Conf. Spoken Lang. Process. (ICSLP)
    • Deng, L.1    Droppo, J.2    Acero, A.3
  • 7
    • 33749058582 scopus 로고    scopus 로고
    • Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing data techniques
    • Oct.
    • D. Kolossa, A. Klimas, and R. Orglmeister, "Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing data techniques," in Proc. Workshop Applicat. Signal Process. Audio Acoust. (WASPAA), Oct. 2005, pp. 82-85.
    • (2005) Proc. Workshop Applicat. Signal Process. Audio Acoust. (WASPAA) , pp. 82-85
    • Kolossa, D.1    Klimas, A.2    Orglmeister, R.3
  • 8
    • 40249103761 scopus 로고    scopus 로고
    • Issues with uncertainty decoding for noise robust automatic speech recognition
    • H. Liao and M. Gales, "Issues with uncertainty decoding for noise robust automatic speech recognition," Speech Commun., vol. 50, no. 4, pp. 265-277, 2008.
    • (2008) Speech Commun. , vol.50 , Issue.4 , pp. 265-277
    • Liao, H.1    Gales, M.2
  • 11
    • 70450180986 scopus 로고    scopus 로고
    • Model based feature enhancement for automatic speech recognition in reverberant environments
    • A. Krueger and R. Haeb-Umbach, "Model based feature enhancement for automatic speech recognition in reverberant environments," in In Proc. Interspeech, 2009, pp. 1231-1234.
    • (2009) Proc. Interspeech , pp. 1231-1234
    • Krueger, A.1    Haeb-Umbach, R.2
  • 12
    • 0036508276 scopus 로고    scopus 로고
    • Speaker verification in noise using a stochastic version of the weighted viterbi algorithm
    • Mar
    • N. Yoma and M. Villar, "Speaker verification in noise using a stochastic version of the weighted viterbi algorithm," IEEE Trans. Speech Audio Process., vol. 10, no. 3, pp. 158-166, Mar. 2002.
    • (2002) IEEE Trans. Speech Audio Process. , vol.10 , Issue.3 , pp. 158-166
    • Yoma, N.1    Villar, M.2
  • 14
    • 56249136428 scopus 로고    scopus 로고
    • Transforming binary uncertainties for robust speech recognition
    • Se
    • S. Srinivasan and D. Wang, "Transforming binary uncertainties for robust speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 7, pp. 2130-2140, Sep. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.7 , pp. 2130-2140
    • Srinivasan, S.1    Wang, D.2
  • 15
    • 84867602659 scopus 로고    scopus 로고
    • Integration of beamforming and automatic speech recognition through propagation of the wiener posterior
    • R. F. Astudillo, A. Abad, and J. P. Neto, "Integration of beamforming and automatic speech recognition through propagation of the wiener posterior," in Proc. ICASSP, Apr. 2012, pp. 4909-4912.
    • (2012) Proc. ICASSP, Apr. , pp. 4909-4912
    • Astudillo, R.F.1    Abad, A.2    Neto, J.P.3
  • 16
  • 17
    • 77956717352 scopus 로고    scopus 로고
    • An uncertainty propagation approach to robust ASR using the ETSI advanced front-end
    • Oct
    • R. F. Astudillo, D. Kolossa, P. Mandelartz, and R. Orglmeister, "An uncertainty propagation approach to robust ASR using the ETSI advanced front-end," IEEE J. Sel. Topics Signal Process., vol. 4, no. 5, pp. 824-833, Oct. 2010.
    • (2010) IEEE J. Sel. Topics Signal Process. , vol.4 , Issue.5 , pp. 824-833
    • Astudillo, R.F.1    Kolossa, D.2    Mandelartz, P.3    Orglmeister, R.4
  • 18
    • 79959836811 scopus 로고    scopus 로고
    • A MMSE estimator in mel-cep-stral domain for robust large vocabulary automatic speech recognition using uncertainty propagation
    • R. F. Astudillo and R. Orglmeister, "A MMSE estimator in mel-cep-stral domain for robust large vocabulary automatic speech recognition using uncertainty propagation," in Proc. Interspeech, 2010.
    • (2010) Proc. Interspeech
    • Astudillo, R.F.1    Orglmeister, R.2
  • 19
    • 66149101303 scopus 로고    scopus 로고
    • Robust speech recognition using a cepstral minimum-mean-square-error- motivated noise suppressor
    • Jul
    • D. Yu, L. Deng, J. Droppo, J. Wu, Y. Gong, and A. Acero, "Robust speech recognition using a cepstral minimum-mean-square-error-motivated noise suppressor," IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 5, pp. 1061-1070, Jul. 2008.
    • (2008) IEEE Trans. Audio, Speech, Lang. Process. , vol.16 , Issue.5 , pp. 1061-1070
    • Yu, D.1    Deng, L.2    Droppo, J.3    Wu, J.4    Gong, Y.5    Acero, A.6
  • 20
    • 79551496435 scopus 로고    scopus 로고
    • MMSE estimation of log-filterbank energies for robust speech recognition
    • A. Stark and K. Paliwal, "MMSE estimation of log-filterbank energies for robust speech recognition," Speech Commun., vol. 53, no. 3, pp. 403-416, 2011.
    • (2011) Speech Commun. , vol.53 , Issue.3 , pp. 403-416
    • Stark, A.1    Paliwal, K.2
  • 21
    • 44949190747 scopus 로고    scopus 로고
    • Improved source modeling and predictive classification for channel robust speech recognition
    • V. Ion and R. Haeb-Umbach, "Improved source modeling and predictive classification for channel robust speech recognition," in Proc. Interspeech, 2006.
    • (2006) Proc. Interspeech
    • Ion, V.1    Haeb-Umbach, R.2
  • 22
    • 0019009880 scopus 로고
    • Speech enhancement using a soft-decision noise suppression filter
    • Apr.
    • R. McAulay and M. Malpass, "Speech enhancement using a soft-decision noise suppression filter," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-28, no. 2, pp. 137-145, Apr. 1980.
    • (1980) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-28 , Issue.2 , pp. 137-145
    • McAulay, R.1    Malpass, M.2
  • 23
    • 0021645331 scopus 로고
    • Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator
    • Dec.
    • Y. Ephraim and D. Malah, "Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 6, pp. 1109-1121, Dec. 1984.
    • (1984) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-32 , Issue.6 , pp. 1109-1121
    • Ephraim, Y.1    Malah, D.2
  • 24
    • 0021892216 scopus 로고
    • Speech enhancement using a minimum mean square error log-spectral amplitude estimator
    • Apr.
    • Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean square error log-spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-33, no. 2, pp. 443-445, Apr. 1985.
    • (1985) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-33 , Issue.2 , pp. 443-445
    • Ephraim, Y.1    Malah, D.2
  • 26
    • 0023773764 scopus 로고
    • A microphone array with adaptive post-filtering for noise reduction in reverberant rooms
    • Apr.
    • R. Zelinski, "A microphone array with adaptive post-filtering for noise reduction in reverberant rooms," in Proc. Int. Conf. Acoust., Speech, Signal Process., Apr. 1988, vol. 5, pp. 2578-2581.
    • (1988) Proc. Int. Conf. Acoust., Speech, Signal Process , vol.5 , pp. 2578-2581
    • Zelinski, R.1
  • 27
    • 7544226792 scopus 로고    scopus 로고
    • Speech enhancement based on the general transfer function GSC and postfiltering
    • Nov
    • S. Gannot and I. Cohen, "Speech enhancement based on the general transfer function GSC and postfiltering," IEEE Trans. Speech Audio Process., vol. 12, no. 6, pp. 561-571, Nov. 2004.
    • (2004) IEEE Trans. Speech Audio Process. , vol.12 , Issue.6 , pp. 561-571
    • Gannot, S.1    Cohen, I.2
  • 29
    • 0041360463 scopus 로고    scopus 로고
    • Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging
    • Set
    • I. Cohen, "Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging," IEEE Trans. Speech Audio Process., vol. 11, no. 5, pp. 466-475, Sept. 2003.
    • (2003) IEEE Trans. Speech Audio Process. , vol.11 , Issue.5 , pp. 466-475
    • Cohen, I.1
  • 31
    • 70450180510 scopus 로고    scopus 로고
    • Accounting for the uncertainty of speech estimates in the complex domain for minimum mean square error speech enhancement
    • R. F. Astudillo, D. Kolossa, and R. Orglmeister, "Accounting for the uncertainty of speech estimates in the complex domain for minimum mean square error speech enhancement," in Proc. Interspeech, 2009.
    • (2009) Proc. Interspeech
    • Astudillo, R.F.1    Kolossa, D.2    Orglmeister, R.3
  • 34
    • 51449104842 scopus 로고    scopus 로고
    • Minimum mean-square error estimation of discrete fourier coefficients with generalized gamma priors
    • Aug
    • J. Erkelens, R. Hendriks, R. Heusdens, and J. Jensen, "Minimum mean-square error estimation of discrete fourier coefficients with generalized gamma priors," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 6, pp. 1741-1752, Aug. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.6 , pp. 1741-1752
    • Erkelens, J.1    Hendriks, R.2    Heusdens, R.3    Jensen, J.4
  • 35
    • 0141957802 scopus 로고    scopus 로고
    • Efficient alternatives to the ephraim and malah suppression rule for audio signal enhancement
    • P. J. Wolfe and S. J. Godsill, "Efficient alternatives to the ephraim and malah suppression rule for audio signal enhancement," EURASIP J. Appl. Signal Process., vol. 10, pp. 1043-1051, 2003.
    • (2003) EURASIP J. Appl. Signal Process. , vol.10 , pp. 1043-1051
    • Wolfe, P.J.1    Godsill, S.J.2
  • 37
    • 51749122132 scopus 로고    scopus 로고
    • Propagation of statistical information through non-linear feature extractions for robust speech recognition
    • R. F. Astudillo, D. Kolossa, and R. Orglmeister, "Propagation of statistical information through non-linear feature extractions for robust speech recognition," in Proc. MaxEnt2007, 2007.
    • (2007) Proc. MaxEnt2007
    • Astudillo, R.F.1    Kolossa, D.2    Orglmeister, R.3
  • 38
    • 84872036128 scopus 로고    scopus 로고
    • Uncertainty propagation for speech recognition using rasta features in highly nonstationary noisy environments
    • R. F. Astudillo, D. Kolossa, and R. Orglmeister, "Uncertainty propagation for speech recognition using rasta features in highly nonstationary noisy environments," in Proc. ITG Workshop for Speech Commun., 2008.
    • (2008) Proc. ITG Workshop for Speech Commun.
    • Astudillo, R.F.1    Kolossa, D.2    Orglmeister, R.3
  • 39
    • 84865725710 scopus 로고    scopus 로고
    • Propagation of uncertainty through multilayer perceptrons for robust automatic speech recognition
    • R. F. Astudillo and J. P. Neto, "Propagation of uncertainty through multilayer perceptrons for robust automatic speech recognition," in Proc. Interspeech, 2011, pp. 461-464.
    • (2011) Proc. Interspeech , pp. 461-464
    • Astudillo, R.F.1    Neto, J.P.2
  • 40
    • 84873928960 scopus 로고    scopus 로고
    • Some applications of dirac's delta function in statistics for more than one random variable
    • S. Chakraborty, "Some applications of dirac's delta function in statistics for more than one random variable," Applicat. Appl. Math., vol. 3, no. 1, pp. 42-54, 2008.
    • (2008) Applicat. Appl. Math. , vol.3 , Issue.1 , pp. 42-54
    • Chakraborty, S.1
  • 42
    • 84887135566 scopus 로고
    • The sum of log-normal probability distributions in scattered transmission systems
    • L. Fenton, "The sum of log-normal probability distributions in scattered transmission systems," IRE Trans. Commun. Syst., vol. 8, pp. 57-67, 1960.
    • (1960) IRE Trans. Commun. Syst. , vol.8 , pp. 57-67
    • Fenton, L.1
  • 43
    • 85009074657 scopus 로고    scopus 로고
    • Iterating Laplaces method to remove multiple types of acoustic distortion for robust speech recognition
    • Sep.
    • B. Frey, L. Deng, A. Acero, and T. T. Kristjansson, "Iterating Laplaces method to remove multiple types of acoustic distortion for robust speech recognition," in Proc. Eurospeech, Aalborg, Denmark, Sep. 2001.
    • (2001) Proc. Eurospeech, Aalborg, Denmark
    • Frey, B.1    Deng, L.2    Acero, A.3    Kristjansson, T.T.4
  • 44
    • 33745202806 scopus 로고    scopus 로고
    • Joint uncertainty decoding for noise robust speech recognition
    • H. Liao and M. J. F. Gales, "Joint uncertainty decoding for noise robust speech recognition," in Proc. Interspeech, 2005, pp. 3129-3132.
    • (2005) Proc. Interspeech , pp. 3129-3132
    • Liao, H.1    Gales, M.J.F.2
  • 45
    • 85032752225 scopus 로고    scopus 로고
    • Missing-feature approaches in speech recognition
    • Se
    • B. Raj and R. Stern, "Missing-feature approaches in speech recognition," IEEE Signal Process. Mag., vol. 22, no. 5, pp. 101-116, Sep. 2005.
    • (2005) IEEE Signal Process. Mag. , vol.22 , Issue.5 , pp. 101-116
    • Raj, B.1    Stern, R.2
  • 47
    • 0003571976 scopus 로고    scopus 로고
    • Cambridge, U.K.: Cambridge Univ. Engineering Department.
    • S. Young, The HTK Book (for HTK Version 3.4). Cambridge, U.K.: Cambridge Univ. Engineering Department., 2006.
    • (2006) The HTK Book (For HTK Version 3.4)
    • Young, S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.