메뉴 건너뛰기




Volumn 26, Issue 2, 2009, Pages 137-148

Temporal and spectral processing methods for processing of degraded speech: A review

Author keywords

Multi speaker speech; Noisy speech; Reverberant speech; Speech enhancement; Temporal processing and spectral processing

Indexed keywords

BACKGROUND NOISE; FREQUENCY DOMAINS; MULTI-SPEAKER SPEECH; NOISY SPEECH; REVERBERANT SPEECH; TEMPORAL PROCESSING AND SPECTRAL PROCESSING; TIME DOMAINS;

EID: 64649095787     PISSN: 02564602     EISSN: None     Source Type: Journal    
DOI: 10.4103/0256-4602.49103     Document Type: Review
Times cited : (17)

References (60)
  • 1
    • 0018455310 scopus 로고
    • Suppression of acoustic noise in speech using spectral subtraction
    • Apr
    • S. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust., Speech, Signal process., vol. ASSP-27, pp. 113-20, Apr. 1979.
    • (1979) IEEE Trans. Acoust., Speech, Signal process , vol.ASSP-27 , pp. 113-120
    • Boll, S.1
  • 4
    • 0026882842 scopus 로고
    • Experiments with a nonlinear spectral subtractor (NSS), Hidden Markov Models and the projection, for robust speech recognition in cars
    • P. Lockwood, and J. Boudy, "Experiments with a nonlinear spectral subtractor (NSS), Hidden Markov Models and the projection, for robust speech recognition in cars," Speech Communication, vol. 11, no. 2-3, pp. 215-28, 1992.
    • (1992) Speech Communication , vol.11 , Issue.2-3 , pp. 215-228
    • Lockwood, P.1    Boudy, J.2
  • 5
    • 0036293748 scopus 로고    scopus 로고
    • A multi-band spectral subtraction method for enhancing speech corrupted by colored noise
    • Orlando, USA, May
    • S. Kamath, and P. Loizou, "A multi-band spectral subtraction method for enhancing speech corrupted by colored noise," in Proc. IEEE Int. Conf. Acoust., Speech, Signal process., Orlando, USA, May 2002.
    • (2002) Proc. IEEE Int. Conf. Acoust., Speech, Signal process
    • Kamath, S.1    Loizou, P.2
  • 7
    • 0033097443 scopus 로고    scopus 로고
    • Single channel speech enhancement based on masking properties of the human auditory system
    • Mar
    • N. Virag, "Single channel speech enhancement based on masking properties of the human auditory system," IEEE Trans. Speech Audio process., vol. 7, pp. 126-37, Mar. 1999.
    • (1999) IEEE Trans. Speech Audio process , vol.7 , pp. 126-137
    • Virag, N.1
  • 8
    • 0021645331 scopus 로고
    • Speech enhancement using a minimummean square error short-time spectral amplitude estimator
    • Dec
    • Y. Ephraim, and D. Malah, "Speech enhancement using a minimummean square error short-time spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal process., vol. ASSP-32, pp. 1109-21, Dec. 1984.
    • (1984) IEEE Trans. Acoust., Speech, Signal process , vol.ASSP-32 , pp. 1109-1121
    • Ephraim, Y.1    Malah, D.2
  • 9
    • 0021892216 scopus 로고
    • Speech enhancement using a minimum mean square error log-spectral amplitude estimator
    • Apr
    • Y. Ephraim, and D. Malah, "Speech enhancement using a minimum mean square error log-spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal process., vol. ASSP-33, pp. 443-5, Apr. 1985.
    • (1985) IEEE Trans. Acoust., Speech, Signal process , vol.ASSP-33 , pp. 443-445
    • Ephraim, Y.1    Malah, D.2
  • 10
    • 64649104394 scopus 로고    scopus 로고
    • Speech recognition and enhancement using autocorrelation domain processing,
    • Ph.D. dissertation, School of engineering, Griffith University, Brisbane, Australia, Aug
    • B. J. Shannon, "Speech recognition and enhancement using autocorrelation domain processing," Ph.D. dissertation, School of engineering, Griffith University, Brisbane, Australia, Aug. 2006.
    • (2006)
    • Shannon, B.J.1
  • 11
    • 0036476655 scopus 로고    scopus 로고
    • Speech pause detection for noise spectrum estimation by tracking power envelope dynamics
    • Feb
    • M. Marzinzik, and B. Kollmeier, "Speech pause detection for noise spectrum estimation by tracking power envelope dynamics," IEEE Trans. Speech Audio process., vol. 10, pp. 109-18, Feb. 2002.
    • (2002) IEEE Trans. Speech Audio process , vol.10 , pp. 109-118
    • Marzinzik, M.1    Kollmeier, B.2
  • 12
    • 33846907750 scopus 로고    scopus 로고
    • A Laplacian-based MMSE estimator for speech enhancement
    • Feb
    • B. Chen, and P. C. Loizou, "A Laplacian-based MMSE estimator for speech enhancement," Speech Communication, vol. 49, pp. 134-43, Feb. 2007.
    • (2007) Speech Communication , vol.49 , pp. 134-143
    • Chen, B.1    Loizou, P.C.2
  • 13
    • 0028413241 scopus 로고
    • Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor
    • Apr
    • O. Cappe, "Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor," IEEE Trans. Speech Audio process., vol. 2, pp. 345-9, Apr. 1994.
    • (1994) IEEE Trans. Speech Audio process , vol.2 , pp. 345-349
    • Cappe, O.1
  • 14
    • 26444569329 scopus 로고    scopus 로고
    • Speech enhancement using super-Gaussian speech models and noncausal a priori SNR estimation
    • Nov
    • I. Cohen, "Speech enhancement using super-Gaussian speech models and noncausal a priori SNR estimation," Speech Communication, vol. 47, pp. 336-50, Nov. 2005.
    • (2005) Speech Communication , vol.47 , pp. 336-350
    • Cohen, I.1
  • 15
    • 47049131634 scopus 로고    scopus 로고
    • Speech enhancement based on undecimated wavelet packet-perceptual filterbanks and MMSE-STSA estimation in various noise environments
    • Sep
    • H. Tasmaz, and E. Ercelebi, "Speech enhancement based on undecimated wavelet packet-perceptual filterbanks and MMSE-STSA estimation in various noise environments," Digital Signal process., vol. 18, no. 5, pp. 797-812, Sep. 2008.
    • (2008) Digital Signal process , vol.18 , Issue.5 , pp. 797-812
    • Tasmaz, H.1    Ercelebi, E.2
  • 16
    • 0029307534 scopus 로고
    • De-noising by soft-thresholding
    • May
    • D. Donoho, "De-noising by soft-thresholding," IEEE Trans. Information Theory, vol. 41, no. 3, pp. 613-27, May 1995.
    • (1995) IEEE Trans. Information Theory , vol.41 , Issue.3 , pp. 613-627
    • Donoho, D.1
  • 17
    • 0344494081 scopus 로고    scopus 로고
    • Reducing signal-bias from mad estimated noise level for dct speech enhancement
    • M. K. Hasan, S. Salahuddin, and M. R. Khan, "Reducing signal-bias from mad estimated noise level for dct speech enhancement," Signal Process., vol. 84, no. 1, pp. 151-62, 2004.
    • (2004) Signal Process , vol.84 , Issue.1 , pp. 151-162
    • Hasan, M.K.1    Salahuddin, S.2    Khan, M.R.3
  • 18
    • 33750529732 scopus 로고    scopus 로고
    • Multiple statistical models for soft decision in noisy speech enhancement
    • Mar
    • J.-H. Chang, S. Gazor, N. S. Kim, and S. K. Mitra, "Multiple statistical models for soft decision in noisy speech enhancement," Pattern Recognition, vol. 40, pp. 1123-34, Mar. 2007.
    • (2007) Pattern Recognition , vol.40 , pp. 1123-1134
    • Chang, J.-H.1    Gazor, S.2    Kim, N.S.3    Mitra, S.K.4
  • 20
    • 33748460996 scopus 로고    scopus 로고
    • Speech enhancement by residual domain constrained optimization
    • Oct
    • W. Jin, and M. S. Scordilis, "Speech enhancement by residual domain constrained optimization," Speech Communication, vol. 48, pp. 1349-64, Oct. 2006.
    • (2006) Speech Communication , vol.48 , pp. 1349-1364
    • Jin, W.1    Scordilis, M.S.2
  • 22
    • 0018494073 scopus 로고
    • Invertibility of a room impulse response
    • S. Neely, and J. Allen, "Invertibility of a room impulse response," J. Acoust. Soc. Am., vol. 66, pp. 165-9, 1979.
    • (1979) J. Acoust. Soc. Am , vol.66 , pp. 165-169
    • Neely, S.1    Allen, J.2
  • 23
    • 51449084820 scopus 로고    scopus 로고
    • Single-and multi-microphone speech dereverberation using spectral enhancement,
    • Ph.D. dissertation, Technische Universiteit Eindhoven, The Netherlands, Jun
    • E. Habets, "Single-and multi-microphone speech dereverberation using spectral enhancement," Ph.D. dissertation, Technische Universiteit Eindhoven, The Netherlands, Jun. 2007, http://alexandria.tue.nl/extra2/ 200710970.pdf.
    • (2007)
    • Habets, E.1
  • 24
    • 0014793955 scopus 로고
    • Signal processing to reduce multipath distortion in small rooms
    • J. Flanagan, and R. Lummis, "Signal processing to reduce multipath distortion in small rooms," J. Acoust. Soc. Am., vol. 47, pp. 1475-81, 1970.
    • (1970) J. Acoust. Soc. Am , vol.47 , pp. 1475-1481
    • Flanagan, J.1    Lummis, R.2
  • 25
    • 0017659025 scopus 로고
    • Multimicrophone signal-processing technique to remove room reverberation from speech signals
    • J. Allen, D. Berkley, and J. Blauert, "Multimicrophone signal-processing technique to remove room reverberation from speech signals," J. Acoust. Soc. Am., vol. 62, pp. 912-5, 1977.
    • (1977) J. Acoust. Soc. Am , vol.62 , pp. 912-915
    • Allen, J.1    Berkley, D.2    Blauert, J.3
  • 26
    • 0141830958 scopus 로고    scopus 로고
    • Blind dereverberation of single channel speech signal based on harmonic structure
    • Hong Kong, China PR, Apr
    • T. Nakatani, and M. Miyoshi, "Blind dereverberation of single channel speech signal based on harmonic structure," in Proc. IEEE Int. Conf. Acoust., Speech, Signal process., vol. 1, Hong Kong, China PR, Apr. 2003, pp. 92-5.
    • (2003) Proc. IEEE Int. Conf. Acoust., Speech, Signal process , vol.1 , pp. 92-95
    • Nakatani, T.1    Miyoshi, M.2
  • 27
    • 14344274593 scopus 로고    scopus 로고
    • A new method based on spectral subtraction for speech dereverberation
    • K. Lebart, and J. Boucher, "A new method based on spectral subtraction for speech dereverberation," Acta Acoustica, vol. 87, pp. 359-66, 2001.
    • (2001) Acta Acoustica , vol.87 , pp. 359-366
    • Lebart, K.1    Boucher, J.2
  • 29
    • 0034857681 scopus 로고    scopus 로고
    • Speech dereverberation via maximum-kurtosis subband adaptive filltering
    • Salt Lake City, USA
    • B. Gillespie, H. Malvar, and D. Florencio, "Speech dereverberation via maximum-kurtosis subband adaptive filltering," in Proc. IEEE Int. Conf. Acoust., Speech, Signal process., vol. 6, Salt Lake City, USA, 2001, pp. 3701-4.
    • (2001) Proc. IEEE Int. Conf. Acoust., Speech, Signal process , vol.6 , pp. 3701-3704
    • Gillespie, B.1    Malvar, H.2    Florencio, D.3
  • 30
    • 0001628038 scopus 로고
    • Nonlinear filtering of multiplied and convolved signals
    • Aug
    • A. Oppenheim, R. Schafer, and J. T.G. Stockham, "Nonlinear filtering of multiplied and convolved signals," Proc. IEEE, vol. 56, pp. 1264-91, Aug. 1968.
    • (1968) Proc. IEEE , vol.56 , pp. 1264-1291
    • Oppenheim, A.1    Schafer, R.2    Stockham, J.T.G.3
  • 31
    • 0027252268 scopus 로고
    • Source waveform recovery in a reverberant space by cepstrum dereverberation
    • Minneapolis, USA, Apr
    • M. Tohyama, R. Lyon, and T. Koike, "Source waveform recovery in a reverberant space by cepstrum dereverberation," in Proc. IEEE Int. Conf. Acoust., Speech, Signal process., vol. 1, Minneapolis, USA, Apr. 1993, pp. 157-60.
    • (1993) Proc. IEEE Int. Conf. Acoust., Speech, Signal process , vol.1 , pp. 157-160
    • Tohyama, M.1    Lyon, R.2    Koike, T.3
  • 32
    • 0030362988 scopus 로고    scopus 로고
    • Study on the dereverberation of speech based on temporal envelope filtering
    • Oct
    • C. Avendano, and H. Hermansky, "Study on the dereverberation of speech based on temporal envelope filtering," in Proc. Fourth Int. Conf. Spoken Language, vol. 2, Oct. 1996, pp. 889-92.
    • (1996) Proc. Fourth Int. Conf. Spoken Language , vol.2 , pp. 889-892
    • Avendano, C.1    Hermansky, H.2
  • 34
    • 4344685385 scopus 로고    scopus 로고
    • Masashi Unoki, Masakazu Furukawa, Keigo Sakata, and Masato Akagi, An improved method based on the MTF concept for restoring the power envelope from a reverberant signal, J. Acoustical Science and Technology, 25, no. 4, pp. 232-42, 2004.
    • Masashi Unoki, Masakazu Furukawa, Keigo Sakata, and Masato Akagi, "An improved method based on the MTF concept for restoring the power envelope from a reverberant signal," J. Acoustical Science and Technology, vol. 25, no. 4, pp. 232-42, 2004.
  • 39
    • 33845358691 scopus 로고    scopus 로고
    • Statistical analysis of the autoregressive modeling of reverberant speech
    • Dec
    • N. D. Gaubitch, D. B. Ward, and P A. Naylor, "Statistical analysis of the autoregressive modeling of reverberant speech," J. Acoust. Soc. Am., vol. 120, pp. 4031-9, Dec. 2006.
    • (2006) J. Acoust. Soc. Am , vol.120 , pp. 4031-4039
    • Gaubitch, N.D.1    Ward, D.B.2    Naylor, P.A.3
  • 40
    • 33745761716 scopus 로고    scopus 로고
    • A two-stage algorithm for one-microphone reverberant speech enhancement
    • May
    • M. Wu, and D. Wang, "A two-stage algorithm for one-microphone reverberant speech enhancement," IEEE Trans. Audio, Speech, Language process., vol. 14, pp. 774-84, May 2006.
    • (2006) IEEE Trans. Audio, Speech, Language process , vol.14 , pp. 774-784
    • Wu, M.1    Wang, D.2
  • 41
    • 50449100228 scopus 로고    scopus 로고
    • Robust speech dereverberation using multichannel blind deconvolution with spectral subtraction
    • Jul
    • K. Furuya, and A. Kataoka, "Robust speech dereverberation using multichannel blind deconvolution with spectral subtraction," IEEE Trans. Audio, Speech, Language process., vol. 15, pp. 1579-91, Jul. 2007.
    • (2007) IEEE Trans. Audio, Speech, Language process , vol.15 , pp. 1579-1591
    • Furuya, K.1    Kataoka, A.2
  • 42
    • 0017004953 scopus 로고
    • Separation of speech from interfering speech by means of harmonic selection
    • Oct
    • T. Parsons, "Separation of speech from interfering speech by means of harmonic selection," J. Acoust. Soc. Am., vol. 60, pp. 911-8, Oct. 1976.
    • (1976) J. Acoust. Soc. Am , vol.60 , pp. 911-918
    • Parsons, T.1
  • 43
    • 0031237388 scopus 로고    scopus 로고
    • Cochannel speaker separation by harmonic enhancement and suppression
    • Sep
    • D. Morgan, E. George, L. Lee, and S. Kay, "Cochannel speaker separation by harmonic enhancement and suppression," IEEE Trans. Speech Audio process., vol. 5, pp. 407-24, Sep. 1997.
    • (1997) IEEE Trans. Speech Audio process , vol.5 , pp. 407-424
    • Morgan, D.1    George, E.2    Lee, L.3    Kay, S.4
  • 44
    • 0025256257 scopus 로고
    • An approach to co-channel talker interference suppression using a sinusoidal model for speech
    • Jan
    • T Quatieri, and R. Danisewicz, "An approach to co-channel talker interference suppression using a sinusoidal model for speech," IEEE Trans. Acoust., Speech, Signal process., vol. ASSP-38, pp. 56-69, Jan. 1990.
    • (1990) IEEE Trans. Acoust., Speech, Signal process , vol.ASSP-38 , pp. 56-69
    • Quatieri, T.1    Danisewicz, R.2
  • 47
    • 64649106897 scopus 로고    scopus 로고
    • G. Hu, and D. Wang, 'An auditory scene analysis approach to monaural speech segregation, in Topics in Acoustic Echo and Noise Control, I. H. E. and S. G, Eds. Springer, Heidelberg, 2006, pp. 485-515.
    • G. Hu, and D. Wang, 'An auditory scene analysis approach to monaural speech segregation," in Topics in Acoustic Echo and Noise Control, I. H. E. and S. G, Eds. Springer, Heidelberg, 2006, pp. 485-515.
  • 48
    • 4644265990 scopus 로고    scopus 로고
    • Monaural speech segregation based on pitch tracking and amplitude modulation
    • Sep
    • G. Hu, and D. Wang, "Monaural speech segregation based on pitch tracking and amplitude modulation," IEEE Trans. Neural Networks, vol. 15, no. 5, pp. 1135-50, Sep. 2004.
    • (2004) IEEE Trans. Neural Networks , vol.15 , Issue.5 , pp. 1135-1150
    • Hu, G.1    Wang, D.2
  • 49
    • 33845940172 scopus 로고    scopus 로고
    • A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation
    • Article ID 84186, 15, doi:10.1155/2007/84186
    • M. H. Radfar, R. M. Dansereau, and A. Sayadiyan, "A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation," EURASIP J. Audio Speech Music process., vol. 2007, Article ID 84186, 15 pages, 2007. doi:10.1155/2007/84186.
    • (2007) EURASIP J. Audio Speech Music process , vol.2007
    • Radfar, M.H.1    Dansereau, R.M.2    Sayadiyan, A.3
  • 51
    • 0025343521 scopus 로고
    • Algorithms for separating the speech of interfering talkers: Evaluations with voiced sentences, and normal-hearing and hearing-impaired listeners
    • R. J. Stubbs, and Q. Summerfield, 'Algorithms for separating the speech of interfering talkers: Evaluations with voiced sentences, and normal-hearing and hearing-impaired listeners," J. Acoust. Soc. Am., vol. 87, no. 1, pp. 359-72, 1990.
    • (1990) J. Acoust. Soc. Am , vol.87 , Issue.1 , pp. 359-372
    • Stubbs, R.J.1    Summerfield, Q.2
  • 52
    • 0028416938 scopus 로고
    • Independent component analysis, a new concept?
    • P. Comon, "Independent component analysis, a new concept?" Signal process., vol. 36, no. 3, pp. 287-314, 1994.
    • (1994) Signal process , vol.36 , Issue.3 , pp. 287-314
    • Comon, P.1
  • 53
    • 0032187518 scopus 로고    scopus 로고
    • Blind signal separation: Statistical principles
    • J. Cardoso, "Blind signal separation: Statistical principles," Proc. IEEE, vol. 86, pp. 2009-25, 1998.
    • (1998) Proc. IEEE , vol.86 , pp. 2009-2025
    • Cardoso, J.1
  • 54
    • 0025642041 scopus 로고
    • Eigen-structure of the fourth-order cumulant tensor with application to the blind source separation problem
    • Apr
    • J.-F. Cardoso, "Eigen-structure of the fourth-order cumulant tensor with application to the blind source separation problem," in Proc. IEEE Int. Conf. Acoust., Speech, Signal process., vol. 5, Apr. 1990, pp. 2655-8.
    • (1990) Proc. IEEE Int. Conf. Acoust., Speech, Signal process , vol.5 , pp. 2655-2658
    • Cardoso, J.-F.1
  • 55
    • 0026191274 scopus 로고
    • Blind separation of sources, part 1: An adaptive algorithm based on neuromimetic architecture
    • C Jutten, and J. Herault, "Blind separation of sources, part 1: An adaptive algorithm based on neuromimetic architecture," Signal process., vol. 24, no. 1, pp. 1-10, 1991.
    • (1991) Signal process , vol.24 , Issue.1 , pp. 1-10
    • Jutten, C.1    Herault, J.2
  • 56
    • 0038782373 scopus 로고    scopus 로고
    • Combined approach of array processing and independent component analysis for blind separation of acoustic signals
    • May
    • F. Asano, S. Ikeda, M. Ogawa, H. Asoh, and N. Kitawaki, "Combined approach of array processing and independent component analysis for blind separation of acoustic signals," IEEE Trans. Speech Audio process., vol. 11, no. 3, pp. 204-15, May 2003.
    • (2003) IEEE Trans. Speech Audio process , vol.11 , Issue.3 , pp. 204-215
    • Asano, F.1    Ikeda, S.2    Ogawa, M.3    Asoh, H.4    Kitawaki, N.5
  • 57
    • 64649084900 scopus 로고    scopus 로고
    • Time-domain blind audio source separation using advanced ICA methods
    • Antwerp, Belgium, Aug
    • Z. Koldovsky, and P Tichavsky, "Time-domain blind audio source separation using advanced ICA methods," in Proc. INTERSPEECH 2007, Antwerp, Belgium, Aug. 2007, pp. 27-31.
    • (2007) Proc. INTERSPEECH 2007 , pp. 27-31
    • Koldovsky, Z.1    Tichavsky, P.2
  • 58
    • 64649084773 scopus 로고    scopus 로고
    • ICA methods for blind source separation of instantaneous mixtures: A case study, Neural Information process
    • Nov
    • N. Das, A. Routray, and P K. Dash, "ICA methods for blind source separation of instantaneous mixtures: A case study," Neural Information process. Letters and Reviews, vol. 11, no. 11, pp. 225-46, Nov. 2007.
    • (2007) Letters and Reviews , vol.11 , Issue.11 , pp. 225-246
    • Das, N.1    Routray, A.2    Dash, P.K.3
  • 59
    • 0042826822 scopus 로고    scopus 로고
    • Independent component analysis: Algorithms and applications
    • A. Hyv̈arinen, and E. Oja, "Independent component analysis: algorithms and applications," Neural Networks, vol. 13, no. 4-5, pp. 411-30, 2000.
    • (2000) Neural Networks , vol.13 , Issue.4-5 , pp. 411-430
    • Hyv̈arinen, A.1    Oja, E.2
  • 60
    • 0037367812 scopus 로고    scopus 로고
    • The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech
    • Mar
    • S. Araki, R. Mukai, S. Makino, T Nishikawa, and H. Saruwatari, "The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech," IEEE Trans. Speech Audio process., vol. 11, no. 2, pp. 109-16, Mar. 2003.
    • (2003) IEEE Trans. Speech Audio process , vol.11 , Issue.2 , pp. 109-116
    • Araki, S.1    Mukai, R.2    Makino, S.3    Nishikawa, T.4    Saruwatari, H.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.