메뉴 건너뛰기




Volumn 17, Issue 4, 2009, Pages 625-638

A supervised learning approach to monaural segregation of reverberant speech

Author keywords

Computational auditory scene analysis (CASA); Monaural segregation; Room reverberation; Speech separation; Supervised learning

Indexed keywords

COMPUTATIONAL AUDITORY SCENE ANALYSIS (CASA); HARMONICITY; INVERSE FILTERING; LEARNING PROCESS; MONAURAL SEGREGATION; OBJECTIVE FUNCTIONS; POSTERIOR PROBABILITIES; REAL ENVIRONMENTS; REVERBERANT CONDITIONS; REVERBERANT ENVIRONMENTS; ROOM REVERBERATION; SEGMENTATION AND GROUPINGS; SIGNAL DEGRADATIONS; SIGNAL-TO-NOISE RATIOS; SPEECH SEGREGATIONS; SPEECH SEPARATION; SYSTEMATIC EVALUATIONS; TIME FREQUENCIES; VOICED SPEECH;

EID: 65249103478     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2008.2010633     Document Type: Article
Times cited : (93)

References (61)
  • 2
    • 0018455820 scopus 로고
    • Image method for efficiently simulating small-room acoustics
    • J. B. Allen and D. A. Berkley, "Image method for efficiently simulating small-room acoustics," J. Acoust. Soc. Amer., vol. 65, pp. 943-950, 1979.
    • (1979) J. Acoust. Soc. Amer , vol.65 , pp. 943-950
    • Allen, J.B.1    Berkley, D.A.2
  • 3
    • 33748523481 scopus 로고    scopus 로고
    • Determination of the potential benefit of time-frequency gain manipulation
    • M. C. Anzalone, L. Calandruccio, K. A. Doherty, and L. H. Carney, "Determination of the potential benefit of time-frequency gain manipulation," Ear Hear., vol. 27, pp. 480-492, 2006.
    • (2006) Ear Hear , vol.27 , pp. 480-492
    • Anzalone, M.C.1    Calandruccio, L.2    Doherty, K.A.3    Carney, L.H.4
  • 4
    • 33749051687 scopus 로고    scopus 로고
    • Blind one-microphone speech separation: A spectral learning approach
    • F. Bach and M. Jordan, "Blind one-microphone speech separation: A spectral learning approach," in Proc. NIPS, 2004, pp. 65-72.
    • (2004) Proc. NIPS , pp. 65-72
    • Bach, F.1    Jordan, M.2
  • 7
    • 65249105669 scopus 로고    scopus 로고
    • P. Boersma and D. Weenink, Praat: Doing phonetics by computer Version 4.3.14, 2005 [Online, Available
    • P. Boersma and D. Weenink, "Praat: Doing phonetics by computer (Version 4.3.14)." 2005 [Online]. Available: http://www.fon.hum.uva.nl/ praat
  • 8
    • 0018455310 scopus 로고
    • Supression of acoustic noise in speech using spectral subtraction
    • Apr
    • S. F. Boll, "Supression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-27, no. 2, pp. 113-120, Apr. 1979.
    • (1979) IEEE Trans. Acoust., Speech, Signal Process , vol.ASSP-27 , Issue.2 , pp. 113-120
    • Boll, S.F.1
  • 9
    • 0031640099 scopus 로고    scopus 로고
    • On the use of explicit speech modeling in microphone array applications
    • M. S. Brandstein, "On the use of explicit speech modeling in microphone array applications," in Proc. IEEE ICASSP, 1998, pp. 3613-3616.
    • (1998) Proc. IEEE ICASSP , pp. 3613-3616
    • Brandstein, M.S.1
  • 11
    • 0028531926 scopus 로고
    • Computational auditory scene analysis
    • G. Brown and M. Cooke, "Computational auditory scene analysis," Comput. Speech Lang., pp. 297-336, 1994.
    • (1994) Comput. Speech Lang , pp. 297-336
    • Brown, G.1    Cooke, M.2
  • 13
    • 33845354768 scopus 로고    scopus 로고
    • Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation
    • D. S. Brungart, P. S. Chang, B. D. Simpson, and D. L. Wang, "Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation," J. Acoust. Soc. Amer., vol. 120, pp. 4007-4018, 2006.
    • (2006) J. Acoust. Soc. Amer , vol.120 , pp. 4007-4018
    • Brungart, D.S.1    Chang, P.S.2    Simpson, B.D.3    Wang, D.L.4
  • 17
    • 0242440783 scopus 로고    scopus 로고
    • Effects of reverberation on perceptual segregation of competing voices
    • J. F. Culling, K. I. Hodder, and C. Y. Toh, "Effects of reverberation on perceptual segregation of competing voices," J. Acoust. Soc. Amer., vol. 114, pp. 2871-2876, 2003.
    • (2003) J. Acoust. Soc. Amer , vol.114 , pp. 2871-2876
    • Culling, J.F.1    Hodder, K.I.2    Toh, C.Y.3
  • 19
    • 0021645331 scopus 로고
    • Speech enhancement using a minimum mean square error short-time spectral amplitude estimator
    • Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean square error short-time spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal Process., vol. 32, pp. 1109-1121, 1984.
    • (1984) IEEE Trans. Acoust., Speech, Signal Process , vol.32 , pp. 1109-1121
    • Ephraim, Y.1    Malah, D.2
  • 20
    • 0029345417 scopus 로고
    • A signal subspace approach for speech enhancement
    • Jul
    • Y. Ephraim and H. L. Trees, "A signal subspace approach for speech enhancement," IEEE Trans. Speech Audio Process., vol. 3, no. 4, pp. 251-266, Jul. 1995.
    • (1995) IEEE Trans. Speech Audio Process , vol.3 , Issue.4 , pp. 251-266
    • Ephraim, Y.1    Trees, H.L.2
  • 22
    • 0034857681 scopus 로고    scopus 로고
    • Speech dereverberation via maximum-kurtosis subband adaptive filtering
    • B. W. Gillespie, H. S. Malvar, and D. A. F. Florencio, "Speech dereverberation via maximum-kurtosis subband adaptive filtering," in Proc. IEEE ICASSP, 2001, pp. 3701-3704.
    • (2001) Proc. IEEE ICASSP , pp. 3701-3704
    • Gillespie, B.W.1    Malvar, H.S.2    Florencio, D.A.F.3
  • 24
    • 85045165251 scopus 로고    scopus 로고
    • Monaural speech organization and segregation,
    • Ph.D. dissertation, Biophys. Program, The Ohio State Univ, Columbus
    • G. Hu, "Monaural speech organization and segregation," Ph.D. dissertation, Biophys. Program, The Ohio State Univ., Columbus, 2006.
    • (2006)
    • Hu, G.1
  • 25
    • 4644265990 scopus 로고    scopus 로고
    • Monaural speech segregation based on pitch tracking and amplitude modulation
    • Sep
    • G. Hu and D. L. Wang, "Monaural speech segregation based on pitch tracking and amplitude modulation," IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1135-1150, Sep. 2004.
    • (2004) IEEE Trans. Neural Netw , vol.15 , Issue.5 , pp. 1135-1150
    • Hu, G.1    Wang, D.L.2
  • 26
    • 46049084696 scopus 로고    scopus 로고
    • An auditory scene analysis approach to monaural speech segregation
    • E. Hansler and G. Schmidt, Eds. New York: Springer
    • G. Hu and D. L. Wang, "An auditory scene analysis approach to monaural speech segregation," in Topics in Acoustic Echo and Noise Control, E. Hansler and G. Schmidt, Eds. New York: Springer, 2006, pp. 485-515.
    • (2006) Topics in Acoustic Echo and Noise Control , pp. 485-515
    • Hu, G.1    Wang, D.L.2
  • 27
    • 38849102154 scopus 로고    scopus 로고
    • Auditory segmentation based on onset and offset analysis
    • Feb
    • G. Hu and D. L. Wang, "Auditory segmentation based on onset and offset analysis," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 2, pp. 396-405, Feb. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , Issue.2 , pp. 396-405
    • Hu, G.1    Wang, D.L.2
  • 28
    • 49249107353 scopus 로고    scopus 로고
    • Segregation of unvoiced speech from nonspeech interference
    • G. Hu and D. L. Wang, "Segregation of unvoiced speech from nonspeech interference," J. Acoust. Soc. Amer., vol. 124, pp. 1306-1319, 2008.
    • (2008) J. Acoust. Soc. Amer , vol.124 , pp. 1306-1319
    • Hu, G.1    Wang, D.L.2
  • 30
    • 0018469156 scopus 로고
    • Critical distance measurement of rooms from the sound energy spectral response
    • J. J. Jetzt, "Critical distance measurement of rooms from the sound energy spectral response," J. Acoust. Soc. Amer., vol. 65, pp. 1204-1211, 1979.
    • (1979) J. Acoust. Soc. Amer , vol.65 , pp. 1204-1211
    • Jetzt, J.J.1
  • 31
    • 34547545458 scopus 로고    scopus 로고
    • A supervised learning approach to monaural segregation of reverberant speech
    • Z. Jin and D. L. Wang, "A supervised learning approach to monaural segregation of reverberant speech," in Proc. IEEE ICASSP, 2007, pp. 921-924.
    • (2007) Proc. IEEE ICASSP , pp. 921-924
    • Jin, Z.1    Wang, D.L.2
  • 33
  • 35
    • 40749125179 scopus 로고    scopus 로고
    • Factors influencing intelligibility of ideal binary- masked speech: Implications for noise reduction
    • N. Li and P. C. Loizou, "Factors influencing intelligibility of ideal binary- masked speech: Implications for noise reduction," J. Acoust. Soc. Amer., vol. 123, pp. 1673-1682, 2008.
    • (2008) J. Acoust. Soc. Amer , vol.123 , pp. 1673-1682
    • Li, N.1    Loizou, P.C.2
  • 38
    • 65249107816 scopus 로고    scopus 로고
    • Proc. Text, Speech, Dialogue-Second Int. Workshop, TSD'99
    • V. Matousek, P. Mautner, J. Ocelíkova, and P. Sojka, Eds, Plzen, Czech Republic, September, Springer
    • V. Matousek, P. Mautner, J. Ocelíkova, and P. Sojka, Eds., in Proc. Text, Speech, Dialogue-Second Int. Workshop, TSD'99, Plzen, Czech Republic, September 1999, 1999, vol. 1692, Lecture Notes in Computer Science, Springer.
    • (1999) Lecture Notes in Computer Science , vol.1692
  • 39
    • 0023944462 scopus 로고
    • Simulation of auditory-neural transduction: Further studies
    • R. Meddis, "Simulation of auditory-neural transduction: Further studies," J. Acoust. Soc. Amer., vol. 83, pp. 1056-1063, 1988.
    • (1988) J. Acoust. Soc. Amer , vol.83 , pp. 1056-1063
    • Meddis, R.1
  • 40
    • 0022130231 scopus 로고
    • On the variation and invertibility of room impulse response functions
    • J. Mourjopoulos, "On the variation and invertibility of room impulse response functions," J. Sound Vibr., vol. 102, pp. 217-228, 1985.
    • (1985) J. Sound Vibr , vol.102 , pp. 217-228
    • Mourjopoulos, J.1
  • 41
    • 0029252699 scopus 로고
    • On the probabilistic interpretation of neural network classifiers and discriminative training criteria
    • Feb
    • H. Ney, "On the probabilistic interpretation of neural network classifiers and discriminative training criteria," IEEE Trans. Pattern Anal. Mach. Intell., vol. 17, no. 2, pp. 107-119, Feb. 1995.
    • (1995) IEEE Trans. Pattern Anal. Mach. Intell , vol.17 , Issue.2 , pp. 107-119
    • Ney, H.1
  • 42
    • 4644304197 scopus 로고    scopus 로고
    • A binaural processor for missing data speech recognition in the presence of noise and smallroom reverberation
    • K. J. Palomaki, G. J. Brown, and D. L. Wang, "A binaural processor for missing data speech recognition in the presence of noise and smallroom reverberation," Speech Commun., vol. 43, pp. 361-378, 2004.
    • (2004) Speech Commun , vol.43 , pp. 361-378
    • Palomaki, K.J.1    Brown, G.J.2    Wang, D.L.3
  • 43
    • 65249137843 scopus 로고    scopus 로고
    • R. D. Patterson, I. Nimmo-Smith, J. Holdsworth, and P. Rice, An Efficient Auditory Filterbank Based on the Gammatone Function, Appl. Psychol. Unit, Cambridge, U.K., APU Rep. 2341, 1988.
    • R. D. Patterson, I. Nimmo-Smith, J. Holdsworth, and P. Rice, "An Efficient Auditory Filterbank Based on the Gammatone Function," Appl. Psychol. Unit, Cambridge, U.K., APU Rep. 2341, 1988.
  • 44
    • 48849091396 scopus 로고    scopus 로고
    • Single-channel speech separation using soft masking filtering
    • Nov
    • M. H. Radfar and R. M. Dansereau, "Single-channel speech separation using soft masking filtering," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2299-2310, Nov. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , Issue.8 , pp. 2299-2310
    • Radfar, M.H.1    Dansereau, R.M.2
  • 45
    • 0000423738 scopus 로고    scopus 로고
    • Equalization in an acoustic reverberant environment: Robustness results
    • May
    • B. D. Radlovic, R. C.Williamson, and R. A. Kennedy, "Equalization in an acoustic reverberant environment: Robustness results," IEEE Trans. Speech Audio Process., vol. 8, no. 3, pp. 311-319, May 2000.
    • (2000) IEEE Trans. Speech Audio Process , vol.8 , Issue.3 , pp. 311-319
    • Radlovic, B.D.1    Williamson, R.C.2    Kennedy, R.A.3
  • 46
    • 0001595997 scopus 로고
    • Neural network classifiers estimate bayesian a posteriori probabilities
    • M. D. Richard and R. P. Lippmann, "Neural network classifiers estimate bayesian a posteriori probabilities," Neural Comput., vol. 3, pp. 461-483, 1991.
    • (1991) Neural Comput , vol.3 , pp. 461-483
    • Richard, M.D.1    Lippmann, R.P.2
  • 47
    • 33845361885 scopus 로고    scopus 로고
    • Binaural segregation in multisource reverberant environments
    • N. Roman, S. Srinivasan, and D. L. Wang, "Binaural segregation in multisource reverberant environments," J. Acoust. Soc. Amer., vol. 120, pp. 4040-4051, 2006.
    • (2006) J. Acoust. Soc. Amer , vol.120 , pp. 4040-4051
    • Roman, N.1    Srinivasan, S.2    Wang, D.L.3
  • 48
    • 33745761651 scopus 로고    scopus 로고
    • Pitch-based monaural segregation of reverberant speech
    • N. Roman and D. L. Wang, "Pitch-based monaural segregation of reverberant speech," J. Acoust. Soc. Amer., vol. 120, pp. 458-469, 2006.
    • (2006) J. Acoust. Soc. Amer , vol.120 , pp. 458-469
    • Roman, N.1    Wang, D.L.2
  • 49
    • 0038705102 scopus 로고    scopus 로고
    • One microphone source separation
    • S. T. Roweis, "One microphone source separation," in Proc. NIPS, 2000, pp. 793-799.
    • (2000) Proc. NIPS , pp. 793-799
    • Roweis, S.T.1
  • 50
    • 0036134369 scopus 로고    scopus 로고
    • Adjusting a classifier for new a priori probabilities: A simple procedure
    • M. Saerens, P. Latinne, and C. Decaestecker, "Adjusting a classifier for new a priori probabilities: A simple procedure," Neural Comput., vol. 14, pp. 21-41, 2002.
    • (2002) Neural Comput , vol.14 , pp. 21-41
    • Saerens, M.1    Latinne, P.2    Decaestecker, C.3
  • 51
    • 65249097950 scopus 로고    scopus 로고
    • The effect of reverberation on the temporal representation of the f0 of frequency swept harmonic complexes in the ventral cochlear nucleus
    • B. Kollmeier, G. Klump, V. Hohmann, U. Langemann, M. Mauermann, S. Uppenkamp, and J. Verhey, Eds. Berlin, Germany: Springer
    • M. Sayles, B. Schouten, N. J. Ingham, and I. M. Winter, "The effect of reverberation on the temporal representation of the f0 of frequency swept harmonic complexes in the ventral cochlear nucleus," in Hearing: From Sensory Processing to Perception, B. Kollmeier, G. Klump, V. Hohmann, U. Langemann, M. Mauermann, S. Uppenkamp, and J. Verhey, Eds. Berlin, Germany: Springer, 2007, pp. 35-42.
    • (2007) Hearing: From Sensory Processing to Perception , pp. 35-42
    • Sayles, M.1    Schouten, B.2    Ingham, N.J.3    Winter, I.M.4
  • 52
    • 0347379706 scopus 로고    scopus 로고
    • Multisolution estimates of classification complexity
    • Dec
    • S. Singh, "Multisolution estimates of classification complexity," IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 12, pp. 1534-1539, Dec. 2003.
    • (2003) IEEE Trans. Pattern Anal. Mach. Intell , vol.25 , Issue.12 , pp. 1534-1539
    • Singh, S.1
  • 53
    • 33846500112 scopus 로고    scopus 로고
    • Distances between data sets based on summary statistics
    • N. Tatti, "Distances between data sets based on summary statistics," J. Mach. Learn. Res., vol. 8, pp. 131-154, 2007.
    • (2007) J. Mach. Learn. Res , vol.8 , pp. 131-154
    • Tatti, N.1
  • 55
    • 84892233308 scopus 로고    scopus 로고
    • On ideal binary mask as the computational goal of auditory scene analysis
    • P. Divenyi, Ed. Norwell, MA: Kluwer
    • D. L. Wang, "On ideal binary mask as the computational goal of auditory scene analysis," in Speech Separation by Humans and Machines, P. Divenyi, Ed. Norwell, MA: Kluwer, 2005, pp. 181-197.
    • (2005) Speech Separation by Humans and Machines , pp. 181-197
    • Wang, D.L.1
  • 56
    • 0032682770 scopus 로고    scopus 로고
    • Separation of speech from interfering sounds based on oscillatory correlation
    • May
    • D. L. Wang and G. J. Brown, "Separation of speech from interfering sounds based on oscillatory correlation," IEEE Trans. Neural Netw., vol. 10, no. 3, pp. 684-697, May 1999.
    • (1999) IEEE Trans. Neural Netw , vol.10 , Issue.3 , pp. 684-697
    • Wang, D.L.1    Brown, G.J.2
  • 58
    • 0003982501 scopus 로고
    • A theory and computational model of auditory monaural sound separation,
    • Ph.D. dissertation, Dept. of Elect. Eng, Stanford Univ, Stanford, CA
    • M. Weinraub, "A theory and computational model of auditory monaural sound separation," Ph.D. dissertation, Dept. of Elect. Eng., Stanford Univ., Stanford, CA, 1985.
    • (1985)
    • Weinraub, M.1
  • 59
    • 50249086925 scopus 로고    scopus 로고
    • Monaural speech separation using source-adapted models
    • R. Weiss and D. Ellis, "Monaural speech separation using source-adapted models," in Proc. IEEE WASPAA, 2007, pp. 114-117.
    • (2007) Proc. IEEE WASPAA , pp. 114-117
    • Weiss, R.1    Ellis, D.2
  • 60
    • 33646750815 scopus 로고    scopus 로고
    • A pitch-based method for the estimation of short reverberation time
    • M. Wu and D. L. Wang, "A pitch-based method for the estimation of short reverberation time," Acta Acustica United With Acustica, vol. 92, pp. 337-339, 2006.
    • (2006) Acta Acustica United With Acustica , vol.92 , pp. 337-339
    • Wu, M.1    Wang, D.L.2
  • 61
    • 33745761716 scopus 로고    scopus 로고
    • A two-stage algorithm for one-microphone reverberant speech enhancement
    • May
    • M. Wu and D. L. Wang, "A two-stage algorithm for one-microphone reverberant speech enhancement," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 3, pp. 774-784, May 2006.
    • (2006) IEEE Trans. Audio, Speech, Lang. Process , vol.14 , Issue.3 , pp. 774-784
    • Wu, M.1    Wang, D.L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.