메뉴 건너뛰기




Volumn 2015-January, Issue , 2015, Pages 1508-1512

Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement

Author keywords

Binary mask; Deep neural network; Minimum mean square error; Multi objective learning; Speech enhancement

Indexed keywords

BINS; MEAN SQUARE ERROR; SPEECH; SPEECH ENHANCEMENT; SPEECH RECOGNITION;

EID: 84959100788     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (69)

References (40)
  • 1
    • 0018455310 scopus 로고
    • Suppression of acoustic noise in speech using spectralsubtraction
    • S. Boll, "Suppression of acoustic noise in speech using spectralsubtraction, " IEEE Transactions on Acoustics, Speech and SignalProcessing, vol. 27, no. 2, pp. 113-120, 1979.
    • (1979) IEEE Transactions on Acoustics, Speech and SignalProcessing , vol.27 , Issue.2 , pp. 113-120
    • Boll, S.1
  • 2
    • 0021645331 scopus 로고
    • Speech enhancement using aminimum-mean square error short-time spectral amplitude esti-mator
    • Y. Ephraim and D. Malah, "Speech enhancement using aminimum-mean square error short-time spectral amplitude esti-mator, " IEEE Transactions on Acoustics, Speech and Signal Pro-cessing, vol. 32, no. 6, pp. 1109-1121, 1984.
    • (1984) IEEE Transactions on Acoustics, Speech and Signal Pro-cessing , vol.32 , Issue.6 , pp. 1109-1121
    • Ephraim, Y.1    Malah, D.2
  • 3
  • 4
    • 0035500783 scopus 로고    scopus 로고
    • Speech enhancement for non-stationary noise environments
    • I. Cohen and B. Berdugo, "Speech enhancement for non-stationary noise environments, " Signal processing, vol. 81, no. 11, pp. 2403-2418, 2001.
    • (2001) Signal Processing , vol.81 , Issue.11 , pp. 2403-2418
    • Cohen, I.1    Berdugo, B.2
  • 5
    • 0041360463 scopus 로고    scopus 로고
    • Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging
    • I. Cohen, "Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging, " IEEE Transac-tions on Acoustics, Speech and Signal Processing, vol. 11, no. 5, pp. 466-475, 2003.
    • (2003) IEEE Transac-tions on Acoustics, Speech and Signal Processing , vol.11 , Issue.5 , pp. 466-475
    • Cohen, I.1
  • 6
    • 0027623210 scopus 로고
    • Assessment for automatic speechrecognition: II. Noisex-92: A database and an experiment tostudy the effect of additive noise on speech recognition systems
    • A. Varga and H. J. Steeneken, "Assessment for automatic speechrecognition: II. noisex-92: A database and an experiment tostudy the effect of additive noise on speech recognition systems, "Speech communication, vol. 12, no. 3, pp. 247-251, 1993.
    • (1993) Speech Communication , vol.12 , Issue.3 , pp. 247-251
    • Varga, A.1    Steeneken, H.J.2
  • 8
    • 84867198451 scopus 로고    scopus 로고
    • Regularized non-negative matrix factorization with temporal dependencies forspeech denoising
    • K. W. Wilson, B. Raj, and P. Smaragdis, "Regularized non-negative matrix factorization with temporal dependencies forspeech denoising. " in INTERSPEECH, 2008, pp. 411-414.
    • (2008) INTERSPEECH , pp. 411-414
    • Wilson, K.W.1    Raj, B.2    Smaragdis, P.3
  • 10
    • 84055222005 scopus 로고    scopus 로고
    • Context-dependentpre-trained deep neural networks for large-vocabulary speechrecognition
    • G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependentpre-trained deep neural networks for large-vocabulary speechrecognition, " IEEE Transactions on Audio, Speech, and LanguageProcessing, vol. 20, no. 1, pp. 30-42, 2012.
    • (2012) IEEE Transactions on Audio, Speech, and LanguageProcessing , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.E.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 11
    • 84889263385 scopus 로고    scopus 로고
    • Denoising deep neural networks basedvoice activity detection
    • X.-L. Zhang and J. Wu, "Denoising deep neural networks basedvoice activity detection, " in ICASSP, 2013, pp. 853-857.
    • (2013) ICASSP , pp. 853-857
    • Zhang, X.-L.1    Wu, J.2
  • 13
    • 84889257121 scopus 로고    scopus 로고
    • An experimental study on speech enhancement based ondeep neural networks
    • -, "An experimental study on speech enhancement based ondeep neural networks, " IEEE Signal Processing Letters, vol. 21, no. 1, pp. 65-68, 2014.
    • (2014) IEEE Signal Processing Letters , vol.21 , Issue.1 , pp. 65-68
    • Xu, Y.1    Du, J.2    Dai, L.-R.3    Lee, C.-H.4
  • 14
    • 84910038203 scopus 로고    scopus 로고
    • Dynamic noise aware training for speech enhancementbased on deep neural networks
    • -, "Dynamic noise aware training for speech enhancementbased on deep neural networks. " in INTERSPEECH, 2014, pp. 2670-2674.
    • (2014) INTERSPEECH , pp. 2670-2674
    • Xu, Y.1    Du, J.2    Dai, L.-R.3    Lee, C.-H.4
  • 15
    • 84867202951 scopus 로고    scopus 로고
    • A speech enhancement approach using piece-wise linear approximation of an explicit model of environmentaldistortions
    • J. Du and Q. Huo, "A speech enhancement approach using piece-wise linear approximation of an explicit model of environmentaldistortions. " in INTERSPEECH, 2008, pp. 569-572.
    • (2008) INTERSPEECH , pp. 569-572
    • Du, J.1    Huo, Q.2
  • 16
    • 84906262433 scopus 로고    scopus 로고
    • Speech enhancementbased on deep denoising autoencoder
    • X. Lu, Y. Tsao, S. Matsuda, and C. Hori, "Speech enhancementbased on deep denoising autoencoder. " in INTERSPEECH, 2013, pp. 436-440.
    • (2013) INTERSPEECH , pp. 436-440
    • Lu, X.1    Tsao, Y.2    Matsuda, S.3    Hori, C.4
  • 17
    • 84896537574 scopus 로고    scopus 로고
    • Wiener filtering based speech enhancementwith weighted denoising auto-encoder and noise classification
    • B. Xia and C. Bao, "Wiener filtering based speech enhancementwith weighted denoising auto-encoder and noise classification, "Speech Communication, vol. 60, pp. 13-29, 2014.
    • (2014) Speech Communication , vol.60 , pp. 13-29
    • Xia, B.1    Bao, C.2
  • 19
    • 84910049527 scopus 로고    scopus 로고
    • Experiments on deep learningfor speech denoising
    • D. Liu, P. Smaragdis, and M. Kim, "Experiments on deep learningfor speech denoising, " in INTERSPEECH, 2014, pp. 2685-2689.
    • (2014) INTERSPEECH , pp. 2685-2689
    • Liu, D.1    Smaragdis, P.2    Kim, M.3
  • 25
    • 84890493989 scopus 로고    scopus 로고
    • Ideal ratio mask estimation usingdeep neural networks for robust speech recognition
    • A. Narayanan and D. L. Wang, "Ideal ratio mask estimation usingdeep neural networks for robust speech recognition, " in ICASSP, 2013, pp. 7092-7096.
    • (2013) ICASSP , pp. 7092-7096
    • Narayanan, A.1    Wang, D.L.2
  • 26
    • 0031189914 scopus 로고
    • Multitask learning: A knowledge-based source of in-ductive bias
    • R. Caruna, "Multitask learning: A knowledge-based source of in-ductive bias, " in ICML, 1993, pp. 41-48.
    • (1993) ICML , pp. 41-48
    • Caruna, R.1
  • 27
    • 84890545600 scopus 로고    scopus 로고
    • Multi-task learning in deep neuralnetworks for improved phoneme recognition
    • M. L. Seltzer and J. Droppo, "Multi-task learning in deep neuralnetworks for improved phoneme recognition, " in ICASSP, 2013, pp. 6965-6969.
    • (2013) ICASSP , pp. 6965-6969
    • Seltzer, M.L.1    Droppo, J.2
  • 29
    • 0032595188 scopus 로고    scopus 로고
    • Generalizedmel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition
    • R. Vergin, D. O'shaughnessy, and A. Farhat, "Generalizedmel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition, " IEEE Transactionson Speech and Audio Processing, vol. 7, no. 5, pp. 525-532, 1999.
    • (1999) IEEE Transactionson Speech and Audio Processing , vol.7 , Issue.5 , pp. 525-532
    • Vergin, R.1    O'Shaughnessy, D.2    Farhat, A.3
  • 30
    • 30444446629 scopus 로고    scopus 로고
    • Combining evidence fromresidual phase and mfcc features for speaker recognition
    • K. S. R. Murty and B. Yegnanarayana, "Combining evidence fromresidual phase and mfcc features for speaker recognition, " IEEESignal Processing Letters, vol. 13, no. 1, pp. 52-55, 2006.
    • (2006) IEEESignal Processing Letters , vol.13 , Issue.1 , pp. 52-55
    • Murty, K.S.R.1    Yegnanarayana, B.2
  • 31
    • 0009985115 scopus 로고    scopus 로고
    • Mel frequency cepstral coefficients for musicmodeling
    • B. Logan et al., "Mel frequency cepstral coefficients for musicmodeling. " in ISMIR, 2000.
    • (2000) ISMIR
    • Logan, B.1
  • 35
    • 0008861179 scopus 로고
    • Getting started with the DARPA timit cd-rom: An acoustic phonetic continuous speech database
    • Gaithersburgh, MD
    • J. S. Garofolo et al., "Getting started with the darpa timit cd-rom: An acoustic phonetic continuous speech database, " National In-stitute of Stand ards and Technology (NIST), Gaithersburgh, MD, vol. 107, 1988.
    • (1988) National In-stitute of Stand Ards and Technology (NIST) , vol.107
    • Garofolo, J.S.1
  • 36
    • 0034847662 scopus 로고    scopus 로고
    • Perceptual evaluation of speech quality (pesq)-a new method forspeech quality assessment of telephone networks and codecs
    • A. W. Rix, J. G. Beerends, M. P. Hollier, and A. P. Hekstra, "Perceptual evaluation of speech quality (pesq)-a new method forspeech quality assessment of telephone networks and codecs, " inICASSP, 2001, pp. 749-752.
    • (2001) ICASSP , pp. 749-752
    • Rix, A.W.1    Beerends, J.G.2    Hollier, M.P.3    Hekstra, A.P.4
  • 38
    • 84890527827 scopus 로고    scopus 로고
    • Improving deepneural networks for lvcsr using rectified linear units and dropout
    • G. E. Dahl, T. N. Sainath, and G. E. Hinton, "Improving deepneural networks for lvcsr using rectified linear units and dropout, "in ICASSP, 2013, pp. 8609-8613.
    • (2013) ICASSP , pp. 8609-8613
    • Dahl, G.E.1    Sainath, T.N.2    Hinton, G.E.3
  • 40
    • 84890492030 scopus 로고    scopus 로고
    • An investigation of deepneural networks for noise robust speech recognition
    • M. L. Seltzer, D. Yu, and Y. Wang, "An investigation of deepneural networks for noise robust speech recognition, " in ICASSP, 2013, pp. 7398-7402.
    • (2013) ICASSP , pp. 7398-7402
    • Seltzer, M.L.1    Yu, D.2    Wang, Y.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.