메뉴 건너뛰기




Volumn 136, Issue 2, 2014, Pages 892-902

Reconstruction techniques for improving the perceptual quality of binary masked speech

Author keywords

[No Author keywords available]

Indexed keywords

BAYESIAN NETWORKS; SEPARATION; SPEECH;

EID: 84905693981     PISSN: 00014966     EISSN: None     Source Type: Journal    
DOI: 10.1121/1.4884759     Document Type: Article
Times cited : (33)

References (55)
  • 1
    • 33748523481 scopus 로고    scopus 로고
    • Determination of the potential benefit of time-frequency gain manipulation
    • 10.1097/01.aud.0000233891.86809.df
    • Anzalone, M. C., Calandruccio, L., Doherty, K. A., and Carney, L. H. (2006). " Determination of the potential benefit of time-frequency gain manipulation," Ear Hear. 27, 480-492. 10.1097/01.aud.0000233891.86809.df
    • (2006) Ear Hear. , vol.27 , pp. 480-492
    • Anzalone, M.C.1    Calandruccio, L.2    Doherty, K.A.3    Carney, L.H.4
  • 2
    • 33646759922 scopus 로고    scopus 로고
    • Reducing musical noise by a fine-shift overlap-add method applied to source separation using a time-frequency mask
    • Araki, S., Makino, S., Sawada, H., and Mukai, R. (2005). " Reducing musical noise by a fine-shift overlap-add method applied to source separation using a time-frequency mask," in Proceedings of ICASSP, Vol. 3, pp. 81-84.
    • (2005) Proceedings of ICASSP , vol.3 , pp. 81-84
    • Araki, S.1    Makino, S.2    Sawada, H.3    Mukai, R.4
  • 3
    • 38149032997 scopus 로고    scopus 로고
    • Compressed sensing and source separation
    • edited by M. E. Davies, C. J. James, S. Abdallah, and M. D. Plumbley (Springer Verlag, New York)
    • Blumensath, T., and Davis, M. E. (2007). " Compressed sensing and source separation," in Independent Component Analysis and Blind Source Separation, edited by M. E. Davies, C. J. James, S. Abdallah, and M. D. Plumbley (Springer Verlag, New York), pp. 341-348.
    • (2007) Independent Component Analysis and Blind Source Separation , pp. 341-348
    • Blumensath, T.1    Davis, M.E.2
  • 5
    • 33845354768 scopus 로고    scopus 로고
    • Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation
    • 10.1121/1.2363929
    • Brungart, D., Chang, P., Simpson, B., and Wang, D. (2006). " Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation," J. Acoust. Soc. Am. 120, 4007-4018. 10.1121/1.2363929
    • (2006) J. Acoust. Soc. Am. , vol.120 , pp. 4007-4018
    • Brungart, D.1    Chang, P.2    Simpson, B.3    Wang, D.4
  • 6
    • 33745604236 scopus 로고    scopus 로고
    • Stable signal recovery from incomplete and inaccurate measurements
    • 10.1002/cpa.20124
    • Candes, E. J., Romberg, J., and Tao, T. (2006). " Stable signal recovery from incomplete and inaccurate measurements," Commun. Pure Appl. Math. 59, 1207-1223. 10.1002/cpa.20124
    • (2006) Commun. Pure Appl. Math. , vol.59 , pp. 1207-1223
    • Candes, E.J.1    Romberg, J.2    Tao, T.3
  • 7
    • 79954508213 scopus 로고    scopus 로고
    • Improvement of intelligibility of ideal binary-masked noisy speech by adding background noise
    • 10.1121/1.3559707
    • Cao, S., Li, L., and Wu, X. (2011). " Improvement of intelligibility of ideal binary-masked noisy speech by adding background noise," J. Acoust. Soc. Am. 129, 2227-2236. 10.1121/1.3559707
    • (2011) J. Acoust. Soc. Am. , vol.129 , pp. 2227-2236
    • Cao, S.1    Li, L.2    Wu, X.3
  • 9
    • 56349098310 scopus 로고    scopus 로고
    • Algorithms for orthogonal nonnegative matrix factorization
    • Choi, S. (2008). " Algorithms for orthogonal nonnegative matrix factorization," in Proceedings IJCNN, pp. 1828-1832.
    • (2008) Proceedings IJCNN , pp. 1828-1832
    • Choi, S.1
  • 11
    • 33645712892 scopus 로고    scopus 로고
    • Compressed sensing
    • 10.1109/TIT.2006.871582
    • Donoho, D. L. (2006). " Compressed sensing," IEEE Trans. Inf. Theory 52, 1289-1306. 10.1109/TIT.2006.871582
    • (2006) IEEE Trans. Inf. Theory , vol.52 , pp. 1289-1306
    • Donoho, D.L.1
  • 14
    • 33751379736 scopus 로고    scopus 로고
    • Image denoising via sparse and redundant representations over learned dictionaries
    • 10.1109/TIP.2006.881969
    • Elad, M., and Aharon, M. (2006b). " Image denoising via sparse and redundant representations over learned dictionaries," IEEE Trans. Image Proc. 15, 3736-3745. 10.1109/TIP.2006.881969
    • (2006) IEEE Trans. Image Proc. , vol.15 , pp. 3736-3745
    • Elad, M.1    Aharon, M.2
  • 15
    • 70349196731 scopus 로고    scopus 로고
    • Using sparse representations for missing data imputation in noise robust speech recognition
    • Gemmeke, J., and Cranen, B. (2008). " Using sparse representations for missing data imputation in noise robust speech recognition," in Proceedings of EUSIPCO, pp. 1-5.
    • (2008) Proceedings of EUSIPCO , pp. 1-5
    • Gemmeke, J.1    Cranen, B.2
  • 16
    • 77949695902 scopus 로고    scopus 로고
    • Compressive sensing for missing data imputation in noise robust speech recognition
    • 10.1109/JSTSP.2009.2039171
    • Gemmeke, J., Van Hamme, H., Cranen, B., and Boves, L. (2010). " Compressive sensing for missing data imputation in noise robust speech recognition," IEEE J. Sel. Top. Signal Process. 4, 272-287. 10.1109/JSTSP.2009.2039171
    • (2010) IEEE J. Sel. Top. Signal Process. , vol.4 , pp. 272-287
    • Gemmeke, J.1    Van Hamme, H.2    Cranen, B.3    Boves, L.4
  • 18
    • 84863733079 scopus 로고    scopus 로고
    • Using sparse representations for exemplar based continuous digit recognition
    • Gemmeke, J. F., ten Bosch, L., Boves, L., and Cranen, B. (2009). " Using sparse representations for exemplar based continuous digit recognition," in Proceedings of EUSIPCO, pp. 1755-1759.
    • (2009) Proceedings of EUSIPCO , pp. 1755-1759
    • Gemmeke, J.F.1    Ten Bosch, L.2    Boves, L.3    Cranen, B.4
  • 19
    • 79960657803 scopus 로고    scopus 로고
    • Exemplar-based sparse representations for noise robust automatic speech recognition
    • 10.1109/TASL.2011.2112350
    • Gemmeke, J. F., Virtanen, T., and Hurmalainen, A. (2011). " Exemplar-based sparse representations for noise robust automatic speech recognition," IEEE Trans. Audio, Speech, Lang. Process. 19, 2067-2080. 10.1109/TASL.2011.2112350
    • (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , pp. 2067-2080
    • Gemmeke, J.F.1    Virtanen, T.2    Hurmalainen, A.3
  • 20
    • 84905701689 scopus 로고    scopus 로고
    • Last viewed 5/30/13
    • Grindlay, G. (2010). " NMFLib," Available: http://code.google. com/p/nmflib/ (Last viewed 5/30/13).
    • (2010) NMFLib
    • Grindlay, G.1
  • 21
    • 84885412715 scopus 로고    scopus 로고
    • An algorithm to improve speech recognition in noise for hearing-impaired listeners
    • 10.1121/1.4820893
    • Healy, E. W., Yoho, S. E., Wang, Y., and Wang, D. L. (2013). " An algorithm to improve speech recognition in noise for hearing-impaired listeners," J. Acoust. Soc. Am. 134, 3029-3038. 10.1121/1.4820893
    • (2013) J. Acoust. Soc. Am. , vol.134 , pp. 3029-3038
    • Healy, E.W.1    Yoho, S.E.2    Wang, Y.3    Wang, D.L.4
  • 23
    • 70349093614 scopus 로고    scopus 로고
    • An algorithm that improves speech intelligibility in noise for normal-hearing listeners
    • 10.1121/1.3184603
    • Kim, G., Lu, Y., Hu, Y., and Loizou, P. (2009). " An algorithm that improves speech intelligibility in noise for normal-hearing listeners," J. Acoust. Soc. Am. 126, 1486-1494. 10.1121/1.3184603
    • (2009) J. Acoust. Soc. Am. , vol.126 , pp. 1486-1494
    • Kim, G.1    Lu, Y.2    Hu, Y.3    Loizou, P.4
  • 24
    • 0033592606 scopus 로고    scopus 로고
    • Learning the parts of objects by non-negative matrix factorization
    • 10.1038/44565
    • Lee, D., and Seung, H. S. (1999). " Learning the parts of objects by non-negative matrix factorization," Nature 401, 788-791. 10.1038/44565
    • (1999) Nature , vol.401 , pp. 788-791
    • Lee, D.1    Seung, H.S.2
  • 25
    • 40749125179 scopus 로고    scopus 로고
    • Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction
    • 10.1121/1.2832617
    • Li, N., and Loizou, P. (2008). " Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction," J. Acoust. Soc. Am. 123, 1673-1682. 10.1121/1.2832617
    • (2008) J. Acoust. Soc. Am. , vol.123 , pp. 1673-1682
    • Li, N.1    Loizou, P.2
  • 26
    • 0018918171 scopus 로고
    • An algorithm for vector quantizer design
    • 10.1109/TCOM.1980.1094577
    • Linde, Y., Buzo, A., and Gray, R. M. (1980). " An algorithm for vector quantizer design," IEEE Trans. Commun. 28, 84-95. 10.1109/TCOM.1980.1094577
    • (1980) IEEE Trans. Commun. , vol.28 , pp. 84-95
    • Linde, Y.1    Buzo, A.2    Gray, R.M.3
  • 27
    • 51449112795 scopus 로고    scopus 로고
    • Temporal smoothing of spectral masks in the cepstral domain for speech separation
    • Madhu, N., Breithaupt, C., and Martin, R. (2008). " Temporal smoothing of spectral masks in the cepstral domain for speech separation," in Proceedings of ICASSP, pp. 45-48.
    • (2008) Proceedings of ICASSP , pp. 45-48
    • Madhu, N.1    Breithaupt, C.2    Martin, R.3
  • 29
    • 76749107542 scopus 로고    scopus 로고
    • Online learning for matrix factorization and sparse coding
    • "
    • Mairal, J., Bach, F., Ponce, J., and Sapiro, G. (2010). " Online learning for matrix factorization and sparse coding," J. Mach. Learn. Res. 11, 19-60.
    • (2010) J. Mach. Learn. Res. , vol.11 , pp. 19-60
    • Mairal, J.1    Bach, F.2    Ponce, J.3    Sapiro, G.4
  • 30
    • 39149089704 scopus 로고    scopus 로고
    • Sparse representation for color image restoration
    • 10.1109/TIP.2007.911828
    • Mairal, J., Elad, M., and Sapiro, G. (2008). " Sparse representation for color image restoration," IEEE Trans. Image Process. 17, 53-69. 10.1109/TIP.2007.911828
    • (2008) IEEE Trans. Image Process. , vol.17 , pp. 53-69
    • Mairal, J.1    Elad, M.2    Sapiro, G.3
  • 34
    • 34250023466 scopus 로고    scopus 로고
    • Monaural speech segregation based on fusion of source-driven with model-driven techniques
    • 10.1016/j.specom.2007.04.007
    • Radfar, M. H., Dansereau, R. M., and Sayadiyan, A. (2007). " Monaural speech segregation based on fusion of source-driven with model-driven techniques," Speech Commun. 49, 464-476. 10.1016/j.specom.2007.04.007
    • (2007) Speech Commun. , vol.49 , pp. 464-476
    • Radfar, M.H.1    Dansereau, R.M.2    Sayadiyan, A.3
  • 35
    • 4644336054 scopus 로고    scopus 로고
    • Reconstruction of missing features for robust speech recognition
    • 10.1016/j.specom.2004.03.007
    • Raj, B., Seltzer, M. L., and Stern, R. M. (2004). " Reconstruction of missing features for robust speech recognition," Speech Commun. 43, 275-296. 10.1016/j.specom.2004.03.007
    • (2004) Speech Commun. , vol.43 , pp. 275-296
    • Raj, B.1    Seltzer, M.L.2    Stern, R.M.3
  • 36
    • 79959818117 scopus 로고    scopus 로고
    • Non-negative matrix factorization based compensation of music for automatic speech recognition
    • Raj, B., Virtanen, T., Chaudhuri, S., and Singh, R. (2010). " Non-negative matrix factorization based compensation of music for automatic speech recognition," in Proceedings of Interspeech, pp. 717-720.
    • (2010) Proceedings of Interspeech , pp. 717-720
    • Raj, B.1    Virtanen, T.2    Chaudhuri, S.3    Singh, R.4
  • 41
    • 84898964201 scopus 로고    scopus 로고
    • Algorithms for non-negative matrix factorization
    • Seung, H. S., and Lee, D. (2001). " Algorithms for non-negative matrix factorization," Adv. Neural Inf. Process. Syst. 13, 556-562.
    • (2001) Adv. Neural Inf. Process. Syst. , vol.13 , pp. 556-562
    • Seung, H.S.1    Lee, D.2
  • 42
    • 34547511508 scopus 로고    scopus 로고
    • Sparse overcomplete decomposition for single channel speaker separation
    • Shashanka, M. V. S., Raj, B., and Smaragdis, P. (2007). " Sparse overcomplete decomposition for single channel speaker separation," in Proceedings of ICASSP, pp. 641-644.
    • (2007) Proceedings of ICASSP , pp. 641-644
    • Shashanka, M.V.S.1    Raj, B.2    Smaragdis, P.3
  • 43
    • 35048843291 scopus 로고    scopus 로고
    • Non negative matrix factor deconvolution: Extraction of multiple sound sources from monophonic inputs
    • Smaragdis, P. (2004). " Non negative matrix factor deconvolution: extraction of multiple sound sources from monophonic inputs," Independent Component Analysis and Blind Signal Separation, pp. 494-499.
    • (2004) Independent Component Analysis and Blind Signal Separation , pp. 494-499
    • Smaragdis, P.1
  • 44
    • 38049021850 scopus 로고    scopus 로고
    • Convolutive speech bases and their application to supervised speech separation
    • 10.1109/TASL.2006.876726
    • Smaragdis, P. (2007). " Convolutive speech bases and their application to supervised speech separation," IEEE Trans. Audio, Speech, Lang. Process. 15, 1-12. 10.1109/TASL.2006.876726
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , pp. 1-12
    • Smaragdis, P.1
  • 45
    • 33750311718 scopus 로고    scopus 로고
    • Binary and ratio time-frequency masks for robust speech recognition
    • 10.1016/j.specom.2006.09.003
    • Srinivasan, S., Roman, N., and Wang, D. L. (2006). " Binary and ratio time-frequency masks for robust speech recognition," Speech Commun. 48, 1486-1501. 10.1016/j.specom.2006.09.003
    • (2006) Speech Commun. , vol.48 , pp. 1486-1501
    • Srinivasan, S.1    Roman, N.2    Wang, D.L.3
  • 46
    • 79960916745 scopus 로고    scopus 로고
    • An algorithm for intelligibility prediction of time frequency weighted noisy speech
    • 10.1109/TASL.2011.2114881
    • Taal, C. H., Hendriks, R. C., Heusdens, R., and Jensen, J. (2011). " An algorithm for intelligibility prediction of time frequency weighted noisy speech," IEEE Trans. Audio, Speech, Lang. Process. 19, 2125-2136. 10.1109/TASL.2011.2114881
    • (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , pp. 2125-2136
    • Taal, C.H.1    Hendriks, R.C.2    Heusdens, R.3    Jensen, J.4
  • 47
    • 50249152311 scopus 로고    scopus 로고
    • Monaural sound source separation by nonnegative matrix factorization with temporal continuity and spareness criteria
    • 10.1109/TASL.2006.885253
    • Virtanen, T. (2007). " Monaural sound source separation by nonnegative matrix factorization with temporal continuity and spareness criteria," IEEE Trans. Audio, Speech, Lang. Process. 15, 1066-1074. 10.1109/TASL.2006.885253
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , pp. 1066-1074
    • Virtanen, T.1
  • 48
    • 84892233308 scopus 로고    scopus 로고
    • On ideal binary mask as the computational goal of auditory scene analysis
    • edited by P. Divenyi (Kluwer Academic, Norwell, MA)
    • Wang, D. L. (2005). " On ideal binary mask as the computational goal of auditory scene analysis," in Speech Separation by Humans and Machines, edited by P. Divenyi (Kluwer Academic, Norwell, MA), pp. 181-197.
    • (2005) Speech Separation by Humans and Machines , pp. 181-197
    • Wang, D.L.1
  • 49
    • 56249144201 scopus 로고    scopus 로고
    • Time-frequency masking for speech separation and its potential for hearing aid design
    • 10.1177/1084713808326455
    • Wang, D. L. (2008). " Time-frequency masking for speech separation and its potential for hearing aid design," Trends Amplif. 12, 332-353. 10.1177/1084713808326455
    • (2008) Trends Amplif. , vol.12 , pp. 332-353
    • Wang, D.L.1
  • 51
    • 64649103540 scopus 로고    scopus 로고
    • Speech intelligibility in background noise with ideal binary time-frequency masking
    • 10.1121/1.3083233
    • Wang, D. L., Kjems, U., Pedersen, M. S., Boldt, J. B., and Lunner, T. (2009). " Speech intelligibility in background noise with ideal binary time-frequency masking," J. Acoust. Soc. Am. 125, 2336-2347. 10.1121/1.3083233
    • (2009) J. Acoust. Soc. Am. , vol.125 , pp. 2336-2347
    • Wang, D.L.1    Kjems, U.2    Pedersen, M.S.3    Boldt, J.B.4    Lunner, T.5
  • 52
    • 84870477511 scopus 로고    scopus 로고
    • Exploring monaural features for classification-based speech segregation
    • 10.1109/TASL.2012.2221459
    • Wang, Y., Han, K., and Wang, D. L. (2013). " Exploring monaural features for classification-based speech segregation," IEEE Trans. Audio, Speech, Lang. Process. 21, 270-279. 10.1109/TASL.2012.2221459
    • (2013) IEEE Trans. Audio, Speech, Lang. Process. , vol.21 , pp. 270-279
    • Wang, Y.1    Han, K.2    Wang, D.L.3
  • 53
    • 84875678689 scopus 로고    scopus 로고
    • Towards scaling up classification-based speech separation
    • 10.1109/TASL.2013.2250961
    • Wang, Y., and Wang, D. L. (2013). " Towards scaling up classification-based speech separation," IEEE Trans. Audio, Speech, Lang. Process. 21, 1381-1390. 10.1109/TASL.2013.2250961
    • (2013) IEEE Trans. Audio, Speech, Lang. Process. , vol.21 , pp. 1381-1390
    • Wang, Y.1    Wang, D.L.2
  • 54
    • 51449092704 scopus 로고    scopus 로고
    • Speech denoising using nonnegative matrix factorization with priors
    • Wilson, K., Raj, B., Smaragdis, P., and Divakaran, A. (2008). " Speech denoising using nonnegative matrix factorization with priors," in Proceedings of ICASSP, pp. 4029-4032.
    • (2008) Proceedings of ICASSP , pp. 4029-4032
    • Wilson, K.1    Raj, B.2    Smaragdis, P.3    Divakaran, A.4
  • 55
    • 84859024513 scopus 로고    scopus 로고
    • CASA-based robust speaker identification
    • 10.1109/TASL.2012.2186803
    • Zhao, X., Shao, Y., and Wang, D. L. (2012). " CASA-based robust speaker identification," IEEE Trans. Audio, Speech, Lang. Process. 20, 1608-1616. 10.1109/TASL.2012.2186803
    • (2012) IEEE Trans. Audio, Speech, Lang. Process. , vol.20 , pp. 1608-1616
    • Zhao, X.1    Shao, Y.2    Wang, D.L.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.