메뉴 건너뛰기




Volumn 138, Issue 3, 2015, Pages 1399-1407

Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality

Author keywords

[No Author keywords available]

Indexed keywords

CHEMICAL ACTIVATION; FACTORIZATION; MATRIX ALGEBRA; SEPARATION; SIGNAL TO NOISE RATIO; SOUND REPRODUCTION;

EID: 84941336645     PISSN: 00014966     EISSN: None     Source Type: Journal    
DOI: 10.1121/1.4928612     Document Type: Article
Times cited : (22)

References (28)
  • 1
    • 34547645591 scopus 로고    scopus 로고
    • Effects of noise and distortion on speech quality judgments in normal-hearing and hearing-impaired listeners
    • Arehart, K. H., Kates, J. M., Anderson, M. C., and Harvey, L. O., Jr. (2007). " Effects of noise and distortion on speech quality judgments in normal-hearing and hearing-impaired listeners," J. Acoust. Soc. Am. 122, 1150-1164. 10.1121/1.2754061
    • (2007) J. Acoust. Soc. Am. , vol.122 , pp. 1150-1164
    • Arehart, K.H.1    Kates, J.M.2    Anderson, M.C.3    Harvey, L.O.4
  • 2
    • 80052250414 scopus 로고    scopus 로고
    • Adaptive subgradient methods for online learning and stochastic optimization
    • Duchi, J., Hazan, E., and Singer, Y. (2010). " Adaptive subgradient methods for online learning and stochastic optimization," J. Mach. Learn. Res. 12, 2121-2159.
    • (2010) J. Mach. Learn. Res. , vol.12 , pp. 2121-2159
    • Duchi, J.1    Hazan, E.2    Singer, Y.3
  • 3
    • 10944227316 scopus 로고    scopus 로고
    • Sparse coding and NMF
    • Eggert, J., and Korner, E. (2004). " Sparse coding and NMF," IEEE Conf. Neural Netw. 4, 2529-2533. 10.1109/IJCNN.2004.1381036
    • (2004) IEEE Conf. Neural Netw. , vol.4 , pp. 2529-2533
    • Eggert, J.1    Korner, E.2
  • 4
    • 63249085556 scopus 로고    scopus 로고
    • Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis
    • Févotte, C., Bertin, N., and Durrieu, J-L. (2009). " Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis," Neural Comput. 21, 793-830. 10.1162/neco.2008.04-08-771
    • (2009) Neural Comput. , vol.21 , pp. 793-830
    • Févotte, C.1    Bertin, N.2    Durrieu, J.-L.3
  • 5
    • 84905268759 scopus 로고    scopus 로고
    • Learning spectral mapping for speech dereverberation
    • Han, K., Wang, Y., and Wang, D. L. (2014). " Learning spectral mapping for speech dereverberation," in Proceedings of ICASSP, pp. 4661-4665.
    • (2014) Proceedings of ICASSP , pp. 4661-4665
    • Han, K.1    Wang, Y.2    Wang, D.L.3
  • 6
    • 84885412715 scopus 로고    scopus 로고
    • An algorithm to improve speech recognition in noise for hearing-impaired listeners
    • Healy, E. W., Yoho, S. E., Wang, Y., and Wang, D. L. (2013). " An algorithm to improve speech recognition in noise for hearing-impaired listeners," J. Acoust. Soc. Am. 134, 3029-3038. 10.1121/1.4820893
    • (2013) J. Acoust. Soc. Am. , vol.134 , pp. 3029-3038
    • Healy, E.W.1    Yoho, S.E.2    Wang, Y.3    Wang, D.L.4
  • 7
    • 0014568991 scopus 로고
    • IEEE recommended practice for speech quality measurements
    • IEEE
    • IEEE (1969). " IEEE recommended practice for speech quality measurements," IEEE Trans. Audio Electroacoust. 17, 225-246. 10.1109/TAU.1969.1162058
    • (1969) IEEE Trans. Audio Electroacoust. , vol.17 , pp. 225-246
  • 9
    • 70349093614 scopus 로고    scopus 로고
    • An algorithm that improves speech intelligibility in noise for normal-hearing listeners
    • Kim, G., Lu, Y., Hu, Y., and Loizou, P. (2009). " An algorithm that improves speech intelligibility in noise for normal-hearing listeners," J. Acoust. Soc. Am. 126, 1486-1494. 10.1121/1.3184603
    • (2009) J. Acoust. Soc. Am. , vol.126 , pp. 1486-1494
    • Kim, G.1    Lu, Y.2    Hu, Y.3    Loizou, P.4
  • 10
    • 84919905473 scopus 로고    scopus 로고
    • Ideal time-frequency masking algorithms lead to different speech intelligibility and quality in normal-hearing and cochlear implant listeners
    • Koning, R., Madhu, N., and Wouters, J. (2015). " Ideal time-frequency masking algorithms lead to different speech intelligibility and quality in normal-hearing and cochlear implant listeners," IEEE Trans. Biomed. Eng. 62, 331-341. 10.1109/TBME.2014.2351854
    • (2015) IEEE Trans. Biomed. Eng. , vol.62 , pp. 331-341
    • Koning, R.1    Madhu, N.2    Wouters, J.3
  • 11
    • 0033592606 scopus 로고    scopus 로고
    • Learning the parts of objects by non-negative matrix factorization
    • Lee, D., and Seung, H. S. (1999). " Learning the parts of objects by non-negative matrix factorization," Nature 401, 788-791. 10.1038/44565
    • (1999) Nature , vol.401 , pp. 788-791
    • Lee, D.1    Seung, H.S.2
  • 12
    • 80051625972 scopus 로고    scopus 로고
    • A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics
    • Mysore, G. J., and Smaragdis, P. (2011). " A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics," in Proceedings of ICASSP, pp. 17-20.
    • (2011) Proceedings of ICASSP , pp. 17-20
    • Mysore, G.J.1    Smaragdis, P.2
  • 13
    • 84878576230 scopus 로고    scopus 로고
    • Non-negative hidden Markov modeling of audio with application to source separation
    • Mysore, G. J., Smaragdis, P., and Raj, B. (2010). " Non-negative hidden Markov modeling of audio with application to source separation," in Proceedings of LVA/ICA, pp. 1-8.
    • (2010) Proceedings of LVA/ICA , pp. 1-8
    • Mysore, G.J.1    Smaragdis, P.2    Raj, B.3
  • 15
    • 84890493989 scopus 로고    scopus 로고
    • Ideal ratio mask estimation using deep neural networks for robust speech recognition
    • Narayanan, A., and Wang, D. L. (2013). " Ideal ratio mask estimation using deep neural networks for robust speech recognition," in Proceedings of ICASSP, pp. 7092-7096.
    • (2013) Proceedings of ICASSP , pp. 7092-7096
    • Narayanan, A.1    Wang, D.L.2
  • 16
    • 84898964201 scopus 로고    scopus 로고
    • Algorithms for non-negative matrix factorization
    • Seung, H. S., and Lee, D. (2001). " Algorithms for non-negative matrix factorization," Adv. Neural Inf. Process. Syst. 13, 556-562.
    • (2001) Adv. Neural Inf. Process. Syst. , vol.13 , pp. 556-562
    • Seung, H.S.1    Lee, D.2
  • 17
    • 38049021850 scopus 로고    scopus 로고
    • Convolutive speech bases and their application to supervised speech separation
    • Smaragdis, P. (2007). " Convolutive speech bases and their application to supervised speech separation," IEEE Trans. Audio Speech Lang. Process. 15, 1-12. 10.1109/TASL.2006.876726
    • (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , pp. 1-12
    • Smaragdis, P.1
  • 18
    • 79960916745 scopus 로고    scopus 로고
    • An algorithm for intelligibility prediction of time frequency weighted noisy speech
    • Taal, C. H., Hendriks, R. C., Heusdens, R., and Jensen, J. (2011). " An algorithm for intelligibility prediction of time frequency weighted noisy speech," IEEE Trans. Audio Speech Lang. Process. 19, 2125-2136. 10.1109/TASL.2011.2114881
    • (2011) IEEE Trans. Audio Speech Lang. Process. , vol.19 , pp. 2125-2136
    • Taal, C.H.1    Hendriks, R.C.2    Heusdens, R.3    Jensen, J.4
  • 19
    • 50249152311 scopus 로고    scopus 로고
    • Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria
    • Virtanen, T. (2007). " Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria," IEEE Trans. Audio Speech Lang. Process. 15, 1066-1074. 10.1109/TASL.2006.885253
    • (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , pp. 1066-1074
    • Virtanen, T.1
  • 20
    • 84870477511 scopus 로고    scopus 로고
    • Exploring monaural features for classification-based speech segregation
    • Wang, Y., Han, K., and Wang, D. L. (2013). " Exploring monaural features for classification-based speech segregation," IEEE Trans. Audio Speech Lang. Process. 21, 270-279. 10.1109/TASL.2012.2221459
    • (2013) IEEE Trans. Audio Speech Lang. Process. , vol.21 , pp. 270-279
    • Wang, Y.1    Han, K.2    Wang, D.L.3
  • 21
    • 84875678689 scopus 로고    scopus 로고
    • Towards scaling up classification-based speech separation
    • Wang, Y., and Wang, D. L. (2013). " Towards scaling up classification-based speech separation," IEEE Trans. Audio Speech Lang. Process. 21, 1381-1390. 10.1109/TASL.2013.2250961
    • (2013) IEEE Trans. Audio Speech Lang. Process. , vol.21 , pp. 1381-1390
    • Wang, Y.1    Wang, D.L.2
  • 22
    • 84921740463 scopus 로고    scopus 로고
    • On training targets for supervised speech separation
    • Wang, Y., Narayanan, A., and Wang, D. L. (2014). " On training targets for supervised speech separation," IEEE Trans. Audio Speech Lang. Process. 22, 1849-1858. 10.1109/TASLP.2014.2352935
    • (2014) IEEE Trans. Audio Speech Lang. Process. , vol.22 , pp. 1849-1858
    • Wang, Y.1    Narayanan, A.2    Wang, D.L.3
  • 23
    • 84905280958 scopus 로고    scopus 로고
    • A two-stage approach for improving the perceptual quality of separated speech
    • Williamson, D. S., Wang, Y., and Wang, D. L. (2014a). " A two-stage approach for improving the perceptual quality of separated speech," in Proceedings of ICASSP, pp. 7084-7088.
    • (2014) Proceedings of ICASSP , pp. 7084-7088
    • Williamson, D.S.1    Wang, Y.2    Wang, D.L.3
  • 24
    • 84905693981 scopus 로고    scopus 로고
    • Reconstruction techniques for improving the perceptual quality of binary masked speech
    • Williamson, D. S., Wang, Y., and Wang, D. L. (2014b). " Reconstruction techniques for improving the perceptual quality of binary masked speech," J. Acoust. Soc. Am. 136, 892-902. 10.1121/1.4884759
    • (2014) J. Acoust. Soc. Am. , vol.136 , pp. 892-902
    • Williamson, D.S.1    Wang, Y.2    Wang, D.L.3
  • 25
    • 84941346133 scopus 로고    scopus 로고
    • Deep neural networks for estimating speech model activations
    • Williamson, D. S., Wang, Y., and Wang, D. L. (2015). " Deep neural networks for estimating speech model activations," in Proceedings of ICASSP, pp. 5113-5117.
    • (2015) Proceedings of ICASSP , pp. 5113-5117
    • Williamson, D.S.1    Wang, Y.2    Wang, D.L.3
  • 26
    • 51449092704 scopus 로고    scopus 로고
    • Speech denoising using nonnegative matrix factorization with priors
    • Wilson, K., Raj, B., Smaragdis, P., and Divakaran, A. (2008). " Speech denoising using nonnegative matrix factorization with priors," in Proceedings of ICASSP, pp. 4029-4032.
    • (2008) Proceedings of ICASSP , pp. 4029-4032
    • Wilson, K.1    Raj, B.2    Smaragdis, P.3    Divakaran, A.4
  • 27
    • 84889257121 scopus 로고    scopus 로고
    • An experimental study on speech enhancement based on deep neural networks
    • Xu, Y., Du, J., Dai, L., and Lee, C. (2014). " An experimental study on speech enhancement based on deep neural networks," IEEE Sign. Process. Lett. 21, 65-68. 10.1109/LSP.2013.2291240
    • (2014) IEEE Sign. Process. Lett. , vol.21 , pp. 65-68
    • Xu, Y.1    Du, J.2    Dai, L.3    Lee, C.4
  • 28
    • 84910097441 scopus 로고    scopus 로고
    • Boosted deep neural networks and multi-resolution cochleagram features for voice activity detection
    • Zhang, X.-L., and Wang, D. L. (2014). " Boosted deep neural networks and multi-resolution cochleagram features for voice activity detection," in Proceedings of INTERSPEECH, pp. 1534-1538.
    • (2014) Proceedings of INTERSPEECH , pp. 1534-1538
    • Zhang, X.-L.1    Wang, D.L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.