SCOPUS 정보 검색 플랫폼

Journal of the Acoustical Society of America

Volumn 138, Issue 3, 2015, Pages 1399-1407

Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality

(3) Williamson, Donald S a Wang, Yuxuan a Wang, Deliang a

a The Ohio State University (United States)

Author keywords

[No Author keywords available]

Indexed keywords

CHEMICAL ACTIVATION; FACTORIZATION; MATRIX ALGEBRA; SEPARATION; SIGNAL TO NOISE RATIO; SOUND REPRODUCTION;

DEEP NEURAL NETWORKS; LOW SIGNAL-TO-NOISE RATIO; NON-NEGATIVE MATRIX; NONNEGATIVE MATRIX FACTORIZATION; PERCEPTUAL EVALUATION OF SPEECH QUALITIES; PERCEPTUAL QUALITY; TIME-FREQUENCY MASKING; TIME-FREQUENCY REPRESENTATIONS;

SPEECH;

ADULT; ALGORITHM; BIOLOGICAL MODEL; FEMALE; HUMAN; MALE; NERVE CELL NETWORK; NOISE; PERCEPTION; PHYSIOLOGY; PROCEDURES; SIGNAL NOISE RATIO; SOUND DETECTION; SPEECH; SPEECH PERCEPTION; TRAFFIC NOISE; YOUNG ADULT;

ADULT; ALGORITHMS; FEMALE; HUMANS; MALE; MODELS, BIOLOGICAL; NERVE NET; NOISE; NOISE, TRANSPORTATION; PERCEPTUAL MASKING; SIGNAL-TO-NOISE RATIO; SOUND SPECTROGRAPHY; SPEECH ACOUSTICS; SPEECH PERCEPTION; YOUNG ADULT;

EID: 84941336645 PISSN: 00014966 EISSN: None Source Type: Journal
DOI: 10.1121/1.4928612 Document Type: Article

Times cited : (22)

References (28)

1
- 34547645591
- Effects of noise and distortion on speech quality judgments in normal-hearing and hearing-impaired listeners
- Arehart, K. H., Kates, J. M., Anderson, M. C., and Harvey, L. O., Jr. (2007). " Effects of noise and distortion on speech quality judgments in normal-hearing and hearing-impaired listeners," J. Acoust. Soc. Am. 122, 1150-1164. 10.1121/1.2754061
- (2007) J. Acoust. Soc. Am. , vol.122 , pp. 1150-1164
- Arehart, K.H.¹ Kates, J.M.² Anderson, M.C.³ Harvey, L.O.⁴

2
- 80052250414
- Adaptive subgradient methods for online learning and stochastic optimization
- Duchi, J., Hazan, E., and Singer, Y. (2010). " Adaptive subgradient methods for online learning and stochastic optimization," J. Mach. Learn. Res. 12, 2121-2159.
- (2010) J. Mach. Learn. Res. , vol.12 , pp. 2121-2159
- Duchi, J.¹ Hazan, E.² Singer, Y.³

3
- 10944227316
- Sparse coding and NMF
- Eggert, J., and Korner, E. (2004). " Sparse coding and NMF," IEEE Conf. Neural Netw. 4, 2529-2533. 10.1109/IJCNN.2004.1381036
- (2004) IEEE Conf. Neural Netw. , vol.4 , pp. 2529-2533
- Eggert, J.¹ Korner, E.²

4
- 63249085556
- Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis
- Févotte, C., Bertin, N., and Durrieu, J-L. (2009). " Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis," Neural Comput. 21, 793-830. 10.1162/neco.2008.04-08-771
- (2009) Neural Comput. , vol.21 , pp. 793-830
- Févotte, C.¹ Bertin, N.² Durrieu, J.-L.³

5
- 84905268759
- Learning spectral mapping for speech dereverberation
- Han, K., Wang, Y., and Wang, D. L. (2014). " Learning spectral mapping for speech dereverberation," in Proceedings of ICASSP, pp. 4661-4665.
- (2014) Proceedings of ICASSP , pp. 4661-4665
- Han, K.¹ Wang, Y.² Wang, D.L.³

6
- 84885412715
- An algorithm to improve speech recognition in noise for hearing-impaired listeners
- Healy, E. W., Yoho, S. E., Wang, Y., and Wang, D. L. (2013). " An algorithm to improve speech recognition in noise for hearing-impaired listeners," J. Acoust. Soc. Am. 134, 3029-3038. 10.1121/1.4820893
- (2013) J. Acoust. Soc. Am. , vol.134 , pp. 3029-3038
- Healy, E.W.¹ Yoho, S.E.² Wang, Y.³ Wang, D.L.⁴

7
- 0014568991
- IEEE recommended practice for speech quality measurements
- IEEE
- IEEE (1969). " IEEE recommended practice for speech quality measurements," IEEE Trans. Audio Electroacoust. 17, 225-246. 10.1109/TAU.1969.1162058
- (1969) IEEE Trans. Audio Electroacoust. , vol.17 , pp. 225-246

8
- 0003639435
- ITU-R
- ITU-R (2001). " Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs," p. 862.
- (2001) Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs , pp. 862

9
- 70349093614
- An algorithm that improves speech intelligibility in noise for normal-hearing listeners
- Kim, G., Lu, Y., Hu, Y., and Loizou, P. (2009). " An algorithm that improves speech intelligibility in noise for normal-hearing listeners," J. Acoust. Soc. Am. 126, 1486-1494. 10.1121/1.3184603
- (2009) J. Acoust. Soc. Am. , vol.126 , pp. 1486-1494
- Kim, G.¹ Lu, Y.² Hu, Y.³ Loizou, P.⁴

10
- 84919905473
- Ideal time-frequency masking algorithms lead to different speech intelligibility and quality in normal-hearing and cochlear implant listeners
- Koning, R., Madhu, N., and Wouters, J. (2015). " Ideal time-frequency masking algorithms lead to different speech intelligibility and quality in normal-hearing and cochlear implant listeners," IEEE Trans. Biomed. Eng. 62, 331-341. 10.1109/TBME.2014.2351854
- (2015) IEEE Trans. Biomed. Eng. , vol.62 , pp. 331-341
- Koning, R.¹ Madhu, N.² Wouters, J.³

11
- 0033592606
- Learning the parts of objects by non-negative matrix factorization
- Lee, D., and Seung, H. S. (1999). " Learning the parts of objects by non-negative matrix factorization," Nature 401, 788-791. 10.1038/44565
- (1999) Nature , vol.401 , pp. 788-791
- Lee, D.¹ Seung, H.S.²

12
- 80051625972
- A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics
- Mysore, G. J., and Smaragdis, P. (2011). " A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics," in Proceedings of ICASSP, pp. 17-20.
- (2011) Proceedings of ICASSP , pp. 17-20
- Mysore, G.J.¹ Smaragdis, P.²

13
- 84878576230
- Non-negative hidden Markov modeling of audio with application to source separation
- Mysore, G. J., Smaragdis, P., and Raj, B. (2010). " Non-negative hidden Markov modeling of audio with application to source separation," in Proceedings of LVA/ICA, pp. 1-8.
- (2010) Proceedings of LVA/ICA , pp. 1-8
- Mysore, G.J.¹ Smaragdis, P.² Raj, B.³

14
- 77956509090
- Rectified linear units improve restricted Boltzmann machines
- Nair, V., and Hinton, G. E. (2010). " Rectified linear units improve restricted Boltzmann machines," in Proceedings of International Conference on Machine Learning, pp. 807-814.
- (2010) Proceedings of International Conference on Machine Learning , pp. 807-814
- Nair, V.¹ Hinton, G.E.²

15
- 84890493989
- Ideal ratio mask estimation using deep neural networks for robust speech recognition
- Narayanan, A., and Wang, D. L. (2013). " Ideal ratio mask estimation using deep neural networks for robust speech recognition," in Proceedings of ICASSP, pp. 7092-7096.
- (2013) Proceedings of ICASSP , pp. 7092-7096
- Narayanan, A.¹ Wang, D.L.²

16
- 84898964201
- Algorithms for non-negative matrix factorization
- Seung, H. S., and Lee, D. (2001). " Algorithms for non-negative matrix factorization," Adv. Neural Inf. Process. Syst. 13, 556-562.
- (2001) Adv. Neural Inf. Process. Syst. , vol.13 , pp. 556-562
- Seung, H.S.¹ Lee, D.²

17
- 38049021850
- Convolutive speech bases and their application to supervised speech separation
- Smaragdis, P. (2007). " Convolutive speech bases and their application to supervised speech separation," IEEE Trans. Audio Speech Lang. Process. 15, 1-12. 10.1109/TASL.2006.876726
- (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , pp. 1-12
- Smaragdis, P.¹

18
- 79960916745
- An algorithm for intelligibility prediction of time frequency weighted noisy speech
- Taal, C. H., Hendriks, R. C., Heusdens, R., and Jensen, J. (2011). " An algorithm for intelligibility prediction of time frequency weighted noisy speech," IEEE Trans. Audio Speech Lang. Process. 19, 2125-2136. 10.1109/TASL.2011.2114881
- (2011) IEEE Trans. Audio Speech Lang. Process. , vol.19 , pp. 2125-2136
- Taal, C.H.¹ Hendriks, R.C.² Heusdens, R.³ Jensen, J.⁴

19
- 50249152311
- Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria
- Virtanen, T. (2007). " Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria," IEEE Trans. Audio Speech Lang. Process. 15, 1066-1074. 10.1109/TASL.2006.885253
- (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , pp. 1066-1074
- Virtanen, T.¹

20
- 84870477511
- Exploring monaural features for classification-based speech segregation
- Wang, Y., Han, K., and Wang, D. L. (2013). " Exploring monaural features for classification-based speech segregation," IEEE Trans. Audio Speech Lang. Process. 21, 270-279. 10.1109/TASL.2012.2221459
- (2013) IEEE Trans. Audio Speech Lang. Process. , vol.21 , pp. 270-279
- Wang, Y.¹ Han, K.² Wang, D.L.³

21
- 84875678689
- Towards scaling up classification-based speech separation
- Wang, Y., and Wang, D. L. (2013). " Towards scaling up classification-based speech separation," IEEE Trans. Audio Speech Lang. Process. 21, 1381-1390. 10.1109/TASL.2013.2250961
- (2013) IEEE Trans. Audio Speech Lang. Process. , vol.21 , pp. 1381-1390
- Wang, Y.¹ Wang, D.L.²

22
- 84921740463
- On training targets for supervised speech separation
- Wang, Y., Narayanan, A., and Wang, D. L. (2014). " On training targets for supervised speech separation," IEEE Trans. Audio Speech Lang. Process. 22, 1849-1858. 10.1109/TASLP.2014.2352935
- (2014) IEEE Trans. Audio Speech Lang. Process. , vol.22 , pp. 1849-1858
- Wang, Y.¹ Narayanan, A.² Wang, D.L.³

23
- 84905280958
- A two-stage approach for improving the perceptual quality of separated speech
- Williamson, D. S., Wang, Y., and Wang, D. L. (2014a). " A two-stage approach for improving the perceptual quality of separated speech," in Proceedings of ICASSP, pp. 7084-7088.
- (2014) Proceedings of ICASSP , pp. 7084-7088
- Williamson, D.S.¹ Wang, Y.² Wang, D.L.³

24
- 84905693981
- Reconstruction techniques for improving the perceptual quality of binary masked speech
- Williamson, D. S., Wang, Y., and Wang, D. L. (2014b). " Reconstruction techniques for improving the perceptual quality of binary masked speech," J. Acoust. Soc. Am. 136, 892-902. 10.1121/1.4884759
- (2014) J. Acoust. Soc. Am. , vol.136 , pp. 892-902
- Williamson, D.S.¹ Wang, Y.² Wang, D.L.³

25
- 84941346133
- Deep neural networks for estimating speech model activations
- Williamson, D. S., Wang, Y., and Wang, D. L. (2015). " Deep neural networks for estimating speech model activations," in Proceedings of ICASSP, pp. 5113-5117.
- (2015) Proceedings of ICASSP , pp. 5113-5117
- Williamson, D.S.¹ Wang, Y.² Wang, D.L.³

26
- 51449092704
- Speech denoising using nonnegative matrix factorization with priors
- Wilson, K., Raj, B., Smaragdis, P., and Divakaran, A. (2008). " Speech denoising using nonnegative matrix factorization with priors," in Proceedings of ICASSP, pp. 4029-4032.
- (2008) Proceedings of ICASSP , pp. 4029-4032
- Wilson, K.¹ Raj, B.² Smaragdis, P.³ Divakaran, A.⁴

27
- 84889257121
- An experimental study on speech enhancement based on deep neural networks
- Xu, Y., Du, J., Dai, L., and Lee, C. (2014). " An experimental study on speech enhancement based on deep neural networks," IEEE Sign. Process. Lett. 21, 65-68. 10.1109/LSP.2013.2291240
- (2014) IEEE Sign. Process. Lett. , vol.21 , pp. 65-68
- Xu, Y.¹ Du, J.² Dai, L.³ Lee, C.⁴

28
- 84910097441
- Boosted deep neural networks and multi-resolution cochleagram features for voice activity detection
- Zhang, X.-L., and Wang, D. L. (2014). " Boosted deep neural networks and multi-resolution cochleagram features for voice activity detection," in Proceedings of INTERSPEECH, pp. 1534-1538.
- (2014) Proceedings of INTERSPEECH , pp. 1534-1538
- Zhang, X.-L.¹ Wang, D.L.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.