SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2014, Pages 1764-1768

Synthesized stereo mapping via deep neural networks for noisy speech recognition

(3) Du, Jun a Dai, Li Rong a Huo, Qiang b

a UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA (China)

b MICROSOFT RESEARCH (United States)

Author keywords

deep neural network; HMM based speech synthesis; joint Gaussian mixture model; noisy speech recognition

Indexed keywords

SIGNAL PROCESSING; SPEECH RECOGNITION; SPEECH SYNTHESIS; STOCHASTIC SYSTEMS;

DEEP NEURAL NETWORKS; EUROPEAN LANGUAGES; GAUSSIAN MIXTURE MODEL; HMM-BASED SPEECH SYNTHESIS; NOISY SPEECH RECOGNITION; REAL APPLICATIONS; STEREO MAPPING; STEREO-BASED STOCHASTIC MAPPINGS;

MAPPING;

EID: 84905284245 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2014.6853901 Document Type: Conference Paper

Times cited : (12)

References (33)

1
- 0004319970
- Kluwer Academic Publishers
- A. Acero, Acoustic and Environment Robustness in Automatic Speech Recognition, Kluwer Academic Publishers, 1993.
- (1993) Acoustic and Environment Robustness in Automatic Speech Recognition
- Acero, A.¹

2
- 34547550766
- Stereo-based stochastic mapping for robust speech recognition
- M. Afify, X. Cui, and Y. Gao, "Stereo-based stochastic mapping for robust speech recognition," Proc. ICASSP, 2007, pp.377-380.
- (2007) Proc. ICASSP , pp. 377-380
- Afify, M.¹ Cui, X.² Gao, Y.³

3
- 68549125183
- Stereo-based stochastic mapping for robust speech recognition
- M. Afify, X. Cui, and Y. Gao, "Stereo-based stochastic mapping for robust speech recognition," IEEE Trans. on Audio, Speech and Language Processing, Vol. 17, No. 7, pp.1325-1334, 2009.
- (2009) IEEE Trans. on Audio, Speech and Language Processing , vol.17 , Issue.7 , pp. 1325-1334
- Afify, M.¹ Cui, X.² Gao, Y.³

4
- 84905267603
- Availability of finnish speechdat-car database for etsi stq wi008 front-end standardisation
- Aurora document AU/217/99, Nov
- Aurora document AU/217/99, "Availability of Finnish SpeechDat-Car database for ETSI STQ WI008 front-end standardisation," Nokia, Nov. 1999.
- (1999) Nokia

5
- 84905249653
- Spanish SDC-Aurora database for ETSI STQ Aurora WI008 advanced DSR front-end evaluation: Description and baseline results
- Aurora document AU/271/00, Nov
- Aurora document AU/271/00, "Spanish SDC-Aurora database for ETSI STQ Aurora WI008 advanced DSR front-end evaluation: description and baseline results," UPC, Nov. 2000.
- (2000) UPC

6
- 84905267604
- Description and baseline results for the subset of the SpeechDat-Car German database used for ETSI STQ Aurora WI008 advanced DSR front-end evaluation
- Aurora document AU/273/00, Dec
- Aurora document AU/273/00, "Description and baseline results for the subset of the SpeechDat-Car German database used for ETSI STQ Aurora WI008 advanced DSR front-end evaluation," Texas Instruments, Dec. 2001.
- (2001) Texas Instruments

7
- 11244303154
- Aurora document AU/378/01, Aalborg University, Jan
- Aurora document AU/378/01, "Danish SpeechDat-Car digits database for ETSI STQ-Aurora advanced DSR," Aalborg University, Jan. 2001.
- (2001) Danish SpeechDat-Car Digits Database for ETSI STQ-Aurora Advanced DSR

8
- 4544344467
- Multienvironment models based linear normalization for robust speech recognition in car conditions
- L. Buera, E. Lleida, A. Miguel, and A. Ortega, "Multienvironment models based linear normalization for robust speech recognition in car conditions," Proc. ICASSP, 2004, pp.1013-1016.
- (2004) Proc. ICASSP , pp. 1013-1016
- Buera, L.¹ Lleida, E.² Miguel, A.³ Ortega, A.⁴

9
- 33947644350
- Evaluation of the SPACE denoising algorithm on Aurora2
- C. Cerisara and K. Daoudi, "Evaluation of the SPACE denoising algorithm on Aurora2," Proc. ICASSP, 2006, pp.I-521-I-524.
- (2006) Proc. ICASSP
- Cerisara, C.¹ Daoudi, K.²

10
- 84055222005
- Context-dependent pre-trained deep neural networks for large vocabulary speech recognition
- G. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large vocabulary speech recognition," IEEE Trans. on Audio, Speech and Language Processing, Vol. 20, No. 1, pp.30-42, 2012.
- (2012) IEEE Trans. on Audio, Speech and Language Processing , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.¹ Yu, D.² Deng, L.³ Acero, A.⁴

11
- 20444414457
- Analysis and comparison of two speech feature extraction/compensation algorithms
- L. Deng, J. Wu, J. Droppo, and A. Acero, "Analysis and comparison of two speech feature extraction/compensation algorithms," IEEE Signal Process. Lett., Vol. 12, No. 6, pp.477-480, 2005.
- (2005) IEEE Signal Process. Lett. , vol.12 , Issue.6 , pp. 477-480
- Deng, L.¹ Wu, J.² Droppo, J.³ Acero, A.⁴

12
- 85006734596
- Evaluation of the SPLICE algorithm on the Aurora2 database
- J. Droppo, L. Deng, and A. Acero, "Evaluation of the SPLICE algorithm on the Aurora2 database," Proc. EuroSpeech, 2001, pp.217-220.
- (2001) Proc. EuroSpeech , pp. 217-220
- Droppo, J.¹ Deng, L.² Acero, A.³

13
- 33745216251
- Maximum mutual information SPLICE transform for seen and unseen conditions
- J. Droppo and A. Acero, "Maximum mutual information SPLICE transform for seen and unseen conditions," Proc. EuroSpeech, 2005, pp.989-992.
- (2005) Proc. EuroSpeech , pp. 989-992
- Droppo, J.¹ Acero, A.²

14
- 78049390326
- HMM-based pseudo-clean speech synthesis for SPLICE algorithm
- J. Du, Y. Hu, L.-R. Dai, and R.-H. Wang, "HMM-based pseudo-clean speech synthesis for SPLICE algorithm," Proc. ICASSP, 2010, pp.4570-4573.
- (2010) Proc. ICASSP , pp. 4570-4573
- Du, J.¹ Hu, Y.² Dai, L.-R.³ Wang, R.-H.⁴

15
- 84878378712
- IVN-based joint training of GMM and HMMs using an improved VTS-based feature compensation for noisy speech recognition
- J. Du and Q. Huo, "IVN-based joint training of GMM and HMMs using an improved VTS-based feature compensation for noisy speech recognition," Proc. INTERSPEECH, 2012.
- (2012) Proc. INTERSPEECH
- Du, J.¹ Huo, Q.²

16
- 84874472370
- Synthesized stereo-based stochastic mapping with data selection for robust speech recognition
- J. Du and Q. Huo, "Synthesized stereo-based stochastic mapping with data selection for robust speech recognition," Proc. ISCSLP, 2012, pp.122-125.
- (2012) Proc. ISCSLP , pp. 122-125
- Du, J.¹ Huo, Q.²

17
- 0029288202
- Speech recognition in noisy environments: A survey
- Y. Gong, "Speech recognition in noisy environments: A survey," Speech Communication, Vol. 16, No. 3, pp.261-291, 1995.
- (1995) Speech Communication , vol.16 , Issue.3 , pp. 261-291
- Gong, Y.¹

18
- 33745805403
- A fast learning algorithm for deep belief nets
- G. Hinton, S. Osindero, and Y. Teh, "A fast learning algorithm for deep belief nets," Neural Computation, Vol. 18, pp.1527-1554, 2006.
- (2006) Neural Computation , vol.18 , pp. 1527-1554
- Hinton, G.¹ Osindero, S.² Teh, Y.³

19
- 33746600649
- Reducing the dimensionality of data with neural networks
- G. Hinton and R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," Science, Vol. 313, No. 5786, pp.504-507, 2006.
- (2006) Science , vol.313 , Issue.5786 , pp. 504-507
- Hinton, G.¹ Salakhutdinov, R.²

20
- 78650474133
- UTML TR 2010-003, University of Toronto
- G. Hinton, "A practical guide to training restricted Boltzmann machines," UTML TR 2010-003, University of Toronto, 2010.
- (2010) A Practical Guide to Training Restricted Boltzmann Machines
- Hinton, G.¹

21
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition," IEEE Signal Processing Magazine, Vol. 29, No. 6, pp.82-97, 2012.
- (2012) IEEE Signal Processing Magazine , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.¹⁰ Kingsbury, B.¹¹

22
- 34547526633
- A maximum likelihood training approach to irrelevant variability compensation based on piecewise linear transformations
- Q. Huo and D.-L. Zhu, "A maximum likelihood training approach to irrelevant variability compensation based on piecewise linear transformations," Proc. ICSLP, 2006, pp.1129-1132.
- (2006) Proc. ICSLP , pp. 1129-1132
- Huo, Q.¹ Zhu, D.-L.²

23
- 84878409063
- Recurrent neural networks for noise reduction in robust ASR
- A. L. Maas, Q. V. Le, T. M. ONeil, O. Vinyals, P. Nguyen, and A. Y. Ng, "Recurrent neural networks for noise reduction in robust ASR," Proc. INTERSPEECH, 2012.
- (2012) Proc. INTERSPEECH
- Maas, A.L.¹ Le, Q.V.² Oneil, T.M.³ Vinyals, O.⁴ Nguyen, P.⁵ Ng, A.Y.⁶

24
- 65549153550
- Ph.D. thesis, Carnegie Mellon University
- P. J. Moreno, Speech recognition in noisy environments, Ph.D. thesis, Carnegie Mellon University, 1996.
- (1996) Speech Recognition in Noisy Environments
- Moreno, P.J.¹

25
- 33646788786
- FMPE: Discriminatively trained features for speech recognition
- D. Povey, B. Kingsbury, L. Mangu, G. Saon, H. Soltau, and G. Zweig, "fMPE: Discriminatively trained features for speech recognition," Proc. ICASSP, 2005, pp.961-964.
- (2005) Proc. ICASSP , pp. 961-964
- Povey, D.¹ Kingsbury, B.² Mangu, L.³ Saon, G.⁴ Soltau, H.⁵ Zweig, G.⁶

26
- 84865801985
- Conversational speech transcription using context-dependent deep neural networks
- F. Seide, Gang Li, and Dong Yu, "Conversational speech transcription using context-dependent deep neural networks," Proc. INTERSPEECH, 2011, pp.437-440.
- (2011) Proc. INTERSPEECH , pp. 437-440
- Seide, F.¹ Li, G.² Yu, D.³

27
- 84890492030
- An investigation of deep neural networks for noise robust speech recognition
- M. Seltzer, D. Yu, and Y.-Q. Wang, "An investigation of deep neural networks for noise robust speech recognition," Proc. ICASSP, 2013, pp.7398-7402.
- (2013) Proc. ICASSP , pp. 7398-7402
- Seltzer, M.¹ Yu, D.² Wang, Y.-Q.³

28
- 0033708106
- Speech parameter generation algorithms for HMMbased speech synthesis
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMMbased speech synthesis," Proc. ICASSP, 2000, pp.1315-1318.
- (2000) Proc. ICASSP , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

29
- 44849090158
- An environment-compensated minimum classification error training approach based on stochastic vector mapping
- J. Wu and Q. Huo, "An environment-compensated minimum classification error training approach based on stochastic vector mapping," IEEE Trans. on Audio, Speech and Language Processing, Vol. 14, No. 6, pp.2147-2155, 2006.
- (2006) IEEE Trans. on Audio, Speech and Language Processing , vol.14 , Issue.6 , pp. 2147-2155
- Wu, J.¹ Huo, Q.²

30
- 34547546133
- Word graph based feature enhancement for noisy speech recognition
- Z.-J. Yan, F. K. Soong, and R.-H. Wang, "Word graph based feature enhancement for noisy speech recognition," Proc. ICASSP, 2007, pp.373-376.
- (2007) Proc. ICASSP , pp. 373-376
- Yan, Z.-J.¹ Soong, F.K.² Wang, R.-H.³

31
- 84905285644
- An experimental study on speech enhancement based on deep neural networks
- Y. Xu, J. Du, L.-R. Dai, and C.-H. Lee, "An experimental study on speech enhancement based on deep neural networks," Accepted by Signal Processing Letter.
- Signal Processing Letter
- Xu, Y.¹ Du, J.² Dai, L.-R.³ Lee, C.-H.⁴

32
- 84905285642
- S. Young et al., The HTK Book (for HTK v3.4), 2006.
- (2006) The HTK Book for HTK , vol.3 , Issue.4
- Young, S.¹

33
- 85133720638
- The HMM-based speech synthesis system (HTS) version 2.0
- H. Zen, T. Nose, J. Yamagishi, S. Sako, T. Masuko, A. W. Black, and K. Tokuda, "The HMM-based speech synthesis system (HTS) version 2.0," ISCA Workshop on Speech Synthesis, 2007, pp.294-299.
- (2007) ISCA Workshop on Speech Synthesis , pp. 294-299
- Zen, H.¹ Nose, T.² Yamagishi, J.³ Sako, S.⁴ Masuko, T.⁵ Black, A.W.⁶ Tokuda, K.⁷

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.