메뉴 건너뛰기




Volumn 2015, Issue 1, 2015, Pages 1-14

Noisy training for deep neural networks in speech recognition

Author keywords

Deep neural network; Noise injection; Speech recognition

Indexed keywords

COMPLEX NETWORKS; SPEECH;

EID: 84922326458     PISSN: 16874714     EISSN: 16874722     Source Type: Journal    
DOI: 10.1186/s13636-014-0047-0     Document Type: Article
Times cited : (122)

References (37)
  • 1
    • 84903724014 scopus 로고    scopus 로고
    • Deep learning: methods and applications
    • L Deng, D Yu, Deep learning: methods and applications. Foundations Trends Signal Process. 7, 197–387 (2014).
    • (2014) Foundations Trends Signal Process , vol.7 , pp. 197-387
    • Deng, L.1    Yu, D.2
  • 3
    • 0033709098 scopus 로고    scopus 로고
    • DPW Ellis, S Sharma, in Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),Tandem connectionist feature extraction for conventional HMM systems (Istanbul
    • H Hermansky, DPW Ellis, S Sharma, in Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),Tandem connectionist feature extraction for conventional HMM systems (Istanbul, Turkey, 9 June 2000), pp. 1635–1638.
    • (2000) Turkey , vol.9 , pp. 1635-1638
  • 4
    • 80051616844 scopus 로고    scopus 로고
    • D Yu, L Deng, A Acero, in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),Large vocabulary continuous speech recognition with context-dependent DBN-HMMs (Prague
    • GE Dahl, D Yu, L Deng, A Acero, in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),Large vocabulary continuous speech recognition with context-dependent DBN-HMMs (Prague, Czech Republic, 22 May 2011), pp. 4688–4691.
    • (2011) Czech Republic , vol.22 , pp. 4688-4691
  • 6
    • 84922303373 scopus 로고    scopus 로고
    • in Proc. of Neural Information Processing Systems (NIPS) Workshop Deep Learning for Speech Recognition and Related Applications,Deep belief networks for phone recognition (Vancouver, BC, Canada
    • A Mohamed, G Dahl, G Hinton, in Proc. of Neural Information Processing Systems (NIPS) Workshop Deep Learning for Speech Recognition and Related Applications,Deep belief networks for phone recognition (Vancouver, BC, Canada, 7 December 2009).
    • (2009) Proc. of Neural Information Processing Systems (NIPS)
    • Mohamed, A.1    Dahl, G.2    Hinton, G.3
  • 7
    • 84055222005 scopus 로고    scopus 로고
    • Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
    • GE Dahl, D Yu, L Deng, A Acero, Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1), 30–42 (2012).
    • (2012) IEEE Trans. Audio Speech Lang. Process , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.E.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 8
    • 84922303372 scopus 로고    scopus 로고
    • L Deng, G Dahl, in Proc. of NIPS Workshop on Deep Learning and Unsupervised Feature Learning,Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition (Vancouver, BC, Canada
    • D Yu, L Deng, G Dahl, in Proc. of NIPS Workshop on Deep Learning and Unsupervised Feature Learning,Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition (Vancouver, BC, Canada, 6 December, 2010).
    • (2010) 6 December
  • 9
    • 84922332279 scopus 로고    scopus 로고
    • Vanhoucke, in Proc. of Interspeech,Application of pretrained deep neural networks to large vocabulary speech recognition (Portland, Oregon
    • N Jaitly, P Nguyen, AW Senior, V Vanhoucke, in Proc. of Interspeech,Application of pretrained deep neural networks to large vocabulary speech recognition (Portland, Oregon, USA, 9–13 September 2012), pp. 2578–2581.
    • (2012) USA , vol.9-13 , pp. 2578-2581
    • N Jaitly, P.1    Nguyen, A.W.2    Senior, V.3
  • 10
    • 84858972572 scopus 로고    scopus 로고
    • Novak, A-r Mohamed, in Proc. of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU),Making deep belief networks effective for large vocabulary continuous speech recognition (Hawaii
    • TN Sainath, B Kingsbury, B Ramabhadran, P Fousek, P Novak, A-r Mohamed, in Proc. of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU),Making deep belief networks effective for large vocabulary continuous speech recognition (Hawaii, USA, 11 December 2011), pp. 30–35.
    • (2011) USA , vol.11 , pp. 30-35
    • TN Sainath, B.1    Kingsbury, B.2    Ramabhadran, P.3    Fousek, P.4
  • 11
    • 84886829539 scopus 로고    scopus 로고
    • Optimization techniques to improve training speed of deep belief networks for large speech tasks
    • TN Sainath, B Kingsbury, H Soltau, B Ramabhadran, Optimization techniques to improve training speed of deep belief networks for large speech tasks. IEEE Trans. Audio Speech Lang. Process. 21(1), 2267–2276 (2013).
    • (2013) IEEE Trans. Audio Speech Lang. Process , vol.21 , Issue.1 , pp. 2267-2276
    • Sainath, T.N.1    Kingsbury, B.2    Soltau, H.3    Ramabhadran, B.4
  • 12
    • 84865801985 scopus 로고    scopus 로고
    • in Proc. of Interspeech,Conversational speech transcription using context-dependent deep neural networks (Florence
    • F Seide, G Li, D Yu, in Proc. of Interspeech,Conversational speech transcription using context-dependent deep neural networks (Florence, Italy, 15 August 2011), pp. 437–440.
    • (2011) Italy , vol.15 , pp. 437-440
    • F Seide, G.1    Li, D.Y.2
  • 13
    • 84858976070 scopus 로고    scopus 로고
    • in Proc. of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU),Feature engineering in context-dependent deep neural networks for conversational speech transcription (Waikoloa, HI
    • F Seide, G Li, X Chen, D Yu, in Proc. of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU),Feature engineering in context-dependent deep neural networks for conversational speech transcription (Waikoloa, HI, USA, 11 December 2011), pp. 24–29.
    • (2011) USA , vol.11 , pp. 24-29
    • Seide, F.1    Li, G.2    X Chen, D.Y.3
  • 14
    • 80051644173 scopus 로고    scopus 로고
    • SV Ravuri, in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),Comparing multilayer perceptron to deep belief network tandem features for robust ASR (Prague
    • O Vinyals, SV Ravuri, in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),Comparing multilayer perceptron to deep belief network tandem features for robust ASR (Prague, Czech Republic, 22 May 2011), pp. 4596–4599.
    • (2011) Czech Republic , vol.22 , pp. 4596-4599
  • 15
    • 84865785753 scopus 로고    scopus 로고
    • Seltzer, in Proc. of Interspeech,Improved bottleneck features using pretrained deep neural networks (Florence
    • D Yu, ML Seltzer, in Proc. of Interspeech,Improved bottleneck features using pretrained deep neural networks (Florence, Italy, 15 August 2011), pp. 237–240.
    • (2011) Italy , vol.15 , pp. 237-240
    • D Yu, M.L.1
  • 16
    • 84890537527 scopus 로고    scopus 로고
    • P Swietojanski, S Renals, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),Multi-level adaptive networks in tandem and hybrid ASR systems (Vancouver, BC
    • P Bell, P Swietojanski, S Renals, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),Multi-level adaptive networks in tandem and hybrid ASR systems (Vancouver, BC, Canada, 26 May 2013), pp. 6975–6979.
    • (2013) Canada , vol.26 , pp. 6975-6979
  • 17
    • 51449103447 scopus 로고    scopus 로고
    • s Fousek P, in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),Optimizing bottle-neck features for LVCSR (Las Vegas
    • F Grezl, s Fousek P, in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),Optimizing bottle-neck features for LVCSR (Las Vegas, USA, 4 April 2008), pp. 4729–4732.
    • (2008) USA , vol.4 , pp. 4729-4732
  • 18
    • 84887376692 scopus 로고    scopus 로고
    • Cross-lingual automatic speech recognition using tandem features
    • P Lal, S King, Cross-lingual automatic speech recognition using tandem features. IEEE Trans. Audio Speech Lang. Process. 21(12), 2506–2515 (2011).
    • (2011) IEEE Trans. Audio Speech Lang. Process , vol.21 , Issue.12 , pp. 2506-2515
    • Lal, P.1    King, S.2
  • 19
    • 79959844505 scopus 로고    scopus 로고
    • Ney, in Proc. of Interspeech,Hierarchical bottle neck features for LVCSR (Makuhari
    • C Plahl, R Schlüter, H Ney, in Proc. of Interspeech,Hierarchical bottle neck features for LVCSR (Makuhari, Japan, 26 September 2010), pp. 1197–1200.
    • (2010) Japan , vol.26 , pp. 1197-1200
    • C Plahl, R.1    Schlüter, H.2
  • 20
    • 84867593213 scopus 로고    scopus 로고
    • B Kingsbury, B Ramabhadran, in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),Auto-encoder bottleneck features using deep belief networks (Kyoto
    • TN Sainath, B Kingsbury, B Ramabhadran, in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),Auto-encoder bottleneck features using deep belief networks (Kyoto, Japan, 25 March 2012), pp. 4153–4156.
    • (2012) Japan , vol.25 , pp. 4153-4156
  • 21
    • 84922329720 scopus 로고    scopus 로고
    • Ney, M Sundermeyer, in Proc. of Interspeech,Context-dependent MLPs for LVCSR: tandem, hybrid or both
    • Z Tüske, R Schlüter, H Ney, M Sundermeyer, in Proc. of Interspeech,Context-dependent MLPs for LVCSR: tandem, hybrid or both? (Portland, Oregon, USA, 9 September 2012), pp. 18–21.
    • (2012) (Portland, Oregon, USA , vol.9 , pp. 18-21
    • Z Tüske, R.1    Schlüter, H.2
  • 22
    • 84893708321 scopus 로고    scopus 로고
    • Garner, H Bourlard, in Proc. of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU),Impact of deep MLP architecture on different acoustic modeling techniques for under-resourced speech recognition (Olomouc
    • D Imseng, P Motlicek, PN Garner, H Bourlard, in Proc. of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU),Impact of deep MLP architecture on different acoustic modeling techniques for under-resourced speech recognition (Olomouc, Czech Republic, 8 December 2013), pp. 332–337.
    • (2013) Czech Republic , vol.8 , pp. 332-337
    • D Imseng, P.1    Motlicek, P.N.2
  • 23
    • 84906236884 scopus 로고    scopus 로고
    • in Proc. of Interspeech,Bottleneck features based on gammatone frequency cepstral coefficients (Lyon
    • J Qi, D Wang, J Xu, J Tejedor, in Proc. of Interspeech,Bottleneck features based on gammatone frequency cepstral coefficients (Lyon, France, 25 August 2013), pp. 1751–1755.
    • (2013) France , vol.25 , pp. 1751-1755
    • J Qi, D.1    Wang, J.X.2    Tejedor, J.3
  • 24
    • 84922303367 scopus 로고    scopus 로고
    • ML Seltzer, J Li, J-T Huang, F Seide, in Proc. of International Conference on Learning Representations (ICLR),Feature learning in deep neural networks - a study on speech recognition tasks (Scottsdale, Arizona
    • D Yu, ML Seltzer, J Li, J-T Huang, F Seide, in Proc. of International Conference on Learning Representations (ICLR),Feature learning in deep neural networks - a study on speech recognition tasks (Scottsdale, Arizona, USA, 2 May 2013).
    • (2013) USA , pp. 2
  • 25
    • 84890532503 scopus 로고    scopus 로고
    • B Li, KC Sim, in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),Noise adaptive front-end normalization based on vector Taylor series for deep neural networks in robust speech recognition (Vancouver, BC, Canada, 6 May 2013), pp. 7408–7412.
    • (2013) Canada , vol.6 , pp. 7408-7412
  • 26
    • 84906272122 scopus 로고    scopus 로고
    • Sim, in Proc. of Interspeech,An investigation of spectral restoration algorithms for deep neural networks based noise robust speech recognition (Lyon
    • B Li, Y Tsao, KC Sim, in Proc. of Interspeech,An investigation of spectral restoration algorithms for deep neural networks based noise robust speech recognition (Lyon, France, 25 August 2013), pp. 3002–3006.
    • (2013) France , vol.25 , pp. 3002-3006
    • B Li, Y.1    Tsao, K.C.2
  • 27
    • 84890492030 scopus 로고    scopus 로고
    • ML Seltzer, D Yu, Y Wang, in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),An investigation of deep neural networks for noise robust speech recognition (Vancouver, BC, Canada, 6 May 2013), pp. 7398–7402.
    • (2013) Canada , vol.6 , pp. 7398-7402
  • 28
    • 56449089103 scopus 로고    scopus 로고
    • Bengio, P-A Manzagol, in Proc. of the 25th International Conference on Machine Learning (ICML),Extracting and composing robust features with denoising autoencoders (Helsinki
    • P Vincent, H Larochelle, Y Bengio, P-A Manzagol, in Proc. of the 25th International Conference on Machine Learning (ICML),Extracting and composing robust features with denoising autoencoders (Helsinki, Finland, 5 July 2008), pp. 1096–1103.
    • (2008) Finland , vol.5 , pp. 1096-1103
    • P Vincent, H.1    Larochelle, Y.2
  • 29
    • 84878409063 scopus 로고    scopus 로고
    • Le, O’Neil TM, O Vinyals, P Nguyen, AY Ng, in Proc. of Interspeech,Recurrent neural networks for noise reduction in robust ASR (Portland, Oregon
    • AL Maas, QV Le, O’Neil TM, O Vinyals, P Nguyen, AY Ng, in Proc. of Interspeech,Recurrent neural networks for noise reduction in robust ASR (Portland, Oregon, USA, 9 September 2012), pp. 22–25.
    • (2012) USA , vol.9 , pp. 22-25
    • AL Maas, Q.V.1
  • 30
    • 85118318684 scopus 로고    scopus 로고
    • Zhang, D Wang, in Proc. of ChinaSIP 2014,Noisy training for deep neural networks (Xi‘an
    • X Meng, C Liu, Z Zhang, D Wang, in Proc. of ChinaSIP 2014,Noisy training for deep neural networks (Xi‘an, China, 7 July 2014), pp. 16–20.
    • (2014) China , vol.7 , pp. 16-20
    • X Meng, C.1    Liu, Z.2
  • 31
    • 2342565172 scopus 로고    scopus 로고
    • The effects of adding noise during backpropagation training on a generalization performance
    • G An, The effects of adding noise during backpropagation training on a generalization performance. Neural Comput. 8(3), 643–674 (1996).
    • (1996) Neural Comput , vol.8 , Issue.3 , pp. 643-674
    • An, G.1
  • 32
    • 0029289838 scopus 로고
    • Comments on ‘noise injection into inputs in back propagation learning’
    • Y Grandvalet, S Canu, Comments on ‘noise injection into inputs in back propagation learning’. IEEE Trans. Syst. Man Cybernet. 25(4), 678–681 (1995).
    • (1995) IEEE Trans. Syst. Man Cybernet , vol.25 , Issue.4 , pp. 678-681
    • Grandvalet, Y.1    Canu, S.2
  • 33
    • 0001740650 scopus 로고
    • Training with noise is equivalent to Tikhonov regularization
    • CM Bishop, Training with noise is equivalent to Tikhonov regularization. Neural Comput. 7(1), 108–116 (1995).
    • (1995) Neural Comput , vol.7 , Issue.1 , pp. 108-116
    • Bishop, C.M.1
  • 34
    • 0013230715 scopus 로고    scopus 로고
    • Noise injection: theoretical prospects
    • Y Grandvalet, S Canu, S Boucheron, Noise injection: theoretical prospects. Neural Comput. 9(5), 1093–1108 (1997).
    • (1997) Neural Comput , vol.9 , Issue.5 , pp. 1093-1108
    • Grandvalet, Y.1    Canu, S.2    Boucheron, S.3
  • 35
    • 0024124323 scopus 로고
    • J Sietsma, RJF Dow, in Proc. of IEEE International Conference on Neural Networks,Neural net pruning-why and how (San Diego, California, USA, 24 July 1988), pp. 325–333.
    • (1988) California, USA , vol.24 , pp. 325-333
  • 36
    • 0026858102 scopus 로고
    • Noise injection into inputs in back-propagation learning
    • K Matsuoka, Noise injection into inputs in back-propagation learning. IEEE Trans. Syst. Man Cybernet. 22(3), 436–440 (1992).
    • (1992) IEEE Trans. Syst. Man Cybernet , vol.22 , Issue.3 , pp. 436-440
    • Matsuoka, K.1
  • 37
    • 0029306953 scopus 로고
    • Similarities of error regularization, sigmoid gain scaling, target smoothing, and training with jitter
    • R Reed, RJ Marks, Seho Oh, Similarities of error regularization, sigmoid gain scaling, target smoothing, and training with jitter. IEEE Trans. Neural Netw. 6(3), 529–538 (1995).
    • (1995) IEEE Trans. Neural Netw , vol.6 , Issue.3 , pp. 529-538
    • Reed, R.1    Marks, R.J.2    Seho Oh3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.