SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn 2015-August, Issue , 2015, Pages 5818-5822

Unsupervised neural network based feature extraction using weak top-down constraints

(4) Kamper, Herman a Elsner, Micha b Jansen, Aren c Goldwater, Sharon a

a UNIVERSITY OF EDINBURGH (United Kingdom)

b OHIO STATE UNIVERSITY (United States)

c JOHNS HOPKINS UNIVERSITY (United States)

Author keywords

deep neural networks; top down constraints; Unsupervised feature extraction; zero resource speech processing

Indexed keywords

AUDIO SIGNAL PROCESSING; DYNAMIC PROGRAMMING; EXTRACTION; FEATURE EXTRACTION; SPEECH COMMUNICATION; SPEECH PROCESSING;

ACOUSTIC MODELLING; DISCRIMINATION TASKS; FEATURE EXTRACTOR; STANDARD COMPONENTS; STATE-OF-THE-ART SYSTEM; SUPERVISED SYSTEMS; TOPDOWN; UNSUPERVISED NEURAL NETWORKS;

DEEP NEURAL NETWORKS;

EID: 84946101387 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2015.7179087 Document Type: Conference Paper

Times cited : (142)

References (26)

1
- 84055222005
- Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
- G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition," IEEE Trans. Audio, Speech, Language Process., vol. 20, no. 1, pp. 30-42, 2012.
- (2012) IEEE Trans. Audio, Speech, Language Process , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

2
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-R. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82-97, 2012.
- (2012) IEEE Signal Process. Mag. , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.-R.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰ Kingsbury, B.¹¹

3
- 84901502980
- Feature learning in deep neural networks-studies on speech recognition
- D. Yu, M. Seltzer, J. Li, J.-T. Huang, and F. Seide, "Feature learning in deep neural networks-studies on speech recognition," in Proc. ICLR, 2013.
- (2013) Proc. ICLR
- Yu, D.¹ Seltzer, M.² Li, J.³ Huang, J.-T.⁴ Seide, F.⁵

4
- 79959851706
- Towards spoken term discovery at scale with zero resources
- A. Jansen, K. Church, and H. Hermansky, "Towards spoken term discovery at scale with zero resources," in Proc. Interspeech, 2010.
- (2010) Proc. Interspeech
- Jansen, A.¹ Church, K.² Hermansky, H.³

5
- 84867809023
- A nonparametric Bayesian approach to acoustic model discovery
- C. Lee and J. R. Glass, "A nonparametric Bayesian approach to acoustic model discovery," in Proc. ACL, 2012.
- (2012) Proc. ACL
- Lee, C.¹ Glass, J.R.²

6
- 70450158585
- Unsupervised training of an HMM-based speech recognizer for topic classification
- H. Gish, M.-H. Siu, A. Chan, and B. Belfield, "Unsupervised training of an HMM-based speech recognizer for topic classification," in Proc. Interspeech, 2009.
- (2009) Proc. Interspeech
- Gish, H.¹ Siu, M.-H.² Chan, A.³ Belfield, B.⁴

7
- 77949473673
- Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams
- Y. Zhang and J. R. Glass, "Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams," in Proc. ASRU, 2009.
- (2009) Proc. ASRU
- Zhang, Y.¹ Glass, J.R.²

8
- 84890478910
- The spoken web search task at MediaEval 2012
- F. Metze, X. Anguera, E. Barnard, M. Davel, and G. Gravier, "The spoken web search task at MediaEval 2012," in Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Metze, F.¹ Anguera, X.² Barnard, E.³ Davel, M.⁴ Gravier, G.⁵

9
- 84890471125
- On rectified linear units for speech processing
- M. D. Zeiler, M. Ranzato, R. Monga, M. Mao, K. Yang, Q. V. Le, P. Nguyen, A. Senior, V. Vanhoucke, J. Dean, and G. E. Hinton, "On rectified linear units for speech processing," in Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Zeiler, M.D.¹ Ranzato, M.² Monga, R.³ Mao, M.⁴ Yang, K.⁵ Le, Q.V.⁶ Nguyen, P.⁷ Senior, A.⁸ Vanhoucke, V.⁹ Dean, J.¹⁰ Hinton, G.E.¹¹

10
- 84905227103
- An autoencoder based approach to unsupervised learning of subword units
- L. Badino, C. Canevari, L. Fadiga, and G. Metta, "An autoencoder based approach to unsupervised learning of subword units," in Proc. ICASSP, 2014.
- (2014) Proc. ICASSP
- Badino, L.¹ Canevari, C.² Fadiga, L.³ Metta, G.⁴

11
- 33947643715
- Unsupervised word acquisition from speech using pattern discovery
- A. Park and J. R. Glass, "Unsupervised word acquisition from speech using pattern discovery," in Proc. ICASSP, 2006.
- (2006) Proc. ICASSP
- Park, A.¹ Glass, J.R.²

12
- 84858987768
- Efficient spoken term discovery using randomized algorithms
- A. Jansen and B. Van Durme, "Efficient spoken term discovery using randomized algorithms," in Proc. ASRU, 2011.
- (2011) Proc. ASRU
- Jansen, A.¹ Van Durme, B.²

13
- 84893673786
- A hierarchical system for word discovery exploiting DTW-based initialization
- O. Walter, T. Korthals, R. Haeb-Umbach, and B. Raj, "A hierarchical system for word discovery exploiting DTW-based initialization," in Proc. ASRU, 2013.
- (2013) Proc. ASRU
- Walter, O.¹ Korthals, T.² Haeb-Umbach, R.³ Raj, B.⁴

14
- 84865770260
- Towards unsupervised training of speaker independent acoustic models
- A. Jansen and K. Church, "Towards unsupervised training of speaker independent acoustic models," in Proc. Interspeech, 2011.
- (2011) Proc. Interspeech
- Jansen, A.¹ Church, K.²

15
- 84890467020
- Weak top-down constraints for unsupervised acoustic model training
- A. Jansen, S. Thomas, and H. Hermansky, "Weak top-down constraints for unsupervised acoustic model training," in Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Jansen, A.¹ Thomas, S.² Hermansky, H.³

16
- 0026400245
- An investigation of PLP and IMELDA acoustic representations and of their potential for combination
- M. Hunt, S. M. Richardson, D. C. Bateman, and A. Piau, "An investigation of PLP and IMELDA acoustic representations and of their potential for combination," in Proc. ICASSP, 1991.
- (1991) Proc. ICASSP
- Hunt, M.¹ Richardson, S.M.² Bateman, D.C.³ Piau, A.⁴

17
- 84946685733
- Phonetics embedding learning with side information
- G. Synnaeve1, T. Schatz, and E. Dupoux, "Phonetics embedding learning with side information," in Proc. SLT, 2014.
- (2014) Proc. SLT
- Synnaevel, G.¹ Schatz, T.² Dupoux, E.³

18
- 69349090197
- Learning deep architectures for AI
- Y. Bengio, "Learning deep architectures for AI," Found. Trends Mach. Learning, vol. 2, no. 1, pp. 1-127, 2009.
- (2009) Found. Trends Mach. Learning , vol.2 , Issue.1 , pp. 1-127
- Bengio, Y.¹

19
- 33746600649
- Reducing the dimensionality of data with neural networks
- G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," Science, vol. 313, no. 5786, pp. 504-507, 2006.
- (2006) Science , vol.313 , Issue.5786 , pp. 504-507
- Hinton, G.E.¹ Salakhutdinov, R.R.²

20
- 84864073449
- Greedy layer-wise training of deep networks
- Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, "Greedy layer-wise training of deep networks," in Proc. NIPS, 2007.
- (2007) Proc. NIPS
- Bengio, Y.¹ Lamblin, P.² Popovici, D.³ Larochelle, H.⁴

21
- 0017930815
- Dynamic programming algorithm optimization for spoken word recognition
- H. Sakoe and S. Chiba, "Dynamic programming algorithm optimization for spoken word recognition," IEEE Trans. Acoust., Speech, Signal Process., vol. 26, no. 1, pp. 43-49, 1978.
- (1978) IEEE Trans. Acoust., Speech, Signal Process , vol.26 , Issue.1 , pp. 43-49
- Sakoe, H.¹ Chiba, S.²

22
- 56449089103
- Extracting and composing robust features with denoising autoencoders
- P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, "Extracting and composing robust features with denoising autoencoders," in Proc. ICML, 2008.
- (2008) Proc. ICML
- Vincent, P.¹ Larochelle, H.² Bengio, Y.³ Manzagol, P.-A.⁴

23
- 0003571976
- Cambridge University Engineering Department
- S. J. Young, G. Evermann, M. J. F. Gales, T. Hain, D. Kershaw, X. Liu, G. L. Moore, J. J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. C. Woodland, The HTK Book (for HTK Version 3. 4), Cambridge University Engineering Department, 2009.
- (2009) The HTK Book (For HTK Version 3. 4)
- Young, S.J.¹ Evermann, G.² Gales, M.J.F.³ Hain, T.⁴ Kershaw, D.⁵ Liu, X.⁶ Moore, G.L.⁷ Odell, J.J.⁸ Ollason, D.⁹ Povey, D.¹⁰ Valtchev, V.¹¹ Woodland, P.C.¹²

24
- 84893401626
- arXiv:1308. 4214
- I. J. Goodfellow, D. Warde-Farley, P. Lamblin, V. Dumoulin, M. Mirza, R. Pascanu, J. Bergstra, F. Bastien, and Y. Bengio, "Pylearn2: a machine learning research library," arXiv:1308. 4214, 2013.
- (2013) Pylearn2: A Machine Learning Research Library
- Goodfellow, I.J.¹ Warde-Farley, D.² Lamblin, P.³ Dumoulin, V.⁴ Mirza, M.⁵ Pascanu, R.⁶ Bergstra, J.⁷ Bastien, F.⁸ Bengio, Y.⁹

25
- 84905240834
- Recurrent deep neural networks for robust speech recognition
- C. Weng, D. Yu, S. Watanabe, and B.-H. Juang, "Recurrent deep neural networks for robust speech recognition," in Proc. ICASSP, 2014.
- (2014) Proc. ICASSP
- Weng, C.¹ Yu, D.² Watanabe, S.³ Juang, B.-H.⁴

26
- 84865767134
- Rapid evaluation of speech representations for spoken term discovery
- M. A. Carlin, S. Thomas, A. Jansen, and H. Hermansky, "Rapid evaluation of speech representations for spoken term discovery," in Proc. Interspeech, 2011
- (2011) Proc. Interspeech
- Carlin, M.A.¹ Thomas, S.² Jansen, A.³ Hermansky, H.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.