SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2013, Pages 2222-2226

Multilingual hierarchical MRASTA features for ASR

(3) Tüske, Zoltán a Schlüter, Ralf a Ney, Hermann a,b

a RWTH AACHEN UNIVERSITY (Germany)

b UFR 919 Laboratoire d'Informatique Pour la Mécanique et les Sciences de l'Ingénieur (France)

Author keywords

Bottleneck; Deep MLP; Hierarchical; LVCSR; MRASTA; Multilingual

Indexed keywords

FEATURE EXTRACTION; SPEECH RECOGNITION;

BOTTLENECK; DEEP MLP; HIERARCHICAL; LVCSR; MRASTA; MULTILINGUAL;

HIERARCHICAL SYSTEMS;

EID: 84906215094 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (27)

References (32)

1
- 33947619591
- Cross-domain and cross-language portability of acoustic features estimated by multilayer perceptrons
- A. Stolcke, F. Grezl, M.-Y. Hwang, X. Lei, N. Morgan, and D. Vergyri, "Cross-domain and cross-language portability of acoustic features estimated by multilayer perceptrons, " in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 2006, pp. 321-324.
- (2006) Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing , pp. 321-324
- Stolcke, A.¹ Grezl, F.² Hwang, M.-Y.³ Lei, X.⁴ Morgan, N.⁵ Vergyri, D.⁶

2
- 44849132075
- Monolingual and cross lingual comparison of tandem features derived from articulatory and phone MLPs
- O. Cetin, M. Magimai-Doss, K. Livescu, A. Kantor, S. King, C. Bartels, and J. Frankel, "Monolingual and cross lingual comparison of tandem features derived from articulatory and phone MLPs, " in Proc. of IEEE Automatic Speech Recognition and Understanding Workshop, 2007, pp. 36-41.
- (2007) Proc. of IEEE Automatic Speech Recognition and Understanding Workshop , pp. 36-41
- Cetin, O.¹ Magimai-Doss, M.² Livescu, K.³ Kantor, A.⁴ King, S.⁵ Bartels, C.⁶ Frankel, J.⁷

3
- 84858976609
- Cross-lingual portability of Chinese and English neural network features for French and German LVCSR
- C. Plahl, R. Schluter, and H. Ney, "Cross-lingual Portability of Chinese and English Neural Network Features for French and German LVCSR, " in Proc. of IEEE Automatic Speech Recognition and Understanding Workshop, 2011, pp. 371-376.
- (2011) Proc. of IEEE Automatic Speech Recognition and Understanding Workshop , pp. 371-376
- Plahl, C.¹ Schluter, R.² Ney, H.³

4
- 84858985238
- Cross-lingual portability of MLP-based tandem features-a case study for English and Hungarian
- L. Toth, J. Frankel, G. Gosztolya, and S. King, "Cross-lingual Portability of MLP-Based Tandem Features-A Case Study for English and Hungarian, " in Proc. of Interspeech, 2008, pp. 2695- 2698.
- (2008) Proc. of Interspeech , pp. 2695-2698
- Toth, L.¹ Frankel, J.² Gosztolya, G.³ King, S.⁴

5
- 84858955616
- Study of probabilistic and bottle-neck features in multilingual environment
- F. Grezl, M. Karafiat, and M. Janda, "Study of probabilistic and bottle-neck features in multilingual environment, " in Proc. of IEEE Automatic Speech Recognition and Understanding Workshop, 2011, pp. 359-364.
- (2011) Proc. of IEEE Automatic Speech Recognition and Understanding Workshop , pp. 359-364
- Grezl, F.¹ Karafiat, M.² Janda, M.³

6
- 84890474441
- Investigation on cross- And multilingual MLP features under matched and mismatched acoustical conditions
- accepted for publication
- Z. Tuske, J. Pinto, D. Willett, and R. Schluter, "Investigation on cross- And multilingual MLP features under matched and mismatched acoustical conditions, " in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 2013, accepted for publication.
- (2013) Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing
- Tuske, Z.¹ Pinto, J.² Willett, D.³ Schluter, R.⁴

7
- 84878559540
- An investigation on initialization schemes for multilayer perceptron training using multilingual data and their effect on ASR performance
- N. T. Vu, W. Breiter, F. Metze, and T. Schultz, "An Investigation on Initialization Schemes for Multilayer Perceptron Training Using Multilingual Data and Their Effect on ASR Performance, " in Proc. of Interspeech, 2012.
- (2012) Proc. of Interspeech
- Vu, N.T.¹ Breiter, W.² Metze, F.³ Schultz, T.⁴

8
- 84867606552
- Multilingual MLP features for low-resource LVCSR systems
- S. Thomas, S. Ganapathy, and H. Hermansky, "Multilingual MLP features for low-resource LVCSR systems, " in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 2012, pp. 4269-4272.
- (2012) Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing , pp. 4269-4272
- Thomas, S.¹ Ganapathy, S.² Hermansky, H.³

9
- 0033709098
- Tandem connectionist feature extraction for conventional HMM systems
- H. Hermansky, D. P. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional HMM systems, " in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, vol. 3, 2000, pp. 1635-1638.
- (2000) Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing , vol.3 , pp. 1635-1638
- Hermansky, H.¹ Ellis, D.P.² Sharma, S.³

10
- 34547548235
- Probabilistic and bottle-neck features for LVCSR of meetings
- F. Grezl, M. Karafiat, S. Kontar, and J. Cernocky, "Probabilistic and bottle-neck features for LVCSR of meetings, " in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 2007, pp. 757-760.
- (2007) Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing , pp. 757-760
- Grezl, F.¹ Karafiat, M.² Kontar, S.³ Cernocky, J.⁴

11
- 85135166225
- Fast bootstrapping of LVCSR systems with multilingual phoneme sets
- T. Schultz and A. Waibel, "Fast Bootstrapping Of LVCSR Systems With Multilingual Phoneme Sets, " in Proc of Eurospeech, 1997.
- (1997) Proc of Eurospeech
- Schultz, T.¹ Waibel, A.²

12
- 70349220094
- A study on multilingual acoustic modeling for large vocabulary ASR
- H. Lin, L. Deng, D. Yu, Y. Gong, A. Acero, and C.-H. Lee, "A study on multilingual acoustic modeling for large vocabulary ASR, " in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 2009, pp. 4333-4336.
- (2009) Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing , pp. 4333-4336
- Lin, H.¹ Deng, L.² Yu, D.³ Gong, Y.⁴ Acero, A.⁵ Lee, C.-H.⁶

13
- 79959816770
- Towards mixed language speech recognition systems
- D. Imseng, H. Bourlard, and M. Magimai-Doss, "Towards mixed language speech recognition systems, " in Proc. of Interspeech, 2010, pp. 278-281.
- (2010) Proc. of Interspeech , pp. 278-281
- Imseng, D.¹ Bourlard, H.² Magimai-Doss, M.³

14
- 78649265439
- Development of multilingual acoustic models in the global phone project
- T. Schultz and A. Waibel, "Development of Multilingual Acoustic Models in the Global Phone Project, " in Proc. ofWorkshop on Text, Speech, and Dialogue, 1998.
- (1998) Proc. of Workshop on Text, Speech, and Dialogue
- Schultz, T.¹ Waibel, A.²

15
- 0033690885
- Towards language independent acoustic modeling
- W. Byrne, P. Beyerlein, J. M. Huerta, S. Khudanpur, B. Marthi, J. Morgan, N. Peterek, J. Picone, D. Vergyri, and W. Wang, "Towards language independent acoustic modeling, " in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, 2000, pp. 1029-1032.
- (2000) Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing , vol.2 , pp. 1029-1032
- Byrne, W.¹ Beyerlein, P.² Huerta, J.M.³ Khudanpur, S.⁴ Marthi, B.⁵ Morgan, J.⁶ Peterek, N.⁷ Picone, J.⁸ Vergyri, D.⁹ Wang, W.¹⁰

16
- 79959819891
- Cross-lingual and multi stream posterior features for low resource LVCSR systems
- S. Thomas, S. Ganapathy, and H. Hermansky, "Cross-lingual and multi stream posterior features for low resource LVCSR systems, " in Proc. of Interspeech, 2010, pp. 877-880.
- (2010) Proc. of Interspeech , pp. 877-880
- Thomas, S.¹ Ganapathy, S.² Hermansky, H.³

17
- 84867224965
- On the use of a multilingual neural network front-end
- S. Scanzio, P. Laface, L. Fissore, R. Gemello, and F. Mana, "On the Use of a Multilingual Neural Network Front-End, " in Proc. of Interspeech, 2008, pp. 2711-2714.
- (2008) Proc. of Interspeech , pp. 2711-2714
- Scanzio, S.¹ Laface, P.² Fissore, L.³ Gemello, R.⁴ Mana, F.⁵

18
- 78049394188
- Multilingual acoustic modeling for speech recognition based on subspace gaussian mixture models
- L. Burget, P. Schwarz, M. Agarwal, P. Akayazi, K. Feng, A. Ghoshal, O. Glembek, N. Goel, M. Karafiat, D. Povey, A. Rastrow, R. C. Rose, and S. Thomas, "Multilingual acoustic modeling for speech recognition based on subspace Gaussian mixture models, " in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 2013, 2010, pp. 4334-4337.
- (2010) Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 2013 , pp. 4334-4337
- Burget, L.¹ Schwarz, P.² Agarwal, M.³ Akayazi, P.⁴ Feng, K.⁵ Ghoshal, A.⁶ Glembek, O.⁷ Goel, N.⁸ Karafiat, M.⁹ Povey, D.¹⁰ Rastrow, A.¹¹ Rose, R.C.¹² Thomas, S.¹³

19
- 84874226274
- The language-independent bottleneck features
- K. Vesely, M. Karafiat, F. Grezl, M. Janda, and E. Egorova, "The language-independent bottleneck features, " in Proc. of IEEE Workshop on Spoken Language Technology, 2012, pp. 336-341.
- (2012) Proc. of IEEE Workshop on Spoken Language Technology , pp. 336-341
- Vesely, K.¹ Karafiat, M.² Grezl, F.³ Janda, M.⁴ Egorova, E.⁵

20
- 33745213373
- Multi-resolution RASTA filtering for TANDEM-based ASR
- H. Hermansky and P. Fousek, "Multi-resolution RASTA filtering for TANDEM-based ASR, " in Proc. of Interspeech, 2005, pp. 361-364.
- (2005) Proc. of Interspeech , pp. 361-364
- Hermansky, H.¹ Fousek, P.²

21
- 79959844505
- Hierarchical bottle neck features for LVCSR
- C. Plahl, R. Schluter, and H. Ney, "Hierarchical Bottle Neck Features for LVCSR, " in Proc. of Interspeech, 2010, pp. 1197-1200.
- (2010) Proc. of Interspeech , pp. 1197-1200
- Plahl, C.¹ Schluter, R.² Ney, H.³

22
- 84865801985
- Conversational speech transcription using context-dependent deep neural networks
- F. Seide, G. Li, and D. Yu, "Conversational Speech Transcription Using Context-Dependent Deep Neural Networks, " in Proc of Interspeech, 2011, pp. 437-440.
- (2011) Proc of Interspeech , pp. 437-440
- Seide, F.¹ Li, G.² Yu, D.³

23
- 84867593213
- Auto-encoder bottleneck features using deep belief networks
- T. N. Sainath, B. Kingsbury, and B. Ramabhadran, "Auto-encoder bottleneck features using deep belief networks, " in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 2012, pp. 4153-4156.
- (2012) Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing , pp. 4153-4156
- Sainath, T.N.¹ Kingsbury, B.² Ramabhadran, B.³

24
- 84890543571
- Deep hierarchical bottleneck MRASTA features for LVCSR
- accepted for publication
- Z. Tuske, R. Schluter, and H. Ney, "Deep hierarchical bottleneck MRASTA features for LVCSR, " in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 2013, accepted for publication.
- (2013) Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing
- Tuske, Z.¹ Schluter, R.² Ney, H.³

25
- 79959839253
- The RWTH 2009 QUAERO ASR evaluation system for English and German
- M. Nußbaum-Thom, S. Wiesler, M. Sundermeyer, C. Plahl, S. Hahn, R. Schluter, and H. Ney, "The RWTH 2009 QUAERO ASR evaluation system for English and German, " in Proc. of Interspeech, 2010, pp. 1517-1520.
- (2010) Proc. of Interspeech , pp. 1517-1520
- Nußbaum-Thom, M.¹ Wiesler, S.² Sundermeyer, M.³ Plahl, C.⁴ Hahn, S.⁵ Schluter, R.⁶ Ney, H.⁷

26
- 80051609102
- The RWTH 2010 QUAERO ASR evaluation system for English, French, and German
- M. Sundermeyer, M. Nußbaum-Thom, S.Wiesler, C. Plahl, A.-D. Mousa, S. Hahn, D. Nolden, R. Schluter, and H. Ney, "The RWTH 2010 QUAERO ASR Evaluation System for English, French, and German, " in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 2011, pp. 2212-2215.
- (2011) Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing , pp. 2212-2215
- Sundermeyer, M.¹ Nußbaum-Thom, M.² Wiesler, S.³ Plahl, C.⁴ Mousa, A.-D.⁵ Hahn, S.⁶ Nolden, D.⁷ Schluter, R.⁸ Ney, H.⁹

27
- 51449120604
- Hierarchical and parallel processing of modulation spectrum for ASR applications
- F. Valente and H. Hermansky, "Hierarchical and parallel processing of modulation spectrum for ASR applications, " in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 2008, pp. 4165-4168.
- (2008) Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing , pp. 4165-4168
- Valente, F.¹ Hermansky, H.²

28
- 84878410921
- RASR - The RWTH Aachen university open source speech recognition toolkit
- D. Rybach, S. Hahn, P. Lehnen, D. Nolden, M. Sundermeyer, Z. Tuske, S. Wiesler, R. Schluter, and H. Ney, "RASR - The RWTH Aachen University Open Source Speech Recognition Toolkit, " in Proc. of IEEE Automatic Speech Recognition and Understanding Workshop, 2011.
- (2011) Proc. of IEEE Automatic Speech Recognition and Understanding Workshop
- Rybach, D.¹ Hahn, S.² Lehnen, P.³ Nolden, D.⁴ Sundermeyer, M.⁵ Tuske, Z.⁶ Wiesler, S.⁷ Schluter, R.⁸ Ney, H.⁹

29
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- M. J. F. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition, " Computer Speech and Language, vol. 12, pp. 75-98, 1998.
- (1998) Computer Speech and Language , vol.12 , pp. 75-98
- Gales, M.J.F.¹

30
- 33646759965
- Adaptive training using simple target models
- G. Stemmer, F. Brugnara, and D. Giuliani, "Adaptive training using simple target models, " in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, vol. 1, 2005, pp. 997-1000.
- (2005) Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing , vol.1 , pp. 997-1000
- Stemmer, G.¹ Brugnara, F.² Giuliani, D.³

31
- 84865756657
- Hybrid language models using mixed types of sub-lexical units for open vocabulary german LVCSR
- M. A. B. Shaik, A. E.-D. Mousa, R. Schluter, and H. Ney, "Hybrid language models using mixed types of sub-lexical units for open vocabulary German LVCSR, " in Proc. of Interspeech, 2011, pp. 1441-1444.
- (2011) Proc. of Interspeech , pp. 1441-1444
- Shaik, M.A.B.¹ Mousa, A.E.-D.² Schluter, R.³ Ney, H.⁴

32
- 80051609913
- Using morpheme and syllable based sub-words for polish LVCSR
- M. A. B. Shaik, A. E.-D. Mousa, R. Schluter, and H. Ney, "Using morpheme and syllable based sub-words for Polish LVCSR, " in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 2011, pp. 4680-4683.
- (2011) Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing , pp. 4680-4683
- Shaik, M.A.B.¹ Mousa, A.E.-D.² Schluter, R.³ Ney, H.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.