SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 2015-January, Issue , 2015, Pages 3660-3664

Joint decoding of tandem and hybrid systems for improved keyword spotting on low resource languages

(6) Wang, Haipeng a Ragni, Anton a Gales, Mark J F a Knill, Kate M a Woodland, Philip C a Zhang, Chao a

a UNIVERSITY OF CAMBRIDGE (United Kingdom)

Author keywords

Deep neural network; Hybrid; Joint decoding; Keyword spotting; Tandem

Indexed keywords

COMPUTATIONAL LINGUISTICS; DECODING; HYBRID SYSTEMS; SEARCH ENGINES; SPEECH COMMUNICATION;

DEEP NEURAL NETWORKS; HYBRID; JOINT DECODING; KEYWORD SPOTTING; TANDEM;

SPEECH RECOGNITION;

EID: 84959166110 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (42)

References (43)

1
- 84890542302
- Exploitingdiversity for spoken term detection
- L. Mangu, H. Soltau, H.-K. Kuo, B. Kingsbury, and G. Saon, "Exploitingdiversity for spoken term detection, " in Proc. ICASSP, 2013, pp. 8282-8286.
- (2013) Proc. ICASSP , pp. 8282-8286
- Mangu, L.¹ Soltau, H.² Kuo, H.-K.³ Kingsbury, B.⁴ Saon, G.⁵

2
- 84893692703
- Score normalization and system combinationfor improved keyword spotting
- D. Karakos, R. Schwartz, S. Tsakalidis, L. Zhang, S. Ranjan, T. Ng, and R. Hsiao, "Score normalization and system combinationfor improved keyword spotting, " in Proc. ASRU, 2013, pp. 210-215.
- (2013) Proc. ASRU , pp. 210-215
- Karakos, D.¹ Schwartz, R.² Tsakalidis, S.³ Zhang, L.⁴ Ranjan, S.⁵ Ng, T.⁶ Hsiao, R.⁷

3
- 84910031125
- Data augmentationfor low resource languages
- A. Ragni, K. Knill, S. Rath, and M. Gales, "Data augmentationfor low resource languages, " in Proc. Interspeech, 2014, pp. 810-814.
- (2014) Proc. Interspeech , pp. 810-814
- Ragni, A.¹ Knill, K.² Rath, S.³ Gales, M.⁴

4
- 84910067354
- Language independentand unsupervised acoustic models for speech recognition and keyword spotting
- K. Knill, M. Gales, A. Ragni, and S. Rath, "Language independentand unsupervised acoustic models for speech recognition and keyword spotting, " in Proc. INTERSPEECH, 2014, pp. 20-26.
- (2014) Proc. INTERSPEECH , pp. 20-26
- Knill, K.¹ Gales, M.² Ragni, A.³ Rath, S.⁴

5
- 84890486944
- M. Harper, "IARPA Solicitation IARPA-BAA-11-02, " 2011, http: //www. iarpa. gov/solicitations babel. html.
- (2011) IARPA Solicitation IARPA-BAA-11-02
- Harper, M.¹

6
- 0030657238
- Analyses of multiple evidence combination
- J. Lee, "Analyses of multiple evidence combination, " in ACM SIGIR, 1997, pp. 267-276.
- (1997) ACM SIGIR , pp. 267-276
- Lee, J.¹

7
- 84890489531
- Systemcombination and score normalization for spoken term detection
- J. Mamou, J. Cui, X. Cui, M. Gales, B. Kingsbury, K. Knill, L. Mangu, D. Nolden, M. Picheny, B. Ramabhadran et al., "Systemcombination and score normalization for spoken term detection, "in Proc. ICASSP, 2013, pp. 8272-8276.
- (2013) Proc. ICASSP , pp. 8272-8276
- Mamou, J.¹ Cui, J.² Cui, X.³ Gales, M.⁴ Kingsbury, B.⁵ Knill, K.⁶ Mangu, L.⁷ Nolden, D.⁸ Picheny, M.⁹ Ramabhadran, B.¹⁰

8
- 84946036768
- Low-resource keyword search strategies forTAMIL
- N. Chen et al., "Low-resource keyword search strategies forTAMIL, " in Proc. ICASSP, 2015, pp. 5366-5370.
- (2015) Proc. ICASSP , pp. 5366-5370
- Chen, N.¹

9
- 0030638031
- A Post-processing System to Yield ReducedWord Error Rates: Recogniser Output Voting Error Reduction(ROVER)
- J. G. Fiscus, "A Post-processing System to Yield ReducedWord Error Rates: Recogniser Output Voting Error Reduction(ROVER), " in Proc. ASRU, 1997, pp. 347-354.
- (1997) Proc. ASRU , pp. 347-354
- Fiscus, J.G.¹

10
- 4544253834
- Posterior probability decoding, confidence estimation and system combination
- G. Evermann and P. Woodland, "Posterior Probability Decoding, Confidence Estimation and System Combination, " in Proc. Speech Transcription Workshop, vol. 27, 2000.
- (2000) Proc. Speech Transcription Workshop , vol.27
- Evermann, G.¹ Woodland, P.²

11
- 56149113962
- Rapid and accurate spokenterm detection
- D. Miller, M. Kleber, C.-L. Kao, O. Kimball, T. Colthurst, S. Lowe, R. Schwartz, and H. Gish, "Rapid and accurate spokenterm detection, " in Proc. Interspeech, 2007.
- (2007) Proc. Interspeech
- Miller, D.¹ Kleber, M.² Kao, C.-L.³ Kimball, O.⁴ Colthurst, T.⁵ Lowe, S.⁶ Schwartz, R.⁷ Gish, H.⁸

12
- 43849107771
- The SRI/OGI 2006 spoken term detection system
- D. Vergyri, I. Shafran, A. Stolcke, V. Gadde, M. Akbacak et al., "The SRI/OGI 2006 spoken term detection system, " in Proc. Interspeech, 2007, pp. 2393-2396.
- (2007) Proc. Interspeech , pp. 2393-2396
- Vergyri, D.¹ Shafran, I.² Stolcke, A.³ Gadde, V.⁴ Akbacak, M.⁵

13
- 67649518727
- Sub-word modelingof out of vocabulary words in spoken term detection
- I. Szoke, L. Burget, J. Cernocky, and M. Fapso, "Sub-word modelingof out of vocabulary words in spoken term detection, " Proc. SLT, 2008, pp. 273-276.
- (2008) Proc. SLT , pp. 273-276
- Szoke, I.¹ Burget, L.² Cernocky, J.³ Fapso, M.⁴

14
- 84890537373
- A high-performance Cantonese keywordsearch system
- B. Kingsbury et al., "A high-performance Cantonese keywordsearch system, " in Proc. ICASSP, 2013, pp. 8277-8281.
- (2013) Proc. ICASSP , pp. 8277-8281
- Kingsbury, B.¹

15
- 84910068314
- Combining tand emand hybrid systems for improved speech recognition and keywordspotting on low resource languages
- S. Rath, K. Knill, A. Ragni, and M. Gales, "Combining tand emand hybrid systems for improved speech recognition and keywordspotting on low resource languages, " in Proc. Interspeech, 2014, pp. 835-839.
- (2014) Proc. Interspeech , pp. 835-839
- Rath, S.¹ Knill, K.² Ragni, A.³ Gales, M.⁴

16
- 79251574977
- Theefficient incorporation of MLP features into automatic speechrecognition systems
- J. Park, F. Diehl, M. Gales, M. Tomalin, and P. C. Woodland, "Theefficient incorporation of MLP features into automatic speechrecognition systems, " Computer Speech and Language, vol. 25, no. 3, pp. 519-534, 2010.
- (2010) Computer Speech and Language , vol.25 , Issue.3 , pp. 519-534
- Park, J.¹ Diehl, F.² Gales, M.³ Tomalin, M.⁴ Woodland, P.C.⁵

17
- 84055211743
- Acoustic modeling usingdeep belief networks
- A. Mohamed, G. Dahl, and G. Hinton, "Acoustic modeling usingdeep belief networks, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 1, pp. 14-22, 2012.
- (2012) Audio, Speech, and Language Processing, IEEE Transactions on , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.² Hinton, G.³

18
- 85032751458
- Deep neuralnetworks for acoustic modeling in speech recognition
- Nov 2012
- G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath et al., "Deep neuralnetworks for acoustic modeling in speech recognition, " IEEESignal Processing Magazine, vol. 29, no. 6, pp. 82-97, Nov 2012.
- IEEESignal Processing Magazine , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.¹⁰

19
- 0003573244
- Springer Science & Business Media
- H. Bourlard and N. Morgan, Connectionist Speech Recognition: A Hybrid Approach. Springer Science & Business Media, 1994, vol. 247.
- (1994) Connectionist Speech Recognition: A Hybrid Approach. , vol.247
- Bourlard, H.¹ Morgan, N.²

20
- 0034825241
- Multi-streamadaptive evidence combination for noise robust ASR
- A. Morris, A. Hagen, H. Glotin, and H. Bourlard, "Multi-streamadaptive evidence combination for noise robust ASR, " SpeechCommunication, vol. 34, no. 1, pp. 25-40, 2001.
- (2001) SpeechCommunication , vol.34 , Issue.1 , pp. 25-40
- Morris, A.¹ Hagen, A.² Glotin, H.³ Bourlard, H.⁴

21
- 0141676589
- New entropy based combinationrules in HMM/ANN multi-stream ASR
- H. Misra, H. Bourlard, and V. Tyagi, "New entropy based combinationrules in HMM/ANN multi-stream ASR, " in Proc. ICASSP, 2003, pp. 738-741.
- (2003) Proc. ICASSP , pp. 738-741
- Misra, H.¹ Bourlard, H.² Tyagi, V.³

22
- 0028194709
- Connectionist probability estimators in hmm speech recognition
- S. Renals, N. Morgan, H. Bourlard, M. Cohen, and H. Franco, "Connectionist probability estimators in hmm speech recognition, "IEEE Trans. Speech and Audio Processing, vol. 2, no. 1, pp. 161-174, 1994.
- (1994) IEEE Trans. Speech and Audio Processing , vol.2 , Issue.1 , pp. 161-174
- Renals, S.¹ Morgan, N.² Bourlard, H.³ Cohen, M.⁴ Franco, H.⁵

23
- 0028204660
- Combining TDNN and HMM in a hybrid system for improved continuous-speech recognition
- C. Dugast, L. Devillers, and X. Aubert, "Combining TDNN and HMM in a hybrid system for improved continuous-speech recognition, "IEEE Trans. Speech and Audio Processing, vol. 2, no. 1, pp. 217-223, 1994.
- (1994) IEEE Trans. Speech and Audio Processing , vol.2 , Issue.1 , pp. 217-223
- Dugast, C.¹ Devillers, L.² Aubert, X.³

24
- 84890492591
- Revisiting hybridand gmm-hmm system combination techniques
- P. Swietojanski, A. Ghoshal, and S. Renals, "Revisiting hybridand gmm-hmm system combination techniques, " in Proc. ICASSP, 2013, pp. 6744-6748.
- (2013) Proc. ICASSP , pp. 6744-6748
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

25
- 80053417853
- Joint optimization for machine translationsystem combination
- X. He and K. Toutanova, "Joint optimization for machine translationsystem combination, " in Proc. EMNLP, 2009, pp. 1202-1211.
- (2009) Proc. EMNLP , pp. 1202-1211
- He, X.¹ Toutanova, K.²

26
- 84905265980
- Joint training of convolutionaland non-convolutional neural networks
- H. Soltau, G. Saon, and T. Sainath, "Joint training of convolutionaland non-convolutional neural networks, " Proc. ICASSP, 2014.
- (2014) Proc. ICASSP
- Soltau, H.¹ Saon, G.² Sainath, T.³

27
- 84976253431
- Results of the2006 spoken term detection evaluation
- J. Fiscus, J. Ajot, J. Garofolo, and G. Doddingtion, "Results of the2006 Spoken Term Detection Evaluation, " in Proc. SIGIR, 2007, pp. 51-57.
- (2007) Proc. SIGIR , pp. 51-57
- Fiscus, J.¹ Ajot, J.² Garofolo, J.³ Doddingtion, G.⁴

28
- 0003571976
- S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey et al., The HTK Book(for HTK version 3. 4. 1). http: //htk. eng. cam. ac. uk: CambridgeUniversity, 2009.
- (2009) The HTK Book(for HTK Version 3. 4. 1)
- Young, S.¹ Evermann, G.² Gales, M.³ Hain, T.⁴ Kershaw, D.⁵ Liu, X.⁶ Moore, G.⁷ Odell, J.⁸ Ollason, D.⁹ Povey, D.¹⁰

29
- 84959142742
- A general artificial neural networkextension for HTK
- C. Zhang and P. Woodland, "A general artificial neural networkextension for HTK, " in Submission to InterSpeech, 2015.
- (2015) Submission to InterSpeech
- Zhang, C.¹ Woodland, P.²

30
- 84946055405
- Unicode-based graphemic systemsfor limited resource languages
- M. Gales, K. Knill, and A. Ragni, "Unicode-based graphemic systemsfor limited resource languages, " in Proc. ICASSP, 2015.
- (2015) Proc. ICASSP
- Gales, M.¹ Knill, K.² Ragni, A.³

31
- 0036460908
- Lightly supervised and unsupervisedacoustic model training
- L. Lamel and J.-L. Gauvain, "Lightly supervised and unsupervisedacoustic model training, " Computer speech and language, vol. 16, pp. 115-129, 2013.
- (2013) Computer Speech and Language , vol.16 , pp. 115-129
- Lamel, L.¹ Gauvain, J.-L.²

32
- 84890474716
- Deepneural network features and semi-supervised training for low resourcespeech recognition
- S. Thomas, M. L. Seltzer, K. Church, and H. Hermansky, "Deepneural network features and semi-supervised training for low resourcespeech recognition, " in Proc. ICASSP, 2013, pp. 6704-6708.
- (2013) Proc. ICASSP , pp. 6704-6708
- Thomas, S.¹ Seltzer, M.L.² Church, K.³ Hermansky, H.⁴

33
- 84893705111
- Discriminative semi-supervised training forkeyword search in low resource languages
- R. Hsiao, T. Ng, F. Grézl, D. Karakos, S. Tsakalidis, L. Nguyen, and R. Schwartz, "Discriminative semi-supervised training forkeyword search in low resource languages, " in Proc. ASRU, 2013, pp. 440-445.
- (2013) Proc. ASRU , pp. 440-445
- Hsiao, R.¹ Ng, T.² Grézl, F.³ Karakos, D.⁴ Tsakalidis, S.⁵ Nguyen, L.⁶ Schwartz, R.⁷

34
- 84890474441
- Investigation oncross-and multilingual MLP features under matched and mismatchedacoustical conditions
- Z. Tuske, J. Pinto, D. Willett, and R. Schluter, "Investigation oncross-and multilingual MLP features under matched and mismatchedacoustical conditions, " in Proc. ICASSP, 2013, pp. 6970-6974.
- (2013) Proc. ICASSP , pp. 6970-6974
- Tuske, Z.¹ Pinto, J.² Willett, D.³ Schluter, R.⁴

35
- 84905215475
- MultilingualMRASTA features for low-resource keyword search and speechrecognition systems
- Z. Tuske, D. Nolden, R. Schluter, and H. Ney, "MultilingualMRASTA features for low-resource keyword search and speechrecognition systems, " in Proc. ICASSP, 2014, pp. 7854-7858.
- (2014) Proc. ICASSP , pp. 7854-7858
- Tuske, Z.¹ Nolden, D.² Schluter, R.³ Ney, H.⁴

36
- 84858953642
- The Kaldi speech recognition toolkit
- D. Povey et al., "The Kaldi speech recognition toolkit, " in Proc. ASRU, 2011.
- (2011) Proc. ASRU
- Povey, D.¹

37
- 0032638856
- Semi-tied covariance matrices for hidden markovmodels
- M. Gales, "Semi-tied covariance matrices for hidden markovmodels, " Speech and Audio Processing, IEEE Transactions on, vol. 7, no. 3, pp. 272-281, 1999.
- (1999) Speech and Audio Processing, IEEE Transactions on , vol.7 , Issue.3 , pp. 272-281
- Gales, M.¹

38
- 33646773785
- Feature space gaussianization
- G. Saon, S. Dharanipragada, and D. Povey, "Feature space gaussianization, "in Proc. ICASSP, 2004, p. 326329.
- (2004) Proc. ICASSP , pp. 326329
- Saon, G.¹ Dharanipragada, S.² Povey, D.³

39
- 0036296863
- Minimum Phone Error and I-smoothing for improved discriminative training
- D. Povey and P. C. Woodland, "Minimum Phone Error and I-smoothing for improved discriminative training, " in Proc. ICASSP, 2002, pp. 101-105.
- (2002) Proc. ICASSP , pp. 101-105
- Povey, D.¹ Woodland, P.C.²

40
- 0030362995
- Acompact model for speaker adaptive training
- T. Anastasakos, J. McDonough, R. Schwartz, and J. Makhoul, "Acompact model for speaker adaptive training, " in Proc. ICSLP, 1996, pp. 1137-1140.
- (1996) Proc. ICSLP , pp. 1137-1140
- Anastasakos, T.¹ McDonough, J.² Schwartz, R.³ Makhoul, J.⁴

41
- 0032050110
- Maximum likelihood linear transformations forHMM-based speech recognition
- M. Gales, "Maximum likelihood linear transformations forHMM-based speech recognition, " Computer speech & language, vol. 12, no. 2, pp. 75-98, 1998.
- (1998) Computer Speech & Language , vol.12 , Issue.2 , pp. 75-98
- Gales, M.¹

42
- 84906274730
- Sequencediscriminativetraining of deep neural networks
- K. Vesely, A. Ghoshal, L. Burget, and D. Povey, "Sequencediscriminativetraining of deep neural networks. " in Proc. Interspeech, 2013, pp. 2345-2349.
- (2013) Proc. Interspeech , pp. 2345-2349
- Vesely, K.¹ Ghoshal, A.² Burget, L.³ Povey, D.⁴

43
- 33745219793
- General indexation ofweighted automata-application to spoken utterance retrieval
- M. Mohri, C. Allauzen, and M. Saraclar, "General indexation ofweighted automata-application to spoken utterance retrieval, " Proc. HLT/NAACL, 2004, pp. 33-40.
- (2004) Proc. HLT/NAACL , pp. 33-40
- Mohri, M.¹ Allauzen, C.² Saraclar, M.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.