SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2014, Pages 835-839

Combining tandem and hybrid systems for improved speech recognition and keyword spotting on low resource languages

(4) Rath, Shakti P a Knill, Kate M a Ragni, Anton a Gales, Mark J F a

a UNIVERSITY OF CAMBRIDGE (United Kingdom)

Author keywords

Deep neural network; Hybrid; Keyword spotting; Tandem

Indexed keywords

DECISION TREES; HYBRID SYSTEMS; SPEECH COMMUNICATION;

AUTOMATIC SPEECH RECOGNITION; CONSISTENT PERFORMANCE; DEEP NEURAL NETWORKS; HYBRID; KEYWORD SPOTTING; LOW RESOURCE LANGUAGES; PERFORMANCE GAIN; TANDEM;

SPEECH RECOGNITION;

EID: 84910068314 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (21)

References (32)

1
- 84893706598
- Mary Harper, "IARPA Babel Program, " http://www.iarpa.gov/Programs/ia/Babel/babel.html.
- IARPA Babel Program
- Harper, M.¹

2
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- IEEE, Nov
- G. Hinton, L. Deng, et al., "Deep Neural Networks for Acoustic Modeling in Speech Recognition, " Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 82-97, Nov 2012.
- (2012) Signal Processing Magazine , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.²

3
- 84858976070
- Feature engineering in context-dependent deep neural networks for conversational speech transcription
- Dec
- F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription, " in Proc. of ASRU, Dec 2011.
- (2011) Proc. of ASRU
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

4
- 0033709098
- Tandem connectionist feature extraction for conventional HMM systems
- H. Hermansky, D. Ellis, and S. Sharma, "Tandem Connectionist Feature Extraction for ConventionalHMMSystems, " in Proc. of ICASSP, 2000.
- (2000) Proc. of ICASSP
- Hermansky, H.¹ Ellis, D.² Sharma, S.³

5
- 84858955616
- Study of probabilistic and bottle-neck features in multilingual environment
- Frantisek Grezl, Martin Karafiat, and Milos Janda, "Study of probabilistic and bottle-neck features in multilingual environment, " in Proc. of ASRU, 2011.
- (2011) Proc. of ASRU
- Grezl, F.¹ Karafiat, M.² Janda, M.³

6
- 0003573244
- Kluwer Academic Publishers, Norwell, MA, USA
- H. A. Bourlard and N. Morgan, Connectionist Speech Recognition: A Hybrid Approach, Kluwer Academic Publishers, Norwell, MA, USA, 1993.
- (1993) Connectionist Speech Recognition: A Hybrid Approach
- Bourlard, H.A.¹ Morgan, N.²

7
- 0001860529
- A post-processing system to yield reduced word error rates: Recogniser output voting error reduction (ROVER)
- J. G. Fiscus, "A post-processing system to yield reduced word error rates: Recogniser Output Voting Error Reduction (ROVER), " in Proc. of ASRU, 1997.
- (1997) Proc. of ASRU
- Fiscus, J.G.¹

8
- 0033676943
- Large vocabulary decoding and confidence estimation using word posterior probabilities
- G. Evermann and P. C. Woodland, "Large vocabulary decoding and confidence estimation using word posterior probabilities, " in Proc. of ICASSP 2000.
- (2000) Proc. of ICASSP
- Evermann, G.¹ Woodland, P.C.²

9
- 43849104109
- Rapid and accurate spoken term detection
- D. R. H. Miller, M. Kleber, et al., "Rapid and accurate spoken term detection, " in Proc. of Interspeech, 2007.
- (2007) Proc. of Interspeech
- Miller, D.R.H.¹ Kleber, M.²

10
- 43849107771
- The SRI/OGI 2006 spoken term detection system
- D. Vergyri, I. Shafran, et al., "The SRI/OGI 2006 spoken term detection system, " in Proc. of Interspeech, 2007.
- (2007) Proc. of Interspeech
- Vergyri, D.¹ Shafran, I.²

11
- 67649518727
- Subword modeling of out of vocabulary words in spoken term detection
- I. Szoke, L. Burget, J Cernocky, and M. Fapso, "Subword modeling of out of vocabulary words in spoken term detection, " in Proc. of SLT 2008.
- (2008) Proc. of SLT
- Szoke, I.¹ Burget, L.² Cernocky, J.³ Fapso, M.⁴

12
- 84890489531
- System combination and score normalization for spoken term detection
- J. Mamou et al., "System combination and score normalization for spoken term detection, " in Proc. of ICASSP, 2013.
- (2013) Proc. of ICASSP
- Mamou, J.¹

13
- 84890542302
- Exploiting diversity for spoken term detection
- L. Mangu, H. Soltau, H.-K. Kuo, B. Kingsbury, and G. Saon, "Exploiting diversity for spoken term detection, " in Proc. of ICASSP, 2013.
- (2013) Proc. of ICASSP
- Mangu, L.¹ Soltau, H.² Kuo, H.-K.³ Kingsbury, B.⁴ Saon, G.⁵

14
- 84890537373
- A high-performance Cantonese keyword search system
- B. Kingsbury et al., "A high-performance Cantonese keyword search system, " in Proc. of ICASSP, 2013.
- (2013) Proc. of ICASSP
- Kingsbury, B.¹

15
- 84893692703
- Score normalization and system combination for improved keyword spotting
- D. Karakos, R Schwartz, S. Tsakalidis, L. Zhang, et al., "Score normalization and system combination for improved keyword spotting, " in Proc. of ASRU 2013.
- (2013) Proc. of ASRU
- Karakos, D.¹ Schwartz, R.² Tsakalidis, S.³ Zhang, L.⁴

16
- 79951634009
- Results of the 2006 spoken term detection evaluation
- J. G. Fiscus et al., "Results of the 2006 Spoken Term Detection Evaluation, " in Proc. ACM SIGIR Workshop on Searching Spontaneous Conversational Speech, 2007.
- (2007) Proc. ACM SIGIR Workshop on Searching Spontaneous Conversational Speech
- Fiscus, J.G.¹

17
- 0003571976
- Cambridge University
- S. J. Young, G. Evermann, M. J. F. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. C. Woodland, The HTK Book (for HTK version 3.4.1), Cambridge University, http://htk.eng.cam.ac.uk 2009.
- (2009) The HTK Book (For HTK Version 3.4.1)
- Young, S.J.¹ Evermann, G.² Gales, M.J.F.³ Hain, T.⁴ Kershaw, D.⁵ Liu, X.⁶ Moore, G.⁷ Odell, J.⁸ Ollason, D.⁹ Povey, D.¹⁰ Valtchev, V.¹¹ Woodland, P.C.¹²

18
- 84893712779
- David Johnson et al., "QuickNet, " http://www1.icsi.berkeley.edu/Speech/qn.html.
- QuickNet
- Johnson, D.¹

19
- 85009139544
- Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis, " in Proc. of Eurospeech, 1999.
- (1999) Proc. of Eurospeech
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

20
- 0002144369
- Tree-based state tying for high accuracy acoustic modelling
- S. J. Young, J. J. Odell, and P. C. Woodland, "Tree-based state tying for high accuracy acoustic modelling, " in Proceedings ARPA Workshop on Human Language Technology, 1994, pp. 307-312.
- (1994) Proceedings ARPA Workshop on Human Language Technology , pp. 307-312
- Young, S.J.¹ Odell, J.J.² Woodland, P.C.³

21
- 85023776577
- Flexible deciscion trees for grapheme based speech recognition
- Cottbus, Germany
- Borislava Mimer, Sebastian Stüker, and Tanja Schultz, "Flexible deciscion trees for grapheme based speech recognition, " in Proc. 15th Conference Elektronische Sprachsignalverabeitung (ESSV), Cottbus, Germany, 2004.
- (2004) Proc. 15th Conference Elektronische Sprachsignalverabeitung (ESSV)
- Mimer, B.¹ Stüker, S.² Schultz, T.³

22
- 79251574977
- The efficient incorporation of MLP features into automatic speech recognition systems
- J. Park et al., "The Efficient Incorporation of MLP Features into Automatic Speech Recognition Systems, " Computer Speech and Language, vol. 25, pp. 519-534, 2010.
- (2010) Computer Speech and Language , vol.25 , pp. 519-534
- Park, J.¹

23
- 0032638856
- Semi-tied covariance matrices for hidden Markov models
- May
- M. J. F. Gales, "Semi-tied covariance matrices for hidden Markov models, " IEEE Transaction of Speech and Audio Processing, vol. 7, no. 3, pp. 272-281, May 1999.
- (1999) IEEE Transaction of Speech and Audio Processing , vol.7 , Issue.3 , pp. 272-281
- Gales, M.J.F.¹

24
- 0032050110
- Maximum likelihood linear transformations for HMM-Based speech recognition
- M. J. F. Gales, "Maximum Likelihood Linear Transformations for HMM-Based Speech Recognition, " Computer Speech and Language, vol. 12, no. 2, pp. 75-98, 1998.
- (1998) Computer Speech and Language , vol.12 , Issue.2 , pp. 75-98
- Gales, M.J.F.¹

25
- 0036296863
- Minimum Phone Error and I-smoothing for improved discriminative training
- D. Povey and P. C. Woodland, "Minimum Phone Error and I-smoothing for improved discriminative training, " in Proc. of ICASSP, 2002.
- (2002) Proc. of ICASSP
- Povey, D.¹ Woodland, P.C.²

26
- 33646788786
- FMPE: Discriminatively trained features for speech recognition
- D. Povey et al., "fMPE: Discriminatively trained features for speech recognition, " in Proc. of ICASSP, 2005.
- (2005) Proc. of ICASSP
- Povey, D.¹

27
- 84893681011
- Vocal tract length perturbation (VTLP) improves speech recognition
- N. Jaitly and G. E. Hinton, "Vocal tract length perturbation (VTLP) improves speech recognition, " in Proc of ICML, 2013.
- (2013) Proc of ICML
- Jaitly, N.¹ Hinton, G.E.²

28
- 84905247925
- Data augmentation for deep neural network acoustic modeling
- X. Cui, V. Goel, and B. Kingsbury, "Data augmentation for deep neural network acoustic modeling, " in Proc. of ICASSP, 2014.
- (2014) Proc. of ICASSP
- Cui, X.¹ Goel, V.² Kingsbury, B.³

29
- 0002162027
- Utilizing untranscribed training data to improve performance
- G. Zavaliagkos and T. Colthurst, "Utilizing untranscribed training data to improve performance, " in Proc. of DARPA Broadcast news transcription and understanding workshop, 1998.
- (1998) Proc. of DARPA Broadcast News Transcription and Understanding Workshop
- Zavaliagkos, G.¹ Colthurst, T.²

30
- 0036460908
- Lightly supervised and unsupervised acoustic model training
- L. Lamel and J.-L. Gauvain, "Lightly supervised and unsupervised acoustic model training, " Computer speech and language, vol. 16, pp. 115-129, 2013.
- (2013) Computer Speech and Language , vol.16 , pp. 115-129
- Lamel, L.¹ Gauvain, J.-L.²

31
- 34047266379
- Progress in the CUHTK broadcast news transcription system
- M. J. F. Gales, D. Y. Kim, P. C. Woodland, H. Y. Chan, D. Mrva, R. Sinha, and S. E. Tranter, "Progress in the CUHTK broadcast news transcription system, " IEEE Tran ASLP, vol. 14, no. 5, pp. 1513-1525, 2006.
- (2006) IEEE Tran ASLP , vol.14 , Issue.5 , pp. 1513-1525
- Gales, M.J.F.¹ Kim, D.Y.² Woodland, P.C.³ Chan, H.Y.⁴ Mrva, D.⁵ Sinha, R.⁶ Tranter, S.E.⁷

32
- 84906932692
- Unsupervised morphology-based vocabulary expansion
- M. S. Rasooli, N. Habash, O. Rambow, and T. Lippincott, "Unsupervised morphology-based vocabulary expansion, " in The 52nd Annual Meeting of the Association for Computational Linguistics, 2014.
- (2014) The 52nd Annual Meeting of the Association for Computational Linguistics
- Rasooli, M.S.¹ Habash, N.² Rambow, O.³ Lippincott, T.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.