메뉴 건너뛰기




Volumn 51, Issue 11, 2009, Pages 1139-1153

A study on integrating acoustic-phonetic information into lattice rescoring for automatic speech recognition

Author keywords

Continuous phone recognition; Knowledge based system; Large vocabulary continuous speech recognition; Lattice rescoring

Indexed keywords

ARTIFICIAL NEURAL NETWORK; AUTOMATIC SPEECH RECOGNITION; CONFIDENCE LEVELS; CONFIDENCE SCORE; CONNECTED DIGITS; CONTINUOUS PHONE RECOGNITION; DATA-DRIVEN; GENERIC FRAMEWORKS; LANGUAGE MODEL; LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION; LATTICE RESCORING; LOG LIKELIHOOD; LOG LIKELIHOOD RATIO; PHONE RECOGNITION; PHONETIC INFORMATION; PHONETIC PROPERTIES; SPEECH EVENTS;

EID: 67650999674     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2009.05.004     Document Type: Article
Times cited : (62)

References (59)
  • 1
    • 0002837464 scopus 로고
    • Maximum mutual information estimation of HMM parameters for speech recognition
    • Tokyo, Japan
    • Bahl, L.R., Brown, P.F., deSouza, P.V., Mercer, R.L., 1986. Maximum mutual information estimation of HMM parameters for speech recognition. In: Proc. of ICASSP, Tokyo, Japan.
    • (1986) Proc. of ICASSP
    • Bahl, L.R.1    Brown, P.F.2    deSouza, P.V.3    Mercer, R.L.4
  • 2
    • 85009138487 scopus 로고    scopus 로고
    • Modeling out-of-vocabulary words for robust speech recognition
    • Beijing, China, pp
    • Bazzi, I., Glass, J., 2000. Modeling out-of-vocabulary words for robust speech recognition. In: Proc. of ICSLP, Beijing, China, pp. 401-404.
    • (2000) Proc. of ICSLP , pp. 401-404
    • Bazzi, I.1    Glass, J.2
  • 3
    • 0029725523 scopus 로고    scopus 로고
    • Knowledge-based parameters for HMM speech recognition
    • Atlanta, USA, pp
    • Bitar, N.N., Espy-Wilson, C.Y., 1996. Knowledge-based parameters for HMM speech recognition. In: Proc. of ICASSP, Atlanta, USA, pp. 29-32.
    • (1996) Proc. of ICASSP , pp. 29-32
    • Bitar, N.N.1    Espy-Wilson, C.Y.2
  • 5
    • 44949118857 scopus 로고    scopus 로고
    • A new framework for system combination based on integrated hypothesis space
    • Pittsurgh, USA, pp
    • Chen, I.-F., Lee, L.-S., 2006. A new framework for system combination based on integrated hypothesis space. In: Proc. of InterSpeech, Pittsurgh, USA, pp. 533-536.
    • (2006) Proc. of InterSpeech , pp. 533-536
    • Chen, I.-F.1    Lee, L.-S.2
  • 6
    • 85009110188 scopus 로고    scopus 로고
    • Learning long-term temporal features in LVCSR using neural networks
    • Jej Island, Korea, pp
    • Chen, B., Zhu, Q., Morgan, N., 2004. Learning long-term temporal features in LVCSR using neural networks. In: Proc. of InterSpeech, Jej Island, Korea, pp. 925-928.
    • (2004) Proc. of InterSpeech , pp. 925-928
    • Chen, B.1    Zhu, Q.2    Morgan, N.3
  • 7
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllable word recognition in continuously spoken sentences
    • Davis S., and Mermelstein P. Comparison of parametric representations for monosyllable word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28 4 (1980) 357-366
    • (1980) IEEE Trans. Acoust. Speech Signal Process. , vol.28 , Issue.4 , pp. 357-366
    • Davis, S.1    Mermelstein, P.2
  • 8
    • 84871614151 scopus 로고    scopus 로고
    • Distinctive features for use in an automatic speech recognition system
    • Aalborg, Denmark, pp
    • Eide, E., 2001. Distinctive features for use in an automatic speech recognition system. In: Proc. of EuroSpeech, Aalborg, Denmark, pp. 1613-1616.
    • (2001) Proc. of EuroSpeech , pp. 1613-1616
    • Eide, E.1
  • 9
    • 0033676943 scopus 로고    scopus 로고
    • Large vocabulary decoding and confidence estimation using word posterior probabilities
    • Istanbul, Turkey, pp
    • Evermann, G., Woodland, P.C., 2000. Large vocabulary decoding and confidence estimation using word posterior probabilities. In: Proc. of ICASSP, Istanbul, Turkey, pp. 1655-1658.
    • (2000) Proc. of ICASSP , pp. 1655-1658
    • Evermann, G.1    Woodland, P.C.2
  • 11
    • 0030638031 scopus 로고    scopus 로고
    • A post-processing system to yield reduced word error rates: Recogniser output voting error reduction (ROVER)
    • Santa Barbara, USA, pp
    • Fiscus, J.G., 1997. A post-processing system to yield reduced word error rates: recogniser output voting error reduction (ROVER). In: Proc. of IEEE ASRU Workshop, Santa Barbara, USA, pp. 347-352.
    • (1997) Proc. of IEEE ASRU Workshop , pp. 347-352
    • Fiscus, J.G.1
  • 12
    • 34249932867 scopus 로고    scopus 로고
    • Articulatory feature recognition using dynamic Bayesian networks
    • Frankel J., Wester M., and King S. Articulatory feature recognition using dynamic Bayesian networks. Comput. Speech Lang. 21 4 (2007) 620-640
    • (2007) Comput. Speech Lang. , vol.21 , Issue.4 , pp. 620-640
    • Frankel, J.1    Wester, M.2    King, S.3
  • 13
    • 34547528169 scopus 로고    scopus 로고
    • Generalization of minimum classification error (MCE) training based on maximizing generalized posterior probability (GPP)
    • Pittsburgh, USA, pp
    • Fu, Q., Moreno, A.D., Juang, B.J., Zhou, L., Soong, F.K., 2006. Generalization of minimum classification error (MCE) training based on maximizing generalized posterior probability (GPP). In: Proc. of Interspeech, Pittsburgh, USA, pp. 681-684.
    • (2006) Proc. of Interspeech , pp. 681-684
    • Fu, Q.1    Moreno, A.D.2    Juang, B.J.3    Zhou, L.4    Soong, F.K.5
  • 15
    • 0001492251 scopus 로고    scopus 로고
    • Minimum Bayes-risk automatic speech recognition
    • Goel V., and Byrne W. Minimum Bayes-risk automatic speech recognition. Comput. Speech Lang. 14 2 (2000) 115-135
    • (2000) Comput. Speech Lang. , vol.14 , Issue.2 , pp. 115-135
    • Goel, V.1    Byrne, W.2
  • 16
    • 4544339430 scopus 로고    scopus 로고
    • Parsing speech into articulatory events
    • Montreal, Canada, pp
    • Hacioglu, K., Pellom, B., Ward, W., 2004. Parsing speech into articulatory events. In: Proc. of ICASSP, Montreal, Canada, pp. 925-928.
    • (2004) Proc. of ICASSP , pp. 925-928
    • Hacioglu, K.1    Pellom, B.2    Ward, W.3
  • 17
    • 67650784457 scopus 로고    scopus 로고
    • Hasegawa, M. et al., 2005. Landmark-Based Speech Recognition: Report of the 2004 Johns Hopkins Summer Workshop. In: Proc. of ICASSP, Philadelphia, USA.
    • Hasegawa, M. et al., 2005. Landmark-Based Speech Recognition: Report of the 2004 Johns Hopkins Summer Workshop. In: Proc. of ICASSP, Philadelphia, USA.
  • 20
    • 33745213373 scopus 로고    scopus 로고
    • Multi-resolution RASTA filtering for TANDEM-based ASR
    • Lisbon, Portugal, pp
    • Hermansky, H., Fousek, P., 2005. Multi-resolution RASTA filtering for TANDEM-based ASR. In: Proc. of InterSpeech, Lisbon, Portugal, pp. 361-364.
    • (2005) Proc. of InterSpeech , pp. 361-364
    • Hermansky, H.1    Fousek, P.2
  • 21
    • 84898982939 scopus 로고    scopus 로고
    • Exploiting generative models in discriminative classifiers
    • Solla S., and Cohn D.A. (Eds), MIT Press
    • Jaakkola T., and Haussler D. Exploiting generative models in discriminative classifiers. In: Solla S., and Cohn D.A. (Eds). Advances in Neural Information Processing Systems (1999), MIT Press 487-493
    • (1999) Advances in Neural Information Processing Systems , pp. 487-493
    • Jaakkola, T.1    Haussler, D.2
  • 22
    • 34047115134 scopus 로고    scopus 로고
    • Large margin hidden Markov models for speech recognition
    • Jian H., Li X., and Liu C. Large margin hidden Markov models for speech recognition. IEEE Trans. Audio Speech Lang. Process. 14 5 (2006) 1584-1595
    • (2006) IEEE Trans. Audio Speech Lang. Process. , vol.14 , Issue.5 , pp. 1584-1595
    • Jian, H.1    Li, X.2    Liu, C.3
  • 23
    • 0022691022 scopus 로고
    • Maximum likelihood estimation for multivariate mixture observation of Markov chains
    • Juang B.-H., Levinson S.E., and Sondhi M.M. Maximum likelihood estimation for multivariate mixture observation of Markov chains. IEEE Trans. Inform. Theory IT-32 2 (1986) 307-309
    • (1986) IEEE Trans. Inform. Theory , vol.IT-32 , Issue.2 , pp. 307-309
    • Juang, B.-H.1    Levinson, S.E.2    Sondhi, M.M.3
  • 24
    • 0032205629 scopus 로고    scopus 로고
    • Flexible speech understanding based on combined key-phrase detection and verification
    • Kawahara T., Lee C.-H., and Juang B.-J. Flexible speech understanding based on combined key-phrase detection and verification. IEEE Trans. Speech Audio Process. 6 6 (1998) 558-568
    • (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.6 , pp. 558-568
    • Kawahara, T.1    Lee, C.-H.2    Juang, B.-J.3
  • 26
    • 0038193561 scopus 로고    scopus 로고
    • Combining articulatory and acoustic information for speech recognition in noisy and reverberant environments
    • Sydney, Australia, pp
    • Kirchhoff, K., 1998. Combining articulatory and acoustic information for speech recognition in noisy and reverberant environments. In: Proc. of ISCLP, Sydney, Australia, pp. 891-894.
    • (1998) Proc. of ISCLP , pp. 891-894
    • Kirchhoff, K.1
  • 28
    • 0035509488 scopus 로고    scopus 로고
    • Speech recognition and utterance verification based on a generalized confidence score
    • Koo M.-W., Lee C.-H., and Juang B.-J. Speech recognition and utterance verification based on a generalized confidence score. IEEE Trans. Speech Audio Process. 9 8 (2001) 821-831
    • (2001) IEEE Trans. Speech Audio Process. , vol.9 , Issue.8 , pp. 821-831
    • Koo, M.-W.1    Lee, C.-H.2    Juang, B.-J.3
  • 29
    • 0038676761 scopus 로고    scopus 로고
    • Towards knowledge-based features for HMM based large vocabulary automatic speech recognition
    • Orlando, USA, pp
    • Launay, B., Siohan, O., Surendran, A.C., Lee, C.-H., 2002. Towards knowledge-based features for HMM based large vocabulary automatic speech recognition. In: Proc. of ICASSP, Orlando, USA, pp. 817-820.
    • (2002) Proc. of ICASSP , pp. 817-820
    • Launay, B.1    Siohan, O.2    Surendran, A.C.3    Lee, C.-H.4
  • 30
    • 0024768209 scopus 로고
    • Speaker-independent phone recognition using hidden Markov models
    • Lee K., and Hon H. Speaker-independent phone recognition using hidden Markov models. IEEE Trans. Acoust. Speech Signal Process. 37 11 (1989) 1641-1648
    • (1989) IEEE Trans. Acoust. Speech Signal Process. , vol.37 , Issue.11 , pp. 1641-1648
    • Lee, K.1    Hon, H.2
  • 31
    • 0008520151 scopus 로고    scopus 로고
    • A unified statistical hypothesis testing approach to speaker verification and verbal information verification
    • Rhodes, Greece, pp
    • Lee, C.-H., 1997. A unified statistical hypothesis testing approach to speaker verification and verbal information verification. In: Proc. of COST Workshop Speech Technology Public Telephone Network: Where Are We Today? Rhodes, Greece, pp. 62-73.
    • (1997) Proc. of COST Workshop Speech Technology Public Telephone Network: Where Are We Today , pp. 62-73
    • Lee, C.-H.1
  • 32
    • 33744917190 scopus 로고    scopus 로고
    • From knowledge-ignorant to knowledge-rich modeling: A new speech research paradigm for next generation automatic speech recognition
    • Jeju Island, Korea, pp
    • Lee, C.-H., 2004. From knowledge-ignorant to knowledge-rich modeling: a new speech research paradigm for next generation automatic speech recognition. In: Proc. of InterSpeech, Jeju Island, Korea, pp. 109-112.
    • (2004) Proc. of InterSpeech , pp. 109-112
    • Lee, C.-H.1
  • 33
    • 0002560960 scopus 로고
    • A database for speaker independent digit recognition
    • San Diego, USA
    • Leonard, R.G., 1984. A database for speaker independent digit recognition. In: Proc. of ICSLP, San Diego, USA.
    • (1984) Proc. of ICSLP
    • Leonard, R.G.1
  • 34
    • 33646781303 scopus 로고    scopus 로고
    • A study on knowledge source integration for rescoring in automatic speech recognition
    • Philadelphia, USA, pp
    • Li, J., Lee, C.-H., 2005. A study on knowledge source integration for rescoring in automatic speech recognition. In: Proc. of ICASSP, Philadelphia, USA, pp. 837-840.
    • (2005) Proc. of ICASSP , pp. 837-840
    • Li, J.1    Lee, C.-H.2
  • 35
    • 33745220723 scopus 로고    scopus 로고
    • On designing and evaluating speech event detectors
    • Lisbon, Portugal, pp
    • Li, J., Tsao, Y., Lee, C.-H., 2005. On designing and evaluating speech event detectors. In: Proc. of InterSpeech, Lisbon, Portugal, pp. 3365-3368.
    • (2005) Proc. of InterSpeech , pp. 3365-3368
    • Li, J.1    Tsao, Y.2    Lee, C.-H.3
  • 36
    • 34547506259 scopus 로고    scopus 로고
    • Soft margin estimation of hidden Markov model parameters
    • Pittsburgh, USA, pp
    • Li, J., Yuan, M., and Lee, C.-H., 2006. Soft margin estimation of hidden Markov model parameters. In: Proc. of InterSpeech, Pittsburgh, USA, pp. 2422-2425.
    • (2006) Proc. of InterSpeech , pp. 2422-2425
    • Li, J.1    Yuan, M.2    Lee, C.-H.3
  • 37
    • 64149098818 scopus 로고    scopus 로고
    • Approximate test risk bound minimization through soft margin estimation
    • Li J., Yuan M., and Lee C.H. Approximate test risk bound minimization through soft margin estimation. IEEE Trans. Audio Speech Lang. Process. 15 8 (2007) 2393-2404
    • (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , Issue.8 , pp. 2393-2404
    • Li, J.1    Yuan, M.2    Lee, C.H.3
  • 38
    • 0022149626 scopus 로고
    • Structural methods in automatic speech recognition
    • Levinson S.E. Structural methods in automatic speech recognition. Proc. IEEE 73 (1985) 1625-1650
    • (1985) Proc. IEEE , vol.73 , pp. 1625-1650
    • Levinson, S.E.1
  • 39
    • 33745208000 scopus 로고    scopus 로고
    • Macherey, W., Haferkamp, Schlüter, L., Ney, H., 2005. Investigations on error minimizing training criteria for discriminative training in automatic speech recognition. In: Proc. of Interspeech, Lisboa, Portugal, pp. 2133-2136.
    • Macherey, W., Haferkamp, Schlüter, L., Ney, H., 2005. Investigations on error minimizing training criteria for discriminative training in automatic speech recognition. In: Proc. of Interspeech, Lisboa, Portugal, pp. 2133-2136.
  • 40
    • 33947681802 scopus 로고    scopus 로고
    • Improving reference speaker weighting adaptation by the use of maximum-likelihood reference speakers
    • Toulouse, France, pp
    • Mak, B., Lai, T.C., Hsiao, R., 2006. Improving reference speaker weighting adaptation by the use of maximum-likelihood reference speakers. In: Proc. of ICASSP, Toulouse, France, pp. 222-232.
    • (2006) Proc. of ICASSP , pp. 222-232
    • Mak, B.1    Lai, T.C.2    Hsiao, R.3
  • 42
    • 0141812372 scopus 로고    scopus 로고
    • A flexible stream architecture for ASR using articulatory features
    • Denver, USA, pp
    • Metze, F., Waibel, A., 2002. A flexible stream architecture for ASR using articulatory features. In: Proc. of ICSLP, Denver, USA, pp. 16-20.
    • (2002) Proc. of ICSLP , pp. 16-20
    • Metze, F.1    Waibel, A.2
  • 44
    • 44949115420 scopus 로고    scopus 로고
    • Combining phonetic attributes using conditional random fields
    • Pittsburgh, Pennsylvania, pp
    • Morris, J., Folser-Lussie, E., 2006. Combining phonetic attributes using conditional random fields. In: Proc. of Interspeech, Pittsburgh, Pennsylvania, pp. 597-600.
    • (2006) Proc. of Interspeech , pp. 597-600
    • Morris, J.1    Folser-Lussie, E.2
  • 45
    • 0036165331 scopus 로고    scopus 로고
    • Detecting stop consonants in continuous speech
    • Niyogi P., and Sondhi M.M. Detecting stop consonants in continuous speech. J. Acoust. Soc. Amer. 111 2 (2002) 1063-1076
    • (2002) J. Acoust. Soc. Amer. , vol.111 , Issue.2 , pp. 1063-1076
    • Niyogi, P.1    Sondhi, M.M.2
  • 46
    • 0024610919 scopus 로고
    • A tutorial on hidden Markov models and selected application in speech recognition
    • Rabiner L.R. A tutorial on hidden Markov models and selected application in speech recognition. Proc. IEEE 77 2 (1989) 257-286
    • (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
    • Rabiner, L.R.1
  • 47
    • 0037697284 scopus 로고    scopus 로고
    • Hidden articulator Markov models for speech recognition
    • Richardson M., Blimes J., and Diorio C. Hidden articulator Markov models for speech recognition. Speech Comm. 41 2 (2003) 511-529
    • (2003) Speech Comm. , vol.41 , Issue.2 , pp. 511-529
    • Richardson, M.1    Blimes, J.2    Diorio, C.3
  • 48
    • 85132936468 scopus 로고    scopus 로고
    • An HMM acoustic model incorporating various additional knowledge sources
    • Antwerp, Belgium, pp
    • Sakti, S., Markov, K., Nakamura, S., 2007. An HMM acoustic model incorporating various additional knowledge sources. In: Proc. of InterSpeech, Antwerp, Belgium, pp. 2117-2120.
    • (2007) Proc. of InterSpeech , pp. 2117-2120
    • Sakti, S.1    Markov, K.2    Nakamura, S.3
  • 49
    • 0002788850 scopus 로고    scopus 로고
    • Multiple-pass search strategies
    • Lee C.-H., Soong F.K., and Paliwal K.K. (Eds), Kluwer Academic Publishers, Norwell, MA, USA
    • Schwartz R., Nguyen L., and Makhoul J. Multiple-pass search strategies. In: Lee C.-H., Soong F.K., and Paliwal K.K. (Eds). Automatic Speech and Speaker Recognition (1996), Kluwer Academic Publishers, Norwell, MA, USA 29-456
    • (1996) Automatic Speech and Speaker Recognition , pp. 29-456
    • Schwartz, R.1    Nguyen, L.2    Makhoul, J.3
  • 50
    • 33947620115 scopus 로고    scopus 로고
    • Hierarchical structures of neural networks for phoneme recognition
    • Toulouse, France
    • Schwarz, P., Matějaka, P., Černocký, J., 2006. Hierarchical structures of neural networks for phoneme recognition. In: Proc. of ICASSP06, Toulouse, France, 325-328.
    • (2006) Proc. of ICASSP06 , pp. 325-328
    • Schwarz, P.1    Matějaka, P.2    Černocký, J.3
  • 51
    • 34547531127 scopus 로고    scopus 로고
    • A study on lattice rescoring with knowledge scores for automatic speech recognition
    • Pittsburgh, USA, pp
    • Siniscalchi, S.M., Li, J., Lee, C.-H., 2006. A study on lattice rescoring with knowledge scores for automatic speech recognition. In: Proc. of InterSpeech, Pittsburgh, USA, pp. 517-520.
    • (2006) Proc. of InterSpeech , pp. 517-520
    • Siniscalchi, S.M.1    Li, J.2    Lee, C.-H.3
  • 52
    • 85061808589 scopus 로고    scopus 로고
    • Explicit word error minimization in N-best list rescoring
    • Rhodes, Greece, pp
    • Stolcke, A., Konig, Y., Weintraub, M., 1997. Explicit word error minimization in N-best list rescoring. In: Proc. of Eurospeech, Rhodes, Greece, pp. 163-165.
    • (1997) Proc. of Eurospeech , pp. 163-165
    • Stolcke, A.1    Konig, Y.2    Weintraub, M.3
  • 53
    • 0141591550 scopus 로고    scopus 로고
    • Multilingual articulatory features
    • Hong Kong, China, pp
    • Stüker, S., Schultz, T., Metze, F., Waibel, A., 2003. Multilingual articulatory features. In: Proc. of ICASSP, Hong Kong, China, pp. 144-147.
    • (2003) Proc. of ICASSP , pp. 144-147
    • Stüker, S.1    Schultz, T.2    Metze, F.3    Waibel, A.4
  • 54
    • 0008724693 scopus 로고
    • A two pass classifier for utterance rejection in keyword spotting
    • Minneapolis, USA, pp
    • Sukkar, R.A., Wilpon, J.G., 1993. A two pass classifier for utterance rejection in keyword spotting. In: Proc. of ICASSP, Minneapolis, USA, pp. 451-454.
    • (1993) Proc. of ICASSP , pp. 451-454
    • Sukkar, R.A.1    Wilpon, J.G.2
  • 55
    • 85009204121 scopus 로고    scopus 로고
    • Modeling linguistic features in speech recognition
    • Geneva, Switzerland, pp
    • Tang, M., Seneff, S., Zue, V.W., 2003. Modeling linguistic features in speech recognition. In: Proc. of Eurospeech, Geneva, Switzerland, pp. 2585-2588.
    • (2003) Proc. of Eurospeech , pp. 2585-2588
    • Tang, M.1    Seneff, S.2    Zue, V.W.3
  • 56
    • 33745199175 scopus 로고    scopus 로고
    • A study on separation between acoustic models and its applications
    • Lisbon, Portugal, pp
    • Tsao, Y., Li, J., Lee, C.-H., 2005. A study on separation between acoustic models and its applications. In: Proc. of InterSpeech, Lisbon, Portugal, pp. 1109-1112.
    • (2005) Proc. of InterSpeech , pp. 1109-1112
    • Tsao, Y.1    Li, J.2    Lee, C.-H.3
  • 57
  • 58
    • 84926060821 scopus 로고
    • Large vocabulary continuous speech recognition using HTK
    • Adelaide, Australia, pp
    • Woodland, P.C., Odell, J., Valtchev, V., Young, S., 1994. Large vocabulary continuous speech recognition using HTK. In: Proc. of ICASSP, Adelaide, Australia, pp. 125-128.
    • (1994) Proc. of ICASSP , pp. 125-128
    • Woodland, P.C.1    Odell, J.2    Valtchev, V.3    Young, S.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.