메뉴 건너뛰기




Volumn 14, Issue 5, 2006, Pages 1513-1525

Progress in the CU-HTK broadcast news transcription system

Author keywords

Automatic speech recognition; Broadcast news (BN) transcription; Diarization

Indexed keywords

ACOUSTIC TRAINING DATA; AUTOMATIC SPEECH RECOGNITION; BROADCAST NEWS (BN) TRANSCRIPTION; DIARIZATION;

EID: 34047266379     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2006.878264     Document Type: Article
Times cited : (74)

References (47)
  • 2
    • 0036296863 scopus 로고    scopus 로고
    • Minimum phone error and I-smoothing for improved discriminative training
    • Orlando, FL, May
    • D. Povey and P. C. Woodland, "Minimum phone error and I-smoothing for improved discriminative training," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Orlando, FL, May 2002, pp. 105-108.
    • (2002) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process , pp. 105-108
    • Povey, D.1    Woodland, P.C.2
  • 5
    • 4544253838 scopus 로고    scopus 로고
    • Improving broadcast news transcription by lightly supervised discriminative training
    • Montreal, QC, Canada, Mar
    • H. Y. Chan and P. C. Woodland, "Improving broadcast news transcription by lightly supervised discriminative training," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Montreal, QC, Canada, Mar. 2004, pp. 737-740.
    • (2004) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process , pp. 737-740
    • Chan, H.Y.1    Woodland, P.C.2
  • 7
  • 8
    • 34047261805 scopus 로고    scopus 로고
    • An overview of automatic speaker diarization systems
    • Sep
    • S. E. Tranter and D. A. Reynolds, "An overview of automatic speaker diarization systems," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 5, pp. 1555-1563, Sep. 2006.
    • (2006) IEEE Trans. Audio, Speech, Lang. Process , vol.14 , Issue.5 , pp. 1555-1563
    • Tranter, S.E.1    Reynolds, D.A.2
  • 9
    • 33646357306 scopus 로고    scopus 로고
    • The Cambridge University March 2005 speaker diarization system
    • Lisbon, Portugal, Sep
    • R. Sinha, S. E. Tranter, M. J. F. Gales, and P. C. Woodland, "The Cambridge University March 2005 speaker diarization system," in Proc. InterSpeech, Lisbon, Portugal, Sep. 2005, pp. 2347-2350.
    • (2005) Proc. InterSpeech , pp. 2347-2350
    • Sinha, R.1    Tranter, S.E.2    Gales, M.J.F.3    Woodland, P.C.4
  • 11
    • 0036567851 scopus 로고    scopus 로고
    • The LIMSI broadcast news transcription system
    • J.-L. Gauvain, L. Lamel, and G. Adda, "The LIMSI broadcast news transcription system," Comput. Speech Lang., pp. 89-108, 2002.
    • (2002) Comput. Speech Lang , pp. 89-108
    • Gauvain, J.-L.1    Lamel, L.2    Adda, G.3
  • 12
    • 34047259784 scopus 로고    scopus 로고
    • CTS decoding improvements at IBM
    • presented at the, St. Thomas, U.S. Virgin Islands, Dec
    • G. Saon, D. Povey, and G. Zweig, "CTS decoding improvements at IBM," presented at the Proc. EARS STT Workshop, St. Thomas, U.S. Virgin Islands, Dec. 2003, p. XXX.
    • (2003) Proc. EARS STT Workshop
    • Saon, G.1    Povey, D.2    Zweig, G.3
  • 13
    • 0036460908 scopus 로고    scopus 로고
    • Lightly supervised and unsupervised acoustic model training
    • L. Lamel and J.-L. Gauvain, "Lightly supervised and unsupervised acoustic model training," Comput. Speech Lang., vol. 16, pp. 115-129, 2002.
    • (2002) Comput. Speech Lang , vol.16 , pp. 115-129
    • Lamel, L.1    Gauvain, J.-L.2
  • 15
    • 34047246426 scopus 로고    scopus 로고
    • Lightly supervised discriminative training for LVCSR,
    • Master's thesis, Cambridge Univ, Cambridge, U.K
    • H. Y. Chan, "Lightly supervised discriminative training for LVCSR," Master's thesis, Cambridge Univ., Cambridge, U.K., 2004.
    • (2004)
    • Chan, H.Y.1
  • 16
    • 0030638031 scopus 로고    scopus 로고
    • A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER)
    • J. G. Fiscus, "A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER)," in Proc. IEEE ASRU Workshop, 1997, pp. 347-352.
    • (1997) Proc. IEEE ASRU Workshop , pp. 347-352
    • Fiscus, J.G.1
  • 17
    • 84946728861 scopus 로고    scopus 로고
    • Design of fast LVCSR systems
    • St. Thomas, U.S. Virgin Islands, Nov
    • G. Evermann and P. C. Woodland, "Design of fast LVCSR systems," in Proc. IEEE ASRU Workshop, St. Thomas, U.S. Virgin Islands, Nov. 2003, pp. 7-12.
    • (2003) Proc. IEEE ASRU Workshop , pp. 7-12
    • Evermann, G.1    Woodland, P.C.2
  • 20
    • 33745185104 scopus 로고    scopus 로고
    • Combining speaker identification and BIC for speaker diarization
    • Lisbon, Portugal, Sep
    • X. Zhu, C. Barras, S. Meignier, and J.-L. Gauvain, "Combining speaker identification and BIC for speaker diarization," in Proc. InterSpeech, Lisbon, Portugal, Sep. 2005, pp. 2441-2444.
    • (2005) Proc. InterSpeech , pp. 2441-2444
    • Zhu, X.1    Barras, C.2    Meignier, S.3    Gauvain, J.-L.4
  • 21
    • 34047257943 scopus 로고    scopus 로고
    • S. E. Tranter, K. Yu, D. A. Reynolds, G. Evermann, D. Y. Kim, and P. C. Woodland, An investigation into the interactions between speaker diarization systems and automatic speech transcription, Cambridge Univ. Eng. Dept., Tech. Rep. CUED/F-INFENG/TR-464, 2003.
    • S. E. Tranter, K. Yu, D. A. Reynolds, G. Evermann, D. Y. Kim, and P. C. Woodland, "An investigation into the interactions between speaker diarization systems and automatic speech transcription," Cambridge Univ. Eng. Dept., Tech. Rep. CUED/F-INFENG/TR-464, 2003.
  • 22
    • 85128356454 scopus 로고    scopus 로고
    • Partitioning and transcription of broadcast news data
    • Sydney, Australia, Dec
    • J.-L. Gauvain, L. Lamel, and G. Adda, "Partitioning and transcription of broadcast news data," in Proc. Int. Conf. Spoken Lang. Process., vol. 4, Sydney, Australia, Dec. 1998, pp. 1335-1338.
    • (1998) Proc. Int. Conf. Spoken Lang. Process , vol.4 , pp. 1335-1338
    • Gauvain, J.-L.1    Lamel, L.2    Adda, G.3
  • 24
    • 0036567794 scopus 로고    scopus 로고
    • The development of the HTK broadcast news transcription system: An overview
    • P. C. Woodland, "The development of the HTK broadcast news transcription system: An overview," Speech Commun., vol. 37, pp. 47-67, 2002.
    • (2002) Speech Commun , vol.37 , pp. 47-67
    • Woodland, P.C.1
  • 26
    • 0003871508 scopus 로고    scopus 로고
    • Investigation of silicon-auditory models and generalization of linear discriminant analysis for improved speech recognition,
    • Ph.D. dissertation, John Hopkins Univ, Baltimore, MD
    • N. Kumar, "Investigation of silicon-auditory models and generalization of linear discriminant analysis for improved speech recognition," Ph.D. dissertation, John Hopkins Univ., Baltimore, MD, 1997.
    • (1997)
    • Kumar, N.1
  • 27
    • 0032638856 scopus 로고    scopus 로고
    • Semi-tied covariance matrices for hidden Markov models
    • May
    • M. J. F. Gales, "Semi-tied covariance matrices for hidden Markov models," IEEE Trans. Speech Audio Process., vol. 7, no. 3, pp. 272-281, May 1999.
    • (1999) IEEE Trans. Speech Audio Process , vol.7 , Issue.3 , pp. 272-281
    • Gales, M.J.F.1
  • 28
    • 34047247141 scopus 로고    scopus 로고
    • S. J. Young, G. Evermann, M. J. F. Gales, T. Hain, D. Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. C. Woodland, The HTK Book, version 3.3. Cambridge, U.K, Cambridge Univ. Eng. Dept, 2005
    • S. J. Young, G. Evermann, M. J. F. Gales, T. Hain, D. Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. C. Woodland, The HTK Book, version 3.3. Cambridge, U.K.: Cambridge Univ. Eng. Dept., 2005.
  • 30
    • 0036461035 scopus 로고    scopus 로고
    • Large scale discriminative training of hidden Markov models for speech recognition
    • P. C. Woodland and D. Povey, "Large scale discriminative training of hidden Markov models for speech recognition," Comput. Speech Lang., vol. 16, pp. 25-47, 2002.
    • (2002) Comput. Speech Lang , vol.16 , pp. 25-47
    • Woodland, P.C.1    Povey, D.2
  • 31
    • 0028419019 scopus 로고
    • Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
    • Apr
    • J.-L. Gauvain and C.-H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Process., vol. 2, no. 2, pp. 291-298, Apr. 1994.
    • (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.2 , pp. 291-298
    • Gauvain, J.-L.1    Lee, C.-H.2
  • 32
    • 84891308106 scopus 로고    scopus 로고
    • SRILM - an extensible language modeling toolkit
    • Denver, CO, Sep
    • A. Stolcke, "SRILM - an extensible language modeling toolkit," in Proc. Int. Conf. Spoken Lang. Process., Denver, CO, Sep. 2002, pp. 901-904.
    • (2002) Proc. Int. Conf. Spoken Lang. Process , pp. 901-904
    • Stolcke, A.1
  • 34
    • 0030366664 scopus 로고    scopus 로고
    • Iterative unsupervised adaptation using maximum likelihood linear regression
    • Philadelphia, PA
    • P. Woodland, D. Pye, and M. Gales, "Iterative unsupervised adaptation using maximum likelihood linear regression," in Proc. Int. Conf. Spoken Lang. Process., Philadelphia, PA, 1996, pp. 1133-1136.
    • (1996) Proc. Int. Conf. Spoken Lang. Process , pp. 1133-1136
    • Woodland, P.1    Pye, D.2    Gales, M.3
  • 35
    • 33745225187 scopus 로고    scopus 로고
    • The 2004 BBN 1 × RT recognition systems for English broadcast news and conversational telephone speech
    • Lisbon, Portugal, Sep
    • S. Matsoukas, R. Prasad, S. Laxminarayan, B. Xiang, L. Nguyen, and R. Schwartz, "The 2004 BBN 1 × RT recognition systems for English broadcast news and conversational telephone speech," in Proc. Inter-Speech, Lisbon, Portugal, Sep. 2005, pp. 1641-1644.
    • (2005) Proc. Inter-Speech , pp. 1641-1644
    • Matsoukas, S.1    Prasad, R.2    Laxminarayan, S.3    Xiang, B.4    Nguyen, L.5    Schwartz, R.6
  • 36
    • 85009192356 scopus 로고    scopus 로고
    • An architecture for rapid decoding of large vocabulary conversational speech
    • Geneva, Switzerland, Sep
    • G. Saon, G. Zweig, B. Kingsbury, L. Mangu, and U. Chaudhari, "An architecture for rapid decoding of large vocabulary conversational speech," in Proc. Eur. Conf. Speech Commun. Technol., Geneva, Switzerland, Sep. 2003, pp. 1977-1980.
    • (2003) Proc. Eur. Conf. Speech Commun. Technol , pp. 1977-1980
    • Saon, G.1    Zweig, G.2    Kingsbury, B.3    Mangu, L.4    Chaudhari, U.5
  • 37
    • 4544253834 scopus 로고    scopus 로고
    • Posterior probability decoding, confidence estimation, and system combination
    • College Park, MD, May
    • G. Evermann and P. C. Woodland, "Posterior probability decoding, confidence estimation, and system combination," in Proc. Speech Transcription Workshop, College Park, MD, May 2000.
    • (2000) Proc. Speech Transcription Workshop
    • Evermann, G.1    Woodland, P.C.2
  • 38
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density HMMs
    • C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density HMMs," Comput. Speech Lang., vol. 9, pp. 171-186, 1995.
    • (1995) Comput. Speech Lang , vol.9 , pp. 171-186
    • Leggetter, C.J.1    Woodland, P.C.2
  • 39
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • M. J. F. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," Comput. Speech Lang., vol. 12, pp. 75-98, 1998.
    • (1998) Comput. Speech Lang , vol.12 , pp. 75-98
    • Gales, M.J.F.1
  • 41
    • 4544324761 scopus 로고    scopus 로고
    • Implicit pronunciation modeling in ASR
    • T. Hain, "Implicit pronunciation modeling in ASR," in Proc. ISCA ITRW PMLA, 2002.
    • (2002) Proc. ISCA ITRW PMLA
    • Hain, T.1
  • 42
    • 4544373872 scopus 로고    scopus 로고
    • Basis superposition precision matrix modeling for large vocabulary continuous speech recognition
    • K. C. Sim and M. J. F. Gales, "Basis superposition precision matrix modeling for large vocabulary continuous speech recognition," in Proc. ICASSP, 2004, pp. 801-804.
    • (2004) Proc. ICASSP , pp. 801-804
    • Sim, K.C.1    Gales, M.J.F.2
  • 45
    • 85135271674 scopus 로고    scopus 로고
    • Finding consensus among words: Lattice-based word error minimization
    • L. Mangu, E. Brill, and A. Stolcke, "Finding consensus among words: Lattice-based word error minimization," in Proc. Eur. Conf. Speech Commun. Technol., 1999, pp. 495-498.
    • (1999) Proc. Eur. Conf. Speech Commun. Technol , pp. 495-498
    • Mangu, L.1    Brill, E.2    Stolcke, A.3
  • 46
    • 33947657269 scopus 로고    scopus 로고
    • Error analysis of the BN and CTS results
    • presented at the, St. Thomas, U.S. Virgin Islands, Dec
    • N. Duta and R. Schwartz, "Error analysis of the BN and CTS results," presented at the Proc. EARS STT Workshop, St. Thomas, U.S. Virgin Islands, Dec. 2003.
    • (2003) Proc. EARS STT Workshop
    • Duta, N.1    Schwartz, R.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.