메뉴 건너뛰기




Volumn 14, Issue 5, 2006, Pages 1505-1512

Multistage speaker diarization of broadcast news

Author keywords

Bayesian information criterion (BIC) clustering; Speaker diarization; Speaker identification (SID); Speaker segmentation and clustering

Indexed keywords

BAYESIAN INFORMATION CRITERION (BIC) CLUSTERING; SPEAKER DIARIZATION; SPEAKER IDENTIFICATION (SID); SPEAKER SEGMENTATION AND CLUSTERING;

EID: 34047266609     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2006.878261     Document Type: Article
Times cited : (169)

References (33)
  • 1
    • 34047245732 scopus 로고    scopus 로고
    • NIST, Gaithersburg, MD, Online] Available
    • NIST. (2004, Aug.) Fall 2004 Rich Transcription (RT-04F) evaluation plan, Gaithersburg, MD. [Online] Available: http://www.nist.gov/speech/tests/rt/ rt2004/fall/docs/rt04f-eval-plan-v14.pdf
    • (2004) Aug.) Fall 2004 Rich Transcription (RT-04F) evaluation plan
  • 2
    • 34047244321 scopus 로고    scopus 로고
    • S. E. Tranter, K. Yu, D. A. Reynolds, G. Evermann, D. Y. Kim, and P. C. Woodland, An investigation into the interactions between speaker diarization systems and automatic speech transcription, Eng. Dept., Cambridge Univ., Cambridge, U.K., Tech. Rep. CUED/F-INFENG/TR-464, Oct. 2003.
    • S. E. Tranter, K. Yu, D. A. Reynolds, G. Evermann, D. Y. Kim, and P. C. Woodland, "An investigation into the interactions between speaker diarization systems and automatic speech transcription," Eng. Dept., Cambridge Univ., Cambridge, U.K., Tech. Rep. CUED/F-INFENG/TR-464, Oct. 2003.
  • 3
    • 34047248320 scopus 로고    scopus 로고
    • NIST, Gaithersburg, MD, Online] Available
    • NIST. (2003, Feb.) The Rich Transcription Spring 2003 (RT-03S) evaluation plan, Gaithersburg, MD. [Online] Available: http://www.nist.gov/speech/tests/ rt/rt2003/spring/docs/rt03-spring-eval-plan-v4.pdf
    • (2003) Feb.) The Rich Transcription Spring 2003 (RT-03S) evaluation plan
  • 7
    • 85143190120 scopus 로고    scopus 로고
    • Y. Moh, P. Nguyen, and J.-C. Junqua, Toward domain independent speaker clustering, in Proc. Int. Conf. Acoust., Speech, Signal Process., China, Apr. 2003, pp. II-85-II-88.
    • Y. Moh, P. Nguyen, and J.-C. Junqua, "Toward domain independent speaker clustering," in Proc. Int. Conf. Acoust., Speech, Signal Process., China, Apr. 2003, pp. II-85-II-88.
  • 8
    • 77951283289 scopus 로고    scopus 로고
    • Speaker diarization using bottom-up clustering based on a parameter-derived distance between GMMs
    • Jeju, Korea, Oct
    • M. Ben, M. Betser, F. Bimbot, and G. Gravier, "Speaker diarization using bottom-up clustering based on a parameter-derived distance between GMMs," in Proc. Int. Conf. Spoken Language Process., Jeju, Korea, Oct. 2004, pp. 1125-1128.
    • (2004) Proc. Int. Conf. Spoken Language Process , pp. 1125-1128
    • Ben, M.1    Betser, M.2    Bimbot, F.3    Gravier, G.4
  • 9
    • 0002782496 scopus 로고    scopus 로고
    • Automatic segmentation and clustering of broadcast news audio
    • Chantilly, VA, Feb
    • M. Siegler, U. Jain, B. Raj, and R. Stern, "Automatic segmentation and clustering of broadcast news audio," in Proc. DARPA Speech Recognition Workshop, Chantilly, VA, Feb. 1997, pp. 97-99.
    • (1997) Proc. DARPA Speech Recognition Workshop , pp. 97-99
    • Siegler, M.1    Jain, U.2    Raj, B.3    Stern, R.4
  • 10
    • 0002595416 scopus 로고    scopus 로고
    • Speaker, environment, and channel change detection and clustering via the Bayesian information criterion
    • Landsdowne, VA, Feb
    • S. Chen and P. Gopalakrishnan, "Speaker, environment, and channel change detection and clustering via the Bayesian information criterion," in Proc. DARPA Broadcast News Transcription and Understanding Workshop, Landsdowne, VA, Feb. 1998, pp. 127-132.
    • (1998) Proc. DARPA Broadcast News Transcription and Understanding Workshop , pp. 127-132
    • Chen, S.1    Gopalakrishnan, P.2
  • 11
    • 85071069033 scopus 로고    scopus 로고
    • Segmentation and classification of broadcast news audio
    • Sydney. Australia, Nov
    • T. Hain and P. C. Woodland, "Segmentation and classification of broadcast news audio," in Proc. Int. Conf. Spoken Language Processing, Sydney. Australia, Nov. 1998, pp. 2727-2730.
    • (1998) Proc. Int. Conf. Spoken Language Processing , pp. 2727-2730
    • Hain, T.1    Woodland, P.C.2
  • 12
    • 85128356454 scopus 로고    scopus 로고
    • Partitioning and transcription of broadcast news data
    • Sydney, Australia, Dec
    • J.-L. Gauvain, L. Lamel, and G. Adda, "Partitioning and transcription of broadcast news data," in Proc. Int. Conf. Spoken Language Processing, Sydney, Australia, Dec. 1998, pp. 1335-1338.
    • (1998) Proc. Int. Conf. Spoken Language Processing , pp. 1335-1338
    • Gauvain, J.-L.1    Lamel, L.2    Adda, G.3
  • 16
    • 0028419019 scopus 로고
    • Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
    • Apr
    • J.-L. Gauvain and C. H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Process., vol. 2, no. 2, pp. 291-298, Apr. 1994.
    • (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.2 , pp. 291-298
    • Gauvain, J.-L.1    Lee, C.H.2
  • 17
    • 0035367375 scopus 로고    scopus 로고
    • Audio partitioning and transcription for broadcast data indexation
    • J.-L. Gauvain, L. Lamel, and G. Adda, "Audio partitioning and transcription for broadcast data indexation," Multimedia Tools Applicat., vol. 14, pp. 187-200, 2001.
    • (2001) Multimedia Tools Applicat , vol.14 , pp. 187-200
    • Gauvain, J.-L.1    Lamel, L.2    Adda, G.3
  • 18
    • 0036567851 scopus 로고    scopus 로고
    • The LIMSI broadcast news transcription system
    • _, "The LIMSI broadcast news transcription system," Speech Commun., vol. 37, no. 1-2, pp. 89-108, 2002.
    • (2002) Speech Commun , vol.37 , Issue.1-2 , pp. 89-108
    • Gauvain, J.-L.1    Lamel, L.2    Adda, G.3
  • 19
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech," J. Acoust. Soc. Amer., vol. 87, no. 4, pp. 1738-1752, 1990.
    • (1990) J. Acoust. Soc. Amer , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 20
    • 0034273195 scopus 로고    scopus 로고
    • DISTBIC: A speaker-based segmentation for audio data indexing
    • P. Delacourt and C. Wellekens, "DISTBIC: A speaker-based segmentation for audio data indexing," Speech Commun., vol. 32, pp. 111-126, 2000.
    • (2000) Speech Commun , vol.32 , pp. 111-126
    • Delacourt, P.1    Wellekens, C.2
  • 21
    • 34047271391 scopus 로고    scopus 로고
    • Segmentation, classification, and clustering of an Italian broadcast news corpus
    • Paris, France, Apr
    • M. Cettolo, "Segmentation, classification, and clustering of an Italian broadcast news corpus," in Proc. Content-Based Multimedia Inf. Access Conf., Paris, France, Apr. 2000, pp. 372-381.
    • (2000) Proc. Content-Based Multimedia Inf. Access Conf , pp. 372-381
    • Cettolo, M.1
  • 22
    • 10844275417 scopus 로고    scopus 로고
    • Evaluation of BIC-based algorithms for audio segmentation
    • M. Cettolo, M. Vescovi, and R. Rizzi, "Evaluation of BIC-based algorithms for audio segmentation," Comput. Speech Lang., vol. 19, pp. 147-170, 2005.
    • (2005) Comput. Speech Lang , vol.19 , pp. 147-170
    • Cettolo, M.1    Vescovi, M.2    Rizzi, R.3
  • 25
    • 85143189567 scopus 로고    scopus 로고
    • C. Barras and J.-L. Gauvain, Feature and score normalization for speaker verification of cellular data, in Proc. Int. Conf. Acoust., Speech, Signal Process., China, 2003, pp. II-49-II-52.
    • C. Barras and J.-L. Gauvain, "Feature and score normalization for speaker verification of cellular data," in Proc. Int. Conf. Acoust., Speech, Signal Process., China, 2003, pp. II-49-II-52.
  • 27
    • 0033884858 scopus 로고    scopus 로고
    • Speaker verification using adapted Gaussian mixture models
    • Digital Signal Processing DSP, a Review Journal, Special Issue on NIST 1999
    • D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, "Speaker verification using adapted Gaussian mixture models," Digital Signal Processing (DSP), a Review Journal - Special Issue on NIST 1999 Speaker Recognition Workshop, vol. 10, pp. 19-41, 2000.
    • (2000) Speaker Recognition Workshop , vol.10 , pp. 19-41
    • Reynolds, D.A.1    Quatieri, T.F.2    Dunn, R.B.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.