SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 14, Issue 5, 2006, Pages 1505-1512

Multistage speaker diarization of broadcast news

(4) Barras, Claude b Zhu, Xuan b Meignier, Sylvain b,c Gauvain, Jean Luc a,b

a IEEE (France)

b UFR 919 Laboratoire d'Informatique Pour la Mécanique et les Sciences de l'Ingénieur (France)

c UNIVERSITÉ DU MAINE (France)

Author keywords

Bayesian information criterion (BIC) clustering; Speaker diarization; Speaker identification (SID); Speaker segmentation and clustering

Indexed keywords

BAYESIAN INFORMATION CRITERION (BIC) CLUSTERING; SPEAKER DIARIZATION; SPEAKER IDENTIFICATION (SID); SPEAKER SEGMENTATION AND CLUSTERING;

BROADCASTING; CLUSTER ANALYSIS; DATA REDUCTION; INFORMATION ANALYSIS; ITERATIVE METHODS; MATHEMATICAL MODELS;

SPEECH PROCESSING;

EID: 34047266609 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2006.878261 Document Type: Article

Times cited : (169)

References (33)

1
- 34047245732
- NIST, Gaithersburg, MD, Online] Available
- NIST. (2004, Aug.) Fall 2004 Rich Transcription (RT-04F) evaluation plan, Gaithersburg, MD. [Online] Available: http://www.nist.gov/speech/tests/rt/ rt2004/fall/docs/rt04f-eval-plan-v14.pdf
- (2004) Aug.) Fall 2004 Rich Transcription (RT-04F) evaluation plan

2
- 34047244321
- S. E. Tranter, K. Yu, D. A. Reynolds, G. Evermann, D. Y. Kim, and P. C. Woodland, An investigation into the interactions between speaker diarization systems and automatic speech transcription, Eng. Dept., Cambridge Univ., Cambridge, U.K., Tech. Rep. CUED/F-INFENG/TR-464, Oct. 2003.
- S. E. Tranter, K. Yu, D. A. Reynolds, G. Evermann, D. Y. Kim, and P. C. Woodland, "An investigation into the interactions between speaker diarization systems and automatic speech transcription," Eng. Dept., Cambridge Univ., Cambridge, U.K., Tech. Rep. CUED/F-INFENG/TR-464, Oct. 2003.

3
- 34047248320
- NIST, Gaithersburg, MD, Online] Available
- NIST. (2003, Feb.) The Rich Transcription Spring 2003 (RT-03S) evaluation plan, Gaithersburg, MD. [Online] Available: http://www.nist.gov/speech/tests/ rt/rt2003/spring/docs/rt03-spring-eval-plan-v4.pdf
- (2003) Feb.) The Rich Transcription Spring 2003 (RT-03S) evaluation plan

4
- 34047254816
- Gaithersburg, MD, Online] Available
- _, (2004, Feb.) Spring 2004 (RT-04S) Rich Transcription meeting recognition evaluation plan, Gaithersburg, MD. [Online] Available: http://www.nist.gov/speech/tests/rt/rt2004/spring/documents/rt04s-meeting-eval- plan-v1.pdf
- (2004) Feb.) Spring 2004 (RT-04S) Rich Transcription meeting recognition evaluation plan

5
- 84862162991
- The ESTER evaluation campaign of rich transcription of French broadcast news
- Lisbon, Portugal, May
- G. Gravier, J.-F. Bonastre, S. Galliano, E. Geoffrois, K. Mc Tait, and K. Choukri, "The ESTER evaluation campaign of rich transcription of French broadcast news," in Proc. Lang. Evaluation Resources Conf. (LREC 2004), Lisbon, Portugal, May 2004, pp. 885-888.
- (2004) Proc. Lang. Evaluation Resources Conf. (LREC 2004) , pp. 885-888
- Gravier, G.¹ Bonastre, J.-F.² Galliano, S.³ Geoffrois, E.⁴ Mc Tait, K.⁵ Choukri, K.⁶

6
- 33745224977
- The ESTER phase II evaluation campaign for the rich transcription of French broadcast news
- Lisbon, Portugal, Sep
- S. Galliano, E. Geoffrois, D. Mostefa, K. Choukri, J.-F. Bonastre, and G. Gravier, "The ESTER phase II evaluation campaign for the rich transcription of French broadcast news," in Proc. 9th Eur. Conf. Speech Communication and Technology (ISCA Interspeech), Lisbon, Portugal, Sep. 2005, pp. 1149-1152.
- (2005) Proc. 9th Eur. Conf. Speech Communication and Technology (ISCA Interspeech) , pp. 1149-1152
- Galliano, S.¹ Geoffrois, E.² Mostefa, D.³ Choukri, K.⁴ Bonastre, J.-F.⁵ Gravier, G.⁶

7
- 85143190120
- Y. Moh, P. Nguyen, and J.-C. Junqua, Toward domain independent speaker clustering, in Proc. Int. Conf. Acoust., Speech, Signal Process., China, Apr. 2003, pp. II-85-II-88.
- Y. Moh, P. Nguyen, and J.-C. Junqua, "Toward domain independent speaker clustering," in Proc. Int. Conf. Acoust., Speech, Signal Process., China, Apr. 2003, pp. II-85-II-88.

8
- 77951283289
- Speaker diarization using bottom-up clustering based on a parameter-derived distance between GMMs
- Jeju, Korea, Oct
- M. Ben, M. Betser, F. Bimbot, and G. Gravier, "Speaker diarization using bottom-up clustering based on a parameter-derived distance between GMMs," in Proc. Int. Conf. Spoken Language Process., Jeju, Korea, Oct. 2004, pp. 1125-1128.
- (2004) Proc. Int. Conf. Spoken Language Process , pp. 1125-1128
- Ben, M.¹ Betser, M.² Bimbot, F.³ Gravier, G.⁴

9
- 0002782496
- Automatic segmentation and clustering of broadcast news audio
- Chantilly, VA, Feb
- M. Siegler, U. Jain, B. Raj, and R. Stern, "Automatic segmentation and clustering of broadcast news audio," in Proc. DARPA Speech Recognition Workshop, Chantilly, VA, Feb. 1997, pp. 97-99.
- (1997) Proc. DARPA Speech Recognition Workshop , pp. 97-99
- Siegler, M.¹ Jain, U.² Raj, B.³ Stern, R.⁴

10
- 0002595416
- Speaker, environment, and channel change detection and clustering via the Bayesian information criterion
- Landsdowne, VA, Feb
- S. Chen and P. Gopalakrishnan, "Speaker, environment, and channel change detection and clustering via the Bayesian information criterion," in Proc. DARPA Broadcast News Transcription and Understanding Workshop, Landsdowne, VA, Feb. 1998, pp. 127-132.
- (1998) Proc. DARPA Broadcast News Transcription and Understanding Workshop , pp. 127-132
- Chen, S.¹ Gopalakrishnan, P.²

11
- 85071069033
- Segmentation and classification of broadcast news audio
- Sydney. Australia, Nov
- T. Hain and P. C. Woodland, "Segmentation and classification of broadcast news audio," in Proc. Int. Conf. Spoken Language Processing, Sydney. Australia, Nov. 1998, pp. 2727-2730.
- (1998) Proc. Int. Conf. Spoken Language Processing , pp. 2727-2730
- Hain, T.¹ Woodland, P.C.²

12
- 85128356454
- Partitioning and transcription of broadcast news data
- Sydney, Australia, Dec
- J.-L. Gauvain, L. Lamel, and G. Adda, "Partitioning and transcription of broadcast news data," in Proc. Int. Conf. Spoken Language Processing, Sydney, Australia, Dec. 1998, pp. 1335-1338.
- (1998) Proc. Int. Conf. Spoken Language Processing , pp. 1335-1338
- Gauvain, J.-L.¹ Lamel, L.² Adda, G.³

13
- 0141809272
- E-HMM approach for learning and adapting sound models for speaker indexing
- Chania, Crete, Jun
- S. Meignier, J.-F. Bonastre, and S. Igounet, "E-HMM approach for learning and adapting sound models for speaker indexing," in 2001: A Speaker Odyssey. Proc. Speaker Recognition Workshop (ISCA, Odyssey 2001), Chania, Crete, Jun. 2001, pp. 175-180.
- (2001) 2001: A Speaker Odyssey. Proc. Speaker Recognition Workshop (ISCA, Odyssey 2001) , pp. 175-180
- Meignier, S.¹ Bonastre, J.-F.² Igounet, S.³

14
- 84946742526
- A robust speaker clustering algorithm
- St. Thomas, U.S. Virgin Islands, Nov
- J. Ajmera and C. Woofers, "A robust speaker clustering algorithm," in Proc. Automatic Speech Recognition and Understanding, St. Thomas, U.S. Virgin Islands, Nov. 2003, pp. 411-116.
- (2003) Proc. Automatic Speech Recognition and Understanding , pp. 411-116
- Ajmera, J.¹ Woofers, C.²

15
- 33646779383
- Speaker diarization for broadcast news
- Toledo, Spain, May
- S. E. Tranter and D. A. Reynolds, "Speaker diarization for broadcast news," in Proc. 2004: A Speaker Odyssey. The Speaker Recognition Workshop, Toledo, Spain, May 2004, pp. 337-344.
- (2004) Proc. 2004: A Speaker Odyssey. The Speaker Recognition Workshop , pp. 337-344
- Tranter, S.E.¹ Reynolds, D.A.²

16
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
- Apr
- J.-L. Gauvain and C. H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Process., vol. 2, no. 2, pp. 291-298, Apr. 1994.
- (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.-L.¹ Lee, C.H.²

17
- 0035367375
- Audio partitioning and transcription for broadcast data indexation
- J.-L. Gauvain, L. Lamel, and G. Adda, "Audio partitioning and transcription for broadcast data indexation," Multimedia Tools Applicat., vol. 14, pp. 187-200, 2001.
- (2001) Multimedia Tools Applicat , vol.14 , pp. 187-200
- Gauvain, J.-L.¹ Lamel, L.² Adda, G.³

18
- 0036567851
- The LIMSI broadcast news transcription system
- _, "The LIMSI broadcast news transcription system," Speech Commun., vol. 37, no. 1-2, pp. 89-108, 2002.
- (2002) Speech Commun , vol.37 , Issue.1-2 , pp. 89-108
- Gauvain, J.-L.¹ Lamel, L.² Adda, G.³

19
- 0025041264
- Perceptual linear predictive (PLP) analysis of speech
- H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech," J. Acoust. Soc. Amer., vol. 87, no. 4, pp. 1738-1752, 1990.
- (1990) J. Acoust. Soc. Amer , vol.87 , Issue.4 , pp. 1738-1752
- Hermansky, H.¹

20
- 0034273195
- DISTBIC: A speaker-based segmentation for audio data indexing
- P. Delacourt and C. Wellekens, "DISTBIC: A speaker-based segmentation for audio data indexing," Speech Commun., vol. 32, pp. 111-126, 2000.
- (2000) Speech Commun , vol.32 , pp. 111-126
- Delacourt, P.¹ Wellekens, C.²

21
- 34047271391
- Segmentation, classification, and clustering of an Italian broadcast news corpus
- Paris, France, Apr
- M. Cettolo, "Segmentation, classification, and clustering of an Italian broadcast news corpus," in Proc. Content-Based Multimedia Inf. Access Conf., Paris, France, Apr. 2000, pp. 372-381.
- (2000) Proc. Content-Based Multimedia Inf. Access Conf , pp. 372-381
- Cettolo, M.¹

22
- 10844275417
- Evaluation of BIC-based algorithms for audio segmentation
- M. Cettolo, M. Vescovi, and R. Rizzi, "Evaluation of BIC-based algorithms for audio segmentation," Comput. Speech Lang., vol. 19, pp. 147-170, 2005.
- (2005) Comput. Speech Lang , vol.19 , pp. 147-170
- Cettolo, M.¹ Vescovi, M.² Rizzi, R.³

23
- 0033692969
- Strategies for automatic segmentation of audio data
- Istanbul, Turkey, Nov
- T. Kemp, M. Schmidt, M. Westphal, and A. Waibel, "Strategies for automatic segmentation of audio data," in Proc. Int. Conf. Acoust., Speech, Signal Process., Istanbul, Turkey, Nov. 2000, pp. 1423-1426.
- (2000) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 1423-1426
- Kemp, T.¹ Schmidt, M.² Westphal, M.³ Waibel, A.⁴

24
- 34047258103
- and, Eds, Norwell, MA: Academic
- J. Schroeder and J. Campbell, Eds., Digital Signal Processing (DSP), a Review Journal - Special Issue on NIST 1999 Speaker Recognition Workshop. Norwell, MA: Academic, 2000.
- (2000) Digital Signal Processing (DSP), a Review Journal - Special Issue on NIST 1999 Speaker Recognition Workshop

25
- 85143189567
- C. Barras and J.-L. Gauvain, Feature and score normalization for speaker verification of cellular data, in Proc. Int. Conf. Acoust., Speech, Signal Process., China, 2003, pp. II-49-II-52.
- C. Barras and J.-L. Gauvain, "Feature and score normalization for speaker verification of cellular data," in Proc. Int. Conf. Acoust., Speech, Signal Process., China, 2003, pp. II-49-II-52.

26
- 85073258179
- Feature warping for robust speaker verification
- Chania, Crete, Jun
- J. Pelecanos and S. Sridharan, "Feature warping for robust speaker verification," in Proc. 2001: A Speaker Odyssey. The Speaker Recognition Workshop, Chania, Crete, Jun. 2001, pp. 213-218.
- (2001) Proc. 2001: A Speaker Odyssey. The Speaker Recognition Workshop , pp. 213-218
- Pelecanos, J.¹ Sridharan, S.²

27
- 0033884858
- Speaker verification using adapted Gaussian mixture models
- Digital Signal Processing DSP, a Review Journal, Special Issue on NIST 1999
- D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, "Speaker verification using adapted Gaussian mixture models," Digital Signal Processing (DSP), a Review Journal - Special Issue on NIST 1999 Speaker Recognition Workshop, vol. 10, pp. 19-41, 2000.
- (2000) Speaker Recognition Workshop , vol.10 , pp. 19-41
- Reynolds, D.A.¹ Quatieri, T.F.² Dunn, R.B.³

28
- 85128386923
- Blind clustering of speech utterances based on speaker and language characteristics
- Sydney, Australia, Nov
- D. A. Reynolds, E. Singer, B. A. Carlson, G. C. O'Leary, J. J. McLaughlin, and M. A. Zissman, "Blind clustering of speech utterances based on speaker and language characteristics," in Proc. Int. Conf. Spoken Language Process., Sydney, Australia, Nov. 1998, pp. 3193-3196.
- (1998) Proc. Int. Conf. Spoken Language Process , pp. 3193-3196
- Reynolds, D.A.¹ Singer, E.² Carlson, B.A.³ O'Leary, G.C.⁴ McLaughlin, J.J.⁵ Zissman, M.A.⁶

29
- 38149054330
- The 2004 BBN/LIMSI 10×RT English broadcast news transcription system
- Palisades, NY, Nov
- L. Nguyen, S. Abdou, M. Afify, J. Makhoul, S. Matsoukas, R. Schwartz, B. Xiang, L. Lamel, J.-L. Gauvain, G. Adda, H. Schwenk, and F. Lefevre, "The 2004 BBN/LIMSI 10×RT English broadcast news transcription system," in Proc. DARPA RT04'S, Palisades, NY, Nov. 2004. pp. 33-1-33-7.
- (2004) Proc. DARPA RT04'S
- Nguyen, L.¹ Abdou, S.² Afify, M.³ Makhoul, J.⁴ Matsoukas, S.⁵ Schwartz, R.⁶ Xiang, B.⁷ Lamel, L.⁸ Gauvain, J.-L.⁹ Adda, G.¹⁰ Schwenk, H.¹¹ Lefevre, F.¹²

30
- 33745185104
- Combining speaker identification and BIC for speaker diarization
- Lisbon, Portugal, Sep
- X. Zhu, C. Barras, S. Meignier, and J-L. Gauvain, "Combining speaker identification and BIC for speaker diarization," in Proc. 9th Eur. Conf. Speech Commun. Technol., Lisbon, Portugal, Sep. 2005. pp. 2441-2444.
- (2005) Proc. 9th Eur. Conf. Speech Commun. Technol , pp. 2441-2444
- Zhu, X.¹ Barras, C.² Meignier, S.³ Gauvain, J.-L.⁴

31
- 33646380923
- Approaches and applications of audio diarization
- Philadelphia, PA, Mar
- D. Reynolds and P. Torres-Carrasquillo, "Approaches and applications of audio diarization," in Proc. Int. Conf. Acoust., Speech, Signal Process, Philadelphia, PA, Mar. 2005, pp. 953-956.
- (2005) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 953-956
- Reynolds, D.¹ Torres-Carrasquillo, P.²

32
- 33745200276
- The Cambridge University March 2005 speaker diarization system
- Lisbon, Portugal, Sep
- R. Sinha, S. Tranter, M. Gales, and P. Woodland, "The Cambridge University March 2005 speaker diarization system," in Proc. 9th Eur. Conf. Speech Commun. Technol., Lisbon, Portugal, Sep. 2005, pp. 2437-2440.
- (2005) Proc. 9th Eur. Conf. Speech Commun. Technol , pp. 2437-2440
- Sinha, R.¹ Tranter, S.² Gales, M.³ Woodland, P.⁴

33
- 84865749221
- Speaker diarization from speech transcripts
- Jeju Island, Korea, Oct
- L. Canseco-Rodriguez, L. Lamel, and J.-L. Gauvain, "Speaker diarization from speech transcripts," in Proc. Int. Conf. Spoken Language Process., Jeju Island, Korea, Oct. 2004, pp. 1272-1275.
- (2004) Proc. Int. Conf. Spoken Language Process , pp. 1272-1275
- Canseco-Rodriguez, L.¹ Lamel, L.² Gauvain, J.-L.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.