메뉴 건너뛰기




Volumn 18, Issue 8, 2010, Pages 2134-2144

Speaker diarization exploiting the eigengap criterion and cluster ensembles

Author keywords

Broadcasts; Cluster ensembles; Eigengap criterion; Movie scene analysis; Speaker clustering; Speaker diarization; Two person dialogues

Indexed keywords

BROADCASTS; CLUSTER ENSEMBLES; EIGENGAP CRITERION; MOVIE SCENES; SPEAKER CLUSTERING; SPEAKER DIARIZATION; TWO-PERSON DIALOGUES;

EID: 77956519059     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2010.2042121     Document Type: Article
Times cited : (20)

References (52)
  • 4
    • 4544361760 scopus 로고    scopus 로고
    • Comparison of MPEG-7 audio spectrum projection features and MFCC applied to speaker recognition, sound classification and audio segmentation
    • Montreal, QC, Canada May
    • H. G. Kim and T. Sikora, "Comparison of MPEG-7 audio spectrum projection features and MFCC applied to speaker recognition, sound classification and audio segmentation," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Montreal, QC, Canada, May 2004, vol. 5, pp. 925-928.
    • (2004) Proc.IEEE Int. Conf. Acoust., Speech, Signal Process. , vol.5 , pp. 925-928
    • Kim, H.G.1    Sikora, T.2
  • 5
    • 84979955147 scopus 로고    scopus 로고
    • Audio spectrum projection based on several basis decomposition algorithms applied to general sound recognition and audio segmentation
    • Vienna, Austria Sep.
    • H. G. Kim and T. Sikora, "Audio spectrum projection based on several basis decomposition algorithms applied to general sound recognition and audio segmentation," in Proc. 12th Eur. Signal Process. Conf., Vienna, Austria, Sep. 2004, pp. 1047-1050.
    • (2004) Proc. 12th Eur. Signal Process. Conf. , pp. 1047-1050
    • Kim, H.G.1    Sikora, T.2
  • 6
    • 34547324377 scopus 로고    scopus 로고
    • Automatic speaker change detection with the Bayesian information criterion using MPEG-7 features and a fusion scheme
    • Kos, Greece May
    • M. Kotti, E. Benetos, and C. Kotropoulos, "Automatic speaker change detection with the Bayesian information criterion using MPEG-7 features and a fusion scheme," in Proc. IEEE Int. Symp. Circuits Syst., Kos, Greece, May 2006.
    • (2006) Proc. IEEE Int. Symp. Circuits Syst.
    • Kotti, M.1    Benetos, E.2    Kotropoulos, C.3
  • 7
    • 34247559206 scopus 로고    scopus 로고
    • Automatic speaker segmentation using multiple features and distance measures: A comparison of three approaches
    • Toronto, ON, Canada, Jul.
    • M. Kotti, L. G. P. M. Martins, E. Benetos, J. S. Cardoso, and C. Kotropoulos, "Automatic speaker segmentation using multiple features and distance measures: A comparison of three approaches," in Proc. IEEE Int. Conf. Multimedia Expo, Toronto, ON, Canada, Jul. 2006, pp. 1101-1104.
    • (2006) Proc. IEEE Int. Conf. Multimedia Expo , pp. 1101-1104
    • Kotti, M.1    Martins, L.G.P.M.2    Benetos, E.3    Cardoso, J.S.4    Kotropoulos, C.5
  • 8
    • 34047261805 scopus 로고    scopus 로고
    • An overview of automatic speaker diarization systems
    • Sep.
    • S. E. Tranter and D. A. Reynolds, "An overview of automatic speaker diarization systems," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 5, pp. 1557-1565, Sep. 2006.
    • (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.5 , pp. 1557-1565
    • Tranter, S.E.1    Reynolds, D.A.2
  • 10
    • 64149092838 scopus 로고    scopus 로고
    • Speaker clustering of speech utterances using a voice characteristic reference space
    • Jeju Island, Korea Oct.
    • W. H. Tsai, S. S. Cheng, and H. M. Wang, "Speaker clustering of speech utterances using a voice characteristic reference space," in Proc. 8th Int. Conf. Spoken Lang. Process., Jeju Island, Korea, Oct. 2004.
    • (2004) Proc. 8th Int. Conf. Spoken Lang. Process.
    • Tsai, W.H.1    Cheng, S.S.2    Wang, H.M.3
  • 12
    • 0141809272 scopus 로고    scopus 로고
    • E-HMM approach for learning and adapting sound models for speaker indexing
    • Crete, Greece Jun.
    • S. Meignier, J. F. Bonastre, and S. Igounet, "E-HMM approach for learning and adapting sound models for speaker indexing," in Proc. Odyssey Speaker Lang. Recognition Workshop, Crete, Greece, Jun. 2001, pp. 175-180.
    • (2001) Proc. Odyssey Speaker Lang. Recognition Workshop , pp. 175-180
    • Meignier, S.1    Bonastre, J.F.2    Igounet, S.3
  • 14
    • 29044442235 scopus 로고    scopus 로고
    • Step-by-step and integrated approaches in broadcast news speaker diarization
    • Apr.-July
    • S. Meignier, D. Moraru, C. Fredouille, J. F. Bonastre, and L. Besacier, "Step-by-step and integrated approaches in broadcast news speaker diarization," Comput. Speech Lang., vol. 20, no. 2-3, pp. 303-330, Apr.-July 2006.
    • (2006) Comput. Speech Lang. , vol.20 , Issue.2-3 , pp. 303-330
    • Meignier, S.1    Moraru, D.2    Fredouille, C.3    Bonastre, J.F.4    Besacier, L.5
  • 17
    • 77956497615 scopus 로고    scopus 로고
    • [Online]. Available:
    • [Online]. Available: http://www.itl.nist.gov/iad/mig/tests/rt/
  • 19
    • 47749119617 scopus 로고    scopus 로고
    • The ICSI RT07S speaker diarization system
    • Berlin, Germany: Springer, vol. LNCS 4625
    • C. Wooters and M. Huijbregts, "The ICSI RT07S speaker diarization system," in Multimodal Technologies for Perception of Humans. Berlin, Germany: Springer, 2009, vol. LNCS 4625, pp. 509-519.
    • (2009) Multimodal Technologies for Perception of Humans , pp. 509-519
    • Wooters, C.1    Huijbregts, M.2
  • 20
    • 0033884858 scopus 로고    scopus 로고
    • Speaker verification using adapted Gaussian mixture models
    • Oct.
    • D. A. Reynolds, T. F. Quatiery, and R. B. Dunn, "Speaker verification using adapted Gaussian mixture models," Digital Signal Process., vol. 10, pp. 19-41, Oct. 2000.
    • (2000) Digital Signal Process. , vol.10 , pp. 19-41
    • Reynolds, D.A.1    Quatiery, T.F.2    Dunn, R.B.3
  • 21
    • 0031233424 scopus 로고    scopus 로고
    • Speaker recognition: A tutorial
    • Sep.
    • J. P. Campbell, "Speaker recognition: A tutorial," Proc. IEEE, vol. 85, no. 9, pp. 1437-1462, Sep. 1997.
    • (1997) Proc. IEEE , vol.85 , Issue.9 , pp. 1437-1462
    • Campbell, J.P.1
  • 22
    • 38949122754 scopus 로고    scopus 로고
    • Speaker segmentation and clustering
    • May
    • M. Kotti, V. Moschou, and C. Kotropoulos, "Speaker segmentation and clustering," Signal Process., vol. 88, no. 5, pp. 1091-1124, May 2008.
    • (2008) Signal Process. , vol.88 , Issue.5 , pp. 1091-1124
    • Kotti, M.1    Moschou, V.2    Kotropoulos, C.3
  • 24
    • 0036650810 scopus 로고    scopus 로고
    • Unsupervised speaker recognition based on competition between self-organizing maps
    • Jul.
    • I. Lapidot, H. Guterman, and A. Cohen, "Unsupervised speaker recognition based on competition between self-organizing maps," IEEE Trans. Neural Netw., vol. 13, no. 4, pp. 877-887, Jul. 2002.
    • (2002) IEEE Trans. Neural Netw. , vol.13 , Issue.4 , pp. 877-887
    • Lapidot, I.1    Guterman, H.2    Cohen, A.3
  • 25
  • 27
    • 0026400244 scopus 로고
    • Segregation of speakers for speech recognition and speaker identification
    • Toronto, ON, Canada, Apr.
    • H. Gish, M. H. Siu, and R. Rohlicek, "Segregation of speakers for speech recognition and speaker identification," in Proc. 1991 IEEE Int. Conf. Acoust., Speech, Signal Process., Toronto, ON, Canada, Apr. 1991, pp. 873-876.
    • (1991) Proc. 1991 IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 873-876
    • Gish, H.1    Siu, M.H.2    Rohlicek, R.3
  • 28
    • 17444365032 scopus 로고    scopus 로고
    • Unsupervised speaker segmentation and tracking in real-time audio content analysis
    • Apr.
    • L. Lu and H. Zhang, "Unsupervised speaker segmentation and tracking in real-time audio content analysis," Multimedia Syst., vol. 10, no. 4, pp. 332-343, Apr. 2005.
    • (2005) Multimedia Syst. , vol.10 , Issue.4 , pp. 332-343
    • Lu, L.1    Zhang, H.2
  • 29
    • 85128356454 scopus 로고    scopus 로고
    • Partitioning and transcription of broadcast news data
    • Sydney, Australia Dec.
    • J.-L. Gauvain, L. Lamel, and G. Adda, "Partitioning and transcription of broadcast news data," in Proc. 5th Int. Conf. Spoken Lang. Process., Sydney, Australia, Dec. 1998, pp. 1335-1338.
    • (1998) Proc. 5th Int. Conf. Spoken Lang. Process. , pp. 1335-1338
    • Gauvain, J.-L.1    Lamel, L.2    Adda, G.3
  • 31
    • 84875953283 scopus 로고    scopus 로고
    • Clustering via the Bayesian information criterion with applications in speech recognition
    • Seattle, WA May
    • S. S. Chen and P. S. Gopalakrishnan, "Clustering via the Bayesian information criterion with applications in speech recognition," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Seattle, WA, May 1998, vol. 2, pp. 645-648.
    • (1998) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , vol.2 , pp. 645-648
    • Chen, S.S.1    Gopalakrishnan, P.S.2
  • 32
    • 78650540904 scopus 로고    scopus 로고
    • Improved speaker segmentation and segments clustering using the Bayesian information criterion
    • Sep.
    • A. Tritschler and R. Gopinath, "Improved speaker segmentation and segments clustering using the Bayesian information criterion," in Proc. 6th Eur. Conf. Speech Commun. Techol., Sep. 1999, pp. 679-682.
    • (1999) Proc. 6th Eur. Conf. Speech Commun. Techol. , pp. 679-682
    • Tritschler, A.1    Gopinath, R.2
  • 33
    • 0034273195 scopus 로고    scopus 로고
    • DISTBIC:Aspeaker-based segmentation for audio data indexing
    • Sep.
    • P. Delacourt and C. J.Wellekens, "DISTBIC:Aspeaker-based segmentation for audio data indexing," Speech Commun., vol. 32, pp. 111-126, Sep. 2000.
    • (2000) Speech Commun. , vol.32 , pp. 111-126
    • Delacourt, P.1    Wellekens, C.J.2
  • 35
    • 77956526830 scopus 로고    scopus 로고
    • [Online].Available:
    • [Online]. Available: http://www.praat.org
  • 36
    • 0001835850 scopus 로고
    • Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound
    • P. Boersma, "Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound," in Proc. Inst. Phon. Sci., 1993, vol. 17, pp. 97-110.
    • (1993) Proc. Inst. Phon. Sci. , vol.17 , pp. 97-110
    • Boersma, P.1
  • 37
    • 66149116378 scopus 로고    scopus 로고
    • Computationally efficient and robust BIC-based speaker segmentation
    • Jul.
    • M. Kotti, E. Benetos, and C. Kotropoulos, "Computationally efficient and robust BIC-based speaker segmentation," IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 5, pp. 920-933, Jul. 2008.
    • (2008) IEEE Trans. Audio, Speech, Lang. Process. , vol.16 , Issue.5 , pp. 920-933
    • Kotti, M.1    Benetos, E.2    Kotropoulos, C.3
  • 38
    • 70350451584 scopus 로고    scopus 로고
    • Robust detection of phone segments in continuous speech using model selection criteria with few observations
    • Feb.
    • G. Almpanidis, M. Kotti, and C. Kotropoulos, "Robust detection of phone segments in continuous speech using model selection criteria with few observations," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 2, pp. 287-298, Feb. 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.2 , pp. 287-298
    • Almpanidis, G.1    Kotti, M.2    Kotropoulos, C.3
  • 40
    • 21244468777 scopus 로고    scopus 로고
    • Combining multiple clusterings using evidence accumulation
    • Jun.
    • L. N. Fred and A. K. Jain, "Combining multiple clusterings using evidence accumulation," IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 6, pp. 835-850, Jun. 2005.
    • (2005) IEEE Trans. Pattern Anal. Mach. Intell. , vol.27 , Issue.6 , pp. 835-850
    • Fred, L.N.1    Jain, A.K.2
  • 41
    • 27544491443 scopus 로고    scopus 로고
    • A CLUE for CLUster ensembles
    • Sep.
    • K. Hornik, "A CLUE for CLUster ensembles," J. Statist. Software, vol. 14, no. 12, Sep. 2005.
    • (2005) J. Statist. Software , vol.14 , Issue.12
    • Hornik, K.1
  • 42
    • 33745772326 scopus 로고    scopus 로고
    • Cluster ensemble and its applications in gene expression analysis
    • X. Hu and I. Yoo, "Cluster ensemble and its applications in gene expression analysis," in ACM Int. Conf. Proc. Series, 2004, vol. 55, pp. 297-302.
    • (2004) ACM Int. Conf. Proc. Series , vol.55 , pp. 297-302
    • Hu, X.1    Yoo, I.2
  • 43
    • 84957012677 scopus 로고    scopus 로고
    • Finding consistent clusters in data partitions
    • New York: Springer, LNCS 2096
    • A. Fred, "Finding consistent clusters in data partitions," in Multiple Classifier Systems. New York: Springer, 2001, vol. LNCS 2096, pp. 309-318.
    • (2001) Multiple Classifier Systems , pp. 309-318
    • Fred, A.1
  • 44
    • 0038391443 scopus 로고    scopus 로고
    • Bagging to improve the accuracy of a clustering procedure
    • S. Dudoit and J. Fridlyand, "Bagging to improve the accuracy of a clustering procedure," BioInformatics, vol. 19, no. 9, pp. 1090-1099, 2003.
    • (2003) BioInformatics , vol.19 , Issue.9 , pp. 1090-1099
    • Dudoit, S.1    Fridlyand, J.2
  • 47
    • 0003882879 scopus 로고    scopus 로고
    • Providence, RI: American Mathematical Society
    • F. R. K. Chung, Spectral Graph Theory. Providence, RI: American Mathematical Society, 1997.
    • (1997) Spectral Graph Theory
    • Chung, F.R.K.1
  • 48
    • 34548583274 scopus 로고    scopus 로고
    • A tutorial on spectral clustering
    • U. von Luxburg, "A tutorial on spectral clustering," Statist. Comput., vol. 17, no. 4, pp. 395-416, 2007.
    • (2007) Statist. Comput. , vol.17 , Issue.4 , pp. 395-416
    • Von Luxburg, U.1
  • 50
    • 77956548258 scopus 로고    scopus 로고
    • 2004 RT-03 MDE training data speech
    • Philadelphia, PA [Online]. Available:
    • S. Strassel, C. Walker, and H. Lee, "2004 RT-03 MDE training data speech," in Linguist. Data Consortium, Philadelphia, PA [Online]. Available: http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId= LDC2004S08
    • Linguist. Data Consortium
    • Strassel, S.1    Walker, C.2    Lee, H.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.