메뉴 건너뛰기




Volumn 16, Issue 5, 2008, Pages 920-933

Computationally efficient and robust BIC-based speaker segmentation

Author keywords

Automatic speaker segmentation; Bayesian infor mation criterion (BIC); Inverse Gaussian distribution; Simultaneous diagonalization; Speaker utterance duration distribution; Speech analysis

Indexed keywords

AUTOMATIC SPEAKER SEGMENTATION; BAYESIAN INFOR-MATION CRITERION (BIC); INVERSE GAUSSIAN DISTRIBUTION; SIMULTANEOUS DIAGONALIZATION; SPEAKER UTTERANCE DURATION DISTRIBUTION;

EID: 66149116378     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2008.925152     Document Type: Article
Times cited : (32)

References (45)
  • 1
    • 4544361760 scopus 로고    scopus 로고
    • "Comparison of MPEG-7 audio spectrum projection features and MFCC applied to speaker recognition, sound classification and audio segmentation,"
    • Montreal, QC, Canada, May
    • H. G. Kim and T. Sikora, "Comparison of MPEG-7 audio spectrum projection features and MFCC applied to speaker recognition, sound classification and audio segmentation," in Proc. 2004 IEEE Int. Conf. Acoust., Speech, Signal Process., Montreal, QC, Canada, May 2004, vol. 5, pp. 925-928.
    • (2004) In Proc. 2004 IEEE Int. Conf. Acoust., Speech, Signal Process. , vol.5 , pp. 925-928
    • Kim, H.G.1    Sikora, T.2
  • 5
    • 27644599375 scopus 로고    scopus 로고
    • "Unsupervised speaker indexing using generic models,"
    • Sep.
    • S. Know and S. Narayanan, "Unsupervised speaker indexing using generic models," IEEE Trans. Speech Audio Process., vol. 13, no. 5, pp. 1004-1013, Sep. 2005.
    • (2005) IEEE Trans. Speech Audio Process. , vol.13 , Issue.5 , pp. 1004-1013
    • Know, S.1    Narayanan, S.2
  • 8
    • 33646789869 scopus 로고    scopus 로고
    • Hybrid speaker-based segmentation system using model-level clustering
    • DOI 10.1109/ICASSP.2005.1415221, 1415221, 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Speech Processing
    • H. Kim, D. filter, and T. Sikora, "Hybrid speaker-based segmentation system using model-level clustering," in Proc. 2005 IEEE Int. Conf. Acoust., Speech, Signal Process., Philadelphia, PA, Mar. 2005, vol. I, pp. 745-748. (Pubitemid 43761260)
    • (2005) ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , vol.I
    • Kim, H.-G.1    Ertelt, D.2    Sikora, T.3
  • 9
    • 29044442235 scopus 로고    scopus 로고
    • "Step-by-step and integrated approaches in broadcast news speaker diarization,"
    • Apr.-Jul.
    • S. Meignier, D. Moraru, C. Fredouille, J. F. Bonastre, and L. Besacier, "Step-by-step and integrated approaches in broadcast news speaker diarization," Compul. Speech Lang., vol. 20, no. 2-3, pp. 303-330, Apr.-Jul. 2006.
    • (2006) Comput. Speech Lang. , vol.20 , Issue.2-3 , pp. 303-330
    • Meignier, S.1    Moraru, D.2    Fredouille, C.3    Bonastre, J.F.4    Besacier, L.5
  • 11
    • 85009282223 scopus 로고    scopus 로고
    • "Speaker change detection using a new weighted distance measure,"
    • Sep.
    • S. Know and S. Narayanan, "Speaker change detection using a new weighted distance measure," in Proc. Int. Conf. Spoken Lang., Sep. 2002, vol. 4, pp. 2537-2540.
    • (2002) In Proc. Int. Conf. Spoken Lang. , vol.4 , pp. 2537-2540
    • Know, S.1    Narayanan, S.2
  • 12
    • 0034273195 scopus 로고    scopus 로고
    • "DISTBIC: A speaker-based segmen-tation for audio data indexing,"
    • Sep.
    • P. Delacourt and C. J. Wellekens, "DISTBIC: A speaker-based segmentation for audio data indexing," Speech Commun., vol. 32, pp. 111-126, Sep. 2000.
    • (2000) Speech Commun. , vol.32 , pp. 111-126
    • Delacourt, P.1    Wellekens, C.J.2
  • 13
    • 17444365032 scopus 로고    scopus 로고
    • "Unsupervised speaker segmentation and tracking in real-time audio content analysis,"
    • Apr.
    • L. Lu and H. Zhang, "Unsupervised speaker segmentation and tracking in real-time audio content analysis," Multimedia Syst., vol. 10, no. 4, pp. 332-343, Apr. 2005.
    • (2005) Multimedia Syst. , vol.10 , Issue.4 , pp. 332-343
    • Lu, L.1    Zhang, H.2
  • 14
    • 33644539859 scopus 로고    scopus 로고
    • "Audio-based description and structuring of videos,"
    • Feb.
    • H. Harb and L. Chen, "Audio-based description and structuring of videos," Int. J. Digital Libraries, vol. 6, no. 1, pp. 70-81, Feb. 2006.
    • (2006) Int. J. Digital Libraries , vol.6 , Issue.1 , pp. 70-81
    • Harb, H.1    Chen, L.2
  • 15
    • 22544475615 scopus 로고    scopus 로고
    • 2 statistic and the Bayesian information criterion,"
    • Jul.
    • B. Zhou and J. H. L. Hansen, "Efficient audio stream segmentation via the combined T2 statistic and the Bayesian information criterion," IEEE Trans. Audio, Speech, Lang. Process., vol. 13, no. 4, pp. 467-174, Jul. 2005.
    • (2005) IEEE Trans. Audio, Speech, Lang. Process. , vol.13 , Issue.4 , pp. 467-474
    • Zhou, B.1    Hansen, J.H.L.2
  • 18
  • 20
    • 33745000055 scopus 로고    scopus 로고
    • "Automatic segmenta-tion and identification of mixed-language speech using delta-BIC and LSA-based GMMs,"
    • Jan.
    • C. H. Wu, Y. H. Chiu, C. J. Shia, and C. Y. Lin, "Automatic segmenta-tion and identification of mixed-language speech using delta-BIC and LSA-based GMMs," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 1, pp. 266-276, Jan. 2006.
    • (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.1 , pp. 266-276
    • Wu, C.H.1    Chiu, Y.H.2    Shia, C.J.3    Lin, C.Y.4
  • 21
  • 22
    • 33947127409 scopus 로고    scopus 로고
    • Multiple change-point audio segmentation and classification using an MDL-based Gaussian model
    • DOI 10.1109/TSA.2005.852988
    • C. H. Wu and C. H. Hsieh, "Multiple change-point audio segmentation and classification using an MDL-based Gaussian model," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 2, pp. 647-657, Mar. 2006. (Pubitemid 46405361)
    • (2006) IEEE Transactions on Audio, Speech and Language Processing , vol.14 , Issue.2 , pp. 647-657
    • Wu, C.-H.1    Hsieh, C.-H.2
  • 23
    • 85009128756 scopus 로고    scopus 로고
    • "Metric SEQDAC: A hybrid approach for audio segmentation,"
    • Jeju, Korea, Oct.
    • S. Cheng and H. Wang, "Metric SEQDAC: A hybrid approach for audio segmentation," in Proc. 8th Int. Conf. Spoken Lang. Process., Jeju, Korea, Oct. 2004, pp. 1617-1620.
    • (2004) In Proc. 8th Int. Conf. Spoken Lang. Process. , pp. 1617-1620
    • Cheng, S.1    Wang, H.2
  • 25
    • 0001011286 scopus 로고
    • "Robust procedures in multivariate analysis I: Robust covariance estimation,"
    • N. A. Campbell, "Robust procedures in multivariate analysis I: Robust covariance estimation," Appl. Statist., vol. 29, no. 3, pp. 231-237,1980.
    • (1980) Appl. Statist. , vol.29 , Issue.3 , pp. 231-237
    • Campbell, N.A.1
  • 27
    • 0032633354 scopus 로고    scopus 로고
    • "Covariance estimation with limited training samples,"
    • Jul.
    • S. Tadjudin and D. A. Landgrebe, "Covariance estimation with limited training samples,," IEEE Trans. Geosci. Remote Sen., vol. 37, no. 4, pp. 2113-2118, Jul. 1999.
    • (1999) IEEE Trans. Geosci. Remote Sen. , vol.37 , Issue.4 , pp. 2113-2118
    • Tadjudin, S.1    Landgrebe, D.A.2
  • 28
    • 3042518464 scopus 로고    scopus 로고
    • "DARPA TIMIT Acoustic-phonetic continuous speech corpus,"
    • Philadelphia, PA
    • J. S. Garofolo, "DARPA TIMIT Acoustic-phonetic continuous speech corpus," in Linguistic Data Consortium, Philadelphia, PA, 1993.
    • In Linguistic Data Consortium , vol.1993
    • Garofolo, J.S.1
  • 30
    • 35648941166 scopus 로고    scopus 로고
    • A neural network approach to audio-assisted movie dialogue detection
    • DOI 10.1016/j.neucom.2007.08.006, PII S0925231207002275, Dedicated Hardware Architectures for Intelligent Systems
    • M. Kotti, E. Benetos, C. Kotropoulos, and I. Pitas, "A neural network approach to audio-assisted movie dialogue detection," Neurocomput., Special Iss.: Adv. Neural Netw. for Speech Audio Process., vol. 71, no. 1-3, pp. 157-166, Dec. 2007. (Pubitemid 350028667)
    • (2007) Neurocomputing , vol.71 , Issue.1-3 , pp. 157-166
    • Kotti, M.1    Benetos, E.2    Kotropoulos, C.3    Pitas, I.4
  • 31
    • 0000995459 scopus 로고
    • "The inverse Gaussian distribution and its statistical application-A review,"
    • J. L. Folks and R. S. Chhikara, "The inverse Gaussian distribution and its statistical application-A review," J. R. Statist. Soc. B, vol. 40, pp. 263-289, 1978.
    • (1978) J. R. Statist. Soc. B , vol.40 , pp. 263-289
    • Folks, J.L.1    Chhikara, R.S.2
  • 33
    • 0036920132 scopus 로고    scopus 로고
    • "Perpetual American pro-cesses under Levi processes,"
    • Jun.
    • S. I. Boyarchenko and S. Z. Levendorskii, "Perpetual American pro-cesses under Levi processes," SI AM J. Control Optim., vol. 40, no. 6, pp. 1514-1516, Jun. 2001.
    • (2001) SI AM J. Control Optim. , vol.40 , Issue.6 , pp. 1514-1516
    • Boyarchenko, S.I.1    Levendorskii, S.Z.2
  • 34
    • 0001302407 scopus 로고
    • "Statistical properties of inverse Gaussian distribu-tions I,"
    • Jun.
    • M. C. K. Tweedie, "Statistical properties of inverse Gaussian distribu-tions I," Ann. Math. Statist., vol. 28, no. 2, pp. 362-377, Jun. 1957.
    • (1957) Ann. Math. Statist. , vol.28 , Issue.2 , pp. 362-377
    • Tweedie, M.C.K.1
  • 36
    • 0031381525 scopus 로고    scopus 로고
    • Wrappers for feature subset selection
    • PII S000437029700043X
    • R. Kohavi and G. H. John, "Wrappers for feature subset selection," Artif. lntell, vol. 97, no. 1-2, pp. 273-324, Dec. 1997. (Pubitemid 127401107)
    • (1997) Artificial Intelligence , vol.97 , Issue.1-2 , pp. 273-324
    • Kohavi, R.1    John, G.H.2
  • 37
    • 35348882681 scopus 로고    scopus 로고
    • Phonemic segmentation using the generalised Gamma distribution and small sample Bayesian information criterion
    • DOI 10.1016/j.specom.2007.06.005, PII S0167639307001197
    • G. Almpanidis and C. Kotropoulos, "Phonemic segmentation using the generalised Gamma distribution and small sample Bayesian informa-tion criterion," Speech Commun., vol. 50, no. 1, pp. 38-55, Jan. 2008. (Pubitemid 47576260)
    • (2008) Speech Communication , vol.50 , Issue.1 , pp. 38-55
    • Almpanidis, G.1    Kotropoulos, C.2
  • 38
    • 38949122754 scopus 로고    scopus 로고
    • "Speaker segmentation and clustering,"
    • May
    • M. Kotti, V. Moschou, and C. Kotropoulos, "Speaker segmentation and clustering," Signal Process., vol. 88, no. 5, pp. 1091-1124, May 2008.
    • (2008) Signal Process. , vol.88 , Issue.5 , pp. 1091-1124
    • Kotti, M.1    Moschou, V.2    Kotropoulos, C.3
  • 39
    • 84900334310 scopus 로고    scopus 로고
    • "The Tukey honestly significant dif-ference procedure and its control of the type I error-rate,"
    • New Orleans, LA, CD-ROM
    • J. J. Barnette and J. E. McLean, "The Tukey honestly significant dif-ference procedure and its control of the type I error-rate," in Proc. Annu. Meeting Mid-South Edit. Res. Assoc, New Orleans, LA, 1998, CD-ROM.
    • (1998) In Proc. Annu. Meeting Mid-South Edit. Res. Assoc
    • Barnette, J.J.1    McLean, J.E.2
  • 41
    • 66149089715 scopus 로고    scopus 로고
    • "1997 english broadcast news speech (HUB4),"
    • Philadelphia, PA
    • J. Fiscus, "1997 English broadcast news speech (HUB4)," Linguistic Data Consortium, Philadelphia, PA, 1998.
    • (1998) Linguistic Data Consortium
    • Fiscus, J.1
  • 42
    • 66149132423 scopus 로고    scopus 로고
    • "1997 HUB4 english evaluation speech and transcripts,"
    • Philadelphia, PA
    • D. Graff, J. Fiscus, and J. Garofolo, "1997 HUB4 English evaluation speech and transcripts," Linguistic Data Consortium, Philadelphia, PA, 2002.
    • (2002) Linguistic Data Consortium
    • Graff, D.1    Fiscus, J.2    Garofolo, J.3
  • 45


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.