메뉴 건너뛰기




Volumn 17, Issue 5, 2009, Pages 985-993

Prosodic and other long-term features for speaker diarization

Author keywords

Long term features; Prosody; Speaker diarization

Indexed keywords

AUDIO TRACK; DATA SETS; DISCRIMINABILITY; ERROR RATE; LONG-TERM FEATURES; PRIOR KNOWLEDGE; PROSODY; SPEAKER DIARIZATION;

EID: 67651165389     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2009.2015089     Document Type: Article
Times cited : (56)

References (30)
  • 1
    • 33646380923 scopus 로고    scopus 로고
    • D. Reynolds and P. Torres-Carrasquillo, Approaches and applications of audio diarization, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'05), Mar. 2005, 5, pp. 953-956.
    • D. Reynolds and P. Torres-Carrasquillo, "Approaches and applications of audio diarization," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'05), Mar. 2005, vol. 5, pp. 953-956.
  • 2
    • 36248960119 scopus 로고    scopus 로고
    • E. Shriberg, Higher-Level Features in Speaker Recognition, in Speaker Classification I, ser. Lecture Notes in Artificial Intelligence, C. Müller, Ed. Heidelberg, Germany: Springer, 2007, 4343.
    • E. Shriberg, "Higher-Level Features in Speaker Recognition," in Speaker Classification I, ser. Lecture Notes in Artificial Intelligence, C. Müller, Ed. Heidelberg, Germany: Springer, 2007, vol. 4343.
  • 7
    • 0141744710 scopus 로고    scopus 로고
    • The SuperSID project: Exploiting high-level information for high-accuracy speaker recognition
    • D. Reynolds et al., "The SuperSID project: Exploiting high-level information for high-accuracy speaker recognition," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'03), 2003, vol. 4, pp. 784-787.
    • (2003) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'03) , vol.4 , pp. 784-787
    • Reynolds, D.1
  • 8
    • 21844454996 scopus 로고    scopus 로고
    • Modeling prosodic feature sequences for speaker recognition
    • Jul
    • E. Shriberg, L. Ferrer, S. Kajarekar, A. Venkataraman, and A. Stolcke, "Modeling prosodic feature sequences for speaker recognition," in Speech Commun., Jul. 2005, vol. 46, no. 3-4, pp. 455-472.
    • (2005) Speech Commun , vol.46 , Issue.3-4 , pp. 455-472
    • Shriberg, E.1    Ferrer, L.2    Kajarekar, S.3    Venkataraman, A.4    Stolcke, A.5
  • 10
    • 0002595416 scopus 로고    scopus 로고
    • Speaker, environment and channel change detection and clustering via the Bayesian information criterion
    • S. Chen and P. Gopalakrishnan, "Speaker, environment and channel change detection and clustering via the Bayesian information criterion," in Proc. DARPA Speech Recognition Workshop, 1998.
    • (1998) Proc. DARPA Speech Recognition Workshop
    • Chen, S.1    Gopalakrishnan, P.2
  • 11
    • 44949264065 scopus 로고    scopus 로고
    • H. Ning, M. Liu, H. Tang, and T. Huang, A spectral clustering approach to speaker diarization, in Proc. Interspeech, 2006, ISCA, article ID: 1607-ThuA10.1.
    • H. Ning, M. Liu, H. Tang, and T. Huang, "A spectral clustering approach to speaker diarization," in Proc. Interspeech, 2006, ISCA, article ID: 1607-ThuA10.1.
  • 12
    • 0022018101 scopus 로고
    • A probabilistic distance measure for hidden Markov models
    • B. H. Juang and L. R. Rabiner, "A probabilistic distance measure for hidden Markov models," AT&T Tech. J., vol. 64, no. 2, pp. 391-408, 1985.
    • (1985) AT&T Tech. J , vol.64 , Issue.2 , pp. 391-408
    • Juang, B.H.1    Rabiner, L.R.2
  • 13
    • 33745560829 scopus 로고    scopus 로고
    • Robust speaker segmentation for meetings: The ICSI-SRI spring 2005 diarization system
    • Edinburgh, U.K, Springer
    • X. Anguera, C. Wooters, B. Peskin, and M. Aguilo, "Robust speaker segmentation for meetings: The ICSI-SRI spring 2005 diarization system," in Proc. NIST MLMI Meeting Recognition Workshop, Edinburgh, U.K., 2005, pp. 402-414, Springer.
    • (2005) Proc. NIST MLMI Meeting Recognition Workshop , pp. 402-414
    • Anguera, X.1    Wooters, C.2    Peskin, B.3    Aguilo, M.4
  • 14
    • 0034273195 scopus 로고    scopus 로고
    • P. Delacourt and C. Wellekens, Distbic: A speaker-based segmentation for audio data indexing, Speech Commun.: Special Iss. Access. Inf. Spoken Audio, 32, no. 1-2, pp. 111-126, 2000.
    • P. Delacourt and C. Wellekens, "Distbic: A speaker-based segmentation for audio data indexing," Speech Commun.: Special Iss. Access. Inf. Spoken Audio, vol. 32, no. 1-2, pp. 111-126, 2000.
  • 15
    • 0028516097 scopus 로고
    • Text-independent speaker identification
    • Oct
    • H. Gish and M. Schmidt, "Text-independent speaker identification," IEEE Signal Process. Mag., vol. 11, no. 4, pp. 18-32, Oct. 1994.
    • (1994) IEEE Signal Process. Mag , vol.11 , Issue.4 , pp. 18-32
    • Gish, H.1    Schmidt, M.2
  • 16
    • 0031233424 scopus 로고    scopus 로고
    • Speaker recognition: A tutorial
    • Oct
    • J. Campbell, "Speaker recognition: A tutorial," Proc. IEEE, vol. 85, no. 9, pp. 1437-1462, Oct. 1997.
    • (1997) Proc. IEEE , vol.85 , Issue.9 , pp. 1437-1462
    • Campbell, J.1
  • 18
    • 44949173184 scopus 로고    scopus 로고
    • A. Gallardo-Antolin, X. Anguera, and C. Wooters, Multi-stream speaker diarization systems for the meetings domain, in Proc. Interspeech, 2006, no. 1620-Thu1A1O.3.
    • A. Gallardo-Antolin, X. Anguera, and C. Wooters, "Multi-stream speaker diarization systems for the meetings domain," in Proc. Interspeech, 2006, no. 1620-Thu1A1O.3.
  • 19
    • 34548351229 scopus 로고    scopus 로고
    • J. Pardo, X. Anguera, and C. Wooters, Speaker diarization for multiple distant microphone meetings: Mixing acoustic features and inter-channel time differences, in Proc. Interspeech, 2006, no. 1337-Thu1A1O.5.
    • J. Pardo, X. Anguera, and C. Wooters, "Speaker diarization for multiple distant microphone meetings: Mixing acoustic features and inter-channel time differences," in Proc. Interspeech, 2006, no. 1337-Thu1A1O.5.
  • 23
    • 0001835850 scopus 로고
    • Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound
    • Amsterdam, The Netherlands
    • P. Boersma, "Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound," in Proc. Dutch Inst. Phon. Sci. (IFA), Amsterdam, The Netherlands, 1993, pp. 97-110.
    • (1993) Proc. Dutch Inst. Phon. Sci. (IFA) , pp. 97-110
    • Boersma, P.1
  • 24
    • 36249015937 scopus 로고    scopus 로고
    • V. Dellwo, M. Huckvale, and M. Ashby, How is individuality expressed in voice? An introduction to speech production & description for speaker classification, in Speaker Classification, ser. Lecture Notes in Computer Science/Artificial Intelligence, C. Müller, Ed. New York: Springer, 2007, 4343.
    • V. Dellwo, M. Huckvale, and M. Ashby, "How is individuality expressed in voice? An introduction to speech production & description for speaker classification," in Speaker Classification, ser. Lecture Notes in Computer Science/Artificial Intelligence, C. Müller, Ed. New York: Springer, 2007, vol. 4343.
  • 25
    • 0036985308 scopus 로고    scopus 로고
    • Harmonics-to-noise ratio: An index of vocal aging
    • C. Ferrand, "Harmonics-to-noise ratio: An index of vocal aging," J. Voice, vol. 16, no. 4, pp. 480-487, 2002.
    • (2002) J. Voice , vol.16 , Issue.4 , pp. 480-487
    • Ferrand, C.1
  • 26
    • 4444257069 scopus 로고    scopus 로고
    • PRAAT, a system for doing phonetics by computer
    • P. Boersma, "PRAAT, a system for doing phonetics by computer," Glot Int., vol. 9, no. 5, pp. 341-345, 2001.
    • (2001) Glot Int , vol.9 , Issue.5 , pp. 341-345
    • Boersma, P.1
  • 30
    • 34548310397 scopus 로고    scopus 로고
    • Speaker diarization for multiple-distant-microphone meetings using several sources of information
    • Sep
    • A.Gallardo-Antolin, X. Anguera, and C. Wooters, "Speaker diarization for multiple-distant-microphone meetings using several sources of information," IEEE Trans. Comput., vol. 56, no. 9, pp. 1212-1224, Sep. 2007.
    • (2007) IEEE Trans. Comput , vol.56 , Issue.9 , pp. 1212-1224
    • Gallardo-Antolin, A.1    Anguera, X.2    Wooters, C.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.