SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 17, Issue 5, 2009, Pages 985-993

Prosodic and other long-term features for speaker diarization

(4) Friedland, Gerald a Vinyals, Oriol a Huang, Yan a,b Müller, Christian a,c

a INTERNATIONAL COMPUTER SCIENCE INSTITUTE (United States)

b Li Creative Technologies Inc (United States)

c GERMAN RESEARCH CENTER FOR ARTIFICIAL INTELLIGENCE DFKI (Germany)

Author keywords

Long term features; Prosody; Speaker diarization

Indexed keywords

AUDIO TRACK; DATA SETS; DISCRIMINABILITY; ERROR RATE; LONG-TERM FEATURES; PRIOR KNOWLEDGE; PROSODY; SPEAKER DIARIZATION;

EID: 67651165389 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2009.2015089 Document Type: Article

Times cited : (56)

References (30)

1
- 33646380923
- D. Reynolds and P. Torres-Carrasquillo, Approaches and applications of audio diarization, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'05), Mar. 2005, 5, pp. 953-956.
- D. Reynolds and P. Torres-Carrasquillo, "Approaches and applications of audio diarization," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'05), Mar. 2005, vol. 5, pp. 953-956.

2
- 36248960119
- E. Shriberg, Higher-Level Features in Speaker Recognition, in Speaker Classification I, ser. Lecture Notes in Artificial Intelligence, C. Müller, Ed. Heidelberg, Germany: Springer, 2007, 4343.
- E. Shriberg, "Higher-Level Features in Speaker Recognition," in Speaker Classification I, ser. Lecture Notes in Artificial Intelligence, C. Müller, Ed. Heidelberg, Germany: Springer, 2007, vol. 4343.

3
- 85128436986
- Modeling dynamic prosodic variation for speaker verification
- K. Soenmez, E. Shriberg, L. Heck, and M. Weintraub, "Modeling dynamic prosodic variation for speaker verification," in Proc. Int. Conf. Spoken Lang. Process. 1998, 1998, no. 0920.
- (1998) Proc. Int. Conf. Spoken Lang. Process , vol.1998 , Issue.920
- Soenmez, K.¹ Shriberg, E.² Heck, L.³ Weintraub, M.⁴

4
- 0141814662
- The ICSI meeting corpus
- A. Janin et al., "The ICSI meeting corpus," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'03), 2003, vol. 1, pp. 364-367.
- (2003) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'03) , vol.1 , pp. 364-367
- Janin, A.¹

5
- 85168075201
- Modeling NERFs for speaker recognition
- S. Kajarekar, L. Ferrer, K. Soenmez, J. Zheng, E. Shriberg, and A. Stolcke, "Modeling NERFs for speaker recognition," in Proc. Speaker Odyssey, 2004, pp. 51-56.
- (2004) Proc. Speaker Odyssey , pp. 51-56
- Kajarekar, S.¹ Ferrer, L.² Soenmez, K.³ Zheng, J.⁴ Shriberg, E.⁵ Stolcke, A.⁶

6
- 0141856298
- Using prosodic and conversational features for high-performance speaker recognition: Report from JHU WS'02
- B. Peskin, J. Navratil, J. Abramson, D. Jones, D. Klusacek, and D. Reynolds, "Using prosodic and conversational features for high-performance speaker recognition: Report from JHU WS'02," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'03), 2003, vol. 4, pp. 729-795.
- (2003) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'03) , vol.4 , pp. 729-795
- Peskin, B.¹ Navratil, J.² Abramson, J.³ Jones, D.⁴ Klusacek, D.⁵ Reynolds, D.⁶

7
- 0141744710
- The SuperSID project: Exploiting high-level information for high-accuracy speaker recognition
- D. Reynolds et al., "The SuperSID project: Exploiting high-level information for high-accuracy speaker recognition," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'03), 2003, vol. 4, pp. 784-787.
- (2003) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'03) , vol.4 , pp. 784-787
- Reynolds, D.¹

8
- 21844454996
- Modeling prosodic feature sequences for speaker recognition
- Jul
- E. Shriberg, L. Ferrer, S. Kajarekar, A. Venkataraman, and A. Stolcke, "Modeling prosodic feature sequences for speaker recognition," in Speech Commun., Jul. 2005, vol. 46, no. 3-4, pp. 455-472.
- (2005) Speech Commun , vol.46 , Issue.3-4 , pp. 455-472
- Shriberg, E.¹ Ferrer, L.² Kajarekar, S.³ Venkataraman, A.⁴ Stolcke, A.⁵

9
- 34547515912
- Parameterization of prosodic feature distributions for SVM modeling in speaker recognition
- L. Ferrer, E. Shriberg, S. Kajarekar, and K. Sonmez, " Parameterization of prosodic feature distributions for SVM modeling in speaker recognition," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., (ICASSP'07), 2007, vol. 4, pp. 233-236.
- (2007) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., (ICASSP'07) , vol.4 , pp. 233-236
- Ferrer, L.¹ Shriberg, E.² Kajarekar, S.³ Sonmez, K.⁴

10
- 0002595416
- Speaker, environment and channel change detection and clustering via the Bayesian information criterion
- S. Chen and P. Gopalakrishnan, "Speaker, environment and channel change detection and clustering via the Bayesian information criterion," in Proc. DARPA Speech Recognition Workshop, 1998.
- (1998) Proc. DARPA Speech Recognition Workshop
- Chen, S.¹ Gopalakrishnan, P.²

11
- 44949264065
- H. Ning, M. Liu, H. Tang, and T. Huang, A spectral clustering approach to speaker diarization, in Proc. Interspeech, 2006, ISCA, article ID: 1607-ThuA10.1.
- H. Ning, M. Liu, H. Tang, and T. Huang, "A spectral clustering approach to speaker diarization," in Proc. Interspeech, 2006, ISCA, article ID: 1607-ThuA10.1.

12
- 0022018101
- A probabilistic distance measure for hidden Markov models
- B. H. Juang and L. R. Rabiner, "A probabilistic distance measure for hidden Markov models," AT&T Tech. J., vol. 64, no. 2, pp. 391-408, 1985.
- (1985) AT&T Tech. J , vol.64 , Issue.2 , pp. 391-408
- Juang, B.H.¹ Rabiner, L.R.²

13
- 33745560829
- Robust speaker segmentation for meetings: The ICSI-SRI spring 2005 diarization system
- Edinburgh, U.K, Springer
- X. Anguera, C. Wooters, B. Peskin, and M. Aguilo, "Robust speaker segmentation for meetings: The ICSI-SRI spring 2005 diarization system," in Proc. NIST MLMI Meeting Recognition Workshop, Edinburgh, U.K., 2005, pp. 402-414, Springer.
- (2005) Proc. NIST MLMI Meeting Recognition Workshop , pp. 402-414
- Anguera, X.¹ Wooters, C.² Peskin, B.³ Aguilo, M.⁴

14
- 0034273195
- P. Delacourt and C. Wellekens, Distbic: A speaker-based segmentation for audio data indexing, Speech Commun.: Special Iss. Access. Inf. Spoken Audio, 32, no. 1-2, pp. 111-126, 2000.
- P. Delacourt and C. Wellekens, "Distbic: A speaker-based segmentation for audio data indexing," Speech Commun.: Special Iss. Access. Inf. Spoken Audio, vol. 32, no. 1-2, pp. 111-126, 2000.

15
- 0028516097
- Text-independent speaker identification
- Oct
- H. Gish and M. Schmidt, "Text-independent speaker identification," IEEE Signal Process. Mag., vol. 11, no. 4, pp. 18-32, Oct. 1994.
- (1994) IEEE Signal Process. Mag , vol.11 , Issue.4 , pp. 18-32
- Gish, H.¹ Schmidt, M.²

16
- 0031233424
- Speaker recognition: A tutorial
- Oct
- J. Campbell, "Speaker recognition: A tutorial," Proc. IEEE, vol. 85, no. 9, pp. 1437-1462, Oct. 1997.
- (1997) Proc. IEEE , vol.85 , Issue.9 , pp. 1437-1462
- Campbell, J.¹

17
- 33646789869
- Hybrid speaker-based segmentation system using model-level clustering
- H. Kim, D. Ertelt, and T. Sikora, "Hybrid speaker-based segmentation system using model-level clustering," in IEEE Int. Conf. Acoust., Speech, Signal Process., (ICASSP'05), 2005, vol. 1, pp. 745-748.
- (2005) IEEE Int. Conf. Acoust., Speech, Signal Process., (ICASSP'05) , vol.1 , pp. 745-748
- Kim, H.¹ Ertelt, D.² Sikora, T.³

18
- 44949173184
- A. Gallardo-Antolin, X. Anguera, and C. Wooters, Multi-stream speaker diarization systems for the meetings domain, in Proc. Interspeech, 2006, no. 1620-Thu1A1O.3.
- A. Gallardo-Antolin, X. Anguera, and C. Wooters, "Multi-stream speaker diarization systems for the meetings domain," in Proc. Interspeech, 2006, no. 1620-Thu1A1O.3.

19
- 34548351229
- J. Pardo, X. Anguera, and C. Wooters, Speaker diarization for multiple distant microphone meetings: Mixing acoustic features and inter-channel time differences, in Proc. Interspeech, 2006, no. 1337-Thu1A1O.5.
- J. Pardo, X. Anguera, and C. Wooters, "Speaker diarization for multiple distant microphone meetings: Mixing acoustic features and inter-channel time differences," in Proc. Interspeech, 2006, no. 1337-Thu1A1O.5.

20
- 47749119617
- The ICSI RT07s speaker diarization system
- Springer
- C. Wooters and M. Huijbregts, "The ICSI RT07s speaker diarization system," in Proc. NIST RT07 Meeting Recognition Evaluation Workshop, 2007, pp. 509-519, Springer.
- (2007) Proc. NIST RT07 Meeting Recognition Evaluation Workshop , pp. 509-519
- Wooters, C.¹ Huijbregts, M.²

21
- 84946742526
- A robust speaker clustering algorithm
- J. Ajmera and C. Wooters, "A robust speaker clustering algorithm," in 2003 IEEE Workshop Autom. Speech Recognition Understanding ASRU'03, 2003, pp. 411-416.
- (2003) 2003 IEEE Workshop Autom. Speech Recognition Understanding ASRU'03 , pp. 411-416
- Ajmera, J.¹ Wooters, C.²

22
- 44849088089
- A fast-match approach for robust, faster than real-time speaker diarization
- Y. Huang, O. Vinyals, G. Friedland, C. Müller, N. Mirghafori, and C. Wooters, "A fast-match approach for robust, faster than real-time speaker diarization," in Proc. IEEE Autom. Speech Recognition Understanding Workshop, 2007, pp. 693-698.
- (2007) Proc. IEEE Autom. Speech Recognition Understanding Workshop , pp. 693-698
- Huang, Y.¹ Vinyals, O.² Friedland, G.³ Müller, C.⁴ Mirghafori, N.⁵ Wooters, C.⁶

23
- 0001835850
- Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound
- Amsterdam, The Netherlands
- P. Boersma, "Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound," in Proc. Dutch Inst. Phon. Sci. (IFA), Amsterdam, The Netherlands, 1993, pp. 97-110.
- (1993) Proc. Dutch Inst. Phon. Sci. (IFA) , pp. 97-110
- Boersma, P.¹

24
- 36249015937
- V. Dellwo, M. Huckvale, and M. Ashby, How is individuality expressed in voice? An introduction to speech production & description for speaker classification, in Speaker Classification, ser. Lecture Notes in Computer Science/Artificial Intelligence, C. Müller, Ed. New York: Springer, 2007, 4343.
- V. Dellwo, M. Huckvale, and M. Ashby, "How is individuality expressed in voice? An introduction to speech production & description for speaker classification," in Speaker Classification, ser. Lecture Notes in Computer Science/Artificial Intelligence, C. Müller, Ed. New York: Springer, 2007, vol. 4343.

25
- 0036985308
- Harmonics-to-noise ratio: An index of vocal aging
- C. Ferrand, "Harmonics-to-noise ratio: An index of vocal aging," J. Voice, vol. 16, no. 4, pp. 480-487, 2002.
- (2002) J. Voice , vol.16 , Issue.4 , pp. 480-487
- Ferrand, C.¹

26
- 4444257069
- PRAAT, a system for doing phonetics by computer
- P. Boersma, "PRAAT, a system for doing phonetics by computer," Glot Int., vol. 9, no. 5, pp. 341-345, 2001.
- (2001) Glot Int , vol.9 , Issue.5 , pp. 341-345
- Boersma, P.¹

27
- 0003922190
- 2nd ed. New York: Wiley
- R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed. New York: Wiley, 2001.
- (2001) Pattern Classification
- Duda, R.O.¹ Hart, P.E.² Stork, D.G.³

28
- 0003548585
- National Inst. Standards Technol, Gaithers-burg, MD, Tech. Rep. NISTIR 4930
- J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallet, and N. L. Dahlgren, "The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM," National Inst. Standards Technol., Gaithers-burg, MD, Tech. Rep. NISTIR 4930, 1993.
- (1993) The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM
- Garofolo, J.S.¹ Lamel, L.F.² Fisher, W.M.³ Fiscus, J.G.⁴ Pallet, D.S.⁵ Dahlgren, N.L.⁶

29
- 67651194790
- Sprecherklassifikation nach Alter und Geschlecht
- C. Mueller, Sprecherklassifikation nach Alter und Geschlecht. Heidelberg, Germany: Akademische Verlagsgesellschaft Aka GmbH, 2006.
- (2006) Heidelberg, Germany: Akademische Verlagsgesellschaft Aka GmbH
- Mueller, C.¹

30
- 34548310397
- Speaker diarization for multiple-distant-microphone meetings using several sources of information
- Sep
- A.Gallardo-Antolin, X. Anguera, and C. Wooters, "Speaker diarization for multiple-distant-microphone meetings using several sources of information," IEEE Trans. Comput., vol. 56, no. 9, pp. 1212-1224, Sep. 2007.
- (2007) IEEE Trans. Comput , vol.56 , Issue.9 , pp. 1212-1224
- Gallardo-Antolin, A.¹ Anguera, X.² Wooters, C.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.