SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 20, Issue 2, 2012, Pages 356-370

Speaker Diarization: A Review of Recent Research

(6) Miro, Xavier Anguera a Bozonnet, Simon b Evans, Nicholas b Fredouille, Corinne c Friedland, Gerald d Vinyals, Oriol d

a TELEFONICA RESEARCH (Spain)

b EURECOM (France)

c UNIVERSITY OF AVIGNON (France)

d INTERNATIONAL COMPUTER SCIENCE INSTITUTE (United States)

Author keywords

[No Author keywords available]

Indexed keywords

EID: 85008530405 PISSN: 15587916 EISSN: 15587924 Source Type: Journal
DOI: 10.1109/TASL.2011.2125954 Document Type: Article

Times cited : (702)

References (130)

1
- 85008554815
- The NIST Rich Transcription 2009 (RT′09) Evaluation
- [Online]. Available: http://www.itl.nist.gov/iad/mig/tests/rt/2009/docs/rt09-meeting-eval-plan-v2.pdf
- “The NIST Rich Transcription 2009 (RT′09) Evaluation,” NIST, 2009 [Online]. Available: http://www.itl.nist.gov/iad/mig/tests/rt/2009/docs/rt09-meeting-eval-plan-v2.pdf.
- (2009) NIST

2
- 34047261805
- An overview of automatic speaker diarization systems
- Sep.
- S. Tranter and D. Reynolds, “An overview of automatic speaker diarization systems,” IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 5, pp. 1557–1565, Sep. 2006.
- (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.5 , pp. 1557-1565
- Tranter, S.¹ Reynolds, D.²

3
- 33947685454
- Nuts and flakes: A study of data characteristics in speaker diarization
- N. Mirghafori and C. Wooters, “Nuts and flakes: A study of data characteristics in speaker diarization,” in Proc. ICASSP, 2006.
- (2006) Proc. ICASSP
- Mirghafori, N.¹ Wooters, C.²

4
- 44849123928
- Robust speaker diarization for meetings
- Ph. D. dissertation, Univ. Politecnica de Catalunya, Barcelona, Spain
- X. Anguera, “Robust speaker diarization for meetings,” Ph. D. dissertation, Univ. Politecnica de Catalunya, Barcelona, Spain, 2006.
- (2006)
- Anguera, X.¹

5
- 66149116378
- Computationally efficient and robust BIC-based speaker segmentation
- Jul.
- M. Kotti, E. Benetos, and C. Kotropoulos, “Computationally efficient and robust BIC-based speaker segmentation,” IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 5, pp. 920–933, Jul. 2008.
- (2008) IEEE Trans. Audio, Speech, Lang. Process. , vol.16 , Issue.5 , pp. 920-933
- Kotti, M.¹ Benetos, E.² Kotropoulos, C.³

6
- 47749123507
- Multi-stage speaker diarization for conference and lecture meetings
- Baltimore, MD, May 8--11, 2007, Revised Selected Papers, Berlin, Heidelberg: Springer-Verlag
- X. Zhu, C. Barras, L. Lamel, and J. -L. Gauvain, “Multi-stage speaker diarization for conference and lecture meetings,” in Proc. Multimodal Technol. Perception ofHumans: Int. Eval. Workshops CLEAR 2007 and RT 2007, Baltimore, MD, May 8--11, 2007, Revised Selected Papers, Berlin, Heidelberg: Springer-Verlag, 2008, pp. 533–542.
- (2008) Proc. Multimodal Technol. Perception ofHumans: Int. Eval. Workshops CLEAR 2007 and RT 2007 , pp. 533-542
- Zhu, X.¹ Barras, C.² Lamel, L.³ Gauvain, J.-L.⁴

7
- 67349120575
- Speaker diarization using autoassociative neural networks
- S. Jothilakshmi, V. Ramalingam, and S. Palanivel, “Speaker diarization using autoassociative neural networks,” Eng. Applicat. Artif. Intell., vol. 22, no. 4-5, pp. 667–675, 2009.
- (2009) Eng. Applicat. Artif. Intell. , vol.22 , Issue.4-5 , pp. 667-675
- Jothilakshmi, S.¹ Ramalingam, V.² Palanivel, S.³

8
- 44949197897
- Robust speaker diarization for meetings: ICSI RT06s evaluation system
- Sep.
- X. Anguera, C. Wooters, and J. Hernando, “Robust speaker diarization for meetings: ICSI RT06s evaluation system,” in Proc. ICSLP, Pittsburgh, PA, Sep. 2006.
- (2006) Proc. ICSLP, Pittsburgh, PA
- Anguera, X.¹ Wooters, C.² Hernando, J.³

9
- 47749119617
- The ICSI RT07s speaker diarization system
- Baltimore, MD, USA, May 8--11, 2007, Revised Selected Papers, Berlin, Heidelberg: Springer-Verlag
- C. Wooters and M. Huijbregts, “The ICSI RT07s speaker diarization system,” in Multimodal Technologies for Perception ofHumans: International Evaluation Workshops CLEAR 2007 and RT 2007, Baltimore, MD, USA, May 8--11, 2007, Revised Selected Papers, Berlin, Heidelberg: Springer-Verlag, 2008, pp. 509–519.
- (2008) Multimodal Technologies for Perception ofHumans: International Evaluation Workshops CLEAR 2007 and RT 2007 , pp. 509-519
- Wooters, C.¹ Huijbregts, M.²

10
- 33947630340
- Fast incremental clustering of Gaussian mixture speaker models for scaling up retrieval in on-line broadcast
- May
- J. Rougui, M. Rziza, D. Aboutajdine, M. Gelgon, and J. Martinez, “Fast incremental clustering of Gaussian mixture speaker models for scaling up retrieval in on-line broadcast,” in Proc. ICASSP, May 2006, vol. 5, pp. 521–524.
- (2006) Proc. ICASSP , vol.5 , pp. 521-524
- Rougui, J.¹ Rziza, M.² Aboutajdine, D.³ Gelgon, M.⁴ Martinez, J.⁵

11
- 85008571982
- W. Tsai, S. Cheng, and H. Wang, in Proc. ICSLP, 2004.
- (2004) Proc. ICSLP
- Tsai, W.¹ Cheng, S.² Wang, H.³

12
- 84867205879
- T-test distance and clustering criterion for speaker diarization
- T. H. Nguyen, E. S. Chng, and H. Li, “T-test distance and clustering criterion for speaker diarization,” in Proc. Interspeech, Brisbane, Australia, 2008.
- (2008) Proc. Interspeech, Brisbane, Australia
- Nguyen, T.H.¹ Chng, E.S.² Li, H.³

13
- 79959827767
- The IIR-NTU speaker diarization systems for RT 2009
- Melbourne, FL
- T. Nguyen, et al., “The IIR-NTU speaker diarization systems for RT 2009,” in Proc. RT′09, NIST Rich Transcription Workshop, Melbourne, FL, 2009.
- (2009) Proc. RT′09, NIST Rich Transcription Workshop
- Nguyen, T.¹

14
- 0141809272
- E-HMM approach for learning and adapting sound models for speaker indexing
- Jun.
- S. Meignier, J. -F. Bonastre, and S. Igounet, “E-HMM approach for learning and adapting sound models for speaker indexing,” in Proc. Odyssey Speaker and Lang. Recognition Workshop, Chania, Creete, Jun. 2001, pp. 175–180.
- (2001) Proc. Odyssey Speaker and Lang. Recognition Workshop, Chania, Creete , pp. 175-180
- Meignier, S.¹ Bonastre, J.-F.² Igounet, S.³

15
- 47749096771
- The LIA RT′07 speaker diarization system
- Baltimore, MD, USA, May 8--11, 2007, Revised Selected Papers, Berlin, Heidelberg: Springer-Verlag
- C. Fredouille and N. Evans, “The LIA RT′07 speaker diarization system,” in Proc. Multimodal Technol. for Perception of Humans: Int. Eval. Workshops CLEAR 2007 and RT 2007, Baltimore, MD, USA, May 8--11, 2007, Revised Selected Papers, Berlin, Heidelberg: Springer-Verlag, 2008, pp. 520–532.
- (2008) Proc. Multimodal Technol. for Perception of Humans: Int. Eval. Workshops CLEAR 2007 and RT 2007 , pp. 520-532
- Fredouille, C.¹ Evans, N.²

16
- 79956279915
- TheLIA-EURECOM RT′09 speaker diarization system
- C. Fredouille, S. Bozonnet, and N. W. D. Evans, “TheLIA-EURECOM RT′09 speaker diarization system,” in Proc. RT′09, NIST Rich Transcription Workshop, Melbourne, FL, 2009.
- (2009) Proc. RT′09, NIST Rich Transcription Workshop, Melbourne, FL
- Fredouille, C.¹ Bozonnet, S.² Evans, N.W.D.³

17
- 78049378635
- TheLIA-EURECOM RT′09 speaker diarization system: Enhancements in speaker modelling and cluster purification
- Mar. 14-19
- S. Bozonnet, N. W. D. Evans, and C. Fredouille, “TheLIA-EURECOM RT′09 speaker diarization system: Enhancements in speaker modelling and cluster purification,” in Proc. ICASSP, Dallas, TX, Mar. 14-19, 2010, pp. 4958–4961.
- (2010) Proc. ICASSP, Dallas, TX , pp. 4958-4961
- Bozonnet, S.¹ Evans, N.W.D.² Fredouille, C.³

18
- 44849112917
- Agglomerative information bottleneck for speaker diarization of meetings data
- Dec.
- D. Vijayasenan, F. Valente, and H. Bourlard, “Agglomerative information bottleneck for speaker diarization of meetings data,” in Proc. ASRU, Dec. 2007, pp. 250–255.
- (2007) Proc. ASRU , pp. 250-255
- Vijayasenan, D.¹ Valente, F.² Bourlard, H.³

19
- 68649087212
- Aninformationtheoretic approach to speaker diarization of meeting data
- Sep.
- D. Vijayasenan, F. Valente, and H. Bourlard, “Aninformationtheoretic approach to speaker diarization of meeting data,” IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 7, pp. 1382–1393, Sep. 2009.
- (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.7 , pp. 1382-1393
- Vijayasenan, D.¹ Valente, F.² Bourlard, H.³

20
- 84972808999
- Estimating normal means with a conjugate style dirichlet process prior
- S. McEachern, “Estimating normal means with a conjugate style dirichlet process prior,” in Proc. Commun. Statist.: Simul. Comput., 1994, vol. 23, pp. 727–741.
- (1994) Proc. Commun. Statist.: Simul. Comput. , vol.23 , pp. 727-741
- McEachern, S.¹

21
- 0027803368
- Keeping the neural networks simpleby minimizing the description length of the weights
- COLT ′93
- G. E. Hinton and D. van Camp, “Keeping the neural networks simpleby minimizing the description length of the weights,” in Proc. 6th Annu. Conf. Comput. Learn. Theory, New York, 1993, COLT ′93, pp. 5–13.
- (1993) Proc. 6th Annu. Conf. Comput. Learn. Theory, New York , pp. 5-13
- Hinton, G.E.¹ van Camp, D.²

22
- 33749249137
- Variational inference in graphical models: The view from the marginal polytope
- M. J. Wainwright and M. I. Jordan, “Variational inference in graphical models: The view from the marginal polytope,” in Proc. 41st Annu. Allerton Conf. Commun., Control, Comput., Urbana-Champaign, IL, 2003.
- (2003) Proc. 41st Annu. Allerton Conf. Commun., Control, Comput., Urbana-Champaign, IL
- Wainwright, M.J.¹ Jordan, M.I.²

23
- 70450171620
- Variational Bayesian methods for audio indexing
- Ph. D. dissertation, Eurecom Inst., Sophia-Antipolis, France
- F. Valente, “Variational Bayesian methods for audio indexing,” Ph. D. dissertation, Eurecom Inst., Sophia-Antipolis, France, 2005.
- (2005)
- Valente, F.¹

24
- 70450194565
- A study of new approaches to speaker diarization
- D. Reynolds, P. Kenny, and F. Castaldo, “A study of new approaches to speaker diarization,” in Proc. Interspeech, 2009.
- (2009) Proc. Interspeech
- Reynolds, D.¹ Kenny, P.² Castaldo, F.³

25
- 70450151829
- Bayesian Analysis of Speaker Diarization with Eigenvoice Priors
- Technical Report. Montreal, QC, Canada: CRIM
- P. Kenny, “Bayesian Analysis of Speaker Diarization with Eigenvoice Priors,” Technical Report. Montreal, QC, Canada: CRIM, 2008.
- (2008)
- Kenny, P.¹

26
- 79959818340
- A novel speaker binary key derived from anchor models
- X. Anguera and J. -F. Bonastre, “A novel speaker binary key derived from anchor models,” in Proc. Interspeech, 2010.
- (2010) Proc. Interspeech
- Anguera, X.¹ Bonastre, J.-F.²

27
- 80051641843
- Fast speaker diarization based on binary keys
- X. Anguera and J. -F. Bonastre, “Fast speaker diarization based on binary keys,” in Proc. ICASSP, 2011.
- (2011) Proc. ICASSP
- Anguera, X.¹ Bonastre, J.-F.²

28
- 44849088089
- A fast-match approach for robust, faster than real-time speaker diarization
- Dec.
- Y. Huang, O. Vinyals, G. Friedland, C. Muller, N. Mirghafori, and C. Wooters, “A fast-match approach for robust, faster than real-time speaker diarization,” in Proc. IEEE Workshop Autom. Speech Recognition Understanding, Kyoto, Japan, Dec. 2007, pp. 693–698.
- (2007) Proc. IEEE Workshop Autom. Speech Recognition Understanding, Kyoto, Japan , pp. 693-698
- Huang, Y.¹ Vinyals, O.² Friedland, G.³ Muller, C.⁴ Mirghafori, N.⁵ Wooters, C.⁶

29
- 79951759204
- Parallelizing speaker-attributed speech recognition for meeting browsing
- Dec.
- G. Friedland, J. Ching, and A. Janin, “Parallelizing speaker-attributed speech recognition for meeting browsing,” in Proc. IEEE Int. Symp. Multimedia, Taichung, Taiwan, Dec. 2010, pp. 121–128.
- (2010) Proc. IEEE Int. Symp. Multimedia, Taichung, Taiwan , pp. 121-128
- Friedland, G.¹ Ching, J.² Janin, A.³

30
- 34548352841
- Friends and enemies: A novel initialization for speaker diarization
- Sep.
- X. Anguera, C. Wooters, and J. Hernando, “Friends and enemies: A novel initialization for speaker diarization,” in Proc. ICSLP, Pittsburgh, PA, Sep. 2006.
- (2006) Proc. ICSLP, Pittsburgh, PA
- Anguera, X.¹ Wooters, C.² Hernando, J.³

31
- 84946742526
- A robust speaker clustering algorithm
- J. Ajmera, “A robust speaker clustering algorithm,” in Proc. ASRU, 2003, pp. 411--416.
- (2003) Proc. ASRU , pp. 411-416
- Ajmera, J.¹

32
- 33947706786
- Purity algorithms for speaker diarization of meetings data
- May
- X. Anguera, C. Wooters, and J. Hernando, “Purity algorithms for speaker diarization of meetings data,” in Proc. ICASSP, Toulouse, France, May 2006, pp. 1025--1028.
- (2006) Proc. ICASSP, Toulouse, France , pp. 1025-1028
- Anguera, X.¹ Wooters, C.² Hernando, J.³

33
- 0002595416
- Speaker, environment and channel change detection and clustering via the bayesian information criterion
- Feb.
- S. S. Chen and P. S. Gopalakrishnan, “Speaker, environment and channel change detection and clustering via the bayesian information criterion,” in Proc. DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne, VA, Feb. 1998, pp. 127--132.
- (1998) Proc. DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne, VA , pp. 127-132
- Chen, S.S.¹ Gopalakrishnan, P.S.²

34
- 0028516097
- Text independent speaker identification
- Oct.
- H. Gish and M. Schmidt, “Text independent speaker identification,” IEEE Signal Process. Mag., vol. 11, no. 4, pp. 18--32, Oct. 1994.
- (1994) IEEE Signal Process. Mag. , vol.11 , Issue.4 , pp. 18-32
- Gish, H.¹ Schmidt, M.²

35
- 84857722303
- The ICSI meeting project: Resources and research
- A. Janin, J. Ang, S. Bhagat, R. Dhillon, J. Edwards, J. Macias-Guarasa, N. Morgan, B. Peskin, E. Shriberg, A. Stolcke, C. Wooters, and B. Wrede, “The ICSI meeting project: Resources and research,” in Proc. ICASSP Meeting Recognition Workshop, 2004.
- (2004) Proc. ICASSP Meeting Recognition Workshop
- Janin, A.¹ Ang, J.² Bhagat, S.³ Dhillon, R.⁴ Edwards, J.⁵ Macias-Guarasa, J.⁶ Morgan, N.⁷ Peskin, B.⁸ Shriberg, E.⁹ Stolcke, A.¹⁰ Wooters, C.¹¹ Wrede, B.¹²

36
- 70350419245
- The AMI meeting corpus
- I. McCowan, J. Carletta, W. Kraaij, S. Ashby, S. Bourban, M. Flynn, M. Guillemot, T. Hain, J. Kadlec, V. Karaiskos, M. Kronenthal, G. Lathoud, M. Lincoln, A. Lisowska, W. Post, D. Reidsma, and P. Wellner, “The AMI meeting corpus,” in Proc. Meas. Behavior, 2005.
- (2005) Proc. Meas. Behavior
- McCowan, I.¹ Carletta, J.² Kraaij, W.³ Ashby, S.⁴ Bourban, S.⁵ Flynn, M.⁶ Guillemot, M.⁷ Hain, T.⁸ Kadlec, J.⁹ Karaiskos, V.¹⁰ Kronenthal, M.¹¹ Lathoud, G.¹² Lincoln, M.¹³ Lisowska, A.¹⁴ Post, W.¹⁵ Reidsma, D.¹⁶ Wellner, P.¹⁷

37
- 41349114281
- The CHIL audiovisual corpus for lecture and meeting analysis inside smart rooms
- Dec.
- D. Mostefa, N. Moreau, K. Choukri, G. Potamianos, S. M. Chu, A. Tyagi, J. R. Casas, J. Turmo, L. Cristoforetti, F. Tobia, A. Pnev-matikakis, V. Mylonakis, F. Talantzis, S. Burger, R. Stiefelhagen, K. Bernardin, and C. Rochet, “The CHIL audiovisual corpus for lecture and meeting analysis inside smart rooms,” Lang. Resources Eval., vol. 41, Dec. 2007.
- (2007) Lang. Resources Eval. , vol.41
- Mostefa, D.¹ Moreau, N.² Choukri, K.³ Potamianos, G.⁴ Chu, S.M.⁵ Tyagi, A.⁶ Casas, J.R.⁷ Turmo, J.⁸ Cristoforetti, L.⁹ Tobia, F.¹⁰ Pnev-matikakis, A.¹¹ Mylonakis, V.¹² Talantzis, F.¹³ Burger, S.¹⁴ Stiefelhagen, R.¹⁵ Bernardin, K.¹⁶ Rochet, C.¹⁷

38
- 33846209880
- The NIST 2004 spring rich transcription evaluation: Two-axis merging strategy in the context of multiple distant microphone based meeting speaker segmentation
- Montreal, QC, Canada
- C. Fredouille, D. Moraru, S. Meignier, L. Besacier, and J. -F. Bonastre, “The NIST 2004 spring rich transcription evaluation: Two-axis merging strategy in the context of multiple distant microphone based meeting speaker segmentation,” in Proc. NIST 2004 Spring Rich Transcript. Eval. Workshop, Montreal, QC, Canada, 2004.
- (2004) Proc. NIST 2004 Spring Rich Transcript. Eval. Workshop
- Fredouille, C.¹ Moraru, D.² Meignier, S.³ Besacier, L.⁴ Bonastre, J.-F.⁵

39
- 85009080849
- Speaker segmentation and clustering in meetings
- Sep.
- Q. Jin, K. Laskowski, T. Schultz, and A. Waibel, “Speaker segmentation and clustering in meetings,” in Proc. ICSLP, Jeju, Korea, Sep. 2004.
- (2004) Proc. ICSLP, Jeju, Korea
- Jin, Q.¹ Laskowski, K.² Schultz, T.³ Waibel, A.⁴

40
- 33947655775
- NIST RT05S evaluation: Pre-processing techniques and speaker diarization on multiple microphone meetings
- Edinburgh, U. K., Jul.
- D. Istrate, C. Fredouille, S. Meignier, L. Besacier, and J. -F. Bonastre, “NIST RT05S evaluation: Pre-processing techniques and speaker diarization on multiple microphone meetings,” in Proc. NIST 2005 Spring Rich Transcript. Eval. Workshop, Edinburgh, U. K., Jul. 2005.
- (2005) Proc. NIST 2005 Spring Rich Transcript. Eval. Workshop
- Istrate, D.¹ Fredouille, C.² Meignier, S.³ Besacier, L.⁴ Bonastre, J.-F.⁵

41
- 34548361194
- Robust speaker segmentation for meetings: The ICSI-SRI spring 2005 diarization system
- Edinburgh, U. K.
- X. Anguera, C. Wooters, B. Peskin, and M. Aguilo, “Robust speaker segmentation for meetings: The ICSI-SRI spring 2005 diarization system,” in Proc. NIST MLMI Meeting Recognition Workshop, Edinburgh, U. K., 2005.
- (2005) Proc. NIST MLMI Meeting Recognition Workshop
- Anguera, X.¹ Wooters, C.² Peskin, B.³ Aguilo, M.⁴

42
- 50449086237
- Acoustic beamforming for speaker diarization of meetings
- Sep.
- X. Anguera, C. Wooters, and J. Hernando, “Acoustic beamforming for speaker diarization of meetings,” IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 7, pp. 2011--2023, Sep. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.7 , pp. 2011-2023
- Anguera, X.¹ Wooters, C.² Hernando, J.³

43
- 44849143841
- [Online]. Available: http://www.xavieranguera.com/beamformit/
- X. Anguera, BeamformIt (The Fast and Robust Acoustic Beamformer) [Online]. Available: http://www.xavieranguera.com/beamformit/.
- BeamformIt (The Fast and Robust Acoustic Beamformer)
- Anguera, X.¹

44
- 84965539501
- New York: Wiley
- N. Wiener, Extrapolation, Interpolation, and Smoothing of Stationary Time Series. New York: Wiley, 1949.
- (1949) Extrapolation, Interpolation, and Smoothing of Stationary Time Series
- Wiener, N.¹

45
- 85009231870
- Qual-comm-ICSI-OGI features for ASR
- A. Adami, L. Burget, S. Dupont, H. Garudadri, F. Grezl, H. Her-mansky, P. Jain, S. Kajarekar, N. Morgan, and S. Sivadas, “Qual-comm-ICSI-OGI features for ASR,” in Proc. ICSLP, 2002, vol. 1, pp. 4--7.
- (2002) Proc. ICSLP , vol.1 , pp. 4-7
- Adami, A.¹ Burget, L.² Dupont, S.³ Garudadri, H.⁴ Grezl, F.⁵ Her-mansky, H.⁶ Jain, P.⁷ Kajarekar, S.⁸ Morgan, N.⁹ Sivadas, S.¹⁰

46
- 4344607755
- Likelihood maximizing beam-forming for robust hands-free speech recognition
- Sep.
- M. L. Seltzer, B. Raj, and R. M. Stern, “Likelihood maximizing beam-forming for robust hands-free speech recognition,” IEEE Trans. Speech Audio Process., vol. 12, no. 5, pp. 489–498, Sep. 2004.
- (2004) IEEE Trans. Speech Audio Process. , vol.12 , Issue.5 , pp. 489-498
- Seltzer, M.L.¹ Raj, B.² Stern, R.M.³

47
- 0019928857
- An alternative approach to linearly constrained adaptive beamforming
- Jan.
- L. J. Griffiths and C. W. Jim, “An alternative approach to linearly constrained adaptive beamforming,” IEEE Trans. Antennas Propagat., vol. AP-30, no. 1, pp. 27--34, Jan. 1982.
- (1982) IEEE Trans. Antennas Propagat. , vol.AP-30 , Issue.1 , pp. 27-34
- Griffiths, L.J.¹ Jim, C.W.²

48
- 50449083999
- New York: Wiley
- M. Woelfel and J. McDonough, Distant Speech Recognition. New York: Wiley, 2009.
- (2009) Distant Speech Recognition
- Woelfel, M.¹ McDonough, J.²

49
- 34047268275
- Towards robust speaker segmentation: The ICSI-SRI fall 2004 diarization system
- Palisades, NY, Nov.
- C. Wooters, J. Fung, B. Peskin, and X. Anguera, “Towards robust speaker segmentation: The ICSI-SRI fall 2004 diarization system,” in Proc. Fall 2004 Rich Transcript. Workshop (RT04), Palisades, NY, Nov. 2004.
- (2004) Proc. Fall 2004 Rich Transcript. Workshop (RT04)
- Wooters, C.¹ Fung, J.² Peskin, B.³ Anguera, X.⁴

50
- 70450144960
- Voice activity detection. Fundamentals and speech recognition systemrobustness
- Jun.
- J. Ramirez, J. M. Girriz, and J. C. Segura, M. Grimm and K. Kroschel, Eds., “Voice activity detection. Fundamentals and speech recognition systemrobustness,” in Proc. Robust Speech Recognit. Understand., Vienna, Austria, Jun. 2007, p. 460.
- (2007) Proc. Robust Speech Recognit. Understand., Vienna, Austria , pp. 460
- Ramirez, J.¹ Girriz, J.M.² Segura, J.C.³ Grimm, M.⁴ Kroschel, K.⁵

51
- 77249126512
- Technical improvements of the E-HMM based speaker diarization system for meeting records
- C. Fredouille and G. Senay, “Technical improvements of the E-HMM based speaker diarization system for meeting records,” in Proc. MLMI Third Int. Workshop, Bethesda, MD, USA, Revised Selected Paper, Berlin, Heidelberg: Springer-Verlag, 2006, pp. 359--370.
- (2006) Proc. MLMI Third Int. Workshop, Bethesda, MD, USA, Revised Selected Paper, Berlin, Heidelberg: Springer-Verlag , pp. 359-370
- Fredouille, C.¹ Senay, G.²

52
- 47749103773
- Progress in the AMIDA speaker diarization system for meeting data
- Baltimore, MD, May 8-11, 2007, Revised Selected Papers, Berlin, Heidelberg: Springer-Verlag
- D. A. V. Leeuwen and M. Konecn$yA, “Progress in the AMIDA speaker diarization system for meeting data,” in Proc. Multimodal Technol. for Percept. of Humans: Int. Eval Workshops CLEAR 2007 and RT 2007, Baltimore, MD, May 8-11, 2007, Revised Selected Papers, Berlin, Heidelberg: Springer-Verlag, 2008, pp. 475--483.
- (2008) Proc. Multimodal Technol. for Percept. of Humans: Int. Eval Workshops CLEAR 2007 and RT 2007 , pp. 475-483
- Leeuwen, D.A.V.¹ Konecn$yA, M.²

53
- 77249109902
- The 2006 Athens information technology speech activity detection and speaker diarization systems
- Bethesda, MD, Revised Selected Paper, Berlin, Heidelberg: Springer-Verlag
- A. Rentzeperis, A. Stergious, C. Boukis, A. Pnevmatikakis, and L. Polymenakos, “The 2006 Athens information technology speech activity detection and speaker diarization systems,” in Proc. Mach. Learn. Multimodal Interaction: 3rd Int. Workshop, MLMI 2006, Bethesda, MD, Revised Selected Paper, Berlin, Heidelberg: Springer-Verlag, 2006, pp. 385--395.
- (2006) Proc. Mach. Learn. Multimodal Interaction: 3rd Int. Workshop, MLMI 2006 , pp. 385-395
- Rentzeperis, A.¹ Stergious, A.² Boukis, C.³ Pnevmatikakis, A.⁴ Polymenakos, L.⁵

54
- 34547526911
- Enhanced SVM training for robust speech activity detection
- A. Temko, D. Macho, and C. Nadeu, “Enhanced SVM training for robust speech activity detection,” in Proc. ICASSP, Honolulu, HI, 2007, pp. 1025--1028.
- (2007) Proc. ICASSP, Honolulu, HI , pp. 1025-1028
- Temko, A.¹ Macho, D.² Nadeu, C.³

55
- 84890517976
- Hybrid speech/non-speech detector applied to speaker diarization of meetings
- Jun.
- X. Anguera, C. Wooters, M. Anguilo, and C. Nadeu, “Hybrid speech/non-speech detector applied to speaker diarization of meetings,” in Proc. Speaker Odyssey Workshop, Puerto Rico, Jun. 2006.
- (2006) Proc. Speaker Odyssey Workshop, Puerto Rico
- Anguera, X.¹ Wooters, C.² Anguilo, M.³ Nadeu, C.⁴

56
- 70450161112
- Speaker diarization for meeting room audio
- Sep.
- H. Sun, T. L. Nwe, B. Ma, and H. Li, “Speaker diarization for meeting room audio,” in Proc. Interspeech′09, Sep. 2009.
- (2009) Proc. Interspeech′09
- Sun, H.¹ Nwe, T.L.² Ma, B.³ Li, H.⁴

57
- 70349218123
- Speaker diarization in meeting audio
- T. L. Nwe, H. Sun, H. Li, and S. Rahardja, “Speaker diarization in meeting audio,” in Proc. ICASSP, Taipei, Taiwan, 2009, pp. 4073--4076.
- (2009) Proc. ICASSP, Taipei, Taiwan , pp. 4073-4076
- Nwe, T.L.¹ Sun, H.² Li, H.³ Rahardja, S.⁴

58
- 70349197676
- Improved speaker diariza-tion system for meetings
- E. El-Khoury, C. Senac, and J. Pinquier, “Improved speaker diariza-tion system for meetings,” in Proc. ICASSP, Taipei, Taiwan, 2009, pp. 4097--4100.
- (2009) Proc. ICASSP, Taipei, Taiwan , pp. 4097-4100
- El-Khoury, E.¹ Senac, C.² Pinquier, J.³

59
- 0036816475
- Content analysis for audio classification and segmentation
- Oct.
- L. Lu, H. -J. Zhang, and H. Jiang, “Content analysis for audio classification and segmentation,” IEEE Trans. Speech Audio Process., vol. 10, no. 7, pp. 504--516, Oct. 2002.
- (2002) IEEE Trans. Speech Audio Process. , vol.10 , Issue.7 , pp. 504-516
- Lu, L.¹ Zhang, H.-J.² Jiang, H.³

60
- 85008578854
- Improving speaker segmentation via speaker identification and text segmentation
- Sep.
- R. Li, Q. Jin, and T. Schultz, “Improving speaker segmentation via speaker identification and text segmentation,” in Proc. Interspeech, Sep. 2009, pp. 3073--3076.
- (2009) Proc. Interspeech , pp. 3073-3076
- Li, R.¹ Jin, Q.² Schultz, T.³

61
- 77951283289
- Speaker diarization using bottom-up clustering based on a parameter-derived distance between adapted gmms
- M. Ben, M. Betser, F. Bimbot, and G. Gravier, “Speaker diarization using bottom-up clustering based on a parameter-derived distance between adapted gmms,” in Proc. ICSLP, Jeju Island, Korea, 2004.
- (2004) Proc. ICSLP, Jeju Island, Korea
- Ben, M.¹ Betser, M.² Bimbot, F.³ Gravier, G.⁴

62
- 77249176190
- The AMI speaker diarization system for NIST RT06s meeting data
- Berlin, Germany: Springer-Verlag, Lecture Notes in Computer Science
- D. Van Leeuwen and M. Huijbregts, “The AMI speaker diarization system for NIST RT06s meeting data,” in Machine Learning for Multimodal Interaction. Berlin, Germany: Springer-Verlag, 2007, vol. 4299, Lecture Notes in Computer Science, pp. 371--384.
- (2007) Machine Learning for Multimodal Interaction , vol.4299 , pp. 371-384
- Van Leeuwen, D.¹ Huijbregts, M.²

63
- 85008561506
- The cost278 pan-European broadcast news database
- A. Vandecatseye, J. -P. Martens, J. Neto, H. Meinedo, C. Garcia-Mateo, J. Dieguez, F. Mihelic, J. Zibert, J. Nouza, P. David, M. Pleva, A. Cizmar, H. Papageorgiou, and C. Alexandris, “The cost278 pan-European broadcast news database,” in Proc. LREC, Lisbon, Portugal, 5, 2004, vol. 4, pp. 873--876.
- (2004) Proc. LREC, Lisbon, Portugal , vol.4 , Issue.5 , pp. 873-876
- Vandecatseye, A.¹ Martens, J.-P.² Neto, J.³ Meinedo, H.⁴ Garcia-Mateo, C.⁵ Dieguez, J.⁶ Mihelic, F.⁷ Zibert, J.⁸ Nouza, J.⁹ David, P.¹⁰ Pleva, M.¹¹ Cizmar, A.¹² Papageorgiou, H.¹³ Alexandris, C.¹⁴

64
- 0034857759
- Speaker change detection and speaker clustering using VQ distortion for broadcast news speech recognition
- K. Mori and S. Nakagawa, “Speaker change detection and speaker clustering using VQ distortion for broadcast news speech recognition,” in Proc. ICASSP, 2001, pp. 413--416.
- (2001) Proc. ICASSP , pp. 413-416
- Mori, K.¹ Nakagawa, S.²

65
- 3543144948
- Robust speaker change detection
- J. Ajmera and I. McCowan, “Robust speaker change detection,” IEEE Signal Process. Lett., vol. 11, pp. 649--651, 2004.
- (2004) IEEE Signal Process. Lett. , vol.11 , pp. 649-651
- Ajmera, J.¹ McCowan, I.²

66
- 33645326073
- Real-time unsupervised speaker change detection
- L. Lu and H. -J. Zhang, “Real-time unsupervised speaker change detection,” in 16th Int. Conf. Pattern Recognit., 2002, vol. 2, pp. 358--361.
- (2002) 16th Int. Conf. Pattern Recognit. , vol.2 , pp. 358-361
- Lu, L.¹ Zhang, H.-J.²

67
- 85008578884
- Evolutive speaker segmentation using a repository system
- X. Anguera and J. Hernando, “Evolutive speaker segmentation using a repository system,” in Proc. Interspeech, 2004.
- (2004) Proc. Interspeech
- Anguera, X.¹ Hernando, J.²

68
- 33846242627
- Speaker diarization for multi-party meetings using acoustic fusion
- Nov.
- X. Anguera, C. Wooters, and J. Hernando, “Speaker diarization for multi-party meetings using acoustic fusion,” in Proc. ASRU, Nov. 2005, pp. 426--431.
- (2005) Proc. ASRU , pp. 426-431
- Anguera, X.¹ Wooters, C.² Hernando, J.³

69
- 33746354301
- Unsupervised speaker change detection using probabilistic pattern matching
- Aug.
- A. Malegaonkar, A. Ariyaeeinia, P. Sivakumaran, and J. Fortuna, “Unsupervised speaker change detection using probabilistic pattern matching,” IEEE Signal Process. Lett., vol. 13, no. 8, pp. 509--512, Aug. 2006.
- (2006) IEEE Signal Process. Lett. , vol.13 , Issue.8 , pp. 509-512
- Malegaonkar, A.¹ Ariyaeeinia, A.² Sivakumaran, P.³ Fortuna, J.⁴

70
- 0026400244
- Segregation of speakers for speech recognition and speaker identification
- M. -H. Siu, G. Yu, and H. Gish, “Segregation of speakers for speech recognition and speaker identification,” in Proc. ICASSP′91, 1991, pp. 873--876.
- (1991) Proc. ICASSP′91 , pp. 873-876
- Siu, M.-H.¹ Yu, G.² Gish, H.³

71
- 0034273195
- DISTBIC: A speaker-based segmentation for audio data indexing
- P. Delacourt and C. Wellekens, “DISTBIC: A speaker-based segmentation for audio data indexing,” Speech Commun., pp. 111--126, 2000.
- (2000) Speech Commun. , pp. 111-126
- Delacourt, P.¹ Wellekens, C.²

72
- 84867210169
- Agglomerative hierarchical speaker clustering using incremental Gaussian mixture cluster modeling
- S. S. Han and K. J. Narayanan, “Agglomerative hierarchical speaker clustering using incremental Gaussian mixture cluster modeling,” in Proc. Interspeech′08, Brisbane, Australia, 2008, pp. 20--23.
- (2008) Proc. Interspeech′08, Brisbane, Australia , pp. 20-23
- Han, S.S.¹ Narayanan, K.J.²

73
- 85009142161
- A novel method for two speaker segmentation
- Sep.
- R. Gangadharaiah, B. Narayanaswamy, and N. Balakrishnan, “A novel method for two speaker segmentation,” in Proc. ICSLP, Jeju, Korea, Sep. 2004.
- (2004) Proc. ICSLP, Jeju, Korea
- Gangadharaiah, R.¹ Narayanaswamy, B.² Balakrishnan, N.³

74
- 85119434191
- Fast speaker change detection for broadcast news transcription and indexing
- Sep.
- D. Liu and F. Kubala, “Fast speaker change detection for broadcast news transcription and indexing,” in Proc. Eurospeech′99, Sep. 1999, pp. 1031–1034.
- (1999) Proc. Eurospeech′99 , pp. 1031-1034
- Liu, D.¹ Kubala, F.²

75
- 0002782496
- Automatic segmentation, classification and clustering of broadcast news audio
- M. A. Siegler, U. Jain, B. Raj, and R. M. Stern, “Automatic segmentation, classification and clustering of broadcast news audio,” in Proc. DARPA Speech Recognit. Workshop, 1997, pp. 97–99.
- (1997) Proc. DARPA Speech Recognit. Workshop , pp. 97-99
- Siegler, M.A.¹ Jain, U.² Raj, B.³ Stern, R.M.⁴

76
- 33745200950
- Modified DISTBIC algorithm for speaker change detection
- P. Zochova and V. Radova, “Modified DISTBIC algorithm for speaker change detection,” in Proc. 9th Eur. Conf. Speech Commun. Technol., Bonn, Germany, 2005, pp. 3073–3076.
- (2005) Proc. 9th Eur. Conf. Speech Commun. Technol., Bonn, Germany , pp. 3073-3076
- Zochova, P.¹ Radova, V.²

77
- 70350349017
- Speaker diarization: From broadcast news to lectures
- X. Zhu, C. Barras, L. Lamel, and J. -L. Gauvain, “Speaker diarization: From broadcast news to lectures,” in Proc. MLMI, 2006, pp. 396–406.
- (2006) Proc. MLMI , pp. 396-406
- Zhu, X.¹ Barras, C.² Lamel, L.³ Gauvain, J.-L.⁴

78
- 51449100003
- Novel inter-cluster distance measure combining GLR and ICR for improved agglomerative hierarchical speaker clustering
- Apr.
- K. Han and S. Narayanan, “Novel inter-cluster distance measure combining GLR and ICR for improved agglomerative hierarchical speaker clustering,” in Proc. ICASSP, Apr. 2008, pp. 4373–4376.
- (2008) Proc. ICASSP , pp. 4373-4376
- Han, K.¹ Narayanan, S.²

79
- 77956276107
- Experiments on speakertracking and segmentation in radio broadcast news
- D. Moraru, M. Ben, and G. Gravier, “Experiments on speakertracking and segmentation in radio broadcast news,” in Proc. ICSLP, 2005.
- (2005) Proc. ICSLP
- Moraru, D.¹ Ben, M.² Gravier, G.³

80
- 33745197522
- Improving speaker diarization
- C. Barras, X. Zhu, S. Meignier, and J. -L. Gauvain, “Improving speaker diarization,” in Proc. DARPA RT04, 2004.
- (2004) Proc. DARPA RT04
- Barras, C.¹ Zhu, X.² Meignier, S.³ Gauvain, J.-L.⁴

81
- 64249126167
- Trainable speaker diarization
- Aug.
- H. Aronowitz, “Trainable speaker diarization,” in Proc. Interspeech, Aug. 2007, pp. 1861–1864.
- (2007) Proc. Interspeech , pp. 1861-1864
- Aronowitz, H.¹

82
- 84897697234
- Towards audio-visual on-line diarization of participants in group meetings
- H. Hung and G. Friedland, “Towards audio-visual on-line diarization of participants in group meetings,” in Proc. Workshop Multi-Camera and Multi-Modal Sensor Fusion Algorithms Applicat. -M2SFA2, Marseille, France, 2008.
- (2008) Proc. Workshop Multi-Camera and Multi-Modal Sensor Fusion Algorithms Applicat. -M2SFA2, Marseille, France
- Hung, H.¹ Friedland, G.²

83
- 70350635616
- Live speaker identification in conversations
- G. Friedland and O. Vinyals, “Live speaker identification in conversations,” in Proc. MM′08: Proc. 16th ACM Int. Conf. Multimedia, New York, 2008, pp. 1017–1018.
- (2008) Proc. MM′08: Proc. 16th ACM Int. Conf. Multimedia, New York , pp. 1017-1018
- Friedland, G.¹ Vinyals, O.²

84
- 67651165389
- Prosodic and other long-term features for speaker diarization
- Jul.
- G. Friedland, O. Vinyals, Y. Huang, and C. Muller, “Prosodic and other long-term features for speaker diarization,” IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 5, pp. 985–993, Jul. 2009.
- (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.5 , pp. 985-993
- Friedland, G.¹ Vinyals, O.² Huang, Y.³ Muller, C.⁴

85
- 47749127366
- Speaker diarization for conference room: The UPC RT07s evaluation system
- Baltimore, MD, May 8-11, 2007, Revised Selected Papers, Berlin, Heidelberg: Springer-Verlag
- J. Luque, X. Anguera, A. Temko, and J. Hernando, “Speaker diarization for conference room: The UPC RT07s evaluation system,” in Proc. Multimodal Technol. Perception of Humans: Int. Eval. Workshops CLEAR 2007 and RT 2007, Baltimore, MD, May 8-11, 2007, Revised Selected Papers, Berlin, Heidelberg: Springer-Verlag, 2008, pp. 543–553.
- (2008) Proc. Multimodal Technol. Perception of Humans: Int. Eval. Workshops CLEAR 2007 and RT 2007 , pp. 543-553
- Luque, J.¹ Anguera, X.² Temko, A.³ Hernando, J.⁴

86
- 34548351229
- Speaker diarization for multiple distant microphone meetings: Mixing acoustic features and interchannel time differences
- J. Pardo, X. Anguera, and C. Wooters, “Speaker diarization for multiple distant microphone meetings: Mixing acoustic features and interchannel time differences,” in Proc. Interspeech, 2006.
- (2006) Proc. Interspeech
- Pardo, J.¹ Anguera, X.² Wooters, C.³

87
- 0141591540
- Location based speaker segmentation
- G. Lathoud and I. M. Cowan, “Location based speaker segmentation,” in Proc. ICASSP, 2003, vol. 1, pp. 176–179.
- (2003) Proc. ICASSP , vol.1 , pp. 176-179
- Lathoud, G.¹ Cowan, I.M.²

88
- 33746619064
- Speaker turn detection based on between-chan-nels differences
- D. Ellis and J. C. Liu, “Speaker turn detection based on between-chan-nels differences,” in Proc. ICASSP, 2004.
- (2004) Proc. ICASSP
- Ellis, D.¹ Liu, J.C.²

89
- 4544339441
- Clustering and segmenting speakers and their locations in meetings
- J. Ajmera, G. Lathoud, and L. McCowan, “Clustering and segmenting speakers and their locations in meetings,” in Proc. ICASSP, 2004, vol. 1, pp. 605–608.
- (2004) Proc. ICASSP , vol.1 , pp. 605-608
- Ajmera, J.¹ Lathoud, G.² McCowan, L.³

90
- 34548351229
- Speaker diarization for multiple distant microphone meetings: Mixing acoustic features and inter-channel time differences
- J. M. Pardo, X. Anguera, and C. Wooters, “Speaker diarization for multiple distant microphone meetings: Mixing acoustic features and inter-channel time differences,” in Proc. Interspeech, 2006.
- (2006) Proc. Interspeech
- Pardo, J.M.¹ Anguera, X.² Wooters, C.³

91
- 34548310397
- Speaker diarization for mul-tiple-distant-microphone meetings using several sources of information
- Sep.
- J. Pardo, X. Anguera, and C. Wooters, “Speaker diarization for mul-tiple-distant-microphone meetings using several sources of information,” IEEE Trans. Comput., vol. 56, no. 9, pp. 1212–1224, Sep. 2007.
- (2007) IEEE Trans. Comput. , vol.56 , Issue.9 , pp. 1212-1224
- Pardo, J.¹ Anguera, X.² Wooters, C.³

92
- 70349220969
- Speaker diarization using unsupervised discriminant analysis of inter-channel delay features
- Apr.
- N. W. D. Evans, C. Fredouille, and J. -F. Bonastre, “Speaker diarization using unsupervised discriminant analysis of inter-channel delay features,” in Proc. ICASSP, Apr. 2009, pp. 4061–4064.
- (2009) Proc. ICASSP , pp. 4061-4064
- Evans, N.W.D.¹ Fredouille, C.² Bonastre, J.-F.³

93
- 70450179598
- Speaker identification using warped MVDR cepstral features
- M. Wölfel, Q. Yang, Q. Jin, and T. Schultz, “Speaker identification using warped MVDR cepstral features,” in Proc. Interspeech, 2009.
- (2009) Proc. Interspeech
- Wölfel, M.¹ Yang, Q.² Jin, Q.³ Schultz, T.⁴

94
- 63649094710
- Higher-level features in speaker recognition
- C. MÜller, Ed. Berlin, Heidelberg, Germany: Springer, Lecture Notes in Artificial Intelligence
- E. Shriberg, “Higher-level features in speaker recognition,” in Speaker Classification I, C. MÜller, Ed. Berlin, Heidelberg, Germany: Springer, 2007, vol. 4343, Lecture Notes in Artificial Intelligence.
- (2007) Speaker Classification I , vol.4343
- Shriberg, E.¹

95
- 77956543877
- Tuning-robust initialization methods for speaker diarization
- Nov.
- D. Imseng and G. Friedland, “Tuning-robust initialization methods for speaker diarization,” IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 8, pp. 2028–2037, Nov. 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process. , vol.18 , Issue.8 , pp. 2028-2037
- Imseng, D.¹ Friedland, G.²

96
- 77949400775
- Robust speaker diarization for short speech recordings
- Dec.
- D. Imseng and G. Friedland, “Robust speaker diarization for short speech recordings,” in Proc. IEEE Workshop Autom. Speech Recognit. Understand., Dec. 2009, pp. 432–437.
- (2009) Proc. IEEE Workshop Autom. Speech Recognit. Understand. , pp. 432-437
- Imseng, D.¹ Friedland, G.²

97
- 85009145345
- Observations on overlap: Findings and implications for automatic processing of multi-party conversations
- E. Shriberg, A. Stolcke, and D. Baron, “Observations on overlap: Findings and implications for automatic processing of multi-party conversations,” in Proc. Eurospeech′01, Aalborg, Denmark, 2001, pp. 1359–1362.
- (2001) Proc. Eurospeech′01, Aalborg, Denmark , pp. 1359-1362
- Shriberg, E.¹ Stolcke, A.² Baron, D.³

98
- 33947640630
- Speaker overlaps and ASR errors in meetings: Effects before, during, and after the overlap
- O. Çetin and E. Shriberg, “Speaker overlaps and ASR errors in meetings: Effects before, during, and after the overlap,” in Proc. ICASSP, Toulouse, France, 2006, pp. 357–360.
- (2006) Proc. ICASSP, Toulouse, France , pp. 357-360
- Çetin, O.¹ Shriberg, E.²

99
- 51449111990
- Overlapped speech detection for improved speaker diarization in multiparty meetings
- K. Boakye, B. Trueba-Hornero, O. Vinyals, and G. Friedland, “Overlapped speech detection for improved speaker diarization in multiparty meetings,” in Proc. ICASSP, 2008, pp. 4353–4356.
- (2008) Proc. ICASSP , pp. 4353-4356
- Boakye, K.¹ Trueba-Hornero, B.² Vinyals, O.³ Friedland, G.⁴

100
- 78649290108
- Handling overlapped speech in speaker diariza-tion
- M. S. thesis, Univ. Politecnica de Catalunya, Barcelona, Spain
- B. Trueba-Hornero, “Handling overlapped speech in speaker diariza-tion,” M. S. thesis, Univ. Politecnica de Catalunya, Barcelona, Spain, 2008.
- (2008)
- Trueba-Hornero, B.¹

101
- 84890528960
- Audio segmentation for meetings speech processing
- Ph. D. dissertation, Univ. of California, Berkeley
- K. Boakye, “Audio segmentation for meetings speech processing,” Ph. D. dissertation, Univ. of California, Berkeley, 2008.
- (2008)
- Boakye, K.¹

102
- 44849101173
- Efficient use of overlap information in speaker diarization
- S. Otterson and M. Ostendorf, “Efficient use of overlap information in speaker diarization,” in Proc. ASRU, Kyoto, Japan, 2007, pp. 686–686.
- (2007) Proc. ASRU, Kyoto, Japan , pp. 686
- Otterson, S.¹ Ostendorf, M.²

103
- 0032136330
- Robust speech recognition using the modulation spectrogram
- B. E. D. Kingsbury, N. Morgan, and S. Greenberg, “Robust speech recognition using the modulation spectrogram,” Speech Commun., vol. 25, no. 1-3, pp. 117–132, 1998.
- (1998) Speech Commun. , vol.25 , Issue.1-3 , pp. 117-132
- Kingsbury, B.E.D.¹ Morgan, N.² Greenberg, S.³

104
- 35248827017
- Speaker localization using audiovisual synchrony: An empirical study
- H. J. Nock, G. Iyengar, and C. Neti, “Speaker localization using audiovisual synchrony: An empirical study,” Lecture Notes in Comput. Sci., vol. 2728, pp. 565–570, 2003.
- (2003) Lecture Notes in Comput. Sci. , vol.2728 , pp. 565-570
- Nock, H.J.¹ Iyengar, G.² Neti, C.³

105
- 34250758546
- Boosting-based multimodal speaker detection for distributed meetings
- C. Zhang, P. Yin, Y. Rui, R. Cutler, and P. Viola, “Boosting-based multimodal speaker detection for distributed meetings,” in Proc. IEEE Int. Workshop Multimedia Signal Process. (MMSP), 2006, pp. 86–91.
- (2006) Proc. IEEE Int. Workshop Multimedia Signal Process. (MMSP) , pp. 86-91
- Zhang, C.¹ Yin, P.² Rui, Y.³ Cutler, R.⁴ Viola, P.⁵

106
- 57649176425
- On-line multi-modal speaker diarization
- A. Noulas and B. J. A. Krose, “On-line multi-modal speaker diarization,” in Proc. 9th Int. Conf. Multimodal Interfaces ICMI ′07, New York, 2007, pp. 350–357.
- (2007) Proc. 9th Int. Conf. Multimodal Interfaces ICMI ′07, New York , pp. 350-357
- Noulas, A.¹ Krose, B.J.A.²

107
- 0031268341
- Factorial hidden Markov models
- Nov.
- Z. Ghahramani and M. I. Jordan, “Factorial hidden Markov models,” Mach. Learn., vol. 29, pp. 245–273, Nov. 1997.
- (1997) Mach. Learn. , vol.29 , pp. 245-273
- Ghahramani, Z.¹ Jordan, M.I.²

108
- 84857749860
- Mutimodal speaker diarization
- preprint, to be published
- A. K. Noulas, G. Englebienne, and B. J. A. Krose, “Mutimodal speaker diarization,” IEEE Trans. Pattern Anal Mach. Intell., 2011, preprint, to be published.
- (2011) IEEE Trans. Pattern Anal Mach. Intell.
- Noulas, A.K.¹ Englebienne, G.² Krose, B.J.A.³

109
- 1542572925
- Multi-modal speech recognition using optical-flow analysis for lip images
- S. Tamura, K. Iwano, and S. Furui, “Multi-modal speech recognition using optical-flow analysis for lip images,” Real World Speech Process., vol. 36, no. 2-3, pp. 117–124, 2004.
- (2004) Real World Speech Process. , vol.36 , Issue.2-3 , pp. 117-124
- Tamura, S.¹ Iwano, K.² Furui, S.³

110
- 0029746565
- Cross-modal prediction in audio-visual communication
- T. Chen and R. Rao, “Cross-modal prediction in audio-visual communication,” in Proc. ICASSP, 1996, vol. 4, pp. 2056–2059.
- (1996) Proc. ICASSP , vol.4 , pp. 2056-2059
- Chen, T.¹ Rao, R.²

111
- 0009622481
- Learningjoint statistical models for audio-visual fusion and segregation
- J. W. Fisher, T. Darrell, W. T. Freeman, and P. A. Viola, “Learningjoint statistical models for audio-visual fusion and segregation,” in Proc. NIPS, 2000, pp. 772–778.
- (2000) Proc. NIPS , pp. 772-778
- Fisher, J.W.¹ Darrell, T.² Freeman, W.T.³ Viola, P.A.⁴

112
- 2642562769
- Speaker association with signal-level audiovisual fusion
- Jun.
- J. W. Fisher and T. Darrell, “Speaker association with signal-level audiovisual fusion,” IEEE Trans. Multimedia, vol. 6, no. 3, pp. 406–413, Jun. 2004.
- (2004) IEEE Trans. Multimedia , vol.6 , Issue.3 , pp. 406-413
- Fisher, J.W.¹ Darrell, T.²

113
- 41549121431
- Exploiting audio-visual correlation in coding of talking head sequences
- Mar.
- R. Rao and T. Chen, “Exploiting audio-visual correlation in coding of talking head sequences,” in Proc. Int. Picture Coding Symp., Mar. 1996.
- (1996) Proc. Int. Picture Coding Symp.
- Rao, R.¹ Chen, T.²

114
- 34547527871
- Dynamic dependency tests for audio-visual speaker association
- Apr.
- M. Siracusa and J. Fisher, “Dynamic dependency tests for audio-visual speaker association,” in Proc. ICASSP, Apr. 2007, pp. 457–460.
- (2007) Proc. ICASSP , pp. 457-460
- Siracusa, M.¹ Fisher, J.²

115
- 0036299249
- CUAVE: A new audio-visual database for multimodal human-computer interface research
- E. K. Patterson, S. Gurbuz, Z. Tufekci, and J. N. Gowdy, “CUAVE: A new audio-visual database for multimodal human-computer interface research,” in Proc. ICASSP, 2002, pp. 2017–2020.
- (2002) Proc. ICASSP , pp. 2017-2020
- Patterson, E.K.¹ Gurbuz, S.² Tufekci, Z.³ Gowdy, J.N.⁴

116
- 33846320482
- New York: Cambridge Univ. Press
- D. McNeill, Language and Gesture. New York: Cambridge Univ. Press, 2000.
- (2000) Language and Gesture
- McNeill, D.¹

117
- 34047223614
- Audio segmentation and speaker localization in meeting videos
- H. Vajaria, T. Islam, S. Sarkar, R. Sankar, and R. Kasturi, “Audio segmentation and speaker localization in meeting videos,” in Proc. 18th Int. Conf. Pattern Recognit. (ICPR′06), 2006, vol. 2, pp. 1150–1153.
- (2006) Proc. 18th Int. Conf. Pattern Recognit. (ICPR′06) , vol.2 , pp. 1150-1153
- Vajaria, H.¹ Islam, T.² Sarkar, S.³ Sankar, R.⁴ Kasturi, R.⁵

118
- 51849100406
- Associating audiovisual activity cues in a dominance estimation framework
- H. Hung, Y. Huang, C. Yeo, and D. Gatica-Perez, “Associating audiovisual activity cues in a dominance estimation framework,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognition (CVPR) Workshop Human Communicative Behavior, Anchorage, AK, 2008, pp. 1–6.
- (2008) Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognition (CVPR) Workshop Human Communicative Behavior, Anchorage, AK , pp. 1-6
- Hung, H.¹ Huang, Y.² Yeo, C.³ Gatica-Perez, D.⁴

119
- 72449135653
- Working with very sparse data to detect speaker and listener participation in a meetings corpus
- May
- N. Campbell and N. Suzuki, “Working with very sparse data to detect speaker and listener participation in a meetings corpus,” in Proc. Workshop Programme, May 2006, vol. 10.
- (2006) Proc. Workshop Programme , vol.10
- Campbell, N.¹ Suzuki, N.²

120
- 70349214881
- Multimodal speaker diarization of real-world meetings using compressed-domain video features
- Apr.
- G. Friedland, H. Hung, and C. Yeo, “Multimodal speaker diarization of real-world meetings using compressed-domain video features,” in Proc. ICASSP, Apr. 2009, pp. 4069–4072.
- (2009) Proc. ICASSP , pp. 4069-4072
- Friedland, G.¹ Hung, H.² Yeo, C.³

121
- 72449147255
- Visual speaker localization aided by acoustic models
- G. Friedland, C. Yeo, and H. Hung, “Visual speaker localization aided by acoustic models,” in Proc. 17th ACMInt. Conf. Multimedia MM′09:, New York, 2009, pp. 195–202.
- (2009) Proc. 17th ACMInt. Conf. Multimedia MM′09:, New York , pp. 195-202
- Friedland, G.¹ Yeo, C.² Hung, H.³

122
- 29044442235
- Step-by-step and integrated approaches in broadcast news speaker diarization
- S. Meignier, D. Moraru, C. Fredouille, J. -F. Bonastre, and L. Besacier, “Step-by-step and integrated approaches in broadcast news speaker diarization,” in Proc. CSL, Sel. Papers from Speaker Lang. Recognit. Workshop (Odyssey′04), 2006, pp. 303–330.
- (2006) Proc. CSL, Sel. Papers from Speaker Lang. Recognit. Workshop (Odyssey′04) , pp. 303-330
- Meignier, S.¹ Moraru, D.² Fredouille, C.³ Bonastre, J.-F.⁴ Besacier, L.⁵

123
- 51449095036
- Combination of ag-glomerative and sequential clustering for speaker diarization
- D. Vijayasenan, F. Valente, and H. Bourlard, “Combination of ag-glomerative and sequential clustering for speaker diarization,” in Proc. ICASSP, Las Vegas, NV, 2008, pp. 4361–4364.
- (2008) Proc. ICASSP, Las Vegas, NV , pp. 4361-4364
- Vijayasenan, D.¹ Valente, F.² Bourlard, H.³

124
- 79959849996
- Speaker diarization: Combination of the LIUM and IRIT systems
- E. El-Khoury, C. Senac, and S. Meignier, “Speaker diarization: Combination of the LIUM and IRIT systems,” in Internal Report, 2008.
- (2008) Internal Report
- El-Khoury, E.¹ Senac, C.² Meignier, S.³

125
- 36749008026
- Combining Gaussianized/non-Gaussianized features to improve speaker diarization of telephone conversations
- Dec.
- V. Gupta, P. Kenny, P. Ouellet, G. Boulianne, and P. Dumouchel, “Combining Gaussianized/non-Gaussianized features to improve speaker diarization of telephone conversations,” in IEEE Signal Process. Lett., Dec. 2007, vol. 14, no. 12, pp. 1040–1043.
- (2007) IEEE Signal Process. Lett. , vol.14 , Issue.12 , pp. 1040-1043
- Gupta, V.¹ Kenny, P.² Ouellet, P.³ Boulianne, G.⁴ Dumouchel, P.⁵

126
- 0001120413
- A Bayesian analysis of some nonparametric problems
- T. S. Ferguson, “A Bayesian analysis of some nonparametric problems,” Ann. Statist., vol. 1, no. 2, pp. 209–230, 1973.
- (1973) Ann. Statist. , vol.1 , Issue.2 , pp. 209-230
- Ferguson, T.S.¹

127
- 44949158124
- Infinite models for speaker clustering
- iDIAP-RR 06-19
- F. Valente, “Infinite models for speaker clustering,” in Proc. Int. Conf. Spoken Lang. Process., 2006, iDIAP-RR 06-19.
- (2006) Proc. Int. Conf. Spoken Lang. Process.
- Valente, F.¹

128
- 33749249312
- Hierarchical Dirichlet processes
- Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei, “Hierarchical Dirichlet processes,” J. Amer. Statist. Assoc., vol. 101, no. 476, pp. 1566–1581, 2006.
- (2006) J. Amer. Statist. Assoc. , vol.101 , Issue.476 , pp. 1566-1581
- Teh, Y.W.¹ Jordan, M.I.² Beal, M.J.³ Blei, D.M.⁴

129
- 56449084167
- An HDP-HMM for systems with state persistence
- Jul.
- E. B. Fox, E. B. Sudderth, M. I. Jordan, and A. S. Willsky, “An HDP-HMM for systems with state persistence,” in Proc. ICML, Jul. 2008.
- (2008) Proc. ICML
- Fox, E.B.¹ Sudderth, E.B.² Jordan, M.I.³ Willsky, A.S.⁴

130
- 84869766113
- The blame game: Performance analysis of speaker diarization system components
- Aug.
- M. Huijbregts and C. Wooters, “The blame game: Performance analysis of speaker diarization system components,” in Proc. Interspeech, Aug. 2007, pp. 1857--1860.
- (2007) Proc. Interspeech , pp. 1857-1860
- Huijbregts, M.¹ Wooters, C.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.