SCOPUS 정보 검색 플랫폼

Multimedia Tools and Applications

Volumn 68, Issue 3, 2014, Pages 747-775

Audiovisual diarization of people in video content

(3) El Khoury, Elie a,b Sénac, Christine c Joly, Philippe c

a IDIAP RESEARCH INSTITUTE (Switzerland)

b UNIVERSITÉ DU MAINE (France)

c IRIT (France)

Author keywords

Audiovisual fusion; People diarization; Segmentation; Unsupervised clustering; Video indexing

Indexed keywords

EID: 84895063162 PISSN: 13807501 EISSN: 15737721 Source Type: Journal
DOI: 10.1007/s11042-012-1080-6 Document Type: Article

Times cited : (30)

References (68)

1
- 44949197897
- Robust speaker diarization for meetings: ICSI RT06 evaluation system
- Anguera X, Wooters C, Hernando J (2006) Robust speaker diarization for meetings: ICSI RT06 evaluation system. In: International conference on spoken language processing
- (2006) International Conference on Spoken Language Processing
- Anguera, X.¹ Wooters, C.² Hernando, J.³

2
- 51949100471
- People-tracking-by-detection and people-detection-by-tracking
- Andriluka M, Roth S, Schiele B (2008) People-tracking-by-detection and people-detection-by-tracking. In: IEEE conference on computer vision and pattern recognition
- (2008) IEEE Conference on Computer Vision and Pattern Recognition
- Andriluka, M.¹ Roth, S.² Schiele, B.³

3
- 33745117631
- Automatic face recognition for film character retrieval in feature-length films
- Arandjelovic O, Zisserman A (2005) Automatic face recognition for film character retrieval in feature-length films. In: IEEE conference on computer vision and pattern recognition
- (2005) IEEE Conference on Computer Vision and Pattern Recognition
- Arandjelovic, O.¹ Zisserman, A.²

4
- 0027608896
- Visually controlled graphics
- 10.1109/34.216730
- Azarbayejani A, Starner T, Horowitz B, Pentland A (1993) Visually controlled graphics. IEEE Trans Pattern Anal Mach Intell 15:602-605
- (1993) IEEE Trans Pattern Anal Mach Intell , vol.15 , pp. 602-605
- Azarbayejani, A.¹ Starner, T.² Horowitz, B.³ Pentland, A.⁴

5
- 33845515689
- On the use of sift features for face authentication
- Bicego M, Lagorio A, Grosso E, Tistarelli M (2006) On the use of sift features for face authentication. In: Computer vision and pattern recognition workshop
- (2006) Computer Vision and Pattern Recognition Workshop
- Bicego, M.¹ Lagorio, A.² Grosso, E.³ Tistarelli, M.⁴

6
- 77956606383
- Exploiting speaker segmentations for automatic role detection. An application to broadcast news documents
- Bigot B, Ferrané I, Pinquier J (2010) Exploiting speaker segmentations for automatic role detection. An application to broadcast news documents. In: International workshop on content-based multimedia indexing
- (2010) International Workshop on Content-based Multimedia Indexing
- Bigot B, F.¹

7
- 78049378635
- The LIA-EURECOM RT09 Speaker diarization system: Anhancements in speaker modelling and cluster purification
- Bozonnet S, Evans N, Fredouille C (2010) The LIA-EURECOM RT09 Speaker diarization system: anhancements in speaker modelling and cluster purification. In: IEEE international conference on acoustics, speech, and signal processing
- (2010) IEEE International Conference on Acoustics, Speech, and Signal Processing
- Bozonnet, S.¹ Evans, N.² Fredouille, C.³

8
- 0141519364
- Efficient audio segmentation algorithms based on the bic
- Cettolo M, Vescovi M (2003) Efficient audio segmentation algorithms based on the bic. In: IEEE international conference on acoustics, speech, and signal processing
- (2003) IEEE International Conference on Acoustics, Speech, and Signal Processing
- Cettolo, M.¹ Vescovi, M.²

9
- 84905180243
- Columbia University/VIREO-CityU/IRIT TRECVID2008 high-level feature extraction and interactive video search
- Chang SF, He J, Jiang YG, El Khoury E, Ngo CW, Yanagawa A, Zavesky E (2008) Columbia University/VIREO-CityU/IRIT TRECVID2008 high-level feature extraction and interactive video search. In: TREC video retrieval workshop, NIST
- (2008) TREC Video Retrieval Workshop, NIST
- Chang, S.F.¹ He, J.² Jiang, Y.G.³ El Khoury, E.⁴ Ngo, C.W.⁵ Yanagawa, A.⁶ Zavesky, E.⁷

10
- 84895064713
- Audio-visual speaker recognition using time-varying stream
- Chaudhari UV, Ramaswamy GN, Potamianos G, Neti C (2003) Audio-visual speaker recognition using time-varying stream. In: IEEE international conference on acoustics, speech and signal processing
- (2003) IEEE International Conference on Acoustics, Speech and Signal Processing
- Chaudhari, U.V.¹ Ramaswamy, G.N.² Potamianos, G.³ Neti, C.⁴

11
- 26844468363
- Information fusion and decision cascading for audio-visual speaker recognition based on time-varying stream reliability prediction
- Chaudhari UV, Ramaswamy GN, Potamianos G, Neti C (2003) Information fusion and decision cascading for audio-visual speaker recognition based on time-varying stream reliability prediction. In: IEEE international conference on multimedia and expo
- (2003) IEEE International Conference on Multimedia and Expo
- Chaudhari, U.V.¹ Ramaswamy, G.N.² Potamianos, G.³ Neti, C.⁴

12
- 84875953283
- Clustering via the bayesian information criterion with applications in speech recognition
- Chen SS, Gopalakrishnan PS (1998) Clustering via the bayesian information criterion with applications in speech recognition. In: IEEE international conference on acoustics, speech and signal processing
- (1998) IEEE International Conference on Acoustics, Speech and Signal Processing
- Chen, S.S.¹ Gopalakrishnan, P.S.²

13
- 72449181349
- Visual language model for face clustering in consumer photos
- Chu WT, Lee YL, Yu JY (2009) Visual language model for face clustering in consumer photos. In: ACM international conference on multimedia
- (2009) ACM International Conference on Multimedia
- Chu, W.T.¹ Lee, Y.L.² Yu, J.Y.³

14
- 84856657342
- Unsupervised metric learning for face identification in TV video
- Cinbis G, Verbeek J, Schmid C (2011) Unsupervised metric learning for face identification in TV video. In: IEEE international conference on computer vision
- (2011) IEEE International Conference on Computer Vision
- Cinbis, G.¹ Verbeek, J.² Schmid, C.³

15
- 33751265201
- Face detection and clustering for video indexing applications
- Czirjek C, Marlow S, Murphy N (2003) Face detection and clustering for video indexing applications. In: Advanced concepts for intelligent vision systems
- (2003) Advanced Concepts for Intelligent Vision Systems
- Czirjek, C.¹ Marlow, S.² Murphy, N.³

16
- 78650915265
- Unsupervised detection of multimodal clusters in edited recordings
- Dielmann A (2010) Unsupervised detection of multimodal clusters in edited recordings. In: IEEE international workshop on Multimedia Signal Processing (MMSP)
- (2010) IEEE International Workshop on Multimedia Signal Processing (MMSP)
- Dielmann, A.¹

17
- 79955869423
- Appearance-based person re-identification in camera networks: Problem overview and current approaches
- 10.1007/s12652-010-0034-y
- Doretto G, Sebastian T, Tu P, Rittscher J (2011) Appearance-based person re-identification in camera networks: Problem overview and current approaches. Journal of Ambient Intelligence and Humanized Computing 2(2):127-151
- (2011) Journal of Ambient Intelligence and Humanized Computing , vol.2 , Issue.2 , pp. 127-151
- Doretto, G.¹ Sebastian, T.² Tu, P.³ Rittscher, J.⁴

18
- 84898027861
- Hello! My name is buffy - Automatic naming of characters in TV video
- Everingham M, Sivic J, Zisserman A (2006) Hello! my name is buffy - automatic naming of characters in TV video. In: British Machine Vision Conference, BMVC06
- (2006) British Machine Vision Conference, BMVC06
- Everingham, M.¹ Sivic, J.² Zisserman, A.³

19
- 62949172236
- Taking the bite out of automated naming of characters in TV video
- 10.1016/j.imavis.2008.04.018
- Everingham M, Sivic J, Zisserman A (2009) Taking the bite out of automated naming of characters in TV video. Image Vision Comput 27(5):545-559
- (2009) Image Vision Comput , vol.27 , Issue.5 , pp. 545-559
- Everingham, M.¹ Sivic, J.² Zisserman, A.³

20
- 0041374436
- On affine invariant clustering and automatic cast listing in movies
- Fitzgibbon AW, Zisserman A (2002) On affine invariant clustering and automatic cast listing in movies. In: ECCV '02: European Conference on Computer Vision
- (2002) ECCV '02: European Conference on Computer Vision
- Fitzgibbon, A.W.¹ Zisserman, A.²

21
- 79956279915
- The LIA-EURECOM RT09 speaker diarization system
- Fredouille C, Bozonnet S, Evans N (2009) The LIA-EURECOM RT09 speaker diarization system. In: NIST Rich transcription workshop
- (2009) NIST Rich Transcription Workshop
- Fredouille, C.¹ Bozonnet, S.² Evans, N.³

22
- 70349214881
- Multi-modal speaker diarization of real-world meetings using compressed-domain video features
- Friedland G, Hung H, Chuohao Yeo (2009) Multi-modal speaker diarization of real-world meetings using compressed-domain video features. In: IEEE international conference on acoustics, speech and signal processing
- (2009) IEEE International Conference on Acoustics, Speech and Signal Processing
- Friedland, G.¹ Hung, H.² Yeo, C.³

23
- 78649623318
- Dialocalisation: Acoustic speaker diarization and visual localization as joint optimization problem
- Friedland G, Yeo C, Hung H (2010) Dialocalisation: acoustic speaker diarization and visual localization as joint optimization problem. ACM Trans Multimedia Comput Commun Appl, TOMCCAP 6(4):27
- (2010) ACM Trans Multimedia Comput Commun Appl, TOMCCAP , vol.6 , Issue.4 , pp. 27
- Friedland, G.¹ Yeo, C.² Hung, H.³

24
- 33745224977
- The ESTER phase II evaluation campaign for the rich transcription of the French broadcast news
- Galliano S, Geofrois E, Mosterfa D, Bonastre JF, Gravier G (2005) The ESTER phase II evaluation campaign for the rich transcription of the French broadcast news. In: European conference on speech communication and technology
- (2005) European Conference on Speech Communication and Technology
- Galliano, S.¹ Geofrois, E.² Mosterfa, D.³ Bonastre, J.F.⁴ Gravier, G.⁵

25
- 70450180496
- The ester 2 evaluation campaign for the rich transcription of French radio broadcasts
- Galliano S, Gravier G, Chaubard L (2009) The ester 2 evaluation campaign for the rich transcription of French radio broadcasts. In:TERSPEECH
- (2009) Interspeech
- Galliano, S.¹ Gravier, G.² Chaubard, L.³

26
- 0026400244
- Segregation of speakers for speech recognition and speaker identification
- Gish H, Siu MH, Rohlicek R (1991) Segregation of speakers for speech recognition and speaker identification. In: International conference on acoustics, speech, and signal processing
- (1991) International Conference on Acoustics, Speech, and Signal Processing
- Gish, H.¹ Siu, M.H.² Rohlicek, R.³

27
- 77953178820
- Is that you? Metric learning approaches for face identification
- Guillaumin M, Verbeek J, Schmid C (2009) Is that you? Metric learning approaches for face identification. ICCV
- (2009) ICCV
- Guillaumin, M.¹ Verbeek, J.² Schmid, C.³

28
- 84862136695
- Tracking and retexturing cloth for real-time virtual clothing applications
- Hilsmann A, Eisert P (2009) Tracking and retexturing cloth for real-time virtual clothing applications. In: International conference on computer vision/computer graphics collaboration techniques
- (2009) International Conference on Computer Vision/computer Graphics Collaboration Techniques
- Hilsmann, A.¹ Eisert, P.²

29
- 84897697234
- Towards audio-visual on-line diarization of participants in group meetings
- Hung H, Friedland G (2008) Towards audio-visual on-line diarization of participants In group meetings. In: Workshop on multi-camera and multi-modal sensor fusion
- (2008) Workshop on Multi-camera and Multi-modal Sensor Fusion
- Hung, H.¹ Friedland, G.²

30
- 0034866703
- Human tracking with mixtures of trees
- Ioffe S, Forsyth DA (2001) Human tracking with mixtures of trees. ICCV01
- (2001) ICCV01
- Ioffe, S.¹ Forsyth, D.A.²

31
- 33646149888
- Costume: A new feature for automatic video content indexing
- Jaffré G, Joly P (2004) Costume: a new feature for automatic video content indexing. RIAO
- (2004) RIAO
- Jaffré G, J.¹

32
- 34547516256
- Speaker Diarization: Towards a more robust and portable system
- El Khoury E, Senac C, André-Obrecht R (2007) Speaker Diarization: Towards a more robust and portable system. In: IEEE international conference on acoustics, speech, and signal processing
- (2007) IEEE International Conference on Acoustics, Speech, and Signal Processing
- El Khoury E, S.¹

33
- 70349197676
- Improved speaker diarization system for meetings
- El-Khoury E, Senac C, Pinquier J (2009) Improved speaker diarization system for meetings. In: IEEE international conference on acoustics, speech, and signal processing
- (2009) IEEE International Conference on Acoustics, Speech, and Signal Processing
- El-Khoury, E.¹ Senac, C.² Pinquier, J.³

34
- 77955290453
- Unsupervised segmentation methods of TV contents
- 10.1155/2010/539796
- El Khoury E, Senac C, Joly P (2010) Unsupervised segmentation methods of TV contents. In:t J Digital Multimedia Broadcast. doi: 10.1155/2010/539796
- (2010) Int J Digital Multimedia Broadcast
- El-Khoury, E.¹ Senac, C.² Joly, P.³

35
- 77952326998
- Face-and-clothing based people clustering in video content
- El Khoury E, Senac C, Joly P (2010) Face-and-clothing based people clustering in video content. In: ACM International conference on multimedia information retrieval
- (2010) ACM International Conference on Multimedia Information Retrieval
- El Khoury, E.¹ Senac, C.² Joly, P.³

36
- 84895060455
- Progress in the AMIDA speaker diarization system for meeting data
- Leeuwen DAV, Konecný M (2008) Progress in the AMIDA speaker diarization system for meeting data. In: Multimodal technologies for perception of humans: international evaluation workshops CLEAR 2007 and RT 2007
- (2008) Multimodal Technologies for Perception of Humans: International Evaluation Workshops CLEAR 2007 and RT 2007
- Leeuwen Dav, K.¹

37
- 17444395970
- Tracking multiple people with recovery from partial and total occlusion
- 10.1016/j.patcog.2004.11.022
- Lerdsudwichai C, Abdel-MottalebM, Ansari AN (2005) Tracking multiple people with recovery from partial and total occlusion. Pattern Recogn 38(7):1059-1070
- (2005) Pattern Recogn , vol.38 , Issue.7 , pp. 1059-1070
- Lerdsudwichai, C.¹ Abdel-Mottaleb, M.² Ansari, A.N.³

38
- 46449107363
- A fast, comprehensive shot boundary determination system
- Liu Z, Gibbon D, Zavesky E, Shahraray B, Haffner P (2007) A fast, comprehensive shot boundary determination system. In: IEEE international conference on multimedia and expo
- (2007) IEEE International Conference on Multimedia and Expo
- Liu, Z.¹ Gibbon, D.² Zavesky, E.³ Shahraray, B.⁴ Haffner, P.⁵

39
- 0034841928
- Major cast detection in video using both audio and visual information
- Liu Z, Wang Y (2001) Major cast detection in video using both audio and visual information. In: IEEE international conference on acoustics, speech, and signal processing
- (2001) IEEE International Conference on Acoustics, Speech, and Signal Processing
- Liu, Z.¹ Wang, Y.²

40
- 33846216333
- Major cast detection in video using both speaker and face information
- 10.1109/TMM.2006.886360
- Liu Z, Wang Y (2007) Major cast detection in video using both speaker and face information. IEEE Transactions on Multimedia 9(1):89-101
- (2007) IEEE Transactions on Multimedia , vol.9 , Issue.1 , pp. 89-101
- Liu, Z.¹ Wang, Y.²

41
- 3042535216
- Distinctive image features from scale-invariant keypoints
- 10.1023/B:VISI.0000029664.99615.94
- Lowe DG (2004) Distinctive image features from scale-invariant keypoints. In:t J Comput Vision 60(2):91-110
- (2004) Int J Comput Vision , vol.60 , Issue.2 , pp. 91-110
- Lowe, D.G.¹

42
- 0030213052
- Texture features for browsing and retrieval of image data
- 10.1109/34.531803
- Manjunath BS, Ma WY (1996) Texture features for browsing and retrieval of image data. IEEE Trans Pattern Anal Mach Intell 18(8):837-842
- (1996) IEEE Trans Pattern Anal Mach Intell , vol.18 , Issue.8 , pp. 837-842
- Manjunath, B.S.¹ Ma, W.Y.²

43
- 79959827767
- The IIR-NTU speaker diarization systems for RT 2009
- Nguyen TH, Sun H, Zhao S, Khine SZ, Tran HD, Ma TL, Ma B, Chng ES, Li H (2009) The IIR-NTU speaker diarization systems for RT 2009. In: NIST rich transcription workshop
- (2009) NIST Rich Transcription Workshop
- Nguyen, T.H.¹ Sun, H.² Zhao, S.³ Khine, S.Z.⁴ Tran, H.D.⁵ Ma, T.L.⁶ Ma, B.⁷ Chng, E.S.⁸ Li, H.⁹

44
- 20444478554
- Speaker localisation using audio-visual synchrony: An ampirical study
- Nockc HJ, Iyengar G, Neti C (2003) Speaker localisation using audio-visual synchrony: an ampirical study. In: CIVR: ACM international conference on image and video retrieval
- (2003) CIVR: ACM International Conference on Image and Video Retrieval
- Nockc, H.J.¹ Iyengar, G.² Neti, C.³

45
- 52049099192
- Automatic classification video for person indexing
- IEEE Computer Society Washington, DC, USA 10.1109/CISP.2008.405 978-0-7695-3119-9
- Peng J, Lin QX (2008) Automatic classification video for person indexing. In: Proceedings of the 2008 congress on image and signal processing, CISP '08, vol 2. IEEE Computer Society, Washington, DC, USA, pp 475-479. ISBN 978-0-7695-3119-9
- (2008) Proceedings of the 2008 Congress on Image and Signal Processing, CISP '08, Vol 2 , pp. 475-479
- Peng, J.¹ Lin, Q.X.²

46
- 77954098766
- Intervenant classification in an audiovisual document
- Philippeau J, Pinquier J, Joly P (2006) Intervenant classification in an audiovisual document. In: International conference on signal processing and multimedia applications
- (2006) International Conference on Signal Processing and Multimedia Applications
- Philippeau, J.¹ Pinquier, J.² Joly, P.³

47
- 0141590391
- A fusion study in speech/music classification
- Pinquier J, Rouas JL, André-Obrecht R (2003) A fusion study in speech/music classification. In: IEEE international conference on acoustics, speech and signal processing
- (2003) IEEE International Conference on Acoustics, Speech and Signal Processing
- Pinquier J, R.¹

48
- 0006354851
- Karl Pearson and the chi-squared test
- 10.2307/1402731 0501.62001 703306
- Plackett RL (1983) Karl Pearson and the chi-squared test. In:t Stat Rev 51(1):59-72
- (1983) Int Stat Rev , vol.51 , Issue.1 , pp. 59-72
- Plackett, R.L.¹

49
- 70450144960
- Voice activity detection
- Grimm M, Kroschel K (eds)
- Ramirez J, Girriz JM, Segura JC (2007) Voice activity detection. In: Grimm M, Kroschel K (eds) Fundamentals and speech recognition system robustness. Robust Speech Recognition and Understanding
- (2007) Fundamentals and Speech Recognition System Robustness. Robust Speech Recognition and Understanding
- Ramirez, J.¹ Girriz, J.M.² Segura, J.C.³

50
- 84867505436
- Tracking clothed people
- Rosenhahn B, Kersting U, Powell K, Brox T, Seidel HP (2007) Tracking clothed people. In: Human motion - understanding, modeling, capture, and animation. Springer
- (2007) Human Motion - Understanding, Modeling, Capture, and Animation. Springer
- Rosenhahn, B.¹ Kersting, U.² Powell, K.³ Brox, T.⁴ Seidel, H.P.⁵

51
- 0030648077
- Construction and evaluation of a robust multifeature speech/music discriminator
- Scheirer E, Slaney M (1997) Construction and evaluation of a robust multifeature speech/music discriminator. In: IEEE international conference on acoustics, speech, and signal processing
- (1997) IEEE International Conference on Acoustics, Speech, and Signal Processing
- Scheirer, E.¹ Slaney, M.²

52
- 77956739906
- Online Diarization of Streaming Audio-Visual Data for Smart Environments
- 10.1109/JSTSP.2010.2050519
- Schmalenstroeer J, Haeb-Umbach R (2010) Online Diarization of Streaming Audio-Visual Data for Smart Environments. J Sel Topics Signal Processing 4(5):845-856
- (2010) J Sel Topics Signal Processing , vol.4 , Issue.5 , pp. 845-856
- Schmalenstroeer, J.¹ Haeb-Umbach, R.²

53
- 0002782496
- Automatic segmentation, classification and clustering of broadcast news audio
- Siegler MA, Jain U, Raj B, Stern RM (1997) Automatic segmentation, classification and clustering of broadcast news audio. In: DARPA Speech Recognition Workshop
- (1997) DARPA Speech Recognition Workshop
- Siegler, M.A.¹ Jain, U.² Raj, B.³ Stern, R.M.⁴

54
- 85009153345
- On the use of the bayesian information criterion in multiple speaker detection
- Sivakumaran P, Fortuna J, Ariyaeeinia AM (2001) On the use of the bayesian information criterion in multiple speaker detection. In: The 7th European conference on speech communication and technology (Eurospeech'01)
- (2001) The 7th European Conference on Speech Communication and Technology (Eurospeech'01)
- Sivakumaran, P.¹ Fortuna, J.² Ariyaeeinia, A.M.³

55
- 77249161746
- Video shot boundary detection: Seven years of trecvid activity
- 10.1016/j.cviu.2009.03.011
- Smeaton AF, Over P, Doherty AR (2010) Video shot boundary detection: seven years of trecvid activity. Comput Vis Image Und 114(4):411-418
- (2010) Comput Vis Image und , vol.114 , Issue.4 , pp. 411-418
- Smeaton, A.F.¹ Over, P.² Doherty, A.R.³

56
- 47749150338
- Multimodal technologies for perception of humans: International evaluation workshops CLEAR 2007 and RT 2007
- Springer
- Stiefelhagen R, Bowers R, Fiscus J (2008) Multimodal technologies for perception of humans: international evaluation workshops CLEAR 2007 and RT 2007. ser. Lecture Notes in Computer Science. Springer
- (2008) Ser. Lecture Notes in Computer Science
- Stiefelhagen, R.¹ Bowers, R.² Fiscus, J.³

57
- 51849166611
- Pose robust face tracking by combining active appearance models and cylinder head models
- 10.1007/s11263-007-0125-1
- Sung JW, Kanade T, Kim DJ (2008) Pose robust face tracking by combining active appearance models and cylinder head models. In:t J Comput Vis 80(2):260-274
- (2008) Int J Comput Vis , vol.80 , Issue.2 , pp. 260-274
- Sung, J.W.¹ Kanade, T.² Kim, D.J.³

58
- 1542572925
- Multi-modal speech recognition using optical-flow analysis for lip images
- Tamura S, Iwano K, Furui S (2004) Multi-modal speech recognition using optical-flow analysis for lip images. J VLSI Signal Process Syst 36(2/3):117-124
- (2004) J VLSI Signal Process Syst , vol.36 , Issue.23 , pp. 117-124
- Tamura, S.¹ Iwano, K.² Furui, S.³

59
- 0027609968
- Analysis and synthesis of facial image sequences using physical and anatomical models
- 10.1109/34.216726
- Terzopoulos D, Waters K (1993) Analysis and synthesis of facial image sequences using physical and anatomical models. IEEE Trans Pattern Anal Mach Intell 15:569-579
- (1993) IEEE Trans Pattern Anal Mach Intell , vol.15 , pp. 569-579
- Terzopoulos, D.¹ Waters, K.²

60
- 0034442267
- New enhancements to cut, fade, and dissolve detection processes in video segmentation
- Truong BT, Dorai C, Venkatesh S (2000) New enhancements to cut, fade, and dissolve detection processes in video segmentation. In: ACM international conference on Multimedia
- (2000) ACM International Conference on Multimedia
- Truong, B.T.¹ Dorai, C.² Venkatesh, S.³

61
- 33646796027
- Clustering speech utterances by speaker using eigenvoice-motivated vector space model
- Tsai WH, Cheng SS, Chao YH, Wang HM (2005) Clustering speech utterances by speaker using eigenvoice-motivated vector space model. In: IEEE international conference on acoustics, speech, and signal processing
- (2005) IEEE International Conference on Acoustics, Speech, and Signal Processing
- Tsai, W.H.¹ Cheng, S.S.² Chao, Y.H.³ Wang, H.M.⁴

62
- 34047223614
- Audio segmentation and speaker localization in meeting videos
- Vajaria H, Islam T, Sarkar S, Sankar R, Kasturi R (2006) Audio segmentation and speaker localization in meeting videos. In: ICPR'06: international conference on pattern recognition
- (2006) ICPR'06: International Conference on Pattern Recognition
- Vajaria, H.¹ Islam, T.² Sarkar, S.³ Sankar, R.⁴ Kasturi, R.⁵

63
- 10444227648
- A survey on pixel-based skin color detection techniques
- Vezhnevets V, Sazonov V, Andreeva A (2003) A survey on pixel-based skin color detection techniques. In: Proc. Graphicon
- (2003) Proc. Graphicon
- Vezhnevets, V.¹ Sazonov, V.² Andreeva, A.³

64
- 0344983340
- Detecting pedestrians using patterns of motion and appearance
- Viola P, Jones MJ, Snow D (2003) Detecting pedestrians using patterns of motion and appearance. In: ICCV '03: IEEE international conference on computer vision
- (2003) ICCV '03: IEEE International Conference on Computer Vision
- Viola, P.¹ Jones, M.J.² Snow, D.³

65
- 2142812371
- Robust real-time face detection
- 10.1023/B:VISI.0000013087.49260.fb
- Viola P, Jones MJ (2004) Robust real-time face detection. In:t J Comput Vis 57(2):137-154
- (2004) Int J Comput Vis , vol.57 , Issue.2 , pp. 137-154
- Viola, P.¹ Jones, M.J.²

66
- 84895062382
- Face detection
- Yang MH (2009) Face detection. In: Encyclopedia of biometrics. Springer
- (2009) Encyclopedia of Biometrics. Springer
- Yang, M.H.¹

67
- 22544475615
- Efficient audio stream segmentation via the combined T2 statistic and the bayesian information criterion
- 10.1109/TSA.2005.845790
- Zhou B, Hansen JHL (2005) Efficient audio stream segmentation via the combined T2 statistic and the bayesian information criterion. IEEE Trans Speech Audio Processing 13(4):467-474
- (2005) IEEE Trans Speech Audio Processing , vol.13 , Issue.4 , pp. 467-474
- Zhou, B.¹ Hansen, J.H.L.²

68
- 84895058370
- Multi-stage speaker diarization for conference and lecture meetings
- Zhu X, Barras C, Lamel L, Gauvain JL (2008) Multi-stage speaker diarization for conference and lecture meetings. In: Multimodal technologies for perception of humans. Springer
- (2008) Multimodal Technologies for Perception of Humans. Springer
- Zhu, X.¹ Barras, C.² Lamel, L.³ Gauvain, J.L.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.