SCOPUS 정보 검색 플랫폼

MM'09 - Proceedings of the 2009 ACM Multimedia Conference, with Co-located Workshops and Symposiums

Volumn , Issue , 2009, Pages 5-14

Short-term audio-visual atoms for generic video concept classification

(5) Jiang, Wei a Cotton, Courtenay a Chang, Shih Fu a Ellis, Dan a Loui, Alexander b

a Columbia University ^* (United States)

b EASTMAN KODAK COMPANY (United States)

Author keywords

Audio visual codebook; Joint audio visual analysis; Semantic concept detection; Short term Audio Visual Atom

Indexed keywords

AUDIO-VISUAL; AUDIOVISUAL ANALYSIS; CODEBOOKS; SEMANTIC CONCEPT DETECTION;

ATOMS; METHOD OF MOMENTS; MULTIMEDIA SYSTEMS; TECHNICAL PRESENTATIONS;

SEMANTICS;

EID: 72549099611 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1631272.1631277 Document Type: Conference Paper

Times cited : (49)

References (38)

1
- 72549102414
- Biologically motivated audio-visual cue integration for object categorization
- J. Anemueller and et al. Biologically motivated audio-visual cue integration for object categorization. In CogSys, 2008.
- (2008) CogSys
- Anemueller, J.¹ and et, al.²

2
- 34948829598
- Harmony in motion
- Z. Barzelay and Y. Schechner. Harmony in motion. In Proc. CVPR, pages 1-8, 2007.
- (2007) Proc. CVPR , pp. 1-8
- Barzelay, Z.¹ Schechner, Y.²

3
- 0042349407
- A graphical model for audiovisual object tracking
- M.J. Beal and et al. A graphical model for audiovisual object tracking. IEEE Trans. PAMI, 25(7):828-836, 2003.
- (2003) IEEE Trans. PAMI , vol.25 , Issue.7 , pp. 828-836
- Beal, M.J.¹ and et, al.²

4
- 0004331630
- S. Birchfield. KLT: An Implementation of the Kanade-Lucas-Tomasi Feature Tracker. http://vision.stanford.edu/¡?birch
- KLT: An Implementation of the Kanade-Lucas-Tomasi Feature Tracker
- Birchfield, S.¹

5
- 84905193419
- Columbia university TRECVID-2005 video search and high-level feature extraction
- Gaithersburg, MD
- S.F. Chang and et al. Columbia university TRECVID-2005 video search and high-level feature extraction. In NIST TRECVID workshop, Gaithersburg, MD, 2005.
- (2005) NIST TRECVID workshop
- Chang, S.F.¹ and et, al.²

6
- 72549095204
- Large-scale multimodal semantic concept detection for consumer video
- S.F. Chang and et al. Large-scale multimodal semantic concept detection for consumer video. In ACM MIR, 2007.
- (2007) ACM MIR
- Chang, S.F.¹ and et, al.²

7
- 84863161940
- Image categorization by learning and reasoning with regions
- Y.X. Chen and et al. Image categorization by learning and reasoning with regions. In JMLR, 5:913-939, 2004.
- (2004) JMLR , vol.5 , pp. 913-939
- Chen, Y.X.¹ and et, al.²

8
- 33846623313
- Audio-visual event recognition in surveillance video sequences
- M. Cristani and et al. Audio-visual event recognition in surveillance video sequences. In IEEE Trans. Multimedia, 9(2):257-267, 2007.
- (2007) IEEE Trans. Multimedia , vol.9 , Issue.2 , pp. 257-267
- Cristani, M.¹ and et, al.²

9
- 51449105193
- Environmental sound recognition using MP-based features
- S. Chu and et al. Environmental sound recognition using MP-based features. in Proc. ICASSP, pages 1-4, 2008.
- (2008) Proc. ICASSP , pp. 1-4
- Chu, S.¹ and et, al.²

10
- 33645146449
- Histograms of oriented gradients for human detection
- N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In Proc. CVPR, pages 886-893, 2005.
- (2005) Proc. CVPR , pp. 886-893
- Dalal, N.¹ Triggs, B.²

11
- 2342460956
- Video retrieval using spatial-temporal descriptors
- D. Dementhon and D. Doermann. Video retrieval using spatial-temporal descriptors. In ACM Multimedia, 2003.
- (2003) ACM Multimedia
- Dementhon, D.¹ Doermann, D.²

12
- 0035423154
- Unsupervised segmentation of color-texture regions in images and video
- Y. Deng and B.S. Manjunath. Unsupervised segmentation of color-texture regions in images and video. In IEEE Trans. PAMI, 23(8):800-810, 2001.
- (2001) IEEE Trans. PAMI , vol.23 , Issue.8 , pp. 800-810
- Deng, Y.¹ Manjunath, B.S.²

13
- 0034164230
- Additive logistic regression: A statistical view of boosting
- J. Friedman and et al. Additive logistic regression: a statistical view of boosting. Ann. of Sta., 28(22):337-407, 2000.
- (2000) Ann. of Sta , vol.28 , Issue.22 , pp. 337-407
- Friedman, J.¹ and et, al.²

14
- 33745855044
- The pyramid match kernel: Discriminative classification with sets of image features
- K. Grauman and T. Darrel. The pyramid match kernel: Discriminative classification with sets of image features. In Proc. ICCV, 2:1458-1465, 2005.
- (2005) Proc. ICCV , vol.2 , pp. 1458-1465
- Grauman, K.¹ Darrel, T.²

15
- 5044226887
- Incremental density approximation and kernel-based bayesian filtering for object tracking
- B. Han and et al. Incremental density approximation and kernel-based bayesian filtering for object tracking. In Proc. CVPR, pages 638-644, 2004.
- (2004) Proc. CVPR , pp. 638-644
- Han, B.¹ and et, al.²

16
- 0009622482
- Audio-vision: Using audio-visual synchrony to locate sounds
- J. Hershey and J. Movellan. Audio-vision: Using audio-visual synchrony to locate sounds. In NIPS, 1999.
- (1999) NIPS
- Hershey, J.¹ Movellan, J.²

17
- 34247257857
- Audio-visual speech recognition using lip information extracted from side-face images
- K. Iwano and et al. Audio-visual speech recognition using lip information extracted from side-face images. In EURASIP JASMP, 2007(1):4-4, 2007.
- (2007) EURASIP JASMP , vol.2007 , Issue.1 , pp. 4-4
- Iwano, K.¹ and et, al.²

18
- 0142134976
- Robust online appearence models for visual tracking
- A. Jepson and et al. Robust online appearence models for visual tracking. IEEE Trans.PAMI, 25(10):1296-1311, 2003.
- (2003) IEEE Trans.PAMI , vol.25 , Issue.10 , pp. 1296-1311
- Jepson, A.¹ and et, al.²

19
- 0001008498
- Real-time lip tracking for audio-visual speech recognition applications
- R. Kaucic, B. Dalton, and A. Blake. Real-time lip tracking for audio-visual speech recognition applications. In Proc. ECCV, vol.2, pages 376-387, 1996.
- (1996) Proc. ECCV , vol.2 , pp. 376-387
- Kaucic, R.¹ Dalton, B.² Blake, A.³

20
- 48949111673
- R. Gribonval and S. Krstulovic. MPTK, the matching pursuit toolkit. http://mptk.irisa.fr/
- MPTK, the matching pursuit toolkit
- Gribonval, R.¹ Krstulovic, S.²

21
- 37849015208
- Kodak's consumer video benchmark data set: Concept definition and annotation
- A. Loui and et al. Kodak's consumer video benchmark data set: concept definition and annotation. In ACM SIGMM Int'l Workshop on MIR, pages 245-254, 2007.
- (2007) ACM SIGMM Int'l Workshop on MIR , pp. 245-254
- Loui, A.¹ and et, al.²

22
- 3042535216
- Distinctive image features from scale-invariant keypoints
- D. Lowe. Distinctive image features from scale-invariant keypoints. In IJCV, 60(2):91-110, 2004.
- (2004) IJCV , vol.60 , Issue.2 , pp. 91-110
- Lowe, D.¹

23
- 0002836012
- An iterative image registration technique with an application to stereo vision
- B.D. Lucas and T. Kanade. An iterative image registration technique with an application to stereo vision. In Proc. Imaging understanding workshop, pages 121-130, 1981.
- (1981) Proc. Imaging understanding workshop , pp. 121-130
- Lucas, B.D.¹ Kanade, T.²

24
- 0027842081
- Matching pursuits with time-frequency dictionaries
- S. Mallat and Z. Zhang. Matching pursuits with time-frequency dictionaries. In IEEE Trans. Signal Processing, 41(12):3397-3415, 1993.
- (1993) IEEE Trans. Signal Processing , vol.41 , Issue.12 , pp. 3397-3415
- Mallat, S.¹ Zhang, Z.²

25
- 84898935332
- A framework for multiple-instance learning
- O. Maron and et al. A framework for multiple-instance learning. In NIPS, 1998.
- (1998) NIPS
- Maron, O.¹ and et, al.²

26
- 57149147931
- Extracting moving people from internet videos
- J.C. Niebles and et al. Extracting moving people from internet videos. in Proc. ECCV, pages 527-540, 2008.
- (2008) Proc. ECCV , pp. 527-540
- Niebles, J.C.¹ and et, al.²

27
- 84870466817
- NIST. TREC Video Retrieval Evaluation (TRECVID). 2001 - 2008. http://www-nlpir.nist.gov/projects/trecvid/
- (2001) TREC Video Retrieval Evaluation (TRECVID)

28
- 34547532522
- Fingerprinting to identify repeated sound events in long-duration personal audio recordings
- J. Ogle and D. Ellis. Fingerprinting to identify repeated sound events in long-duration personal audio recordings. In Proc. ICASSP, pages I-233-236, 2007.
- (2007) Proc. ICASSP
- Ogle, J.¹ Ellis, D.²

29
- 72549090392
- F. Petitcolas. MPEG for MATLAB. http://www.petitcolas.net/fabien/ software/mpeg
- MPEG for MATLAB
- Petitcolas, F.¹

30
- 0028112849
- Good features to track
- J. Shi and C. Tomasi. Good features to track. In Proc. CVPR, pages 593-600, 1994.
- (1994) Proc. CVPR , pp. 593-600
- Shi, J.¹ Tomasi, C.²

31
- 0034244889
- Learning patterns of activity using real-time tracking
- C. Stauffer and W.E.L. Grimson. Learning patterns of activity using real-time tracking. In IEEE Trans. PAMI, 22(8):747-757, 2002.
- (2002) IEEE Trans. PAMI , vol.22 , Issue.8 , pp. 747-757
- Stauffer, C.¹ Grimson, W.E.L.²

32
- 72549084034
- Boosting image retrieval
- K. Tieu and P. Viola. Boosting image retrieval. In IJCV, 56(1-2):228-235, 2000.
- (2000) IJCV , vol.56 , Issue.1-2 , pp. 228-235
- Tieu, K.¹ Viola, P.²

33
- 0003991806
- Wiley-Interscience, New York
- V. Vapnik. Statistical learning theory. Wiley-Interscience, New York, 1998.
- (1998) Statistical learning theory
- Vapnik, V.¹

34
- 33745844069
- Learning Semantic Scene Models by Trajectory Analysis
- X.G. Wang and et al. Learning Semantic Scene Models by Trajectory Analysis. In Proc. ECCV, pages 110-123, 2006.
- (2006) Proc. ECCV , pp. 110-123
- Wang, X.G.¹ and et, al.²

35
- 20444437959
- Multimodal information fusion for video concept detection
- Y. Wu and et al. Multimodal information fusion for video concept detection. in Proc. ICIP, pages 2391-2394, 2004.
- (2004) Proc. ICIP , pp. 2391-2394
- Wu, Y.¹ and et, al.²

36
- 33845562302
- Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning
- C. Yang and et al. Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning. In Proc. CVPR, pages 2057-2063, 2006.
- (2006) Proc. CVPR , pp. 2057-2063
- Yang, C.¹ and et, al.²

37
- 37849051635
- Large head movement tracking using SIFT-based registration
- G.Q. Zhao and et al. Large head movement tracking using SIFT-based registration. In ACM Multimedia, 2007.
- (2007) ACM Multimedia
- Zhao, G.Q.¹ and et, al.²

38
- 59349094120
- Object tracking using sift features and mean shift
- H. Zhou and et al. Object tracking using sift features and mean shift. Com. Vis. & Ima. Und., 113(3):345-352, 2009.
- (2009) Com. Vis. & Ima. Und , vol.113 , Issue.3 , pp. 345-352
- Zhou, H.¹ and et, al.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.