-
1
-
-
79955649703
-
Human activity analysis: a review
-
Aggarwal JK, Ryoo MS (2011) Human activity analysis: a review. ACM Comput Surv 43(3):1–16
-
(2011)
ACM Comput Surv
, vol.43
, Issue.3
, pp. 1-16
-
-
Aggarwal, J.K.1
Ryoo, M.S.2
-
2
-
-
73849126715
-
Human action recognition in videos using kinematic features and multiple instance learning
-
Ali S, Shah M (2010) Human action recognition in videos using kinematic features and multiple instance learning. IEEE Trans Pattern Anal Mach Intell 32(2):288–303
-
(2010)
IEEE Trans Pattern Anal Mach Intell
, vol.32
, Issue.2
, pp. 288-303
-
-
Ali, S.1
Shah, M.2
-
3
-
-
0020849266
-
Maintaining knowledge about temporal intervals
-
Allen JF (1983) Maintaining knowledge about temporal intervals. Commun ACM 26(11):832–843
-
(1983)
Commun ACM
, vol.26
, Issue.11
, pp. 832-843
-
-
Allen, J.F.1
-
4
-
-
0022115986
-
Kinematic features of unrestrained vertical arm movements
-
Atkeson CG, Hollerbach JM (1985) Kinematic features of unrestrained vertical arm movements. J Neurosci 5(9):2318–2330
-
(1985)
J Neurosci
, vol.5
, Issue.9
, pp. 2318-2330
-
-
Atkeson, C.G.1
Hollerbach, J.M.2
-
5
-
-
34547645414
-
The bag-of-frames approach to audio pattern recognition: a sufficient model for urban soundscapes but not for polyphonic music
-
Aucouturier JJ, Defreville B, Pachet F (2007) The bag-of-frames approach to audio pattern recognition: a sufficient model for urban soundscapes but not for polyphonic music. J Acoust Soc Am 122(2):881–891
-
(2007)
J Acoust Soc Am
, vol.122
, Issue.2
, pp. 881-891
-
-
Aucouturier, J.J.1
Defreville, B.2
Pachet, F.3
-
7
-
-
84870398559
-
Audio-based event detection for sports video
-
Proceedings of international conference on image and video retrieval, Urbana-Champaign, IL
-
Baillie M, Jose JM (2003) Audio-based event detection for sports video. In: Proceedings of international conference on image and video retrieval, Urbana-Champaign, IL
-
(2003)
In
-
-
Baillie, M.1
Jose, J.M.2
-
8
-
-
78651388935
-
Event detection and recognition for semantic annotation of video
-
Ballan L, Bertini M, Bimbo AD, Seidenari L, Serra G (2011) Event detection and recognition for semantic annotation of video. Multimedia Tools Appl 51(1):279–302
-
(2011)
Multimedia Tools Appl
, vol.51
, Issue.1
, pp. 279-302
-
-
Ballan, L.1
Bertini, M.2
Bimbo, A.D.3
Seidenari, L.4
Serra, G.5
-
9
-
-
34848878272
-
Headline generation based on statistical translation
-
Proceedings of the annual meeting of the association for computational linguistics, Hong Kong
-
Banko M, Mittal VO, Witbrock, MJ (2000) Headline generation based on statistical translation. In: Proceedings of the annual meeting of the association for computational linguistics, Hong Kong
-
(2000)
In
-
-
Banko, M.1
Mittal, V.O.2
Witbrock, M.J.3
-
10
-
-
84905241486
-
-
In, Proceedings of NIST TRECVID, Workshop, Gaithersburg, MD, USA
-
Bao L, Yu SI, Lan ZZ, Overwijk A, Jin Q, Langner B, Garbus M, Burger S, Metze F, Hauptmann A (2011) Informedia @ TRECVID 2011. In: Proceedings of NIST TRECVID, Workshop, Gaithersburg, MD, USA
-
(2011)
Hauptmann A (2011) Informedia @ TRECVID
-
-
Bao, L.1
Yu, S.I.2
Lan, Z.Z.3
Overwijk, A.4
Jin, Q.5
Langner, B.6
Garbus, M.7
Burger, S.8
Metze, F.9
-
11
-
-
84881100367
-
-
arXiv:1204.3616v1
-
Barbu, A., Bridge, A., Coroian, D., Dickinson, S., Mussman, S., Narayanaswamy, S., Salvi, D., Schmidt, L., Shangguan, J., Siskind, J.M., Waggoner, J., Wang, S., Wei, J., Yin, Y., Zhang, Z.: Large-scale automatic labeling of video events with verbs based on event-participant interaction. In: arXiv:1204.3616v1 (2012)
-
(2012)
Large-scale automatic labeling of video events with verbs based on event-participant interaction
-
-
Barbu, A.1
Bridge, A.2
Coroian, D.3
Dickinson, S.4
Mussman, S.5
Narayanaswamy, S.6
Salvi, D.7
Schmidt, L.8
Shangguan, J.9
Siskind, J.M.10
Waggoner, J.11
Wang, S.12
Wei, J.13
Yin, Y.14
Zhang, Z.15
-
12
-
-
43049174575
-
SURF: speeded up robust features
-
Bay H, Ess A, Tuytelaars T, van Gool L (2008) SURF: speeded up robust features. Comput Vision Image Underst 110(3):346–359
-
(2008)
Comput Vision Image Underst
, vol.110
, Issue.3
, pp. 346-359
-
-
Bay, H.1
Ess, A.2
Tuytelaars, T.3
van Gool, L.4
-
15
-
-
0031590139
-
Movement, activity, and action: the role of knowledge in the perception of motion
-
Bobick AF (1997) Movement, activity, and action: the role of knowledge in the perception of motion. Philos Trans Royal Soc London 352:1257–1265
-
(1997)
Philos Trans Royal Soc London
, vol.352
, pp. 1257-1265
-
-
Bobick, A.F.1
-
17
-
-
43449110431
-
Automatic video classification: a survey of the literature
-
Brezeale D, Cook D (2008) Automatic video classification: a survey of the literature. IEEE Trans Syst Man Cybernet Part C 38(3):416–430
-
(2008)
IEEE Trans Syst Man Cybernet Part C
, vol.38
, Issue.3
, pp. 416-430
-
-
Brezeale, D.1
Cook, D.2
-
18
-
-
79955857786
-
Efficient structure learning of bayesian networks using constraints
-
de Campos C, Ji Q (2011) Efficient structure learning of bayesian networks using constraints. J Mach Learn Res 12(3):663–689
-
(2011)
J Mach Learn Res
, vol.12
, Issue.3
, pp. 663-689
-
-
de Campos, C.1
Ji, Q.2
-
19
-
-
77954608206
-
MCG-WEBV: a benchmark dataset for web video analysis. Tech. rep
-
Institute of Computing Technology, Chinese Academy of Sciences
-
Cao J, Zhang YD, Song YC, Chen ZN, Zhang X, Li JT (2009) MCG-WEBV: a benchmark dataset for web video analysis. Tech. rep., ICT-MCG-09-001, Institute of Computing Technology, Chinese Academy of Sciences
-
(2009)
ICT-MCG-09-001
-
-
Cao, J.1
Zhang, Y.D.2
Song, Y.C.3
Chen, Z.N.4
Zhang, X.5
Li, J.T.6
-
20
-
-
4944266418
-
What is going on? a high level interpretation of sequences of images
-
Springer-Verlag, London, UK
-
Castel C, Chaudron L, Tessier C (1996) What is going on? a high level interpretation of sequences of images. In: Proceedings of European conference on computer vision, Springer-Verlag, London, UK
-
(1996)
In: Proceedings of European conference on computer vision
-
-
Castel, C.1
Chaudron, L.2
Tessier, C.3
-
21
-
-
84905180243
-
Columbia University/VIREO-CityU/IRIT TRECVID2008 high-level feature extraction and interactive video search
-
Workshop, Gaithersburg
-
Chang SF, He J, Jiang YG, El Khoury E, Ngo CW, Yanagawa A, Zavesky, E. (2008) Columbia University/VIREO-CityU/IRIT TRECVID2008 high-level feature extraction and interactive video search. In: Proceedings of NIST TRECVID, Workshop, Gaithersburg
-
(2008)
In: Proceedings of NIST TRECVID
-
-
Chang, S.F.1
He, J.2
Jiang, Y.G.3
El Khoury, E.4
Ngo, C.W.5
Yanagawa, A.6
Zavesky, E.7
-
22
-
-
0029716457
-
Integrated image and speech analysis for content-based video indexing
-
Proceedings of IEEE international conference on multimedia computing and systems, Washington, DC
-
Chang YL, Zeng W, Kamel I, Alonso R (1996) Integrated image and speech analysis for content-based video indexing. In: Proceedings of IEEE international conference on multimedia computing and systems, Washington, DC
-
(1996)
In
-
-
Chang, Y.L.1
Zeng, W.2
Kamel, I.3
Alonso, R.4
-
24
-
-
84905251864
-
Team SRI-Sarnoff’s AURORA System @ TRECVID 2011
-
Proceedings of NIST TRECVID, Workshop
-
Cheng H et al (2011) Team SRI-Sarnoff’s AURORA System @ TRECVID 2011. In: Proceedings of NIST TRECVID, Workshop
-
(2011)
In
-
-
Cheng, H.1
-
26
-
-
80051610520
-
Soundtrack classification by transient events
-
Cotton CV, Ellis DPW, Loui AC (2011) Soundtrack classification by transient events. In: Proceedings of IEEE international conference acoustics, speech, signal processing, pp 473–476
-
(2011)
In: Proceedings of IEEE international conference acoustics, speech, signal processing
, pp. 473-476
-
-
Cotton, C.V.1
Ellis, D.P.W.2
Loui, A.C.3
-
28
-
-
80052888136
-
-
In, Proceedings of IEEE conference on computer vision and, pattern recognition
-
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Proceedings of IEEE conference on computer vision and, pattern recognition
-
(2009)
Imagenet: a large-scale hierarchical image database
-
-
Deng, J.1
Dong, W.2
Socher, R.3
Li, L.J.4
Li, K.5
Fei-Fei, L.6
-
31
-
-
77956003629
-
-
In, Proceedings of IEEE conference on computer vision and, pattern recognition
-
Duan L, Xu D, Tsang IW, Luo J (2010) Visual event recognition in videos by learning from web data. In: Proceedings of IEEE conference on computer vision and, pattern recognition
-
(2010)
Visual event recognition in videos by learning from web data
-
-
Duan, L.1
Xu, D.2
Tsang, I.W.3
Luo, J.4
-
33
-
-
33744968612
-
Audio-based context recognition
-
Eronen A, Peltonen V, Tuomi J, Klapuri A, Fagerlund S, Sorsa T, Lorho G, Huopaniemi J (2006) Audio-based context recognition. IEEE Trans Audio Speech Lang Process 14(1):321–329
-
(2006)
IEEE Trans Audio Speech Lang Process
, vol.14
, Issue.1
, pp. 321-329
-
-
Eronen, A.1
Peltonen, V.2
Tuomi, J.3
Klapuri, A.4
Fagerlund, S.5
Sorsa, T.6
Lorho, G.7
Huopaniemi, J.8
-
35
-
-
77955422240
-
Object detection with discriminatively trained part based models
-
Felzenszwalb P, Girshick R, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part based models. IEEE Trans Pattern Anal Mach Intell 32(9):1530–1535
-
(2010)
IEEE Trans Pattern Anal Mach Intell
, vol.32
, Issue.9
, pp. 1530-1535
-
-
Felzenszwalb, P.1
Girshick, R.2
McAllester, D.3
Ramanan, D.4
-
37
-
-
0002635287
-
The case for case
-
Universals in Linguistic Theory, New York
-
Fillmore CJ (1968) The case for case. In: Bach E, Harms R (eds), Universals in Linguistic Theory, New York, pp 1–88
-
(1968)
Bach E
, pp. 1-88
-
-
Fillmore, C.J.1
Harms, R.2
-
39
-
-
28344457205
-
Verl: an ontology framework for representing and annotating video events
-
Francois ARJ, Nevatia R, Hobbs J, Bolles RC (2005) Verl: an ontology framework for representing and annotating video events. IEEE Multimedia Magazine 12(4):76–86
-
(2005)
IEEE Multimedia Magazine
, vol.12
, Issue.4
, pp. 76-86
-
-
Francois, A.R.J.1
Nevatia, R.2
Hobbs, J.3
Bolles, R.C.4
-
40
-
-
25844482570
-
A comparison of algorithms for inference and learning in probabilistic graphical models
-
Frey BJ, Jojic N (2005) A comparison of algorithms for inference and learning in probabilistic graphical models. IEEE Trans Pattern Anal Mach Intell 27(9):1392–1416
-
(2005)
IEEE Trans Pattern Anal Mach Intell
, vol.27
, Issue.9
, pp. 1392-1416
-
-
Frey, B.J.1
Jojic, N.2
-
41
-
-
77952671498
-
Visual word ambiguity
-
van Gemert JC, Veenman CJ, Smeulders AWM, Geusebroek JM (2010) Visual word ambiguity. IEEE Trans Pattern Anal Mach Intell 32(7):1271–1283
-
(2010)
IEEE Trans Pattern Anal Mach Intell
, vol.32
, Issue.7
, pp. 1271-1283
-
-
van Gemert, J.C.1
Veenman, C.J.2
Smeulders, A.W.M.3
Geusebroek, J.M.4
-
43
-
-
0000351727
-
Investigating causal relations by econometric models and cross-spectral methods
-
Granger C (1969) Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3):424–438
-
(1969)
Econometrica
, vol.37
, Issue.3
, pp. 424-438
-
-
Granger, C.1
-
46
-
-
77953194241
-
Action detection in complex scenes with spatial and temporal ambiguities
-
Hu Y, Cao L, Lv F, Yan S, Gong Y, Huang TS (2009) Action detection in complex scenes with spatial and temporal ambiguities. In: Proceedings of IEEE international conference on computer vision
-
(2009)
In: Proceedings of IEEE international conference on computer vision
-
-
Hu, Y.1
Cao, L.2
Lv, F.3
Yan, S.4
Gong, Y.5
Huang, T.S.6
-
47
-
-
33746649771
-
Semantic analysis of soccer video using dynamic bayesian network
-
Huang CL, Shih HC, Chao CY (2006) Semantic analysis of soccer video using dynamic bayesian network. IEEE Trans Multimedia 8(4):749–760
-
(2006)
IEEE Trans Multimedia
, vol.8
, Issue.4
, pp. 749-760
-
-
Huang, C.L.1
Shih, H.C.2
Chao, C.Y.3
-
48
-
-
84905233993
-
-
In, Proceedings of NIST TRECVID Workshop
-
Inoue N, Kamishima Y, Wada T, Shinoda K, Sato S (2011) TokyoTech+Canon at TRECVID 2011. In: Proceedings of NIST TRECVID Workshop
-
(2011)
TokyoTech+Canon at TRECVID 2011
-
-
Inoue, N.1
Kamishima, Y.2
Wada, T.3
Shinoda, K.4
Sato, S.5
-
50
-
-
0034245366
-
Recognition of visual activities and interactions by stochastic parsing
-
Ivanov YA, Bobick AF (2000) Recognition of visual activities and interactions by stochastic parsing. IEEE Trans Pattern Anal Mach Intell 22(8):852–872
-
(2000)
IEEE Trans Pattern Anal Mach Intell
, vol.22
, Issue.8
, pp. 852-872
-
-
Ivanov, Y.A.1
Bobick, A.F.2
-
56
-
-
72949121298
-
Representations of keypoint-based semantic concept detection: a comprehensive study
-
Jiang YG, Yang J, Ngo CW, Hauptmann AG (2010) Representations of keypoint-based semantic concept detection: a comprehensive study. IEEE Trans Multimedia 12(1):42–53
-
(2010)
IEEE Trans Multimedia
, vol.12
, Issue.1
, pp. 42-53
-
-
Jiang, Y.G.1
Yang, J.2
Ngo, C.W.3
Hauptmann, A.G.4
-
58
-
-
84905161670
-
Columbia-UCF TRECVID2010 multimedia event detection: Combining multiple modalities, contextual concepts, and temporal matching
-
Proceedings of NIST TRECVID, Workshop
-
Jiang YG, Zeng X, Ye G, Bhattacharya S, Ellis D, Shah M, Chang SF (2010) Columbia-UCF TRECVID2010 multimedia event detection: Combining multiple modalities, contextual concepts, and temporal matching. In: Proceedings of NIST TRECVID, Workshop
-
(2010)
In
-
-
Jiang, Y.G.1
Zeng, X.2
Ye, G.3
Bhattacharya, S.4
Ellis, D.5
Shah, M.6
Chang, S.F.7
-
59
-
-
33845524029
-
Attribute grammar-based event recognition and anomaly detection
-
Proceedings of IEEE conference on computer vision and pattern recognition, Workshop
-
Joo SW, Chellappa R (2006) Attribute grammar-based event recognition and anomaly detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, Workshop
-
(2006)
In
-
-
Joo, S.W.1
Chellappa, R.2
-
63
-
-
0036843382
-
Natural language description of human activities from video images based on concept hierarchy of actions
-
Kojima A, Tamura T, Fukunaga K (2002) Natural language description of human activities from video images based on concept hierarchy of actions. Int J Comput Vision 50(2):171–184
-
(2002)
Int J Comput Vision
, vol.50
, Issue.2
, pp. 171-184
-
-
Kojima, A.1
Tamura, T.2
Fukunaga, K.3
-
65
-
-
24944451092
-
On space-time interest points
-
Laptev I (2005) On space-time interest points. Int J Comput Vision 64:107–123
-
(2005)
Int J Comput Vision
, vol.64
, pp. 107-123
-
-
Laptev, I.1
-
67
-
-
69549119986
-
Understanding video events: a survey of methods for automatic interpretation of semantic occurrences in videos
-
Lavee G, Rivlin E, Rudzsky M (2009) Understanding video events: a survey of methods for automatic interpretation of semantic occurrences in videos. IEEE Trans Syst Man Cybernet Part C 39(5):489–504
-
(2009)
IEEE Trans Syst Man Cybernet Part C
, vol.39
, Issue.5
, pp. 489-504
-
-
Lavee, G.1
Rivlin, E.2
Rudzsky, M.3
-
69
-
-
80052874098
-
-
Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In, Proceedings of IEEE conference on computer vision and, pattern recognition
-
Le QV, Zou WY, Yeung SY, Ng AY (2011) Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: Proceedings of IEEE conference on computer vision and, pattern recognition
-
(2011)
Ng AY
-
-
Le, Q.V.1
Zou, W.Y.2
Yeung, S.Y.3
-
70
-
-
77955746721
-
Audio-based semantic concept classification for consumer video
-
Lee K, Ellis DPW (2010) Audio-based semantic concept classification for consumer video. IEEE Trans Audio Speech Lang Process 18(6):1406–1416
-
(2010)
IEEE Trans Audio Speech Lang Process
, vol.18
, Issue.6
, pp. 1406-1416
-
-
Lee, K.1
Ellis, D.P.W.2
-
71
-
-
55149112799
-
Expandable data-driven graphical modeling of human actions based on salient postures
-
Li W, Zhang Z, Liu Z (2008) Expandable data-driven graphical modeling of human actions based on salient postures. IEEE Trans Circ Syst Video Technol 18(11):1499–1510
-
(2008)
IEEE Trans Circ Syst Video Technol
, vol.18
, Issue.11
, pp. 1499-1510
-
-
Li, W.1
Zhang, Z.2
Liu, Z.3
-
72
-
-
0032209062
-
Feature detection with automatic scale selection
-
Lindeberg T (1998) Feature detection with automatic scale selection. Int J Comput Vision 30:79–116
-
(1998)
Int J Comput Vision
, vol.30
, pp. 79-116
-
-
Lindeberg, T.1
-
76
-
-
37849015208
-
-
In: Proceedings of ACM international workshop on multimedia, information retrieval
-
Loui AC, Luo J, Chang SF, Ellis D, Jiang W, Kennedy L, Lee K, Yanagawa A (2007) Kodak’s consumer video benchmark data set: concept definition and annotation. In: Proceedings of ACM international workshop on multimedia, information retrieval
-
(2007)
Kodak’s consumer video benchmark data set: concept definition and annotation
-
-
Loui, A.C.1
Luo, J.2
Chang, S.F.3
Ellis, D.4
Jiang, W.5
Kennedy, L.6
Lee, K.7
Yanagawa, A.8
-
77
-
-
3042535216
-
Distinctive image features from scale-invariant keypoints
-
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60:91–110
-
(2004)
Int J Comput Vision
, vol.60
, pp. 91-110
-
-
Lowe, D.1
-
78
-
-
85008010045
-
Audio keywords discovery for text-like audio content analysis and retrieval
-
Lu L, Hanjalic A (2008) Audio keywords discovery for text-like audio content analysis and retrieval. IEEE Trans Multimedia 10(1):74–85
-
(2008)
IEEE Trans Multimedia
, vol.10
, Issue.1
, pp. 74-85
-
-
Lu, L.1
Hanjalic, A.2
-
80
-
-
78149304826
-
Sound retrieval and ranking using sparse auditory representations
-
Lyon RF, Rehn M, Bengio S, Walters TC, Chechik G (2010) Sound retrieval and ranking using sparse auditory representations. Neural Comput 22(9):2390–2416
-
(2010)
Neural Comput
, vol.22
, Issue.9
, pp. 2390-2416
-
-
Lyon, R.F.1
Rehn, M.2
Bengio, S.3
Walters, T.C.4
Chechik, G.5
-
83
-
-
0030213052
-
Texture features for browsing and retrieval of image data
-
Manjunath BS, Ma WY (1996) Texture features for browsing and retrieval of image data. IEEE Trans Pattern Anal Mach Intell 18(8):837–842
-
(1996)
IEEE Trans Pattern Anal Mach Intell
, vol.18
, Issue.8
, pp. 837-842
-
-
Manjunath, B.S.1
Ma, W.Y.2
-
84
-
-
85046873967
-
The det curve in assessment of detection task performance
-
Martin A, Doddington G, Kamm T, Ordowski M, Przybocki M (1997) The det curve in assessment of detection task performance. In: Procedings of European conference on speech communication and technology, pp 1895–1898
-
(1997)
In: Procedings of European conference on speech communication and technology
, pp. 1895-1898
-
-
Martin, A.1
Doddington, G.2
Kamm, T.3
Ordowski, M.4
Przybocki, M.5
-
85
-
-
0041416425
-
Robust wide baseline stereo from maximally stable extremal regions
-
Matas J, Chum O, Urban M, Pajdla T (2002) Robust wide baseline stereo from maximally stable extremal regions. In: Proceedings of British machine vision conference, vol 1, pp 384–393
-
(2002)
Proceedings of British machine vision conference
, vol.1
, pp. 384-393
-
-
Matas, J.1
Chum, O.2
Urban, M.3
Pajdla, T.4
-
88
-
-
9644260534
-
Scale and affine invariant interest point detectors
-
Mikolajczyk K, Schmid C (2004) Scale and affine invariant interest point detectors. Int J Comput Vision 60:63–86
-
(2004)
Int J Comput Vision
, vol.60
, pp. 63-86
-
-
Mikolajczyk, K.1
Schmid, C.2
-
90
-
-
33244468369
-
A comparison of affine region detectors
-
Mikolajczyk K, Tuytelaars T, Schmid C, Zisserman A, Matas J et al (2005) A comparison of affine region detectors. Int J Comput Vision 65(1/2):43–72
-
(2005)
Int J Comput Vision
, vol.65
, Issue.1-2
, pp. 43-72
-
-
Mikolajczyk, K.1
Tuytelaars, T.2
Schmid, C.3
Zisserman, A.4
Matas, J.5
-
92
-
-
55449128654
-
Recognizing multitasked activities using stochastic context-free grammar
-
Moore D, Essa I (2001) Recognizing multitasked activities using stochastic context-free grammar. In: Proceedings of AAAI conference
-
(2001)
In: Proceedings of AAAI conference
-
-
Moore, D.1
Essa, I.2
-
94
-
-
77951750177
-
Youtube scale, large vocabulary video annotation, Chapter 14 in video search and mining. Springer-Verlag series on studies in computational intelligence
-
Morsillo N, Mann G, Pal C (2010) Youtube scale, large vocabulary video annotation, Chapter 14 in video search and mining. Springer-Verlag series on studies in computational intelligence. Springer, Berlin, pp 357–386
-
(2010)
Springer, Berlin
, pp. 357-386
-
-
Morsillo, N.1
Mann, G.2
Pal, C.3
-
95
-
-
33747626730
-
Large-scale concept ontology for multimedia
-
Naphade M, Smith J, Tesic J, Chang SF, Hsu W, Kennedy L, Hauptmann A, Curtis J (2006) Large-scale concept ontology for multimedia. IEEE Multimedia Magazine 13(3):86–91
-
(2006)
IEEE Multimedia Magazine
, vol.13
, Issue.3
, pp. 86-91
-
-
Naphade, M.1
Smith, J.2
Tesic, J.3
Chang, S.F.4
Hsu, W.5
Kennedy, L.6
Hauptmann, A.7
Curtis, J.8
-
98
-
-
84905189035
-
-
Proceedings of NIST TRECVID, Workshop
-
Natsev A, Smith JR, Hill M, Hua G, Huang B, Merler M, Xie L, Ouyang H, Zhou, M (2010) IBM Research TRECVID-2010 video copy detection and multimedia event detection system. In: Proceedings of NIST TRECVID, Workshop
-
(2010)
IBM Research TRECVID-2010 video copy detection and multimedia event detection system
-
-
Natsev, A.1
Smith, J.R.2
Hill, M.3
Hua, G.4
Huang, B.5
Merler, M.6
Xie, L.7
Ouyang, H.8
Zhou, M.9
-
102
-
-
79952952363
-
Spatiotemporal localization and categorization of human actions in unsegmented image sequences
-
Oikonomopoulos A, Patras I, Pantic M (2011) Spatiotemporal localization and categorization of human actions in unsegmented image sequences. IEEE Trans Image Process 20(4):1126–1140
-
(2011)
IEEE Trans Image Process
, vol.20
, Issue.4
, pp. 1126-1140
-
-
Oikonomopoulos, A.1
Patras, I.2
Pantic, M.3
-
103
-
-
0036647193
-
Multiresolution gray-scale and rotation invariant texture classification with local binary patterns
-
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
-
(2002)
IEEE Trans Pattern Anal Mach Intell
, vol.24
, Issue.7
, pp. 971-987
-
-
Ojala, T.1
Pietikainen, M.2
Maenpaa, T.3
-
104
-
-
0035328421
-
Modeling the shape of the scene: a holistic representation of the spatial envelope
-
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vision 42:145–175
-
(2001)
Int J Comput Vision
, vol.42
, pp. 145-175
-
-
Oliva, A.1
Torralba, A.2
-
107
-
-
0000460671
-
Complex sounds and auditory images
-
Patterson RD, Robinson K, Holdsworth J, McKeown D, Zhang C, Allerhand M (1992) Complex sounds and auditory images. In: Proceedings of international symposium on hearing, pp 429–446
-
(1992)
In: Proceedings of international symposium on hearing
, pp. 429-446
-
-
Patterson, R.D.1
Robinson, K.2
Holdsworth, J.3
McKeown, D.4
Zhang, C.5
Allerhand, M.6
-
111
-
-
77949275097
-
Survey on vision-based human action recognition
-
Poppe R (2010) Survey on vision-based human action recognition. Image Vision Comput 28(6):976–990
-
(2010)
Image Vision Comput
, vol.28
, Issue.6
, pp. 976-990
-
-
Poppe, R.1
-
115
-
-
0034313871
-
The earth mover’s distance as a metric for image retrieval
-
Rubner Y, Tomasi C, Guibas LJ (2000) The earth mover’s distance as a metric for image retrieval. Int J Comput Vision 40(2):99– 121
-
(2000)
Int J Comput Vision
, vol.40
, Issue.2
, pp. 99-121
-
-
Rubner, Y.1
Tomasi, C.2
Guibas, L.J.3
-
116
-
-
39749186006
-
LabelMe: a database and web-based tool for image annotation
-
Russell B, Torralba A, Murphy K, Freeman WT (2008) LabelMe: a database and web-based tool for image annotation. Int J Comput Vision 77(1–3):157–173
-
(2008)
Int J Comput Vision
, vol.77
, Issue.1-3
, pp. 157-173
-
-
Russell, B.1
Torralba, A.2
Murphy, K.3
Freeman, W.T.4
-
118
-
-
27844565238
-
Event detection in field sports video using audio-visual features and a support vector machine
-
Sadlier DA, O’Connor NE (2005) Event detection in field sports video using audio-visual features and a support vector machine. IEEE Trans Circ Syst Video Technol 15(10):1225–1233
-
(2005)
IEEE Trans Circ Syst Video Technol
, vol.15
, Issue.10
, pp. 1225-1233
-
-
Sadlier, D.A.1
O’Connor, N.E.2
-
129
-
-
0034498523
-
Content based image retrieval at the end of the early years
-
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
-
(2000)
IEEE Trans Pattern Anal Mach Intell
, vol.22
, Issue.12
, pp. 1349-1380
-
-
Smeulders, A.W.M.1
Worring, M.2
Santini, S.3
Gupta, A.4
Jain, R.5
-
131
-
-
0003459124
-
Visual recognition of american sign language using hidden markov models
-
Starner TE (1995) Visual recognition of american sign language using hidden markov models. Ph.D. thesis
-
(1995)
Ph.D thesis
-
-
Starner, T.E.1
-
132
-
-
70450214829
-
Hierarchical spatio-temporal context modeling for action recognition
-
Sun J, Wu X, Yan S, Cheong LF, Chua TS, Li J (2009) Hierarchical spatio-temporal context modeling for action recognition. In: Proceedings of IEEE conference on computer vision and pattern recognition
-
(2009)
In: Proceedings of IEEE conference on computer vision and pattern recognition
-
-
Sun, J.1
Wu, X.2
Yan, S.3
Cheong, L.F.4
Chua, T.S.5
Li, J.6
-
133
-
-
80155180597
-
-
Automatic annotation of web videos. In: Proceedings of IEEE international conference on multimedia and expo
-
Sun SW, Wang YCF, Hung YL, Chang CL, Chen KC, Cheng SS, Wang HM, Liao HYM (2011) Automatic annotation of web videos. In: Proceedings of IEEE international conference on multimedia and expo
-
(2011)
Liao HYM
-
-
Sun, S.W.1
Wang, Y.C.F.2
Hung, Y.L.3
Chang, C.L.4
Chen, K.C.5
Cheng, S.S.6
Wang, H.M.7
-
139
-
-
55149089260
-
Machine recognition of human activities: a survey
-
Turaga P, Chellappa R, Subrahmanian VS, Udrea O (2008) Machine recognition of human activities: a survey. IEEE Trans Circ Syst Video Technol 18(11):1473–1488
-
(2008)
IEEE Trans Circ Syst Video Technol
, vol.18
, Issue.11
, pp. 1473-1488
-
-
Turaga, P.1
Chellappa, R.2
Subrahmanian, V.S.3
Udrea, O.4
-
147
-
-
79551480483
-
Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion
-
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11(12):3371–3408
-
(2010)
J Mach Learn Res
, vol.11
, Issue.12
, pp. 3371-3408
-
-
Vincent, P.1
Larochelle, H.2
Lajoie, I.3
Bengio, Y.4
Manzagol, P.A.5
-
150
-
-
80052877143
-
-
Action recognition by dense trajectories. In: Proceedings of IEEE conference on computer vision and pattern recognition
-
Wang H, Klaser A, Schmid C, Liu CL (2011) Action recognition by dense trajectories. In: Proceedings of IEEE conference on computer vision and pattern recognition
-
(2011)
Liu CL
-
-
Wang, H.1
Klaser, A.2
Schmid, C.3
-
155
-
-
33750025833
-
Free viewpoint action recognition using motion history volumes
-
Weinland D, Ronfard R, Boyer E (2006) Free viewpoint action recognition using motion history volumes. Comput Vision Image Underst 104(2):249–257
-
(2006)
Comput Vision Image Underst
, vol.104
, Issue.2
, pp. 249-257
-
-
Weinland, D.1
Ronfard, R.2
Boyer, E.3
-
160
-
-
2142771243
-
Structure analysis of soccer video with domain knowledge and hidden markov models
-
Xie L, Xu P, Chang SF, Divakaran A, Sun H (2004) Structure analysis of soccer video with domain knowledge and hidden markov models. Pattern Recognit Lett 25(7):767–775
-
(2004)
Pattern Recognit Lett
, vol.25
, Issue.7
, pp. 767-775
-
-
Xie, L.1
Xu, P.2
Chang, S.F.3
Divakaran, A.4
Sun, H.5
-
161
-
-
41549084805
-
A novel framework for semantic annotation and personalized retrieval of sports video
-
Xu C, Wang J, Lu H, Zhang Y (2008) A novel framework for semantic annotation and personalized retrieval of sports video. IEEE Trans Multimedia 10(3):421–436
-
(2008)
IEEE Trans Multimedia
, vol.10
, Issue.3
, pp. 421-436
-
-
Xu, C.1
Wang, J.2
Lu, H.3
Zhang, Y.4
-
162
-
-
54749131961
-
Video event recognition using Kernel methods with multilevel temporal alignment
-
Xu D, Chang SF (2008) Video event recognition using Kernel methods with multilevel temporal alignment. IEEE Trans Pattern Anal Mach Intell 30(11):1985–1997
-
(2008)
IEEE Trans Pattern Anal Mach Intell
, vol.30
, Issue.11
, pp. 1985-1997
-
-
Xu, D.1
Chang, S.F.2
-
167
-
-
77954862144
-
I2T: Image parsing to text description
-
Yao B, Yang X, Lin L, Lee M, Zhu S (2010) I2T: Image parsing to text description. Proc IEEE 98(8):1485–1508
-
(2010)
Proc IEEE
, vol.98
, Issue.8
, pp. 1485-1508
-
-
Yao, B.1
Yang, X.2
Lin, L.3
Lee, M.4
Zhu, S.5
-
174
-
-
33846580425
-
Local features and kernels for classification of texture and object categories: a comprehensive study
-
Zhang J, Marszalek M, Lazebnik S, Schmid C (2007) Local features and kernels for classification of texture and object categories: a comprehensive study. Int J Comput Vision 73(2):213–238
-
(2007)
Int J Comput Vision
, vol.73
, Issue.2
, pp. 213-238
-
-
Zhang, J.1
Marszalek, M.2
Lazebnik, S.3
Schmid, C.4
|