메뉴 건너뛰기




Volumn 2, Issue 2, 2013, Pages 73-101

High-level event recognition in unconstrained videos

Author keywords

Fusion; Multimedia event detection; Multimodal features; Recognition; Unconstrained videos; Video events

Indexed keywords


EID: 84986185450     PISSN: 21926611     EISSN: 2192662X     Source Type: Journal    
DOI: 10.1007/s13735-012-0024-2     Document Type: Article
Times cited : (149)

References (174)
  • 1
    • 79955649703 scopus 로고    scopus 로고
    • Human activity analysis: a review
    • Aggarwal JK, Ryoo MS (2011) Human activity analysis: a review. ACM Comput Surv 43(3):1–16
    • (2011) ACM Comput Surv , vol.43 , Issue.3 , pp. 1-16
    • Aggarwal, J.K.1    Ryoo, M.S.2
  • 2
    • 73849126715 scopus 로고    scopus 로고
    • Human action recognition in videos using kinematic features and multiple instance learning
    • Ali S, Shah M (2010) Human action recognition in videos using kinematic features and multiple instance learning. IEEE Trans Pattern Anal Mach Intell 32(2):288–303
    • (2010) IEEE Trans Pattern Anal Mach Intell , vol.32 , Issue.2 , pp. 288-303
    • Ali, S.1    Shah, M.2
  • 3
    • 0020849266 scopus 로고
    • Maintaining knowledge about temporal intervals
    • Allen JF (1983) Maintaining knowledge about temporal intervals. Commun ACM 26(11):832–843
    • (1983) Commun ACM , vol.26 , Issue.11 , pp. 832-843
    • Allen, J.F.1
  • 4
    • 0022115986 scopus 로고
    • Kinematic features of unrestrained vertical arm movements
    • Atkeson CG, Hollerbach JM (1985) Kinematic features of unrestrained vertical arm movements. J Neurosci 5(9):2318–2330
    • (1985) J Neurosci , vol.5 , Issue.9 , pp. 2318-2330
    • Atkeson, C.G.1    Hollerbach, J.M.2
  • 5
    • 34547645414 scopus 로고    scopus 로고
    • The bag-of-frames approach to audio pattern recognition: a sufficient model for urban soundscapes but not for polyphonic music
    • Aucouturier JJ, Defreville B, Pachet F (2007) The bag-of-frames approach to audio pattern recognition: a sufficient model for urban soundscapes but not for polyphonic music. J Acoust Soc Am 122(2):881–891
    • (2007) J Acoust Soc Am , vol.122 , Issue.2 , pp. 881-891
    • Aucouturier, J.J.1    Defreville, B.2    Pachet, F.3
  • 7
    • 84870398559 scopus 로고    scopus 로고
    • Audio-based event detection for sports video
    • Proceedings of international conference on image and video retrieval, Urbana-Champaign, IL
    • Baillie M, Jose JM (2003) Audio-based event detection for sports video. In: Proceedings of international conference on image and video retrieval, Urbana-Champaign, IL
    • (2003) In
    • Baillie, M.1    Jose, J.M.2
  • 9
    • 34848878272 scopus 로고    scopus 로고
    • Headline generation based on statistical translation
    • Proceedings of the annual meeting of the association for computational linguistics, Hong Kong
    • Banko M, Mittal VO, Witbrock, MJ (2000) Headline generation based on statistical translation. In: Proceedings of the annual meeting of the association for computational linguistics, Hong Kong
    • (2000) In
    • Banko, M.1    Mittal, V.O.2    Witbrock, M.J.3
  • 15
    • 0031590139 scopus 로고    scopus 로고
    • Movement, activity, and action: the role of knowledge in the perception of motion
    • Bobick AF (1997) Movement, activity, and action: the role of knowledge in the perception of motion. Philos Trans Royal Soc London 352:1257–1265
    • (1997) Philos Trans Royal Soc London , vol.352 , pp. 1257-1265
    • Bobick, A.F.1
  • 17
    • 43449110431 scopus 로고    scopus 로고
    • Automatic video classification: a survey of the literature
    • Brezeale D, Cook D (2008) Automatic video classification: a survey of the literature. IEEE Trans Syst Man Cybernet Part C 38(3):416–430
    • (2008) IEEE Trans Syst Man Cybernet Part C , vol.38 , Issue.3 , pp. 416-430
    • Brezeale, D.1    Cook, D.2
  • 18
    • 79955857786 scopus 로고    scopus 로고
    • Efficient structure learning of bayesian networks using constraints
    • de Campos C, Ji Q (2011) Efficient structure learning of bayesian networks using constraints. J Mach Learn Res 12(3):663–689
    • (2011) J Mach Learn Res , vol.12 , Issue.3 , pp. 663-689
    • de Campos, C.1    Ji, Q.2
  • 19
    • 77954608206 scopus 로고    scopus 로고
    • MCG-WEBV: a benchmark dataset for web video analysis. Tech. rep
    • Institute of Computing Technology, Chinese Academy of Sciences
    • Cao J, Zhang YD, Song YC, Chen ZN, Zhang X, Li JT (2009) MCG-WEBV: a benchmark dataset for web video analysis. Tech. rep., ICT-MCG-09-001, Institute of Computing Technology, Chinese Academy of Sciences
    • (2009) ICT-MCG-09-001
    • Cao, J.1    Zhang, Y.D.2    Song, Y.C.3    Chen, Z.N.4    Zhang, X.5    Li, J.T.6
  • 22
    • 0029716457 scopus 로고    scopus 로고
    • Integrated image and speech analysis for content-based video indexing
    • Proceedings of IEEE international conference on multimedia computing and systems, Washington, DC
    • Chang YL, Zeng W, Kamel I, Alonso R (1996) Integrated image and speech analysis for content-based video indexing. In: Proceedings of IEEE international conference on multimedia computing and systems, Washington, DC
    • (1996) In
    • Chang, Y.L.1    Zeng, W.2    Kamel, I.3    Alonso, R.4
  • 24
    • 84905251864 scopus 로고    scopus 로고
    • Team SRI-Sarnoff’s AURORA System @ TRECVID 2011
    • Proceedings of NIST TRECVID, Workshop
    • Cheng H et al (2011) Team SRI-Sarnoff’s AURORA System @ TRECVID 2011. In: Proceedings of NIST TRECVID, Workshop
    • (2011) In
    • Cheng, H.1
  • 37
    • 0002635287 scopus 로고
    • The case for case
    • Universals in Linguistic Theory, New York
    • Fillmore CJ (1968) The case for case. In: Bach E, Harms R (eds), Universals in Linguistic Theory, New York, pp 1–88
    • (1968) Bach E , pp. 1-88
    • Fillmore, C.J.1    Harms, R.2
  • 39
    • 28344457205 scopus 로고    scopus 로고
    • Verl: an ontology framework for representing and annotating video events
    • Francois ARJ, Nevatia R, Hobbs J, Bolles RC (2005) Verl: an ontology framework for representing and annotating video events. IEEE Multimedia Magazine 12(4):76–86
    • (2005) IEEE Multimedia Magazine , vol.12 , Issue.4 , pp. 76-86
    • Francois, A.R.J.1    Nevatia, R.2    Hobbs, J.3    Bolles, R.C.4
  • 40
    • 25844482570 scopus 로고    scopus 로고
    • A comparison of algorithms for inference and learning in probabilistic graphical models
    • Frey BJ, Jojic N (2005) A comparison of algorithms for inference and learning in probabilistic graphical models. IEEE Trans Pattern Anal Mach Intell 27(9):1392–1416
    • (2005) IEEE Trans Pattern Anal Mach Intell , vol.27 , Issue.9 , pp. 1392-1416
    • Frey, B.J.1    Jojic, N.2
  • 43
    • 0000351727 scopus 로고
    • Investigating causal relations by econometric models and cross-spectral methods
    • Granger C (1969) Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3):424–438
    • (1969) Econometrica , vol.37 , Issue.3 , pp. 424-438
    • Granger, C.1
  • 47
    • 33746649771 scopus 로고    scopus 로고
    • Semantic analysis of soccer video using dynamic bayesian network
    • Huang CL, Shih HC, Chao CY (2006) Semantic analysis of soccer video using dynamic bayesian network. IEEE Trans Multimedia 8(4):749–760
    • (2006) IEEE Trans Multimedia , vol.8 , Issue.4 , pp. 749-760
    • Huang, C.L.1    Shih, H.C.2    Chao, C.Y.3
  • 50
    • 0034245366 scopus 로고    scopus 로고
    • Recognition of visual activities and interactions by stochastic parsing
    • Ivanov YA, Bobick AF (2000) Recognition of visual activities and interactions by stochastic parsing. IEEE Trans Pattern Anal Mach Intell 22(8):852–872
    • (2000) IEEE Trans Pattern Anal Mach Intell , vol.22 , Issue.8 , pp. 852-872
    • Ivanov, Y.A.1    Bobick, A.F.2
  • 56
    • 72949121298 scopus 로고    scopus 로고
    • Representations of keypoint-based semantic concept detection: a comprehensive study
    • Jiang YG, Yang J, Ngo CW, Hauptmann AG (2010) Representations of keypoint-based semantic concept detection: a comprehensive study. IEEE Trans Multimedia 12(1):42–53
    • (2010) IEEE Trans Multimedia , vol.12 , Issue.1 , pp. 42-53
    • Jiang, Y.G.1    Yang, J.2    Ngo, C.W.3    Hauptmann, A.G.4
  • 58
    • 84905161670 scopus 로고    scopus 로고
    • Columbia-UCF TRECVID2010 multimedia event detection: Combining multiple modalities, contextual concepts, and temporal matching
    • Proceedings of NIST TRECVID, Workshop
    • Jiang YG, Zeng X, Ye G, Bhattacharya S, Ellis D, Shah M, Chang SF (2010) Columbia-UCF TRECVID2010 multimedia event detection: Combining multiple modalities, contextual concepts, and temporal matching. In: Proceedings of NIST TRECVID, Workshop
    • (2010) In
    • Jiang, Y.G.1    Zeng, X.2    Ye, G.3    Bhattacharya, S.4    Ellis, D.5    Shah, M.6    Chang, S.F.7
  • 59
    • 33845524029 scopus 로고    scopus 로고
    • Attribute grammar-based event recognition and anomaly detection
    • Proceedings of IEEE conference on computer vision and pattern recognition, Workshop
    • Joo SW, Chellappa R (2006) Attribute grammar-based event recognition and anomaly detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, Workshop
    • (2006) In
    • Joo, S.W.1    Chellappa, R.2
  • 63
    • 0036843382 scopus 로고    scopus 로고
    • Natural language description of human activities from video images based on concept hierarchy of actions
    • Kojima A, Tamura T, Fukunaga K (2002) Natural language description of human activities from video images based on concept hierarchy of actions. Int J Comput Vision 50(2):171–184
    • (2002) Int J Comput Vision , vol.50 , Issue.2 , pp. 171-184
    • Kojima, A.1    Tamura, T.2    Fukunaga, K.3
  • 65
    • 24944451092 scopus 로고    scopus 로고
    • On space-time interest points
    • Laptev I (2005) On space-time interest points. Int J Comput Vision 64:107–123
    • (2005) Int J Comput Vision , vol.64 , pp. 107-123
    • Laptev, I.1
  • 67
    • 69549119986 scopus 로고    scopus 로고
    • Understanding video events: a survey of methods for automatic interpretation of semantic occurrences in videos
    • Lavee G, Rivlin E, Rudzsky M (2009) Understanding video events: a survey of methods for automatic interpretation of semantic occurrences in videos. IEEE Trans Syst Man Cybernet Part C 39(5):489–504
    • (2009) IEEE Trans Syst Man Cybernet Part C , vol.39 , Issue.5 , pp. 489-504
    • Lavee, G.1    Rivlin, E.2    Rudzsky, M.3
  • 69
    • 80052874098 scopus 로고    scopus 로고
    • Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In, Proceedings of IEEE conference on computer vision and, pattern recognition
    • Le QV, Zou WY, Yeung SY, Ng AY (2011) Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: Proceedings of IEEE conference on computer vision and, pattern recognition
    • (2011) Ng AY
    • Le, Q.V.1    Zou, W.Y.2    Yeung, S.Y.3
  • 70
    • 77955746721 scopus 로고    scopus 로고
    • Audio-based semantic concept classification for consumer video
    • Lee K, Ellis DPW (2010) Audio-based semantic concept classification for consumer video. IEEE Trans Audio Speech Lang Process 18(6):1406–1416
    • (2010) IEEE Trans Audio Speech Lang Process , vol.18 , Issue.6 , pp. 1406-1416
    • Lee, K.1    Ellis, D.P.W.2
  • 71
    • 55149112799 scopus 로고    scopus 로고
    • Expandable data-driven graphical modeling of human actions based on salient postures
    • Li W, Zhang Z, Liu Z (2008) Expandable data-driven graphical modeling of human actions based on salient postures. IEEE Trans Circ Syst Video Technol 18(11):1499–1510
    • (2008) IEEE Trans Circ Syst Video Technol , vol.18 , Issue.11 , pp. 1499-1510
    • Li, W.1    Zhang, Z.2    Liu, Z.3
  • 72
    • 0032209062 scopus 로고    scopus 로고
    • Feature detection with automatic scale selection
    • Lindeberg T (1998) Feature detection with automatic scale selection. Int J Comput Vision 30:79–116
    • (1998) Int J Comput Vision , vol.30 , pp. 79-116
    • Lindeberg, T.1
  • 77
    • 3042535216 scopus 로고    scopus 로고
    • Distinctive image features from scale-invariant keypoints
    • Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60:91–110
    • (2004) Int J Comput Vision , vol.60 , pp. 91-110
    • Lowe, D.1
  • 78
    • 85008010045 scopus 로고    scopus 로고
    • Audio keywords discovery for text-like audio content analysis and retrieval
    • Lu L, Hanjalic A (2008) Audio keywords discovery for text-like audio content analysis and retrieval. IEEE Trans Multimedia 10(1):74–85
    • (2008) IEEE Trans Multimedia , vol.10 , Issue.1 , pp. 74-85
    • Lu, L.1    Hanjalic, A.2
  • 80
    • 78149304826 scopus 로고    scopus 로고
    • Sound retrieval and ranking using sparse auditory representations
    • Lyon RF, Rehn M, Bengio S, Walters TC, Chechik G (2010) Sound retrieval and ranking using sparse auditory representations. Neural Comput 22(9):2390–2416
    • (2010) Neural Comput , vol.22 , Issue.9 , pp. 2390-2416
    • Lyon, R.F.1    Rehn, M.2    Bengio, S.3    Walters, T.C.4    Chechik, G.5
  • 83
    • 0030213052 scopus 로고    scopus 로고
    • Texture features for browsing and retrieval of image data
    • Manjunath BS, Ma WY (1996) Texture features for browsing and retrieval of image data. IEEE Trans Pattern Anal Mach Intell 18(8):837–842
    • (1996) IEEE Trans Pattern Anal Mach Intell , vol.18 , Issue.8 , pp. 837-842
    • Manjunath, B.S.1    Ma, W.Y.2
  • 88
    • 9644260534 scopus 로고    scopus 로고
    • Scale and affine invariant interest point detectors
    • Mikolajczyk K, Schmid C (2004) Scale and affine invariant interest point detectors. Int J Comput Vision 60:63–86
    • (2004) Int J Comput Vision , vol.60 , pp. 63-86
    • Mikolajczyk, K.1    Schmid, C.2
  • 92
    • 55449128654 scopus 로고    scopus 로고
    • Recognizing multitasked activities using stochastic context-free grammar
    • Moore D, Essa I (2001) Recognizing multitasked activities using stochastic context-free grammar. In: Proceedings of AAAI conference
    • (2001) In: Proceedings of AAAI conference
    • Moore, D.1    Essa, I.2
  • 94
    • 77951750177 scopus 로고    scopus 로고
    • Youtube scale, large vocabulary video annotation, Chapter 14 in video search and mining. Springer-Verlag series on studies in computational intelligence
    • Morsillo N, Mann G, Pal C (2010) Youtube scale, large vocabulary video annotation, Chapter 14 in video search and mining. Springer-Verlag series on studies in computational intelligence. Springer, Berlin, pp 357–386
    • (2010) Springer, Berlin , pp. 357-386
    • Morsillo, N.1    Mann, G.2    Pal, C.3
  • 102
    • 79952952363 scopus 로고    scopus 로고
    • Spatiotemporal localization and categorization of human actions in unsegmented image sequences
    • Oikonomopoulos A, Patras I, Pantic M (2011) Spatiotemporal localization and categorization of human actions in unsegmented image sequences. IEEE Trans Image Process 20(4):1126–1140
    • (2011) IEEE Trans Image Process , vol.20 , Issue.4 , pp. 1126-1140
    • Oikonomopoulos, A.1    Patras, I.2    Pantic, M.3
  • 103
    • 0036647193 scopus 로고    scopus 로고
    • Multiresolution gray-scale and rotation invariant texture classification with local binary patterns
    • Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
    • (2002) IEEE Trans Pattern Anal Mach Intell , vol.24 , Issue.7 , pp. 971-987
    • Ojala, T.1    Pietikainen, M.2    Maenpaa, T.3
  • 104
    • 0035328421 scopus 로고    scopus 로고
    • Modeling the shape of the scene: a holistic representation of the spatial envelope
    • Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vision 42:145–175
    • (2001) Int J Comput Vision , vol.42 , pp. 145-175
    • Oliva, A.1    Torralba, A.2
  • 111
    • 77949275097 scopus 로고    scopus 로고
    • Survey on vision-based human action recognition
    • Poppe R (2010) Survey on vision-based human action recognition. Image Vision Comput 28(6):976–990
    • (2010) Image Vision Comput , vol.28 , Issue.6 , pp. 976-990
    • Poppe, R.1
  • 115
    • 0034313871 scopus 로고    scopus 로고
    • The earth mover’s distance as a metric for image retrieval
    • Rubner Y, Tomasi C, Guibas LJ (2000) The earth mover’s distance as a metric for image retrieval. Int J Comput Vision 40(2):99– 121
    • (2000) Int J Comput Vision , vol.40 , Issue.2 , pp. 99-121
    • Rubner, Y.1    Tomasi, C.2    Guibas, L.J.3
  • 116
    • 39749186006 scopus 로고    scopus 로고
    • LabelMe: a database and web-based tool for image annotation
    • Russell B, Torralba A, Murphy K, Freeman WT (2008) LabelMe: a database and web-based tool for image annotation. Int J Comput Vision 77(1–3):157–173
    • (2008) Int J Comput Vision , vol.77 , Issue.1-3 , pp. 157-173
    • Russell, B.1    Torralba, A.2    Murphy, K.3    Freeman, W.T.4
  • 118
    • 27844565238 scopus 로고    scopus 로고
    • Event detection in field sports video using audio-visual features and a support vector machine
    • Sadlier DA, O’Connor NE (2005) Event detection in field sports video using audio-visual features and a support vector machine. IEEE Trans Circ Syst Video Technol 15(10):1225–1233
    • (2005) IEEE Trans Circ Syst Video Technol , vol.15 , Issue.10 , pp. 1225-1233
    • Sadlier, D.A.1    O’Connor, N.E.2
  • 128
    • 34547401486 scopus 로고    scopus 로고
    • In: Proceedings of ACM international workshop on multimedia information retrieval
    • Smeaton AF, Over P, Kraaij W (2006) Evaluation campaigns and TRECVid. In: Proceedings of ACM international workshop on multimedia information retrieval
    • (2006) Evaluation campaigns and TRECVid
    • Smeaton, A.F.1    Over, P.2    Kraaij, W.3
  • 131
    • 0003459124 scopus 로고
    • Visual recognition of american sign language using hidden markov models
    • Starner TE (1995) Visual recognition of american sign language using hidden markov models. Ph.D. thesis
    • (1995) Ph.D thesis
    • Starner, T.E.1
  • 133
  • 147
    • 79551480483 scopus 로고    scopus 로고
    • Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion
    • Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11(12):3371–3408
    • (2010) J Mach Learn Res , vol.11 , Issue.12 , pp. 3371-3408
    • Vincent, P.1    Larochelle, H.2    Lajoie, I.3    Bengio, Y.4    Manzagol, P.A.5
  • 150
    • 80052877143 scopus 로고    scopus 로고
    • Action recognition by dense trajectories. In: Proceedings of IEEE conference on computer vision and pattern recognition
    • Wang H, Klaser A, Schmid C, Liu CL (2011) Action recognition by dense trajectories. In: Proceedings of IEEE conference on computer vision and pattern recognition
    • (2011) Liu CL
    • Wang, H.1    Klaser, A.2    Schmid, C.3
  • 155
    • 33750025833 scopus 로고    scopus 로고
    • Free viewpoint action recognition using motion history volumes
    • Weinland D, Ronfard R, Boyer E (2006) Free viewpoint action recognition using motion history volumes. Comput Vision Image Underst 104(2):249–257
    • (2006) Comput Vision Image Underst , vol.104 , Issue.2 , pp. 249-257
    • Weinland, D.1    Ronfard, R.2    Boyer, E.3
  • 160
    • 2142771243 scopus 로고    scopus 로고
    • Structure analysis of soccer video with domain knowledge and hidden markov models
    • Xie L, Xu P, Chang SF, Divakaran A, Sun H (2004) Structure analysis of soccer video with domain knowledge and hidden markov models. Pattern Recognit Lett 25(7):767–775
    • (2004) Pattern Recognit Lett , vol.25 , Issue.7 , pp. 767-775
    • Xie, L.1    Xu, P.2    Chang, S.F.3    Divakaran, A.4    Sun, H.5
  • 161
    • 41549084805 scopus 로고    scopus 로고
    • A novel framework for semantic annotation and personalized retrieval of sports video
    • Xu C, Wang J, Lu H, Zhang Y (2008) A novel framework for semantic annotation and personalized retrieval of sports video. IEEE Trans Multimedia 10(3):421–436
    • (2008) IEEE Trans Multimedia , vol.10 , Issue.3 , pp. 421-436
    • Xu, C.1    Wang, J.2    Lu, H.3    Zhang, Y.4
  • 162
    • 54749131961 scopus 로고    scopus 로고
    • Video event recognition using Kernel methods with multilevel temporal alignment
    • Xu D, Chang SF (2008) Video event recognition using Kernel methods with multilevel temporal alignment. IEEE Trans Pattern Anal Mach Intell 30(11):1985–1997
    • (2008) IEEE Trans Pattern Anal Mach Intell , vol.30 , Issue.11 , pp. 1985-1997
    • Xu, D.1    Chang, S.F.2
  • 167
    • 77954862144 scopus 로고    scopus 로고
    • I2T: Image parsing to text description
    • Yao B, Yang X, Lin L, Lee M, Zhu S (2010) I2T: Image parsing to text description. Proc IEEE 98(8):1485–1508
    • (2010) Proc IEEE , vol.98 , Issue.8 , pp. 1485-1508
    • Yao, B.1    Yang, X.2    Lin, L.3    Lee, M.4    Zhu, S.5
  • 174
    • 33846580425 scopus 로고    scopus 로고
    • Local features and kernels for classification of texture and object categories: a comprehensive study
    • Zhang J, Marszalek M, Lazebnik S, Schmid C (2007) Local features and kernels for classification of texture and object categories: a comprehensive study. Int J Comput Vision 73(2):213–238
    • (2007) Int J Comput Vision , vol.73 , Issue.2 , pp. 213-238
    • Zhang, J.1    Marszalek, M.2    Lazebnik, S.3    Schmid, C.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.