SCOPUS 정보 검색 플랫폼

MM 2013 - Proceedings of the 2013 ACM Multimedia Conference

Volumn , Issue , 2013, Pages 263-272

Learning latent spatio-temporal compositional model for human action recognition

(3) Liang, Xiaodan a Lin, Liang a Cao, Liangliang b

a SUN YAT SEN UNIVERSITY (China)

b IBM RESEARCH (United States)

Author keywords

Action recognition; And or graph; Structural learning; Video understanding

Indexed keywords

ACTION RECOGNITION; AND- OR GRAPH; COMPOSITIONAL MODELING; HUMAN-ACTION RECOGNITION; SPATIO-TEMPORAL STRUCTURES; STRUCTURAL CONFIGURATIONS; STRUCTURAL LEARNING; VIDEO UNDERSTANDING;

GESTURE RECOGNITION; ITERATIVE METHODS;

MOTION ESTIMATION;

EID: 84887476984 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2502081.2502089 Document Type: Conference Paper

Times cited : (32)

References (47)

1
- 57849120833
- A constrained probabilistic petri net framework for human activity detection in video
- M. Albanese, R. Chellappa, N. P. Cuntoor, V. Moscato, A. Picariello, V. S. Subrahmanian, and O. Udrea. A constrained probabilistic petri net framework for human activity detection in video. IEEE Transactions on Multimedia, 10(6):982-996, 2008.
- (2008) IEEE Transactions on Multimedia , vol.10 , Issue.6 , pp. 982-996
- Albanese, M.¹ Chellappa, R.² Cuntoor, N.P.³ Moscato, V.⁴ Picariello, A.⁵ Subrahmanian, V.S.⁶ Udrea, O.⁷

2
- 77957182145
- Actions and events in interval temporal logic
- J. F. Allen and G. Ferguson. Actions and events in interval temporal logic. J. Log. Comput., 4(5):531-579, 1994.
- (1994) J. Log. Comput. , vol.4 , Issue.5 , pp. 531-579
- Allen, J.F.¹ Ferguson, G.²

3
- 84867859826
- Cost-sensitive top-down/bottom-up inference for multiscale activity recognition
- M. R. Amer, D. Xie, M. Zhao, S. Todorovic, and S. C. Zhu. Cost-sensitive top-down/bottom-up inference for multiscale activity recognition. In ECCV (4), pages 187-200, 2012.
- (2012) ECCV , Issue.4 , pp. 187-200
- Amer, M.R.¹ Xie, D.² Zhao, M.³ Todorovic, S.⁴ Zhu, S.C.⁵

4
- 84856661125
- Learning spatiotemporal graphs of human activities
- W. Brendel and S. Todorovic. Learning spatiotemporal graphs of human activities. In ICCV, pages 778-785, 2011.
- (2011) ICCV , pp. 778-785
- Brendel, W.¹ Todorovic, S.²

5
- 77955989314
- Cross-dataset action detection
- L. Cao, Z. Liu, and T. S. Huang. Cross-dataset action detection. In CVPR, pages 1998-2005, 2010.
- (2010) CVPR , pp. 1998-2005
- Cao, L.¹ Liu, Z.² Huang, T.S.³

6
- 80052874112
- Learning context for collective activity recognition
- W. Choi, K. Shahid, and S. Savarese. Learning context for collective activity recognition. In CVPR, 2011.
- (2011) CVPR
- Choi, W.¹ Shahid, K.² Savarese, S.³

7
- 33846622081
- Behavior recognition via sparse spatio-temporal features
- October
- P. Dollár, V. Rabaud, G. Cottrell, and S. Belongie. Behavior recognition via sparse spatio-temporal features. In VS-PETS, October 2005.
- (2005) VS-PETS
- Dollár, P.¹ Rabaud, V.² Cottrell, G.³ Belongie, S.⁴

8
- 84871399506
- Discovering video shot categories by unsupervised stochastic graph partition
- X. Duan, L. Lin, and H. Chao. Discovering video shot categories by unsupervised stochastic graph partition. IEEE Transactions on Multimedia, 15(1):167-180, 2013.
- (2013) IEEE Transactions on Multimedia , vol.15 , Issue.1 , pp. 167-180
- Duan, X.¹ Lin, L.² Chao, H.³

9
- 33745164643
- Activity recognition and abnormality detection with the switching hidden semi-markov model
- T. V. Duong, H. H. Bui, D. Q. Phung, and S. Venkatesh. Activity recognition and abnormality detection with the switching hidden semi-markov model. In CVPR (1), pages 838-845, 2005.
- (2005) CVPR , Issue.1 , pp. 838-845
- Duong, T.V.¹ Bui, H.H.² Phung, D.Q.³ Venkatesh, S.⁴

10
- 77955422240
- Object detection with discriminatively trained part-based models
- P. F. Felzenszwalb, R. B. Girshick, D. A. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell., 32(9):1627-1645, 2010.
- (2010) IEEE Trans. Pattern Anal. Mach. Intell. , vol.32 , Issue.9 , pp. 1627-1645
- Felzenszwalb, P.F.¹ Girshick, R.B.² McAllester, D.A.³ Ramanan, D.⁴

11
- 70450202741
- Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos
- A. Gupta, P. Srinivasan, J. Shi, and L. S. Davis. Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos. In CVPR, pages 2012-2019, 2009.
- (2009) CVPR , pp. 2012-2019
- Gupta, A.¹ Srinivasan, P.² Shi, J.³ Davis, L.S.⁴

12
- 78149348487
- Object, scene and actions: Combining multiple features for human action recognition
- N. Ikizler-Cinbis and S. Sclaroff. Object, scene and actions: Combining multiple features for human action recognition. In ECCV (1), pages 494-507, 2010.
- (2010) ECCV , Issue.1 , pp. 494-507
- Ikizler-Cinbis, N.¹ Sclaroff, S.²

13
- 84871359352
- Leveraging high-level and low-level features for multimedia event detection
- L. Jiang, A. G. Hauptmann, and G. Xiang. Leveraging high-level and low-level features for multimedia event detection. In ACM Multimedia, pages 449-458, 2012.
- (2012) ACM Multimedia , pp. 449-458
- Jiang, L.¹ Hauptmann, A.G.² Xiang, G.³

14
- 84867849524
- Trajectory-based modeling of human actions with motion reference points
- Y.-G. Jiang, Q. Dai, X. Xue, W. Liu, and C.-W. Ngo. Trajectory-based modeling of human actions with motion reference points. In ECCV (5), 2012.
- (2012) ECCV , Issue.5
- Jiang, Y.-G.¹ Dai, Q.² Xue, X.³ Liu, W.⁴ Ngo, C.-W.⁵

15
- 84898426452
- A spatio-temporal descriptor based on 3d-gradients
- A. Kläser, M. Marszalek, and C. Schmid. A spatio-temporal descriptor based on 3d-gradients. In BMVC, 2008.
- (2008) BMVC
- Kläser, A.¹ Marszalek, M.² Schmid, C.³

16
- 24944451092
- On space-time interest points
- I. Laptev. On space-time interest points. International Journal of Computer Vision, 64(2-3):107-123, 2005.
- (2005) International Journal of Computer Vision , vol.64 , Issue.2-3 , pp. 107-123
- Laptev, I.¹

17
- 51949083365
- Learning realistic human actions from movies
- I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. Learning realistic human actions from movies. In CVPR, 2008.
- (2008) CVPR
- Laptev, I.¹ Marszalek, M.² Schmid, C.³ Rozenfeld, B.⁴

18
- 70350660670
- Real-time human action recognition by luminance field trajectory analysis
- Z. Li, Y. Fu, T. S. Huang, and S. Yan. Real-time human action recognition by luminance field trajectory analysis. In ACM Multimedia, pages 671-676, 2008.
- (2008) ACM Multimedia , pp. 671-676
- Li, Z.¹ Fu, Y.² Huang, T.S.³ Yan, S.⁴

19
- 56049121516
- Semantic event representation and recognition using syntactic attribute graph grammar
- L. Lin, H. Gong, L. Li, and L. Wang. Semantic event representation and recognition using syntactic attribute graph grammar. Pattern Recognition Letters, 30(2):180-186, 2009.
- (2009) Pattern Recognition Letters , vol.30 , Issue.2 , pp. 180-186
- Lin, L.¹ Gong, H.² Li, L.³ Wang, L.⁴

20
- 84866698552
- Learning contour-fragment-based shape model with and-or tree representation
- L. Lin, X. Wang, W. Yang, and J. Lai. Learning contour-fragment-based shape model with and-or tree representation. In CVPR, pages 135-142, 2012.
- (2012) CVPR , pp. 135-142
- Lin, L.¹ Wang, X.² Yang, W.³ Lai, J.⁴

21
- 62349137210
- A stochastic graph grammar for compositional object representation and recognition
- L. Lin, T. Wu, J. Porway, and Z. Xu. A stochastic graph grammar for compositional object representation and recognition. Pattern Recognition, 42(7):1297-1307, 2009.
- (2009) Pattern Recognition , vol.42 , Issue.7 , pp. 1297-1307
- Lin, L.¹ Wu, T.² Porway, J.³ Xu, Z.⁴

22
- 70450203660
- Recognizing realistic actions from videos
- J. Liu, J. Luo, and M. Shah. Recognizing realistic actions from videos. In CVPR, pages 1996-2003, 2009.
- (2009) CVPR , pp. 1996-2003
- Liu, J.¹ Luo, J.² Shah, M.³

23
- 84871363788
- Knowledge adaptation for ad hoc multimedia event detection with few exemplars
- Z. Ma, Y. Yang, Y. Cai, N. Sebe, and A. G. Hauptmann. Knowledge adaptation for ad hoc multimedia event detection with few exemplars. In ACM Multimedia, pages 469-478, 2012.
- (2012) ACM Multimedia , pp. 469-478
- Ma, Z.¹ Yang, Y.² Cai, Y.³ Sebe, N.⁴ Hauptmann, A.G.⁵

24
- 70450177757
- Actions in context
- M. Marszalek, I. Laptev, and C. Schmid. Actions in context. In CVPR, 2009.
- (2009) CVPR
- Marszalek, M.¹ Laptev, I.² Schmid, C.³

25
- 78149353400
- Modeling temporal structure of decomposable motion segments for activity classification
- J. C. Niebles, C.-W. Chen, and F.-F. Li. Modeling temporal structure of decomposable motion segments for activity classification. In ECCV (2), pages 392-405, 2010.
- (2010) ECCV , Issue.2 , pp. 392-405
- Niebles, J.C.¹ Chen, C.-W.² Li, F.-F.³

26
- 84866661728
- Discovering discriminative action parts from mid-level video representations
- M. Raptis, I. Kokkinos, and S. Soatto. Discovering discriminative action parts from mid-level video representations. In CVPR, pages 1242-1249, 2012.
- (2012) CVPR , pp. 1242-1249
- Raptis, M.¹ Kokkinos, I.² Soatto, S.³

27
- 77953187842
- Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities
- M. S. Ryoo and J. K. Aggarwal. Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In ICCV, pages 1593-1600, 2009.
- (2009) ICCV , pp. 1593-1600
- Ryoo, M.S.¹ Aggarwal, J.K.²

28
- 84866718894
- Action bank: A high-level representation of activity in video
- S. Sadanand and J. J. Corso. Action bank: A high-level representation of activity in video. In CVPR, pages 1234-1241, 2012.
- (2012) CVPR , pp. 1234-1241
- Sadanand, S.¹ Corso, J.J.²

29
- 78149294036
- Modeling the temporal extent of actions
- S. Satkin and M. Hebert. Modeling the temporal extent of actions. In ECCV (1), pages 536-548, 2010.
- (2010) ECCV , Issue.1 , pp. 536-548
- Satkin, S.¹ Hebert, M.²

30
- 37849037402
- A 3-dimensional sift descriptor and its application to action recognition
- P. Scovanner, S. Ali, and M. Shah. A 3-dimensional sift descriptor and its application to action recognition. In ACM Multimedia, pages 357-360, 2007.
- (2007) ACM Multimedia , pp. 357-360
- Scovanner, P.¹ Ali, S.² Shah, M.³

31
- 84856636962
- Unsupervised learning of event and-or grammar and semantics from video
- Z. Si, M. Pei, B. Yao, and S.-C. Zhu. Unsupervised learning of event and-or grammar and semantics from video. In ICCV, pages 41-48, 2011.
- (2011) ICCV , pp. 41-48
- Si, Z.¹ Pei, M.² Yao, B.³ Zhu, S.-C.⁴

32
- 84866720929
- Multi-view latent variable discriminative models for action recognition
- Y. Song, L.-P. Morency, and R. Davis. Multi-view latent variable discriminative models for action recognition. In CVPR, pages 2120-2127, 2012.
- (2012) CVPR , pp. 2120-2127
- Song, Y.¹ Morency, L.-P.² Davis, R.³

33
- 84887413550
- Exploring probabilistic localized video representation for human action recognition
- Y. Song, S. Tang, Y.-T. Zheng, T.-S. Chua, Y. Zhang, and S. Lin. Exploring probabilistic localized video representation for human action recognition. Multimedia Tools Appl., 58(3):663-685, 2012.
- (2012) Multimedia Tools Appl , vol.58 , Issue.3 , pp. 663-685
- Song, Y.¹ Tang, S.² Zheng, Y.-T.³ Chua, T.-S.⁴ Zhang, Y.⁵ Lin, S.⁶

34
- 70450214829
- Hierarchical spatio-temporal context modeling for action recognition
- J. Sun, X. Wu, S. Yan, L. F. Cheong, T.-S. Chua, and J. Li. Hierarchical spatio-temporal context modeling for action recognition. In CVPR, pages 2004-2011, 2009.
- (2009) CVPR , pp. 2004-2011
- Sun, J.¹ Wu, X.² Yan, S.³ Cheong, L.F.⁴ Chua, T.-S.⁵ Li, J.⁶

35
- 84866658784
- Learning latent temporal structure for complex event detection
- K. Tang, F.-F. Li, and D. Koller. Learning latent temporal structure for complex event detection. In CVPR, pages 1250-1257, 2012.
- (2012) CVPR , pp. 1250-1257
- Tang, K.¹ Li, F.-F.² Koller, D.³

36
- 80052877143
- Action recognition by dense trajectories
- H. Wang, A. Kläser, C. Schmid, and C.-L. Liu. Action recognition by dense trajectories. In CVPR, pages 3169-3176, 2011.
- (2011) CVPR , pp. 3169-3176
- Wang, H.¹ Kläser, A.² Schmid, C.³ Liu, C.-L.⁴

37
- 84866674455
- Action recognition by exploring data distribution and feature correlation
- S. Wang, Y. Yang, Z. Ma, X. Li, C. Pang, and A. G. Hauptmann. Action recognition by exploring data distribution and feature correlation. In CVPR, pages 1370-1377, 2012.
- (2012) CVPR , pp. 1370-1377
- Wang, S.¹ Yang, Y.² Ma, Z.³ Li, X.⁴ Pang, C.⁵ Hauptmann, A.G.⁶

38
- 84877770379
- Dynamical and-or graph learning for object shape modeling and detection
- X. Wang and L. Lin. Dynamical and-or graph learning for object shape modeling and detection. In NIPS, pages 242-250, 2012.
- (2012) NIPS , pp. 242-250
- Wang, X.¹ Lin, L.²

39
- 79957467077
- Hidden part models for human action recognition: Probabilistic versus max margin
- Y. Wang and G. Mori. Hidden part models for human action recognition: Probabilistic versus max margin. IEEE Trans. Pattern Anal. Mach. Intell., 33(7):1310-1323, 2011.
- (2011) IEEE Trans. Pattern Anal. Mach. Intell. , vol.33 , Issue.7 , pp. 1310-1323
- Wang, Y.¹ Mori, G.²

40
- 77953218032
- Learning deformable action templates from cluttered videos
- B. Yao and S. C. Zhu. Learning deformable action templates from cluttered videos. In ICCV, pages 1507-1514, 2009.
- (2009) ICCV , pp. 1507-1514
- Yao, B.¹ Zhu, S.C.²

41
- 84455206064
- Real-time human action search using random forest based hough voting
- G. Yu, J. Yuan, and Z. Liu. Real-time human action search using random forest based hough voting. In ACM Multimedia, pages 1149-1152, 2011.
- (2011) ACM Multimedia , pp. 1149-1152
- Yu, G.¹ Yuan, J.² Liu, Z.³

42
- 80051863221
- Discriminative video pattern search for efficient action detection
- J. Yuan, Z. Liu, and Y. Wu. Discriminative video pattern search for efficient action detection. IEEE Trans. Pattern Anal. Mach. Intell., 33(9):1728-1743, 2011.
- (2011) IEEE Trans. Pattern Anal. Mach. Intell. , vol.33 , Issue.9 , pp. 1728-1743
- Yuan, J.¹ Liu, Z.² Wu, Y.³

43
- 0037686659
- The concave-convex procedure
- A. L. Yuille and A. Rangarajan. The concave-convex procedure. Neural Computation, 15(4):915-936, 2003.
- (2003) Neural Computation , vol.15 , Issue.4 , pp. 915-936
- Yuille, A.L.¹ Rangarajan, A.²

44
- 84867850268
- Spatio-temporal phrases for activity recognition
- Y. Zhang, X. Liu, M.-C. Chang, W. Ge, and T. Chen. Spatio-temporal phrases for activity recognition. In ECCV (3), pages 707-721, 2012.
- (2012) ECCV , Issue.3 , pp. 707-721
- Zhang, Y.¹ Liu, X.² Chang, M.-C.³ Ge, W.⁴ Chen, T.⁵

45
- 70350676914
- Sift-bag kernel for video event analysis
- X. Zhou, X. Zhuang, S. Yan, S.-F. Chang, M. Hasegawa-Johnson, and T. S. Huang. Sift-bag kernel for video event analysis. In ACM Multimedia, pages 229-238, 2008.
- (2008) ACM Multimedia , pp. 229-238
- Zhou, X.¹ Zhuang, X.² Yan, S.³ Chang, S.-F.⁴ Hasegawa-Johnson, M.⁵ Huang, T.S.⁶

46
- 72449171990
- Detecting video events based on action recognition in complex scenes using spatio-temporal descriptor
- G. Zhu, M. Yang, K. Yu, W. Xu, and Y. Gong. Detecting video events based on action recognition in complex scenes using spatio-temporal descriptor. In ACM Multimedia, pages 165-174, 2009.
- (2009) ACM Multimedia , pp. 165-174
- Zhu, G.¹ Yang, M.² Yu, K.³ Xu, W.⁴ Gong, Y.⁵

47
- 34548726226
- A stochastic grammar of images
- S. C. Zhu and D. Mumford. A stochastic grammar of images. Foundations and Trends in Computer Graphics and Vision, 2(4):259-362, 2006.
- (2006) Foundations and Trends in Computer Graphics and Vision , vol.2 , Issue.4 , pp. 259-362
- Zhu, S.C.¹ Mumford, D.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.