SCOPUS 정보 검색 플랫폼

International Journal of Computer Vision

Volumn 118, Issue 2, 2016, Pages 256-273

A Deep Structured Model with Radius–Margin Bound for 3D Human Activity Recognition

(6) Lin, Liang a,b Wang, Keze a,b Zuo, Wangmeng c Wang, Meng d Luo, Jiebo e Zhang, Lei b

a SUN YAT SEN UNIVERSITY (China)

b HONG KONG POLYTECHNIC UNIVERSITY (Hong Kong)

c HARBIN INSTITUTE OF TECHNOLOGY (China)

d HEFEI UNIVERSITY OF TECHNOLOGY (China)

e University of Rochester ^* (United States)

Author keywords

Deep learning; Human action and activity; RGB depth analysis; Structured model

Indexed keywords

ITERATIVE METHODS; LEARNING ALGORITHMS; NETWORK LAYERS; NEURAL NETWORKS; PATTERN RECOGNITION;

CONVOLUTIONAL NEURAL NETWORK; DEPTH ANALYSIS; GENERALIZATION PERFORMANCE; HUMAN ACTIONS; HUMAN ACTIVITY RECOGNITION; REGULARIZATION TERMS; STATE-OF-THE-ART APPROACH; STRUCTURED MODEL;

DEEP LEARNING;

EID: 84952023029 PISSN: 09205691 EISSN: 15731405 Source Type: Journal
DOI: 10.1007/s11263-015-0876-z Document Type: Article

Times cited : (91)

References (57)

1
- 84866674206
- Sum-product networks for modeling activities with stochastic structure
- Amer, M. R., & Todorovic, S. (2012). Sum-product networks for modeling activities with stochastic structure. In CVPR, pp 1314–1321
- (2012) In CVPR , pp. 1314-1321
- Amer, M.R.¹ Todorovic, S.²

2
- 85083953811
- Bayer, J., Osendorfer, C., Korhammer, D., Chen, N., Urban, S., & van der Smagt, P. In Proc. ICLR
- Bayer, J., Osendorfer, C., Korhammer, D., Chen, N., Urban, S., & van der Smagt, P. (2014). On fast dropout and its applicability to recurrent networks. In Proc. ICLR
- (2014) On fast dropout and its applicability to recurrent networks.

3
- 84856661125
- Learning spatiotemporal graphs of human activities. In: ICCV
- Brendel, W., & Todorovic, S. (2011). Learning spatiotemporal graphs of human activities. In: ICCV, pp 778–785
- (2011) pp 778–785
- Brendel, W.¹ Todorovic, S.²

4
- 0036161011
- Choosing multiple parameters for support vector machines
- Chapelle, O., Vapnik, V., Bousquet, O., & Mukherjee, S. (2002). Choosing multiple parameters for support vector machines. Machine Learning, 46(1–3), 131–159.
- (2002) Machine Learning , vol.46 , Issue.1-3 , pp. 131-159
- Chapelle, O.¹ Vapnik, V.² Bousquet, O.³ Mukherjee, S.⁴

5
- 84875494948
- A survey of video datasets for human action and activity recognition
- Chaquet, J. M., Carmona, E. J., & Fernandez-Caballero, A. (2013). A survey of video datasets for human action and activity recognition. Computer Vision and Image Understanding, 117(6), 633–659.
- (2013) Computer Vision and Image Understanding , vol.117 , Issue.6 , pp. 633-659
- Chaquet, J.M.¹ Carmona, E.J.² Fernandez-Caballero, A.³

6
- 84455205109
- Human group activity analysis with fusion of motion and appearance information
- Cheng, Z., Qin, L., Huang, Q., Jiang, S., Yan, S., & Tian, Q. (2011). Human group activity analysis with fusion of motion and appearance information. In ACM Multimedia, pp 1401–1404
- (2011) In ACM Multimedia , pp. 1401-1404
- Cheng, Z.¹ Qin, L.² Huang, Q.³ Jiang, S.⁴ Yan, S.⁵ Tian, Q.⁶

7
- 0141430928
- Radius margin bounds for support vector machines with the rbf kernel
- Chung, K. M., Kao, W. C., Sun, C. L., Wang, L. L., & Lin, C. J. (2003). Radius margin bounds for support vector machines with the rbf kernel. Neural Computation, 15(11), 2643–2681.
- (2003) Neural Computation , vol.15 , Issue.11 , pp. 2643-2681
- Chung, K.M.¹ Kao, W.C.² Sun, C.L.³ Wang, L.L.⁴ Lin, C.J.⁵

8
- 84897556574
- Convex formulations of radius-margin based support vector machines
- Do, H., & Kalousis, A. (2013). Convex formulations of radius-margin based support vector machines. In: ICML
- (2013) In: ICML
- Do, H.¹ Kalousis, A.²

9
- 70350633050
- Do, H., Kalousis, A., & Hilario, M. (Vol. 5781, pp. 315–329)., Lecture Notes in Computer Science Berlin Heidelberg: Springer
- Do, H., Kalousis, A., & Hilario, M. (2009). Feature weighting using margin and radius based error bound optimization in svms. Machine Learning and Knowledge Discovery in Databases (Vol. 5781, pp. 315–329)., Lecture Notes in Computer Science Berlin Heidelberg: Springer.
- (2009) Feature weighting using margin and radius based error bound optimization in svms. Machine Learning and Knowledge Discovery in Databases

10
- 84959236502
- Donahue, J., Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T. In CVPR
- Donahue, J., Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T. (2015). Long-term recurrent convolutional networks for visual recognition and description. In CVPR
- (2015) Long-term recurrent convolutional networks for visual recognition and description.

11
- 77955422240
- Object detection with discriminatively trained part based models
- Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1627–1645.
- (2010) IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.32 , Issue.9 , pp. 1627-1645
- Felzenszwalb, P.F.¹ Girshick, R.B.² McAllester, D.³ Ramanan, D.⁴

12
- 85119025686
- Girshick, R., Donahue, J., Darrell, T., & Malik, J. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- (2014) Rich feature hierarchies for accurate object detection and semantic segmentation.

13
- 84887418625
- Human activities recognition using depth images
- Gupta, R., Chia, A. Y., Rajan, D., Ng E. S., & Lung, E. H. (2013). Human activities recognition using depth images. In ACM Multimedia pp 283–292
- (2013) In ACM Multimedia , pp. 283-292
- Gupta, R.¹ Chia, A.Y.² Rajan, D.³ Ng, E.S.⁴ Lung, E.H.⁵

14
- 33746600649
- Reducing the dimensionality of data with neural networks
- Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–507.
- (2006) Science , vol.313 , Issue.5786 , pp. 504-507
- Hinton, G.E.¹ Salakhutdinov, R.R.²

15
- 33845597145
- Large-scale learning with svm and convolutional for generic object categorization
- Huang, F. J., & LeCun, Y. (2006). Large-scale learning with svm and convolutional for generic object categorization. In CVPR, pp 284–291
- (2006) In CVPR , pp. 284-291
- Huang, F.J.¹ LeCun, Y.²

16
- 84870183903
- 3d convolutional neural networks for human action recognition
- Ji, S., Xu, W., Yang, M., & Yu, K. (2013). 3d convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell, 35(1), 221–231.
- (2013) IEEE Trans Pattern Anal Mach Intell , vol.35 , Issue.1 , pp. 221-231
- Ji, S.¹ Xu, W.² Yang, M.³ Yu, K.⁴

17
- 84957580317
- Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. In CVPR
- Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Large-scale video classification with convolutional neural networks. In CVPR
- (2014) Large-scale video classification with convolutional neural networks.

18
- 84880311243
- Learning human activities and object affordances from rgb-d videos
- Koppula, H. S., Gupta, R., & Saxena, A. (2013). Learning human activities and object affordances from rgb-d videos. International Journal of Robotics Research (IJRR), 32(8), 951–970.
- (2013) International Journal of Robotics Research (IJRR) , vol.32 , Issue.8 , pp. 951-970
- Koppula, H.S.¹ Gupta, R.² Saxena, A.³

19
- 84897508000
- Learning spatio-temporal structure from rgb-d videos for human activity detection and anticipation
- Koppula, H. S., & Saxena, A. (2013). Learning spatio-temporal structure from rgb-d videos for human activity detection and anticipation. In ICML pp 792–800
- (2013) In ICML , pp. 792-800
- Koppula, H.S.¹ Saxena, A.²

20
- 84876231242
- Imagenet classification with deep convolutional neural networks
- Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp 1097–1105
- (2012) In Advances in neural information processing systems , pp. 1097-1105
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.E.³

21
- 84869785889
- LeCun, Y., Boser, B., Denker, J., Henderson, D., Howard, R., Hubbard, W., & Jackel ea, L. D. In Advances in neural information processing systems
- LeCun, Y., Boser, B., Denker, J., Henderson, D., Howard, R., Hubbard, W., & Jackel ea, L. D. (1990). Handwritten digit recognition with a back-propagation network. In Advances in neural information processing systems
- (1990) Handwritten digit recognition with a back-propagation network.

22
- 84887476984
- Learning latent spatio-temporal compositional model for human action recognition
- Liang, X., Lin, L., & Cao, L. (2013). Learning latent spatio-temporal compositional model for human action recognition. In ACM Multimedia, pp 263–272
- (2013) In ACM Multimedia , pp. 263-272
- Liang, X.¹ Lin, L.² Cao, L.³

23
- 84926429586
- Discriminatively trained and-or graph models for object shape detection
- Lin, L., Wang, X., Yang, W., & Lai, J. H. (2015). Discriminatively trained and-or graph models for object shape detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(5), 959–972.
- (2015) IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.37 , Issue.5 , pp. 959-972
- Lin, L.¹ Wang, X.² Yang, W.³ Lai, J.H.⁴

24
- 62349137210
- A stochastic graph grammar for compositional object representation and recognition
- Lin, L., Wu, T., Porway, J., & Xu, Z. (2009). A stochastic graph grammar for compositional object representation and recognition. Pattern Recognition, 42(7), 1297–1307.
- (2009) Pattern Recognition , vol.42 , Issue.7 , pp. 1297-1307
- Lin, L.¹ Wu, T.² Porway, J.³ Xu, Z.⁴

25
- 85119024205
- Luo, P., Tian, Y., Wang, X., & Tang, X. In CVPR
- Luo, P., Tian, Y., Wang, X., & Tang, X. (2014). Switchable deep network for pedestrian detection. In CVPR
- (2014) Switchable deep network for pedestrian detection.

26
- 84898796864
- A deep sum-product architecture for robust facial attributes analysis
- Luo, P., Wang, X., & Tang, X. (2013a). A deep sum-product architecture for robust facial attributes analysis. In ICCV, pp 2864–2871
- (2013) In ICCV , pp. 2864-2871
- Luo, P.¹ Wang, X.² Tang, X.³

27
- 84898770979
- Pedestrian parsing via deep decompositional neural network
- Luo, P., Wang, X., & Tang, X. (2013b). Pedestrian parsing via deep decompositional neural network. In ICCV, pp 2648–2655
- (2013) In ICCV , pp. 2648-2655
- Luo, P.¹ Wang, X.² Tang, X.³

28
- 84977441029
- Rgbd-hudaact: A color-depth video database for human daily activity recognition. Consumer Depth Cameras for Computer Vision, Lecture Notes in Computer Science (pp. 193–208)
- Ni, B., Wang, G., & Moulin, P. (2013a). Rgbd-hudaact: A color-depth video database for human daily activity recognition. Consumer Depth Cameras for Computer Vision, Lecture Notes in Computer Science (pp. 193–208). Springer.
- (2013) Springer
- Ni, B.¹ Wang, G.² Moulin, P.³

29
- 84881515103
- Integrating multi-stage depth-induced contextual information for human action recognition and localization
- Ni, B., YPei, Z.L., Lin, L., & Moulin, P. (2013b). Integrating multi-stage depth-induced contextual information for human action recognition and localization. In International Conference and Workshops on Automatic Face and Gesture Recognition, pp 1–8
- (2013) In International Conference and Workshops on Automatic Face and Gesture Recognition , pp. 1-8
- Ni, B.¹ YPei, Z.L.² Lin, L.³ Moulin, P.⁴

30
- 84887375927
- Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences
- Oreifej, O., & Liu, Z. (2013). Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences. In CVPR, pp 716–723
- (2013) In CVPR , pp. 716-723
- Oreifej, O.¹ Liu, Z.²

31
- 84866717619
- A combined pose, object, and feature model for action understanding
- Packer, B., Saenko, K., & Koller, D. (2012). A combined pose, object, and feature model for action understanding. In CVPR, pp 1378–1385
- (2012) In CVPR , pp. 1378-1385
- Packer, B.¹ Saenko, K.² Koller, D.³

32
- 84856646751
- Parsing video events with goal inference and intent prediction
- Pei, M., Jia, Y., & Zhu, S. (2011). Parsing video events with goal inference and intent prediction. In ICCV, pp 487–494
- (2011) In ICCV , pp. 487-494
- Pei, M.¹ Jia, Y.² Zhu, S.³

33
- 84866718894
- Action bank: A high-level representation of activity in video
- Sadanand, S., & Corso, J. J. (2012). Action bank: A high-level representation of activity in video. In CVPR, pp 1234–1241
- (2012) In CVPR , pp. 1234-1241
- Sadanand, S.¹ Corso, J.J.²

34
- 37849037402
- A 3-dimensional sift descriptor and its application to action recognition
- Scovanner, P., Ali, S., & Shah, M. (2007) A 3-dimensional sift descriptor and its application to action recognition. In ACM Multimedia, pp 357–360
- (2007) In ACM Multimedia , pp. 357-360
- Scovanner, P.¹ Ali, S.² Shah, M.³

35
- 84887328988
- Sermanet, P., Kavukcuoglu, K., Chintala, S., & LeCun, Y
- Sermanet, P., Kavukcuoglu, K., Chintala, S., & LeCun, Y. (2013). Pedestrian detection with unsupervised multi- stage feature learning. In CVPR
- (2013) Pedestrian detection with unsupervised multi- stage feature learning. In CVPR

36
- 84904163933
- Dropout: A simple way to prevent neural networks from overtting
- Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overtting. The Journal of Machine Learning Research, 15(1), 1929–1958.
- (2014) The Journal of Machine Learning Research , vol.15 , Issue.1 , pp. 1929-1958
- Srivastava, N.¹ Hinton, G.² Krizhevsky, A.³ Sutskever, I.⁴ Salakhutdinov, R.⁵

37
- 84864487638
- Unstructured human activity detection from rgbd images
- Sung, J., Ponce, C., Selman, B., & Saxena, A. (2012) Unstructured human activity detection from rgbd images. In ICRA, pp 842–849
- (2012) In ICRA , pp. 842-849
- Sung, J.¹ Ponce, C.² Selman, B.³ Saxena, A.⁴

38
- 84866658784
- Learning latent temporal structure for complex event detection
- Tang, K., Fei-Fei, L., & Koller, D. (2012). Learning latent temporal structure for complex event detection. In CVPR, pp 1250–1257
- (2012) In CVPR , pp. 1250-1257
- Tang, K.¹ Fei-Fei, L.² Koller, D.³

39
- 84901405262
- Joint video and text parsing for understanding events and answering queries
- Tu, K., Meng, M., Lee, M. W., Choi, T., & Zhu, S. (2014). Joint video and text parsing for understanding events and answering queries. IEEE Transactions on Multimedia, 21(2), 42–70.
- (2014) IEEE Transactions on Multimedia , vol.21 , Issue.2 , pp. 42-70
- Tu, K.¹ Meng, M.² Lee, M.W.³ Choi, T.⁴ Zhu, S.⁵

40
- 0003991806
- New York: John Wiley and Sons
- Vapnik, V. (1998). Statistical learning theory. New York: John Wiley and Sons.
- (1998) Statistical learning theory
- Vapnik, V.¹

41
- 84944069490
- Venugopalan, S., Xu, H., Donahue, J., Rohrbach, M., Mooney, R., & Saenko, K. In North American Chapter of the Association for Computational Linguistics
- Venugopalan, S., Xu, H., Donahue, J., Rohrbach, M., Mooney, R., & Saenko, K. (2015). Translating videos to natural language using deep recurrent neural networks. In North American Chapter of the Association for Computational Linguistics
- (2015) Translating videos to natural language using deep recurrent neural networks.

42
- 84866672692
- Mining actionlet ensemble for action recognition with depth cameras
- Wang, J., Liu, Z., Wu, Y., & Yuan, J. (2012). Mining actionlet ensemble for action recognition with depth cameras. In CVPR, pp 1290–1297
- (2012) In CVPR , pp. 1290-1297
- Wang, J.¹ Liu, Z.² Wu, Y.³ Yuan, J.⁴

43
- 79957467077
- Hidden part models for human action recognition: Probabilistic vs. max-margin
- Wang, Y., & Mori, G. (2011). Hidden part models for human action recognition: Probabilistic vs. max-margin. IEEE Trans Pattern Anal Mach Intell, 33(7), 1310–1323.
- (2011) IEEE Trans Pattern Anal Mach Intell , vol.33 , Issue.7 , pp. 1310-1323
- Wang, Y.¹ Mori, G.²

44
- 84977456312
- In ACM MM
- Wang, K., Wang, X., Lin, L. (2014). 3d human activity recognition with reconfigurable convolutional neural networks. In ACM MM
- (2014) 3d human activity recognition with reconfigurable convolutional neural networks
- Wang, K.¹ Wang, X.² Lin, L.³

45
- 84887346790
- An approach to pose-based action recognition
- Wang, C., Wang, Y., & Yuille, A. L. (2013). An approach to pose-based action recognition. In CVPR, pp 915–922
- (2013) In CVPR , pp. 915-922
- Wang, C.¹ Wang, Y.² Yuille, A.L.³

46
- 84898794902
- Learning maximum margin temporal warping for action recognition
- Wang, J., & Wu, Y. (2013) Learning maximum margin temporal warping for action recognition. In ICCV, pp 2688–2695
- (2013) In ICCV , pp. 2688-2695
- Wang, J.¹ Wu, Y.²

47
- 0002210265
- On the convergence properties of the em algorithm
- Wu, C. F. J. (1983). On the convergence properties of the em algorithm. Annals of Statistics, 11(1), 95–103.
- (1983) Annals of Statistics , vol.11 , Issue.1 , pp. 95-103
- Wu, C.F.J.¹

48
- 84887419657
- Online multimodal deep similarity learning with application to image retrieval
- Wu, P., Hoi, S., Xia, H., Zhao, P., Wang, D., & Miao, C. (2013) Online multimodal deep similarity learning with application to image retrieval. In ACM Mutilmedia, pp 153–162
- (2013) In ACM Mutilmedia , pp. 153-162
- Wu, P.¹ Hoi, S.² Xia, H.³ Zhao, P.⁴ Wang, D.⁵ Miao, C.⁶

49
- 84887324355
- Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera
- Xia, L., & Aggarwal, J. (2013) Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera. In CVPR, pp 2834–2841
- (2013) In CVPR , pp. 2834-2841
- Xia, L.¹ Aggarwal, J.²

50
- 84865033379
- View invariant human action recognition using histograms of 3d joints. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on
- Xia, L., Chen, C., & Aggarwal, J. (2012a). View invariant human action recognition using histograms of 3d joints. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on, IEEE, pp 20–27
- (2012) IEEE , pp. 20-27
- Xia, L.¹ Chen, C.² Aggarwal, J.³

51
- 84865033379
- View invariant human action recognition using histograms of 3d joints
- Xia, L., Chen, C., & Aggarwal, J. K. (2012b). View invariant human action recognition using histograms of 3d joints. In CVPRW, pp 20–27
- (2012) In CVPRW , pp. 20-27
- Xia, L.¹ Chen, C.² Aggarwal, J.K.³

52
- 84871394796
- Recognizing actions using depth motion maps-based histograms of oriented gradients
- Yang, X., Zhang, C., & Tian, Y. (2012). Recognizing actions using depth motion maps-based histograms of oriented gradients. In ACM Multimedia, pp 1057–1060
- (2012) In ACM Multimedia , pp. 1057-1060
- Yang, X.¹ Zhang, C.² Tian, Y.³

53
- 80052889296
- Learning image representations from the pixel level via hierarchical sparse coding
- Yu, K., Lin, Y., & Lafferty, J. (2011). Learning image representations from the pixel level via hierarchical sparse coding. In CVPR, pp 1713–1720
- (2011) In CVPR , pp. 1713-1720
- Yu, K.¹ Lin, Y.² Lafferty, J.³

54
- 84865015840
- Yun, K., Honorio, J., Chattopadhyay, D., Berg, T. L., & Samaras, D.In Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on, IEEE
- Yun, K., Honorio, J., Chattopadhyay, D., Berg, T. L., & Samaras, D. (2012) Two-person interaction detection using body-pose features and multiple instance learning. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on, IEEE
- (2012) Two-person interaction detection using body-pose features and multiple instance learning.

55
- 84887474318
- Exploring discriminative pose sub-patterns for effective action classification. In: ACM Multimedia
- Zhao, X., Liu, Y., & Fu, Y. (2013). Exploring discriminative pose sub-patterns for effective action classification. In: ACM Multimedia, pp 273–282
- (2013) pp 273–282
- Zhao, X.¹ Liu, Y.² Fu, Y.³

56
- 70350676914
- Sift-bag kernel for video event analysis
- Zhou, X., Zhuang, X., Yan, S., Chang, S. F., Johnson, M. H., & Huang, T. S. (2009) Sift-bag kernel for video event analysis. In ACM Multimedia, pp 229–238
- (2009) In ACM Multimedia , pp. 229-238
- Zhou, X.¹ Zhuang, X.² Yan, S.³ Chang, S.F.⁴ Johnson, M.H.⁵ Huang, T.S.⁶

57
- 34548726226
- A stochastic grammar of images
- Zhu, S., & Mumford, D. (2007). A stochastic grammar of images. Foundations and Trends in Computer Graphics and Vision, 2(4), 259–362.
- (2007) Foundations and Trends in Computer Graphics and Vision , vol.2 , Issue.4 , pp. 259-362
- Zhu, S.¹ Mumford, D.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.