SCOPUS 정보 검색 플랫폼

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Volumn 2016-December, Issue , 2016, Pages 2658-2667

Actions ~ Transformations

(3) Wang, Xiaolong a Farhadi, Ali b,c Gupta, Abhinav a,c

a CARNEGIE MELLON UNIVERSITY (United States)

b UNIVERSITY OF WASHINGTON (United States)

c Allen Institute for Artificial Intelligence (United States)

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION;

DEEP LEARNING; HIGH-LEVEL FEATURES; STANDARD ACTION; VIDEO REPRESENTATIONS;

PATTERN RECOGNITION;

EID: 84986268683 PISSN: 10636919 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/CVPR.2016.291 Document Type: Conference Paper

Times cited : (241)

References (60)

1
- 24644437539
- Signature verification using a siamese time delay neural network
- J. Bromley, I. Guyon, Y. LeCun, E. Sackinger, and R. Shah. Signature verification using a siamese time delay neural network. NIPS, 1993.
- (1993) NIPS
- Bromley, J.¹ Guyon, I.² LeCun, Y.³ Sackinger, E.⁴ Shah, R.⁵

2
- 24644436425
- Learning a similarity metric discriminatively, with application to face verification
- S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. CVPR, 2005.
- (2005) CVPR
- Chopra, S.¹ Hadsell, R.² LeCun, Y.³

3
- 34948855444
- Human detection using oriented histograms of flow and appearance
- N. Dalal, B. Triggs, and C. Schmid. Human detection using oriented histograms of flow and appearance. In ECCV, 2006.
- (2006) ECCV
- Dalal, N.¹ Triggs, B.² Schmid, C.³

4
- 84959236502
- Long-term recurrent convolutional networks for visual recognition and description
- J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term recurrent convolutional networks for visual recognition and description. In CVPR, 2015.
- (2015) CVPR
- Donahue, J.¹ Hendricks, L.A.² Guadarrama, S.³ Rohrbach, M.⁴ Venugopalan, S.⁵ Saenko, K.⁶ Darrell, T.⁷

5
- 84887336656
- Modeling actions through state changes
- A. Fathi and J. M. Rehg. Modeling actions through state changes. In ICCV, 2013.
- (2013) ICCV
- Fathi, A.¹ Rehg, J.M.²

6
- 84959223985
- Modeling video evolution for action recognition
- B. Fernando, E. Gavves, J. O. M., A. Ghodrati, and T. Tuytelaars. Modeling video evolution for action recognition. In CVPR, 2015.
- (2015) CVPR
- Fernando, B.¹ Gavves, E.² Ghodrati, A.³ Tuytelaars, T.⁴

7
- 84959230113
- Devnet: A deep event network for multimedia event detection and evidence recounting
- C. Gan, N. Wang, Y. Yang, D.-Y. Yeung, and A. G. Hauptmann. Devnet: A deep event network for multimedia event detection and evidence recounting. CVPR, 2015.
- (2015) CVPR
- Gan, C.¹ Wang, N.² Yang, Y.³ Yeung, D.-Y.⁴ Hauptmann, A.G.⁵

8
- 84959196122
- Finding action tubes
- G. Gkioxari and J. Malik. Finding action tubes. In CVPR, 2015.
- (2015) CVPR
- Gkioxari, G.¹ Malik, J.²

9
- 33845594569
- Dimensionality reduction by learning an invariant mapping
- R. Hadsell, S. Chopra, and Y. LeCun. Dimensionality reduction by learning an invariant mapping. CVPR, 2006.
- (2006) CVPR
- Hadsell, R.¹ Chopra, S.² LeCun, Y.³

10
- 84959216468
- Activitynet: A large-scale video benchmark for human activity understanding
- F. C. Heilbron, V. Escorcia, B. Ghanem, and J. C. Niebles. Activitynet: A large-scale video benchmark for human activity understanding. CVPR, 2015.
- (2015) CVPR
- Heilbron, F.C.¹ Escorcia, V.² Ghanem, B.³ Niebles, J.C.⁴

11
- 85009910964
- Deep metric learning using triplet network
- /abs/1412.6622
- E. Hoffer and N. Ailon. Deep metric learning using triplet network. CoRR, /abs/1412.6622, 2014.
- (2014) CoRR
- Hoffer, E.¹ Ailon, N.²

12
- 84911459575
- Discriminative deep metric learning for face verification in the wild
- June
- J. Hu, J. Lu, and Y.-P. Tan. Discriminative deep metric learning for face verification in the wild. In CVPR, June 2014.
- (2014) CVPR
- Hu, J.¹ Lu, J.² Tan, Y.-P.³

13
- 84887479105
- Recognizing complex events using large margin joint low-level event model
- H. Izadinia and M. Shah. Recognizing complex events using large margin joint low-level event model. ECCV, 2012.
- (2012) ECCV
- Izadinia, H.¹ Shah, M.²

14
- 84887337772
- Representing videos using mid-level discriminative patches
- A. Jain, A. Gupta, M. Rodriguez, and L. S. Davis. Representing videos using mid-level discriminative patches. In CVPR, 2013.
- (2013) CVPR
- Jain, A.¹ Gupta, A.² Rodriguez, M.³ Davis, L.S.⁴

15
- 84887398298
- Better exploiting motion for better action recognition
- M. Jain, H. Jegou, and P. Bouthemy. Better exploiting motion for better action recognition. CVPR, 2013.
- (2013) CVPR
- Jain, M.¹ Jegou, H.² Bouthemy, P.³

16
- 84973897623
- Learning image representations tied to ego-motion
- D. Jayaraman and K. Grauman. Learning image representations tied to ego-motion. In ICCV, 2015.
- (2015) ICCV
- Jayaraman, D.¹ Grauman, K.²

17
- 84870183903
- 3d convolutional neural networks for human action recognition
- S. Ji, W. Xu, M. Yang, and K. Yu. 3d convolutional neural networks for human action recognition. TPAMI, 2013.
- (2013) TPAMI
- Ji, S.¹ Xu, W.² Yang, M.³ Yu, K.⁴

18
- 84877645596
- Trajectory-based modeling of human actions with motion reference points
- Y.-G. Jiang, Q. Dai, X. Xue, W. Liu, and C.-W. Ngo. Trajectory-based modeling of human actions with motion reference points. In ECCV, 2012.
- (2012) ECCV
- Jiang, Y.-G.¹ Dai, Q.² Xue, X.³ Liu, W.⁴ Ngo, C.-W.⁵

19
- 84905052261
- Y.-G. Jiang, J. Liu, A. R. Zamir, G. Toderici, I. Laptev, M. Shah, and R. Sukthankar. Thumos challenge: Action recognition with a large number of classes. http://crcv.ucf.edu/THUMOS14/, 2014.
- (2014) Thumos Challenge: Action Recognition with A Large Number of Classes
- Jiang, Y.-G.¹ Liu, J.² Zamir, A.R.³ Toderici, G.⁴ Laptev, I.⁵ Shah, M.⁶ Sukthankar, R.⁷

20
- 79959766559
- Consumer video understanding: A benchmark database and an evaluation of human and machine performance
- Y.-G. Jiang, G. Ye, S.-F. Chang, D. Ellis, and A. C. Loui. Consumer video understanding: A benchmark database and an evaluation of human and machine performance. In ICMR, 2011.
- (2011) ICMR
- Jiang, Y.-G.¹ Ye, G.² Chang, S.-F.³ Ellis, D.⁴ Loui, A.C.⁵

21
- 84911364368
- Large-scale video classification with convolutional neural networks
- A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In CVPR, 2014.
- (2014) CVPR
- Karpathy, A.¹ Toderici, G.² Shetty, S.³ Leung, T.⁴ Sukthankar, R.⁵ Fei-Fei, L.⁶

22
- 84898426452
- A spatiotemporal descriptor based on 3d-gradients
- A. Klaser, M. Marszalek, and C. Schmid. A spatiotemporal descriptor based on 3d-gradients. In BMVC, 2008.
- (2008) BMVC
- Klaser, A.¹ Marszalek, M.² Schmid, C.³

23
- 84856682691
- Hmdb: A large video database for human motion recognition
- H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T.Serre. Hmdb: A large video database for human motion recognition. In ICCV, 2011.
- (2011) ICCV
- Kuehne, H.¹ Jhuang, H.² Garrote, E.³ Poggio, T.⁴ Serre, T.⁵

24
- 84973931670
- Action recognition by hierarchical mid-level action elements
- T. Lan, Y. Zhu, A. R. Zamir, and S. Savarese. Action recognition by hierarchical mid-level action elements. In ICCV, 2015.
- (2015) ICCV
- Lan, T.¹ Zhu, Y.² Zamir, A.R.³ Savarese, S.⁴

25
- 84959241532
- Beyond Gaussian pyramid: Multi-skip feature stacking for action recognition
- Z. Lan, M. Lin, X. Li, A. G. Hauptmann, and B. Raj. Beyond Gaussian pyramid: Multi-skip feature stacking for action recognition. In CVPR, 2015.
- (2015) CVPR
- Lan, Z.¹ Lin, M.² Li, X.³ Hauptmann, A.G.⁴ Raj, B.⁵

26
- 24944451092
- On space-time interest points
- I. Laptev. On space-time interest points. IJCV, 64, 2005.
- (2005) IJCV , pp. 64
- Laptev, I.¹

27
- 51949083365
- Learning realistic human actions from movies
- I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. Learning realistic human actions from movies. In CVPR, 2008.
- (2008) CVPR
- Laptev, I.¹ Marszalek, M.² Schmid, C.³ Rozenfeld, B.⁴

28
- 80052874098
- Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis
- Q. V. Le,W. Y. Zou, S. Y. Yeung, and A. Y. Ng. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In CVPR, 2011.
- (2011) CVPR
- Le, Q.V.¹ Zou, W.Y.² Yeung, S.Y.³ Ng, A.Y.⁴

29
- 77953178862
- Trajectons: Action recognition through the motion analysis of tracked features
- P. Matikainen, M. Hebert, and R. Sukthankar. Trajectons: Action recognition through the motion analysis of tracked features. In ICCV Workshops, 2009.
- (2009) ICCV Workshops
- Matikainen, P.¹ Hebert, M.² Sukthankar, R.³

30
- 84959228762
- Beyond short snippets: Deep networks for video classification
- J. Y.-H. Ng, M. Hausknecht, S. Vijayanarasimhan, O. Vinyals, R. Monga, and G. Toderici. Beyond short snippets: Deep networks for video classification. In CVPR, 2015.
- (2015) CVPR
- Ng, J.Y.-H.¹ Hausknecht, M.² Vijayanarasimhan, S.³ Vinyals, O.⁴ Monga, R.⁵ Toderici, G.⁶

31
- 84906500926
- Bag of visual words and fusion methods for action recognition: Comprehensive study and good practice
- abs/1405.4506
- X. Peng, L. Wang, X. Wang, and Y. Qiao. Bag of visual words and fusion methods for action recognition: Comprehensive study and good practice. CoRR, /abs/1405.4506, 2014.
- (2014) CoRR
- Peng, X.¹ Wang, L.² Wang, X.³ Qiao, Y.⁴

32
- 84947130265
- Action recognition with stacked fisher vectors
- X. Peng, C. Zou, Y. Qiao, and Q. Peng. Action recognition with stacked fisher vectors. In ECCV, 2014.
- (2014) ECCV , pp. 2
- Peng, X.¹ Zou, C.² Qiao, Y.³ Peng, Q.⁴

33
- 77949275097
- A survey on vision-based human action recognition
- R. Poppe. A survey on vision-based human action recognition. Image and vision computing, 28(6):976-990, 2010.
- (2010) Image and Vision Computing , vol.28 , Issue.6 , pp. 976-990
- Poppe, R.¹

34
- 84887351648
- Script data for attributebased recognition of composite activities
- M. Rohrbach, M. Regneri, M. Andriluka, S. Amin, M. Pinkal, and B. Schiele. Script data for attributebased recognition of composite activities. ECCV, 2012.
- (2012) ECCV
- Rohrbach, M.¹ Regneri, M.² Andriluka, M.³ Amin, S.⁴ Pinkal, M.⁵ Schiele, B.⁶

35
- 84947041871
- Imagenet large scale visual recognition challenge
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. IJCV, 115(3):211-252, 2015.
- (2015) IJCV , vol.115 , Issue.3 , pp. 211-252
- Russakovsky, O.¹ Deng, J.² Su, H.³ Krause, J.⁴ Satheesh, S.⁵ Ma, S.⁶ Huang, Z.⁷ Karpathy, A.⁸ Khosla, A.⁹ Bernstein, M.¹⁰ Berg, A.C.¹¹ Fei-Fei, L.¹²

36
- 84866718894
- Action bank: A highlevel representation of activity in video
- S. Sadanand and J. J. Corso. Action bank: A highlevel representation of activity in video. In CVPR, 2012.
- (2012) CVPR
- Sadanand, S.¹ Corso, J.J.²

37
- 84938235221
- Fracking deep convolutional image descriptors
- abs/1412.6537
- E. Simo-Serra, E. Trulls, L. Ferraz, I. Kokkinos, and F. Moreno-Noguer. Fracking deep convolutional image descriptors. CoRR, /abs/1412.6537, 2014.
- (2014) CoRR
- Simo-Serra, E.¹ Trulls, E.² Ferraz, L.³ Kokkinos, I.⁴ Moreno-Noguer, F.⁵

38
- 84938239875
- Deep inside convolutional networks: Visualising image classification models and saliency maps
- abs/1312.6034
- K. Simonyan, A. Vedaldi, and A. Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR, /abs/1312.6034, 2013.
- (2013) CoRR
- Simonyan, K.¹ Vedaldi, A.² Zisserman, A.³

39
- 84937862424
- Two-stream convolutional networks for action recognition in videos
- K. Simonyan and A. Zisserman. Two-stream convolutional networks for action recognition in videos. In NIPS, 2014.
- (2014) NIPS
- Simonyan, K.¹ Zisserman, A.²

40
- 85083953063
- Very deep convolutional networks for large-scale image recognition
- K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. ICLR, 2015.
- (2015) ICLR
- Simonyan, K.¹ Zisserman, A.²

41
- 84887335980
- Action recognition by hierarchical sequence summarization
- Y. Song, L.-P. Morency, and R. Davis. Action recognition by hierarchical sequence summarization. In CVPR, 2013.
- (2013) CVPR
- Song, Y.¹ Morency, L.-P.² Davis, R.³

42
- 84893702065
- Ucf101: A dataset of 101 human actions classes from videos in the wild
- abs/1212.0402
- K. Soomro, A. R. Zamir, and M. Shah. Ucf101: A dataset of 101 human actions classes from videos in the wild. CoRR, /abs/1212.0402, 2012.
- (2012) CoRR
- Soomro, K.¹ Zamir, A.R.² Shah, M.³

43
- 84962900096
- Unsupervised learning of video representations using lstms
- abs/1502.04681
- N. Srivastava, E. Mansimov, and R. Salakhutdinov. Unsupervised learning of video representations using lstms. CoRR, /abs/1502.04681, 2015.
- (2015) CoRR
- Srivastava, N.¹ Mansimov, E.² Salakhutdinov, R.³

44
- 84898775956
- Active: Activity concept transitions in video event classification
- C. Sun and R. Nevatia. Active: Activity concept transitions in video event classification. ICCV, 2013.
- (2013) ICCV
- Sun, C.¹ Nevatia, R.²

45
- 84962876036
- Temporal localization of fine-grained actions in videos by domain transfer from web images
- C. Sun, S. Shetty, R. Sukthankar, and R. Nevatia. Temporal localization of fine-grained actions in videos by domain transfer from web images. In ACM Multimedia, 2015.
- (2015) ACM Multimedia
- Sun, C.¹ Shetty, S.² Sukthankar, R.³ Nevatia, R.⁴

46
- 84973863239
- Human action recognition using factorized spatio-temporal convolutional networks
- L. Sun, K. Jia, D.-Y. Yeung, and B. E. Shi. Human action recognition using factorized spatio-temporal convolutional networks. In ICCV, 2015.
- (2015) ICCV
- Sun, L.¹ Jia, K.² Yeung, D.-Y.³ Shi, B.E.⁴

47
- 84866658784
- Learning latent temporal structure for complex event detection
- K. Tang, L. Fei-Fei, and D. Koller. Learning latent temporal structure for complex event detection. In CVPR, 2012.
- (2012) CVPR
- Tang, K.¹ Fei-Fei, L.² Koller, D.³

48
- 84867652321
- Convolutional learning of spatio-temporal features
- G. W. Taylor, R. Fergus, Y. LeCun, and C. Bregler. Convolutional learning of spatio-temporal features. In ECCV, 2010.
- (2010) ECCV
- Taylor, G.W.¹ Fergus, R.² LeCun, Y.³ Bregler, C.⁴

49
- 84973865953
- Learning spatiotemporal features with 3d convolutional networks
- D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotemporal features with 3d convolutional networks. In ICCV, 2015.
- (2015) ICCV
- Tran, D.¹ Bourdev, L.² Fergus, R.³ Torresani, L.⁴ Paluri, M.⁵

50
- 80052877143
- Action recognition by dense trajectories
- H. Wang, A. Klaser, C. Schmid, and L. Cheng-Lin. Action recognition by dense trajectories. In CVPR, 2011.
- (2011) CVPR
- Wang, H.¹ Klaser, A.² Schmid, C.³ Cheng-Lin, L.⁴

51
- 84898805910
- Action recognition with improved trajectories
- H. Wang and C. Schmid. Action recognition with improved trajectories. In ICCV, 2013.
- (2013) ICCV
- Wang, H.¹ Schmid, C.²

52
- 84955282488
- Action recognition with trajectory-pooled deep-convolutional descriptors
- L. Wang, Y. Qiao, and X. Tang. Action recognition with trajectory-pooled deep-convolutional descriptors. In CVPR, 2015.
- (2015) CVPR
- Wang, L.¹ Qiao, Y.² Tang, X.³

53
- 84961995462
- Towards good practices for very deep two-stream convnets
- abs/1507.02159
- L. Wang, Y. Xiong, Z. Wang, and Y. Qiao. Towards good practices for very deep two-stream convnets. CoRR, /abs/1507.02159, 2015.
- (2015) CoRR
- Wang, L.¹ Xiong, Y.² Wang, Z.³ Qiao, Y.⁴

54
- 84973889989
- Unsupervised learning of visual representations using videos
- X. Wang and A. Gupta. Unsupervised learning of visual representations using videos. ICCV, 2015.
- (2015) ICCV
- Wang, X.¹ Gupta, A.²

55
- 79957467077
- Hidden part models for human action recognition: Probabilistic vs max-margin
- Y. Wang and G. Mori. Hidden part models for human action recognition: Probabilistic vs. max-margin. TPAMI, 2011.
- (2011) TPAMI
- Wang, Y.¹ Mori, G.²

56
- 84911433150
- Towards good practices for action video encoding
- J. Wu, Y. Zhang, and W. Lin. Towards good practices for action video encoding. CVPR, 2014.
- (2014) CVPR
- Wu, J.¹ Zhang, Y.² Lin, W.³

57
- 84962921420
- Modeling spatial-temporal clues in a hybrid deep learning framework for video classification
- Z. Wu, X. Wang, Y.-G. Jiang, H. Ye, and X. Xue. Modeling spatial-temporal clues in a hybrid deep learning framework for video classification. In ACM Multimedia, 2015.
- (2015) ACM Multimedia
- Wu, Z.¹ Wang, X.² Jiang, Y.-G.³ Ye, H.⁴ Xue, X.⁵

58
- 84959226659
- A discriminative cnn video representation for event detection
- Z. Xu, Y. Yang, and A. G. Hauptmann. A discriminative cnn video representation for event detection. CVPR, 2015.
- (2015) CVPR
- Xu, Z.¹ Yang, Y.² Hauptmann, A.G.³

59
- 51849165417
- A duality based approach for realtime tv-l1 optical flow
- C. Zach, T. Pock, and H. Bischof. A duality based approach for realtime tv-l1 optical flow. 29th DAGM Symposium on Pattern Recognition, 2007.
- (2007) 29th Dagm Symposium on Pattern Recognition
- Zach, C.¹ Pock, T.² Bischof, H.³

60
- 84898805615
- Action recognition with actons
- J. Zhu, B. Wang, X. Yang, W. Zhang, and Z. Tu. Action recognition with actons. In ICCV, 2013.
- (2013) ICCV
- Zhu, J.¹ Wang, B.² Yang, X.³ Zhang, W.⁴ Tu, Z.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.