메뉴 건너뛰기




Volumn 2016-December, Issue , 2016, Pages 1049-1058

Temporal action localization in untrimmed videos via multi-stage CNNs

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION;

EID: 84986268774     PISSN: 10636919     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/CVPR.2016.119     Document Type: Conference Paper
Times cited : (1130)

References (46)
  • 1
    • 85009885890 scopus 로고    scopus 로고
    • 6
    • Mexaction2. http: //mexculture. cnam. fr/xwiki/bin/view/Datasets/Mex+action+ dataset, 2015.
    • (2015) Mexaction2
  • 4
    • 80052915321 scopus 로고    scopus 로고
    • Actom sequence models for efficient action detection
    • 3
    • A. Gaidon, Z. Harchaoui, and C. Schmid. Actom sequence models for efficient action detection. In CVPR, 2011.
    • (2011) CVPR
    • Gaidon, A.1    Harchaoui, Z.2    Schmid, C.3
  • 5
    • 84884557275 scopus 로고    scopus 로고
    • Temporal localization of actions with actoms
    • 3
    • A. Gaidon, Z. Harchaoui, and C. Schmid. Temporal localization of actions with actoms. In TPAMI, 2013.
    • (2013) TPAMI
    • Gaidon, A.1    Harchaoui, Z.2    Schmid, C.3
  • 6
    • 85029359197 scopus 로고    scopus 로고
    • Fast r-cnn
    • 1, 3
    • R. Girshick. Fast r-cnn. In ICCV, 2015.
    • (2015) ICCV
    • Girshick, R.1
  • 7
    • 84911400494 scopus 로고    scopus 로고
    • Rich feature hierarchies for accurate object detection and semantic segmentation
    • 1, 3
    • R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014.
    • (2014) CVPR
    • Girshick, R.1    Donahue, J.2    Darrell, T.3    Malik, J.4
  • 8
    • 84959196122 scopus 로고    scopus 로고
    • Finding action tubes
    • 3
    • G. Gkioxari and J. Malik. Finding action tubes. In CVPR, 2015.
    • (2015) CVPR
    • Gkioxari, G.1    Malik, J.2
  • 11
    • 84973868024 scopus 로고    scopus 로고
    • Objects2action: Classifying and localizing actions without any video example
    • 3
    • M. Jain, J. van Gemert, T. Mensink, and C. Snoek. Objects2action: Classifying and localizing actions without any video example. In ICCV, 2015.
    • (2015) ICCV
    • Jain, M.1    Van Gemert, J.2    Mensink, T.3    Snoek, C.4
  • 12
    • 84959235126 scopus 로고    scopus 로고
    • What do 15, 000 object categories tell us about classifying and localizing actions
    • 3
    • M. Jain, J. van Gemert, and C. Snoek. What do 15, 000 object categories tell us about classifying and localizing actions In CVPR, 2015.
    • (2015) CVPR
    • Jain, M.1    Van Gemert, J.2    Snoek, C.3
  • 13
    • 84870183903 scopus 로고    scopus 로고
    • 3d convolutional neural networks for human action recognition
    • 1, 3
    • S. Ji, W. Xu, M. Yang, and K. Yu. 3d convolutional neural networks for human action recognition. In TPMAI, 2013.
    • (2013) TPMAI
    • Ji, S.1    Xu, W.2    Yang, M.3    Yu, K.4
  • 16
    • 84885382793 scopus 로고    scopus 로고
    • A unified tree-based framework for joint action localization, recognition and segmentation
    • 3
    • Z. Jiang, Z. Lin, and L. S. Davis. A unified tree-based framework for joint action localization, recognition and segmentation. In CVIU, 2013.
    • (2013) CVIU
    • Jiang, Z.1    Lin, Z.2    Davis, L.S.3
  • 17
    • 84986316707 scopus 로고    scopus 로고
    • Fast saliency based pooling of fisher encoded dense trajectories
    • 1, 6
    • S. Karaman, L. Seidenari, and A. D. Bimbo. Fast saliency based pooling of fisher encoded dense trajectories. In ECCV THUMOS Workshop, 2014.
    • (2014) ECCV THUMOS Workshop
    • Karaman, S.1    Seidenari, L.2    Bimbo, A.D.3
  • 20
    • 84876231242 scopus 로고    scopus 로고
    • Imagenet classification with deep convolutional neural networks
    • 3
    • A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
    • (2012) NIPS
    • Krizhevsky, A.1    Sutskever, I.2    Hinton, G.E.3
  • 21
    • 84973861966 scopus 로고    scopus 로고
    • Deepbox: Learning objectness with convolutional networks
    • 1, 3
    • W. Kuo, B. Hariharan, and J. Malik. Deepbox: Learning objectness with convolutional networks. In ICCV, 2015.
    • (2015) ICCV
    • Kuo, W.1    Hariharan, B.2    Malik, J.3
  • 22
    • 84959216691 scopus 로고    scopus 로고
    • Recognizing complex events in videos by learning key static-dynamic evidences
    • 2
    • K.-T. Lai, D. Liu, M.-S. Chen, and S.-F. Chang. Recognizing complex events in videos by learning key static-dynamic evidences. In ECCV, 2014.
    • (2014) ECCV
    • Lai, K.-T.1    Liu, D.2    Chen, M.-S.3    Chang, S.-F.4
  • 23
    • 84911413388 scopus 로고    scopus 로고
    • Video event detection by inferring temporal instance labels
    • 2
    • K.-T. Lai, F. X. Yu, M.-S. Chen, and S.-F. Chang. Video event detection by inferring temporal instance labels. In CVPR, 2014.
    • (2014) CVPR
    • Lai, K.-T.1    Yu, F.X.2    Chen, M.-S.3    Chang, S.-F.4
  • 24
    • 84959241532 scopus 로고    scopus 로고
    • Beyond Gaussian pyramid: Multi-skip feature stacking for action recognition
    • 1, 2
    • Z. Lan, M. Lin, X. Li, A. G. Hauptmann, and B. Raj. Beyond Gaussian pyramid: Multi-skip feature stacking for action recognition. In CVPR, 2015.
    • (2015) CVPR
    • Lan, Z.1    Lin, M.2    Li, X.3    Hauptmann, A.G.4    Raj, B.5
  • 25
    • 84898791167 scopus 로고    scopus 로고
    • Action and event recognition with fisher vectors on a compact feature set
    • 1, 2, 6
    • D. Oneata, J. Verbeek, and C. Schmid. Action and event recognition with fisher vectors on a compact feature set. In ICCV, 2013.
    • (2013) ICCV
    • Oneata, D.1    Verbeek, J.2    Schmid, C.3
  • 26
    • 84911423364 scopus 로고    scopus 로고
    • Efficient action localization with approximately normalized fisher vectors
    • 3
    • D. Oneata, J. Verbeek, and C. Schmid. Efficient action localization with approximately normalized fisher vectors. In CVPR, 2014.
    • (2014) CVPR
    • Oneata, D.1    Verbeek, J.2    Schmid, C.3
  • 28
    • 77949275097 scopus 로고    scopus 로고
    • A survey on vision-based human action recognition
    • 1, 2
    • R. Poppe. A survey on vision-based human action recognition. In Image and vision computing, 2010.
    • (2010) Image and Vision Computing
    • Poppe, R.1
  • 29
    • 84973879045 scopus 로고    scopus 로고
    • Unsupervised tube extraction using transductive learning and dense trajectories
    • 3
    • M. M. Puscas, E. Sangineto, D. Culibrk, and N. Sebe. Unsupervised tube extraction using transductive learning and dense trajectories. In ICCV, 2015.
    • (2015) ICCV
    • Puscas, M.M.1    Sangineto, E.2    Culibrk, D.3    Sebe, N.4
  • 30
    • 84960980241 scopus 로고    scopus 로고
    • Faster r-cnn: Towards real-time object detection with region proposal networks
    • 1, 3
    • S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In NIPS, 2015.
    • (2015) NIPS
    • Ren, S.1    He, K.2    Girshick, R.3    Sun, J.4
  • 31
    • 84937862424 scopus 로고    scopus 로고
    • Two-stream convolutional networks for action recognition in videos
    • 1, 3
    • K. Simonyan and A. Zisserman. Two-stream convolutional networks for action recognition in videos. In NIPS, 2014.
    • (2014) NIPS
    • Simonyan, K.1    Zisserman, A.2
  • 33
    • 84973931629 scopus 로고    scopus 로고
    • Action localization in videos through context walk
    • 3
    • K. Soomro, H. Idrees, and M. Shah. Action localization in videos through context walk. In ICCV, 2015.
    • (2015) ICCV
    • Soomro, K.1    Idrees, H.2    Shah, M.3
  • 34
    • 84904972001 scopus 로고    scopus 로고
    • UCF101: A dataset of 101 human actions classes from videos in the wild
    • 3
    • K. Soomro, A. R. Zamir, and M. Shah. UCF101: A dataset of 101 human actions classes from videos in the wild. In CRCV-TR-12-01, 2012.
    • (2012) CRCV-TR-12-01
    • Soomro, K.1    Zamir, A.R.2    Shah, M.3
  • 36
    • 84986290264 scopus 로고    scopus 로고
    • Temporal localization of fine-grained actions in videos by domain transfer from web images
    • 2
    • C. Sun, S. Shetty, R. Sukthankar, and R. Nevatia. Temporal localization of fine-grained actions in videos by domain transfer from web images. In ACM MM, 2015.
    • (2015) ACM MM
    • Sun, C.1    Shetty, S.2    Sukthankar, R.3    Nevatia, R.4
  • 37
    • 84973865953 scopus 로고    scopus 로고
    • Learning spatiotemporal features with 3d convolutional networks
    • 1, 3, 4, 6
    • D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotemporal features with 3d convolutional networks. In ICCV, 2015.
    • (2015) ICCV
    • Tran, D.1    Bourdev, L.2    Fergus, R.3    Torresani, L.4    Paluri, M.5
  • 38
    • 84973913561 scopus 로고    scopus 로고
    • Apt: Action localization proposals from dense trajectories
    • 3
    • J. van Gemert, M. Jain, E. Gati, and C. Snoek. Apt: Action localization proposals from dense trajectories. In BMVC, 2015.
    • (2015) BMVC
    • Van Gemert, J.1    Jain, M.2    Gati, E.3    Snoek, C.4
  • 39
  • 40
    • 84898805910 scopus 로고    scopus 로고
    • Action recognition with improved trajectories
    • 1, 2
    • H. Wang and C. Schmid. Action recognition with improved trajectories. In ICCV, 2013.
    • (2013) ICCV
    • Wang, H.1    Schmid, C.2
  • 41
    • 84986274451 scopus 로고    scopus 로고
    • Action recognition and detection by combining motion and appearance features
    • 1, 6
    • L. Wang, Y. Qiao, and X. Tang. Action recognition and detection by combining motion and appearance features. In ECCV THUMOS Workshop, 2014.
    • (2014) ECCV THUMOS Workshop
    • Wang, L.1    Qiao, Y.2    Tang, X.3
  • 42
    • 78751648503 scopus 로고    scopus 로고
    • A survey of visionbased methods for action representation, segmentation and recognition
    • 1, 2
    • D. Weinland, R. Ronfard, and E. Boyer. A survey of visionbased methods for action representation, segmentation and recognition. In Computer Vision and Image Understanding, 2011.
    • (2011) Computer Vision and Image Understanding
    • Weinland, D.1    Ronfard, R.2    Boyer, E.3
  • 43
    • 84973931775 scopus 로고    scopus 로고
    • Learning to track for spatio-temporal action localization
    • 3
    • P. Weinzaepfel, Z. Harchaoui, and C. Schmid. Learning to track for spatio-temporal action localization. In ICCV, 2015.
    • (2015) ICCV
    • Weinzaepfel, P.1    Harchaoui, Z.2    Schmid, C.3
  • 44
    • 84959226659 scopus 로고    scopus 로고
    • A discriminative cnn video representation for event detection
    • 1, 3
    • Z. Xu, Y. Yang, and A. G. Hauptmann. A discriminative cnn video representation for event detection. In CVPR, 2015.
    • (2015) CVPR
    • Xu, Z.1    Yang, Y.2    Hauptmann, A.G.3
  • 45
    • 84959191147 scopus 로고    scopus 로고
    • Fast action proposals for human action detection and search
    • 3
    • G. Yu and J. Yuan. Fast action proposals for human action detection and search. In CVPR, 2015.
    • (2015) CVPR
    • Yu, G.1    Yuan, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.