메뉴 건너뛰기




Volumn 2017-January, Issue , 2017, Pages 1417-1426

CDC: Convolutional-de-convolutional networks for precise temporal action localization in untrimmed videos

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION; PATTERN RECOGNITION; SEMANTICS; SIGNAL SAMPLING;

EID: 85044270610     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/CVPR.2017.155     Document Type: Conference Paper
Times cited : (621)

References (79)
  • 1
    • 85044309526 scopus 로고    scopus 로고
    • Mexaction2. http://mexculture.cnam.fr/xwiki/bin/view/Datasets/Mex+action+dataset, 2015.
    • (2015)
  • 4
    • 85038956512 scopus 로고    scopus 로고
    • Segnet: A deep convolutional encoder-decoder architecture for image segmentation
    • V. Badrinarayanan, A. Kendall, and R. Cipolla. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. TPAMI, 2016.
    • (2016) TPAMI
    • Badrinarayanan, V.1    Kendall, A.2    Cipolla, R.3
  • 5
    • 85083954148 scopus 로고    scopus 로고
    • Semantic image segmentation with deep con-volutional nets and fully connected crfs
    • L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Semantic image segmentation with deep con-volutional nets and fully connected crfs. In ICLR, 2015.
    • (2015) ICLR
    • Chen, L.-C.1    Papandreou, G.2    Kokkinos, I.3    Murphy, K.4    Yuille, A.L.5
  • 10
    • 84986266741 scopus 로고    scopus 로고
    • Convolutional two-stream network fusion for video action recognition
    • C. Feichtenhofer, A. Pinz, and A. Zisserman. Convolutional two-stream network fusion for video action recognition. In CVPR, 2016.
    • (2016) CVPR
    • Feichtenhofer, C.1    Pinz, A.2    Zisserman, A.3
  • 11
    • 80052915321 scopus 로고    scopus 로고
    • Actom sequence models for efficient action detection
    • A. Gaidon, Z. Harchaoui, and C. Schmid. Actom sequence models for efficient action detection. In CVPR, 2011.
    • (2011) CVPR
    • Gaidon, A.1    Harchaoui, Z.2    Schmid, C.3
  • 12
    • 84973872525 scopus 로고    scopus 로고
    • Temporal localization of actions with actoms
    • A. Gaidon, Z. Harchaoui, and C. Schmid. Temporal localization of actions with actoms. In TPAMI, 2013.
    • (2013) TPAMI
    • Gaidon, A.1    Harchaoui, Z.2    Schmid, C.3
  • 13
    • 84959230113 scopus 로고    scopus 로고
    • Devnet: A deep event network for multimedia event detection and evidence recounting
    • C. Gan, N. Wang, Y. Yang, D.-Y. Yeung, and A. G. Hauptmann. Devnet: A deep event network for multimedia event detection and evidence recounting. In CVPR, 2015.
    • (2015) CVPR
    • Gan, C.1    Wang, N.2    Yang, Y.3    Yeung, D.-Y.4    Hauptmann, A.G.5
  • 14
    • 84959196122 scopus 로고    scopus 로고
    • Finding action tubes
    • G. Gkioxari and J. Malik. Finding action tubes. In CVPR, 2015.
    • (2015) CVPR
    • Gkioxari, G.1    Malik, J.2
  • 16
    • 84986274465 scopus 로고    scopus 로고
    • Deep residual learning for image recognition
    • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
    • (2016) CVPR
    • He, K.1    Zhang, X.2    Ren, S.3    Sun, J.4
  • 17
    • 84959216468 scopus 로고    scopus 로고
    • Activitynet: A large-scale video benchmark for human activity understanding
    • F. C. Heilbron, V. Escorcia, B. Ghanem, and J. C. Niebles. Activitynet: A large-scale video benchmark for human activity understanding. In CVPR, 2015.
    • (2015) CVPR
    • Heilbron, F.C.1    Escorcia, V.2    Ghanem, B.3    Niebles, J.C.4
  • 18
    • 84986275821 scopus 로고    scopus 로고
    • Fast temporal activity proposals for efficient detection of human actions in untrimmed videos
    • F. C. Heilbron, J. C. Niebles, and B. Ghanem. Fast temporal activity proposals for efficient detection of human actions in untrimmed videos. In CVPR, 2016.
    • (2016) CVPR
    • Heilbron, F.C.1    Niebles, J.C.2    Ghanem, B.3
  • 19
    • 84965099276 scopus 로고    scopus 로고
    • Decoupled deep neural network for semi-supervised semantic segmentation
    • S. Hong, H. Noh, and B. Han. Decoupled deep neural network for semi-supervised semantic segmentation. In NIPS, 2015.
    • (2015) NIPS
    • Hong, S.1    Noh, H.2    Han, B.3
  • 21
    • 84973868024 scopus 로고    scopus 로고
    • Objects2action: Classifying and localizing actions without any video example
    • M. Jain, J. van Gemert, T. Mensink, and C. Snoek. Objects2action: Classifying and localizing actions without any video example. In ICCV, 2015.
    • (2015) ICCV
    • Jain, M.1    Van Gemert, J.2    Mensink, T.3    Snoek, C.4
  • 22
    • 84959235126 scopus 로고    scopus 로고
    • What do 15, 000 object categories tell us about classifying and localizing actions?
    • M. Jain, J. van Gemert, and C. Snoek. What do 15, 000 object categories tell us about classifying and localizing actions? In CVPR, 2015.
    • (2015) CVPR
    • Jain, M.1    Van Gemert, J.2    Snoek, C.3
  • 23
    • 77956004473 scopus 로고    scopus 로고
    • Aggregating local descriptors into a compact image representation
    • H. Jégou, M. Douze, C. Schmid, and P. Pérez. Aggregating local descriptors into a compact image representation. In CVPR, 2010.
    • (2010) CVPR
    • Jégou, H.1    Douze, M.2    Schmid, C.3    Pérez, P.4
  • 26
    • 84986316707 scopus 로고    scopus 로고
    • Fast saliency based pooling of fisher encoded dense trajectories
    • S. Karaman, L. Seidenari, and A. D. Bimbo. Fast saliency based pooling of fisher encoded dense trajectories. In ECCV THUMOS Workshop, 2014.
    • (2014) ECCV THUMOS Workshop
    • Karaman, S.1    Seidenari, L.2    Bimbo, A.D.3
  • 29
    • 84876231242 scopus 로고    scopus 로고
    • Imagenet classification with deep convolutional neural networks
    • A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
    • (2012) NIPS
    • Krizhevsky, A.1    Sutskever, I.2    Hinton, G.E.3
  • 30
    • 50649122769 scopus 로고    scopus 로고
    • Retrieving actions in movies
    • I. Laptev and P. Pérez. Retrieving actions in movies. In ICCV, 2007.
    • (2007) ICCV
    • Laptev, I.1    Pérez, P.2
  • 31
    • 85044291629 scopus 로고    scopus 로고
    • Segmental spatiotemporal cnns for fine-grained action segmentation
    • C. Lea, A. Reiter, R. Vidal, and G. D. Hager. Segmental spatiotemporal cnns for fine-grained action segmentation. In ECCV, 2016.
    • (2016) ECCV
    • Lea, C.1    Reiter, A.2    Vidal, R.3    Hager, G.D.4
  • 32
    • 84986261676 scopus 로고    scopus 로고
    • Efficient piecewise training of deep structured models for semantic segmentation
    • G. Lin, C. Shen, A. van den Hengel, and I. Reid. Efficient piecewise training of deep structured models for semantic segmentation. In CVPR, 2016.
    • (2016) CVPR
    • Lin, G.1    Shen, C.2    Van Den Hengel, A.3    Reid, I.4
  • 33
    • 84986256919 scopus 로고    scopus 로고
    • Multi-scale patch aggregation (mpa) for simultaneous detection and segmentation
    • S. Liu, X. Qi, J. Shi, H. Zhang, and J. Jia. Multi-scale patch aggregation (mpa) for simultaneous detection and segmentation. In CVPR, 2016.
    • (2016) CVPR
    • Liu, S.1    Qi, X.2    Shi, J.3    Zhang, H.4    Jia, J.5
  • 34
    • 84959205572 scopus 로고    scopus 로고
    • Fully convolutional networks for semantic segmentation
    • J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In CVPR, 2015.
    • (2015) CVPR
    • Long, J.1    Shelhamer, E.2    Darrell, T.3
  • 35
    • 84866710901 scopus 로고    scopus 로고
    • A database for fine grained activity detection of cooking activities
    • M. A. M. Rohrbach, S. Amin and B. Schiele. A database for fine grained activity detection of cooking activities. In CVPR, 2012.
    • (2012) CVPR
    • Rohrbach, M.A.M.1    Amin, S.2    Schiele, B.3
  • 36
    • 84994583262 scopus 로고    scopus 로고
    • Spot on: Action localization from pointly-supervised proposals
    • P. Mettes, J. van Gemert, and C. Snoek. Spot on: Action localization from pointly-supervised proposals. In ECCV, 2016.
    • (2016) ECCV
    • Mettes, P.1    Van Gemert, J.2    Snoek, C.3
  • 37
    • 84973879016 scopus 로고    scopus 로고
    • Learning deconvolution network for semantic segmentation
    • H. Noh, S. Hong, and B. Han. Learning deconvolution network for semantic segmentation. In ICCV, 2015.
    • (2015) ICCV
    • Noh, H.1    Hong, S.2    Han, B.3
  • 38
    • 84898791167 scopus 로고    scopus 로고
    • Action and event recognition with fisher vectors on a compact feature set
    • D. Oneata, J. Verbeek, and C. Schmid. Action and event recognition with fisher vectors on a compact feature set. In ICCV, 2013.
    • (2013) ICCV
    • Oneata, D.1    Verbeek, J.2    Schmid, C.3
  • 40
    • 79959771606 scopus 로고    scopus 로고
    • Improving the fisher kernel for large-scale image classification
    • F. Perronnin, J. Sánchez, and T. Mensink. Improving the fisher kernel for large-scale image classification. In ECCV, 2010.
    • (2010) ECCV
    • Perronnin, F.1    Sánchez, J.2    Mensink, T.3
  • 41
    • 77949275097 scopus 로고    scopus 로고
    • A survey on vision-based human action recognition
    • R. Poppe. A survey on vision-based human action recognition. In Image and vision computing, 2010.
    • (2010) Image and Vision Computing
    • Poppe, R.1
  • 42
    • 84973879045 scopus 로고    scopus 로고
    • Un-supervised tube extraction using transductive learning and dense trajectories
    • M. M. Puscas, E. Sangineto, D. Culibrk, and N. Sebe. Un-supervised tube extraction using transductive learning and dense trajectories. In ICCV, 2015.
    • (2015) ICCV
    • Puscas, M.M.1    Sangineto, E.2    Culibrk, D.3    Sebe, N.4
  • 43
    • 84986270053 scopus 로고    scopus 로고
    • Temporal action detection using a statistical language model
    • A. Richard and J. Gall. Temporal action detection using a statistical language model. In CVPR, 2016.
    • (2016) CVPR
    • Richard, A.1    Gall, J.2
  • 45
    • 85011076500 scopus 로고    scopus 로고
    • Fully convolutional networks for semantic segmentation
    • E. Shelhamer, J. Long, and T. Darrell. Fully convolutional networks for semantic segmentation. TPAMI, 2016.
    • (2016) TPAMI
    • Shelhamer, E.1    Long, J.2    Darrell, T.3
  • 47
    • 84986268774 scopus 로고    scopus 로고
    • Temporal action localization in untrimmed videos via multi-stage cnns
    • Z. Shou, D. Wang, and S.-F. Chang. Temporal action localization in untrimmed videos via multi-stage cnns. In CVPR, 2016.
    • (2016) CVPR
    • Shou, Z.1    Wang, D.2    Chang, S.-F.3
  • 49
    • 85041903747 scopus 로고    scopus 로고
    • Hollywood in homes: Crowdsourcing data collection for activity understanding
    • G. A. Sigurdsson, G. Varol, X. Wang, A. Farhadi, I. Laptev, and A. Gupta. Hollywood in homes: Crowdsourcing data collection for activity understanding. In ECCV, 2016.
    • (2016) ECCV
    • Sigurdsson, G.A.1    Varol, G.2    Wang, X.3    Farhadi, A.4    Laptev, I.5    Gupta, A.6
  • 50
    • 84937862424 scopus 로고    scopus 로고
    • Two-stream convolutional networks for action recognition in videos
    • K. Simonyan and A. Zisserman. Two-stream convolutional networks for action recognition in videos. In NIPS, 2014.
    • (2014) NIPS
    • Simonyan, K.1    Zisserman, A.2
  • 52
    • 84986328004 scopus 로고    scopus 로고
    • A multi-stream bi-directional recurrent neural network for finegrained action detection
    • B. Singh, T. K. Marks, M. Jones, O. Tuzel, and M. Shao. A multi-stream bi-directional recurrent neural network for finegrained action detection. In CVPR, 2016.
    • (2016) CVPR
    • Singh, B.1    Marks, T.K.2    Jones, M.3    Tuzel, O.4    Shao, M.5
  • 53
    • 85044257995 scopus 로고    scopus 로고
    • Untrimmed classification for activity detection: Submission to activitynet challenge
    • G. Singh and F. Cuzzolin. Untrimmed classification for activity detection: submission to activitynet challenge. In CVPR ActivityNet Workshop, 2016.
    • (2016) CVPR ActivityNet Workshop
    • Singh, G.1    Cuzzolin, F.2
  • 54
    • 84973931629 scopus 로고    scopus 로고
    • Action localization in videos through context walk
    • K. Soomro, H. Idrees, and M. Shah. Action localization in videos through context walk. In ICCV, 2015.
    • (2015) ICCV
    • Soomro, K.1    Idrees, H.2    Shah, M.3
  • 55
    • 84986246311 scopus 로고    scopus 로고
    • Predicting the where and what of actors and actions through online action localization
    • K. Soomro, H. Idrees, and M. Shah. Predicting the where and what of actors and actions through online action localization. In CVPR, 2016.
    • (2016) CVPR
    • Soomro, K.1    Idrees, H.2    Shah, M.3
  • 58
    • 84986265065 scopus 로고    scopus 로고
    • What if we do not have multiple videos of the same action? - Video action localization using web images
    • W. Sultani and M. Shah. What if we do not have multiple videos of the same action? - video action localization using web images. In CVPR, 2016.
    • (2016) CVPR
    • Sultani, W.1    Shah, M.2
  • 59
    • 84986290264 scopus 로고    scopus 로고
    • Temporal localization of fine-grained actions in videos by domain transfer from web images
    • C. Sun, S. Shetty, R. Sukthankar, and R. Nevatia. Temporal localization of fine-grained actions in videos by domain transfer from web images. In ACM MM, 2015.
    • (2015) ACM MM
    • Sun, C.1    Shetty, S.2    Sukthankar, R.3    Nevatia, R.4
  • 60
    • 84973865953 scopus 로고    scopus 로고
    • Learning spatiotemporal features with 3d convolutional networks
    • D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotemporal features with 3d convolutional networks. In ICCV, 2015.
    • (2015) ICCV
    • Tran, D.1    Bourdev, L.2    Fergus, R.3    Torresani, L.4    Paluri, M.5
  • 62
    • 84973913561 scopus 로고    scopus 로고
    • Apt: Action localization proposals from dense trajectories
    • J. van Gemert, M. Jain, E. Gati, and C. Snoek. Apt: Action localization proposals from dense trajectories. In BMVC, 2015.
    • (2015) BMVC
    • Van Gemert, J.1    Jain, M.2    Gati, E.3    Snoek, C.4
  • 64
    • 84898805910 scopus 로고    scopus 로고
    • Action recognition with improved trajectories
    • H. Wang and C. Schmid. Action recognition with improved trajectories. In ICCV, 2013.
    • (2013) ICCV
    • Wang, H.1    Schmid, C.2
  • 65
    • 84986274451 scopus 로고    scopus 로고
    • Action recognition and detection by combining motion and appearance features
    • L. Wang, Y. Qiao, and X. Tang. Action recognition and detection by combining motion and appearance features. In ECCV THUMOS Workshop, 2014.
    • (2014) ECCV THUMOS Workshop
    • Wang, L.1    Qiao, Y.2    Tang, X.3
  • 66
    • 85019099168 scopus 로고    scopus 로고
    • Temporal segment networks: Towards good practices for deep action recognition
    • L. Wang, Y. Xiong, Z. Wang, Y. Qiao, D. Lin, X. Tang, and L. V. Gool. Temporal segment networks: Towards good practices for deep action recognition. In ECCV, 2016.
    • (2016) ECCV
    • Wang, L.1    Xiong, Y.2    Wang, Z.3    Qiao, Y.4    Lin, D.5    Tang, X.6    Gool, L.V.7
  • 68
    • 78751648503 scopus 로고    scopus 로고
    • A survey of vision-based methods for action representation, segmentation and recognition
    • D. Weinland, R. Ronfard, and E. Boyer. A survey of vision-based methods for action representation, segmentation and recognition. In Computer Vision and Image Understanding, 2011.
    • (2011) Computer Vision and Image Understanding
    • Weinland, D.1    Ronfard, R.2    Boyer, E.3
  • 69
    • 84973931775 scopus 로고    scopus 로고
    • Learning to track for spatio-temporal action localization
    • P. Weinzaepfel, Z. Harchaoui, and C. Schmid. Learning to track for spatio-temporal action localization. In ICCV, 2015.
    • (2015) ICCV
    • Weinzaepfel, P.1    Harchaoui, Z.2    Schmid, C.3
  • 70
    • 84986313829 scopus 로고    scopus 로고
    • Actor-action semantic segmentation with grouping process models
    • C. Xu and J. J. Corso. Actor-action semantic segmentation with grouping process models. In CVPR, 2016.
    • (2016) CVPR
    • Xu, C.1    Corso, J.J.2
  • 71
    • 84959226659 scopus 로고    scopus 로고
    • A discriminative cnn video representation for event detection
    • Z. Xu, Y. Yang, and A. G. Hauptmann. A discriminative cnn video representation for event detection. In CVPR, 2015.
    • (2015) CVPR
    • Xu, Z.1    Yang, Y.2    Hauptmann, A.G.3
  • 73
    • 84986253505 scopus 로고    scopus 로고
    • End-to-end learning of action detection from frame glimpses in videos
    • S. Yeung, O. Russakovsky, G. Mori, and L. Fei-Fei. End-to-end learning of action detection from frame glimpses in videos. In CVPR, 2016.
    • (2016) CVPR
    • Yeung, S.1    Russakovsky, O.2    Mori, G.3    Fei-Fei, L.4
  • 74
    • 85083952059 scopus 로고    scopus 로고
    • Multi-scale context aggregation by dilated convolutions
    • F. Yu and V. Koltun. Multi-scale context aggregation by dilated convolutions. In ICLR, 2016.
    • (2016) ICLR
    • Yu, F.1    Koltun, V.2
  • 75
    • 84959191147 scopus 로고    scopus 로고
    • Fast action proposals for human action detection and search
    • G. Yu and J. Yuan. Fast action proposals for human action detection and search. In CVPR, 2015.
    • (2015) CVPR
    • Yu, G.1    Yuan, J.2
  • 76
    • 84986267340 scopus 로고    scopus 로고
    • Temporal action localization with pyramid of score distribution features
    • J. Yuan, B. Ni, X. Yang, and A. Kassim. Temporal action localization with pyramid of score distribution features. In CVPR, 2016.
    • (2016) CVPR
    • Yuan, J.1    Ni, B.2    Yang, X.3    Kassim, A.4
  • 77
    • 84921476116 scopus 로고    scopus 로고
    • Visualizing and understanding con-volutional networks
    • M. Zeiler and R. Fergus. Visualizing and understanding con-volutional networks. In ECCV, 2014.
    • (2014) ECCV
    • Zeiler, M.1    Fergus, R.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.