-
1
-
-
24644437539
-
Signature verification using a siamese time delay neural network
-
J. Bromley, I. Guyon, Y. LeCun, E. Sackinger, and R. Shah. Signature verification using a siamese time delay neural network. NIPS, 1993.
-
(1993)
NIPS
-
-
Bromley, J.1
Guyon, I.2
LeCun, Y.3
Sackinger, E.4
Shah, R.5
-
2
-
-
24644436425
-
Learning a similarity metric discriminatively, with application to face verification
-
S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. CVPR, 2005.
-
(2005)
CVPR
-
-
Chopra, S.1
Hadsell, R.2
LeCun, Y.3
-
3
-
-
34948855444
-
Human detection using oriented histograms of flow and appearance
-
N. Dalal, B. Triggs, and C. Schmid. Human detection using oriented histograms of flow and appearance. In ECCV, 2006.
-
(2006)
ECCV
-
-
Dalal, N.1
Triggs, B.2
Schmid, C.3
-
4
-
-
84959236502
-
Long-term recurrent convolutional networks for visual recognition and description
-
J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term recurrent convolutional networks for visual recognition and description. In CVPR, 2015.
-
(2015)
CVPR
-
-
Donahue, J.1
Hendricks, L.A.2
Guadarrama, S.3
Rohrbach, M.4
Venugopalan, S.5
Saenko, K.6
Darrell, T.7
-
5
-
-
84887336656
-
Modeling actions through state changes
-
A. Fathi and J. M. Rehg. Modeling actions through state changes. In ICCV, 2013.
-
(2013)
ICCV
-
-
Fathi, A.1
Rehg, J.M.2
-
7
-
-
84959230113
-
Devnet: A deep event network for multimedia event detection and evidence recounting
-
C. Gan, N. Wang, Y. Yang, D.-Y. Yeung, and A. G. Hauptmann. Devnet: A deep event network for multimedia event detection and evidence recounting. CVPR, 2015.
-
(2015)
CVPR
-
-
Gan, C.1
Wang, N.2
Yang, Y.3
Yeung, D.-Y.4
Hauptmann, A.G.5
-
9
-
-
33845594569
-
Dimensionality reduction by learning an invariant mapping
-
R. Hadsell, S. Chopra, and Y. LeCun. Dimensionality reduction by learning an invariant mapping. CVPR, 2006.
-
(2006)
CVPR
-
-
Hadsell, R.1
Chopra, S.2
LeCun, Y.3
-
10
-
-
84959216468
-
Activitynet: A large-scale video benchmark for human activity understanding
-
F. C. Heilbron, V. Escorcia, B. Ghanem, and J. C. Niebles. Activitynet: A large-scale video benchmark for human activity understanding. CVPR, 2015.
-
(2015)
CVPR
-
-
Heilbron, F.C.1
Escorcia, V.2
Ghanem, B.3
Niebles, J.C.4
-
11
-
-
85009910964
-
Deep metric learning using triplet network
-
/abs/1412.6622
-
E. Hoffer and N. Ailon. Deep metric learning using triplet network. CoRR, /abs/1412.6622, 2014.
-
(2014)
CoRR
-
-
Hoffer, E.1
Ailon, N.2
-
12
-
-
84911459575
-
Discriminative deep metric learning for face verification in the wild
-
June
-
J. Hu, J. Lu, and Y.-P. Tan. Discriminative deep metric learning for face verification in the wild. In CVPR, June 2014.
-
(2014)
CVPR
-
-
Hu, J.1
Lu, J.2
Tan, Y.-P.3
-
13
-
-
84887479105
-
Recognizing complex events using large margin joint low-level event model
-
H. Izadinia and M. Shah. Recognizing complex events using large margin joint low-level event model. ECCV, 2012.
-
(2012)
ECCV
-
-
Izadinia, H.1
Shah, M.2
-
14
-
-
84887337772
-
Representing videos using mid-level discriminative patches
-
A. Jain, A. Gupta, M. Rodriguez, and L. S. Davis. Representing videos using mid-level discriminative patches. In CVPR, 2013.
-
(2013)
CVPR
-
-
Jain, A.1
Gupta, A.2
Rodriguez, M.3
Davis, L.S.4
-
15
-
-
84887398298
-
Better exploiting motion for better action recognition
-
M. Jain, H. Jegou, and P. Bouthemy. Better exploiting motion for better action recognition. CVPR, 2013.
-
(2013)
CVPR
-
-
Jain, M.1
Jegou, H.2
Bouthemy, P.3
-
16
-
-
84973897623
-
Learning image representations tied to ego-motion
-
D. Jayaraman and K. Grauman. Learning image representations tied to ego-motion. In ICCV, 2015.
-
(2015)
ICCV
-
-
Jayaraman, D.1
Grauman, K.2
-
17
-
-
84870183903
-
3d convolutional neural networks for human action recognition
-
S. Ji, W. Xu, M. Yang, and K. Yu. 3d convolutional neural networks for human action recognition. TPAMI, 2013.
-
(2013)
TPAMI
-
-
Ji, S.1
Xu, W.2
Yang, M.3
Yu, K.4
-
18
-
-
84877645596
-
Trajectory-based modeling of human actions with motion reference points
-
Y.-G. Jiang, Q. Dai, X. Xue, W. Liu, and C.-W. Ngo. Trajectory-based modeling of human actions with motion reference points. In ECCV, 2012.
-
(2012)
ECCV
-
-
Jiang, Y.-G.1
Dai, Q.2
Xue, X.3
Liu, W.4
Ngo, C.-W.5
-
19
-
-
84905052261
-
-
Y.-G. Jiang, J. Liu, A. R. Zamir, G. Toderici, I. Laptev, M. Shah, and R. Sukthankar. Thumos challenge: Action recognition with a large number of classes. http://crcv.ucf.edu/THUMOS14/, 2014.
-
(2014)
Thumos Challenge: Action Recognition with A Large Number of Classes
-
-
Jiang, Y.-G.1
Liu, J.2
Zamir, A.R.3
Toderici, G.4
Laptev, I.5
Shah, M.6
Sukthankar, R.7
-
20
-
-
79959766559
-
Consumer video understanding: A benchmark database and an evaluation of human and machine performance
-
Y.-G. Jiang, G. Ye, S.-F. Chang, D. Ellis, and A. C. Loui. Consumer video understanding: A benchmark database and an evaluation of human and machine performance. In ICMR, 2011.
-
(2011)
ICMR
-
-
Jiang, Y.-G.1
Ye, G.2
Chang, S.-F.3
Ellis, D.4
Loui, A.C.5
-
21
-
-
84911364368
-
Large-scale video classification with convolutional neural networks
-
A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In CVPR, 2014.
-
(2014)
CVPR
-
-
Karpathy, A.1
Toderici, G.2
Shetty, S.3
Leung, T.4
Sukthankar, R.5
Fei-Fei, L.6
-
22
-
-
84898426452
-
A spatiotemporal descriptor based on 3d-gradients
-
A. Klaser, M. Marszalek, and C. Schmid. A spatiotemporal descriptor based on 3d-gradients. In BMVC, 2008.
-
(2008)
BMVC
-
-
Klaser, A.1
Marszalek, M.2
Schmid, C.3
-
23
-
-
84856682691
-
Hmdb: A large video database for human motion recognition
-
H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T.Serre. Hmdb: A large video database for human motion recognition. In ICCV, 2011.
-
(2011)
ICCV
-
-
Kuehne, H.1
Jhuang, H.2
Garrote, E.3
Poggio, T.4
Serre, T.5
-
24
-
-
84973931670
-
Action recognition by hierarchical mid-level action elements
-
T. Lan, Y. Zhu, A. R. Zamir, and S. Savarese. Action recognition by hierarchical mid-level action elements. In ICCV, 2015.
-
(2015)
ICCV
-
-
Lan, T.1
Zhu, Y.2
Zamir, A.R.3
Savarese, S.4
-
25
-
-
84959241532
-
Beyond Gaussian pyramid: Multi-skip feature stacking for action recognition
-
Z. Lan, M. Lin, X. Li, A. G. Hauptmann, and B. Raj. Beyond Gaussian pyramid: Multi-skip feature stacking for action recognition. In CVPR, 2015.
-
(2015)
CVPR
-
-
Lan, Z.1
Lin, M.2
Li, X.3
Hauptmann, A.G.4
Raj, B.5
-
26
-
-
24944451092
-
On space-time interest points
-
I. Laptev. On space-time interest points. IJCV, 64, 2005.
-
(2005)
IJCV
, pp. 64
-
-
Laptev, I.1
-
28
-
-
80052874098
-
Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis
-
Q. V. Le,W. Y. Zou, S. Y. Yeung, and A. Y. Ng. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In CVPR, 2011.
-
(2011)
CVPR
-
-
Le, Q.V.1
Zou, W.Y.2
Yeung, S.Y.3
Ng, A.Y.4
-
29
-
-
77953178862
-
Trajectons: Action recognition through the motion analysis of tracked features
-
P. Matikainen, M. Hebert, and R. Sukthankar. Trajectons: Action recognition through the motion analysis of tracked features. In ICCV Workshops, 2009.
-
(2009)
ICCV Workshops
-
-
Matikainen, P.1
Hebert, M.2
Sukthankar, R.3
-
30
-
-
84959228762
-
Beyond short snippets: Deep networks for video classification
-
J. Y.-H. Ng, M. Hausknecht, S. Vijayanarasimhan, O. Vinyals, R. Monga, and G. Toderici. Beyond short snippets: Deep networks for video classification. In CVPR, 2015.
-
(2015)
CVPR
-
-
Ng, J.Y.-H.1
Hausknecht, M.2
Vijayanarasimhan, S.3
Vinyals, O.4
Monga, R.5
Toderici, G.6
-
31
-
-
84906500926
-
Bag of visual words and fusion methods for action recognition: Comprehensive study and good practice
-
abs/1405.4506
-
X. Peng, L. Wang, X. Wang, and Y. Qiao. Bag of visual words and fusion methods for action recognition: Comprehensive study and good practice. CoRR, /abs/1405.4506, 2014.
-
(2014)
CoRR
-
-
Peng, X.1
Wang, L.2
Wang, X.3
Qiao, Y.4
-
32
-
-
84947130265
-
Action recognition with stacked fisher vectors
-
X. Peng, C. Zou, Y. Qiao, and Q. Peng. Action recognition with stacked fisher vectors. In ECCV, 2014.
-
(2014)
ECCV
, pp. 2
-
-
Peng, X.1
Zou, C.2
Qiao, Y.3
Peng, Q.4
-
33
-
-
77949275097
-
A survey on vision-based human action recognition
-
R. Poppe. A survey on vision-based human action recognition. Image and vision computing, 28(6):976-990, 2010.
-
(2010)
Image and Vision Computing
, vol.28
, Issue.6
, pp. 976-990
-
-
Poppe, R.1
-
34
-
-
84887351648
-
Script data for attributebased recognition of composite activities
-
M. Rohrbach, M. Regneri, M. Andriluka, S. Amin, M. Pinkal, and B. Schiele. Script data for attributebased recognition of composite activities. ECCV, 2012.
-
(2012)
ECCV
-
-
Rohrbach, M.1
Regneri, M.2
Andriluka, M.3
Amin, S.4
Pinkal, M.5
Schiele, B.6
-
35
-
-
84947041871
-
Imagenet large scale visual recognition challenge
-
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. IJCV, 115(3):211-252, 2015.
-
(2015)
IJCV
, vol.115
, Issue.3
, pp. 211-252
-
-
Russakovsky, O.1
Deng, J.2
Su, H.3
Krause, J.4
Satheesh, S.5
Ma, S.6
Huang, Z.7
Karpathy, A.8
Khosla, A.9
Bernstein, M.10
Berg, A.C.11
Fei-Fei, L.12
-
36
-
-
84866718894
-
Action bank: A highlevel representation of activity in video
-
S. Sadanand and J. J. Corso. Action bank: A highlevel representation of activity in video. In CVPR, 2012.
-
(2012)
CVPR
-
-
Sadanand, S.1
Corso, J.J.2
-
37
-
-
84938235221
-
Fracking deep convolutional image descriptors
-
abs/1412.6537
-
E. Simo-Serra, E. Trulls, L. Ferraz, I. Kokkinos, and F. Moreno-Noguer. Fracking deep convolutional image descriptors. CoRR, /abs/1412.6537, 2014.
-
(2014)
CoRR
-
-
Simo-Serra, E.1
Trulls, E.2
Ferraz, L.3
Kokkinos, I.4
Moreno-Noguer, F.5
-
38
-
-
84938239875
-
Deep inside convolutional networks: Visualising image classification models and saliency maps
-
abs/1312.6034
-
K. Simonyan, A. Vedaldi, and A. Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR, /abs/1312.6034, 2013.
-
(2013)
CoRR
-
-
Simonyan, K.1
Vedaldi, A.2
Zisserman, A.3
-
39
-
-
84937862424
-
Two-stream convolutional networks for action recognition in videos
-
K. Simonyan and A. Zisserman. Two-stream convolutional networks for action recognition in videos. In NIPS, 2014.
-
(2014)
NIPS
-
-
Simonyan, K.1
Zisserman, A.2
-
40
-
-
85083953063
-
Very deep convolutional networks for large-scale image recognition
-
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. ICLR, 2015.
-
(2015)
ICLR
-
-
Simonyan, K.1
Zisserman, A.2
-
41
-
-
84887335980
-
Action recognition by hierarchical sequence summarization
-
Y. Song, L.-P. Morency, and R. Davis. Action recognition by hierarchical sequence summarization. In CVPR, 2013.
-
(2013)
CVPR
-
-
Song, Y.1
Morency, L.-P.2
Davis, R.3
-
42
-
-
84893702065
-
Ucf101: A dataset of 101 human actions classes from videos in the wild
-
abs/1212.0402
-
K. Soomro, A. R. Zamir, and M. Shah. Ucf101: A dataset of 101 human actions classes from videos in the wild. CoRR, /abs/1212.0402, 2012.
-
(2012)
CoRR
-
-
Soomro, K.1
Zamir, A.R.2
Shah, M.3
-
43
-
-
84962900096
-
Unsupervised learning of video representations using lstms
-
abs/1502.04681
-
N. Srivastava, E. Mansimov, and R. Salakhutdinov. Unsupervised learning of video representations using lstms. CoRR, /abs/1502.04681, 2015.
-
(2015)
CoRR
-
-
Srivastava, N.1
Mansimov, E.2
Salakhutdinov, R.3
-
44
-
-
84898775956
-
Active: Activity concept transitions in video event classification
-
C. Sun and R. Nevatia. Active: Activity concept transitions in video event classification. ICCV, 2013.
-
(2013)
ICCV
-
-
Sun, C.1
Nevatia, R.2
-
45
-
-
84962876036
-
Temporal localization of fine-grained actions in videos by domain transfer from web images
-
C. Sun, S. Shetty, R. Sukthankar, and R. Nevatia. Temporal localization of fine-grained actions in videos by domain transfer from web images. In ACM Multimedia, 2015.
-
(2015)
ACM Multimedia
-
-
Sun, C.1
Shetty, S.2
Sukthankar, R.3
Nevatia, R.4
-
46
-
-
84973863239
-
Human action recognition using factorized spatio-temporal convolutional networks
-
L. Sun, K. Jia, D.-Y. Yeung, and B. E. Shi. Human action recognition using factorized spatio-temporal convolutional networks. In ICCV, 2015.
-
(2015)
ICCV
-
-
Sun, L.1
Jia, K.2
Yeung, D.-Y.3
Shi, B.E.4
-
47
-
-
84866658784
-
Learning latent temporal structure for complex event detection
-
K. Tang, L. Fei-Fei, and D. Koller. Learning latent temporal structure for complex event detection. In CVPR, 2012.
-
(2012)
CVPR
-
-
Tang, K.1
Fei-Fei, L.2
Koller, D.3
-
49
-
-
84973865953
-
Learning spatiotemporal features with 3d convolutional networks
-
D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotemporal features with 3d convolutional networks. In ICCV, 2015.
-
(2015)
ICCV
-
-
Tran, D.1
Bourdev, L.2
Fergus, R.3
Torresani, L.4
Paluri, M.5
-
51
-
-
84898805910
-
Action recognition with improved trajectories
-
H. Wang and C. Schmid. Action recognition with improved trajectories. In ICCV, 2013.
-
(2013)
ICCV
-
-
Wang, H.1
Schmid, C.2
-
52
-
-
84955282488
-
Action recognition with trajectory-pooled deep-convolutional descriptors
-
L. Wang, Y. Qiao, and X. Tang. Action recognition with trajectory-pooled deep-convolutional descriptors. In CVPR, 2015.
-
(2015)
CVPR
-
-
Wang, L.1
Qiao, Y.2
Tang, X.3
-
53
-
-
84961995462
-
Towards good practices for very deep two-stream convnets
-
abs/1507.02159
-
L. Wang, Y. Xiong, Z. Wang, and Y. Qiao. Towards good practices for very deep two-stream convnets. CoRR, /abs/1507.02159, 2015.
-
(2015)
CoRR
-
-
Wang, L.1
Xiong, Y.2
Wang, Z.3
Qiao, Y.4
-
54
-
-
84973889989
-
Unsupervised learning of visual representations using videos
-
X. Wang and A. Gupta. Unsupervised learning of visual representations using videos. ICCV, 2015.
-
(2015)
ICCV
-
-
Wang, X.1
Gupta, A.2
-
55
-
-
79957467077
-
Hidden part models for human action recognition: Probabilistic vs max-margin
-
Y. Wang and G. Mori. Hidden part models for human action recognition: Probabilistic vs. max-margin. TPAMI, 2011.
-
(2011)
TPAMI
-
-
Wang, Y.1
Mori, G.2
-
56
-
-
84911433150
-
Towards good practices for action video encoding
-
J. Wu, Y. Zhang, and W. Lin. Towards good practices for action video encoding. CVPR, 2014.
-
(2014)
CVPR
-
-
Wu, J.1
Zhang, Y.2
Lin, W.3
-
57
-
-
84962921420
-
Modeling spatial-temporal clues in a hybrid deep learning framework for video classification
-
Z. Wu, X. Wang, Y.-G. Jiang, H. Ye, and X. Xue. Modeling spatial-temporal clues in a hybrid deep learning framework for video classification. In ACM Multimedia, 2015.
-
(2015)
ACM Multimedia
-
-
Wu, Z.1
Wang, X.2
Jiang, Y.-G.3
Ye, H.4
Xue, X.5
-
58
-
-
84959226659
-
A discriminative cnn video representation for event detection
-
Z. Xu, Y. Yang, and A. G. Hauptmann. A discriminative cnn video representation for event detection. CVPR, 2015.
-
(2015)
CVPR
-
-
Xu, Z.1
Yang, Y.2
Hauptmann, A.G.3
-
60
-
-
84898805615
-
Action recognition with actons
-
J. Zhu, B. Wang, X. Yang, W. Zhang, and Z. Tu. Action recognition with actons. In ICCV, 2013.
-
(2013)
ICCV
-
-
Zhu, J.1
Wang, B.2
Yang, X.3
Zhang, W.4
Tu, Z.5
|