-
1
-
-
84898797429
-
Monte carlo tree search for scheduling activity recognition
-
M. R. Amer, S. Todorovic, A. Fern, and S.-C. Zhu. Monte carlo tree search for scheduling activity recognition. In ICCV, 2013.
-
(2013)
ICCV
-
-
Amer, M.R.1
Todorovic, S.2
Fern, A.3
Zhu, S.-C.4
-
2
-
-
84900675076
-
Diffrac: A discriminative and flexible framework for clustering
-
F. Bach and Z. Harchaoui. Diffrac: A discriminative and flexible framework for clustering. In NIPS, 2007.
-
(2007)
NIPS
-
-
Bach, F.1
Harchaoui, Z.2
-
3
-
-
0041876117
-
Matching words and pictures
-
K. Barnard, P. Duygulu, D. A. Forsyth, N. de Freitas, D. M. Blei, and M. I. Jordan. Matching words and pictures. JMLR, 2003.
-
(2003)
JMLR
-
-
Barnard, K.1
Duygulu, P.2
Forsyth, D.A.3
De Freitas, N.4
Blei, D.M.5
Jordan, M.I.6
-
6
-
-
84898792367
-
Finding actors and actions in movies
-
P. Bojanowski, F. Bach, I. Laptev, J. Ponce, C. Schmid, and J. Sivic. Finding actors and actions in movies. In ICCV, 2013.
-
(2013)
ICCV
-
-
Bojanowski, P.1
Bach, F.2
Laptev, I.3
Ponce, J.4
Schmid, C.5
Sivic, J.6
-
7
-
-
84943800045
-
Weakly supervised action labeling in videos under ordering constraints
-
P. Bojanowski, R. Lajugie, F. Bach, I. Laptev, J. Ponce, C. Schmid, and J. Sivic. Weakly supervised action labeling in videos under ordering constraints. In ECCV, 2014.
-
(2014)
ECCV
-
-
Bojanowski, P.1
Lajugie, R.2
Bach, F.3
Laptev, I.4
Ponce, J.5
Schmid, C.6
Sivic, J.7
-
8
-
-
84944046597
-
-
arXiv preprint arXiv:1411. 4389
-
J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term recur-rent convolutional networks for visual recognition and description. ArXiv preprint arXiv:1411. 4389, 2014.
-
(2014)
Long-term Recur-rent Convolutional Networks for Visual Recognition and Description
-
-
Donahue, J.1
Hendricks, L.A.2
Guadarrama, S.3
Rohrbach, M.4
Venugopalan, S.5
Saenko, K.6
Darrell, T.7
-
9
-
-
0038401728
-
Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary
-
P. Duygulu, K. Barnard, J. F. G. d. Freitas, and D. A. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In ECCV, 2002.
-
(2002)
ECCV
-
-
Duygulu, P.1
Barnard, K.2
Freitas, J.F.G.D.3
Forsyth, D.A.4
-
10
-
-
80052017343
-
Every picture tells a story: Generating sentences from images
-
A. Farhadi, S. M. M. Hejrati, M. A. Sadeghi, P. Young, C. Rashtchian, J. Hockenmaier, and D. A. Forsyth. Every picture tells a story: Generating sentences from images. In ECCV, 2010.
-
(2010)
ECCV
-
-
Farhadi, A.1
Hejrati, S.M.M.2
Sadeghi, M.A.3
Young, P.4
Rashtchian, C.5
Hockenmaier, J.6
Forsyth, D.A.7
-
12
-
-
84898958665
-
Devise: A deep visual-semantic embedding model
-
A. FRome, G. S. Corrado, J. Shlens, S. Bengio, J. Dean, M. Ranzato, and T. Mikolov. Devise: A deep visual-semantic embedding model. In NIPS, 2013.
-
(2013)
NIPS
-
-
Frome, A.1
Corrado, G.S.2
Shlens, J.3
Bengio, S.4
Dean, J.5
Ranzato, M.6
Mikolov, T.7
-
13
-
-
80052915321
-
Actom sequence models for efficient action detection
-
A. Gaidon, Z. Harchaoui, and C. Schmid. Actom sequence models for efficient action detection. In CVPR, 2011.
-
(2011)
CVPR
-
-
Gaidon, A.1
Harchaoui, Z.2
Schmid, C.3
-
14
-
-
84894905366
-
A multi-view embedding space for modeling internet images, tags, and their semantics
-
Y. Gong, Q. Ke, M. Isard, and S. Lazebnik. A multi-view embedding space for modeling internet images, tags, and their semantics. IJCV, 2014.
-
(2014)
IJCV
-
-
Gong, Y.1
Ke, Q.2
Isard, M.3
Lazebnik, S.4
-
15
-
-
84959394156
-
A markovian approach to distributional semantics with application to semantic compositionality
-
E. Grave, G. Obozinski, and F. Bach. A markovian approach to distributional semantics with application to semantic compositionality. In COLING, 2014.
-
(2014)
COLING
-
-
Grave, E.1
Obozinski, G.2
Bach, F.3
-
16
-
-
84898930423
-
Convex relaxations of latent variable training
-
Y. Guo and D. Schuurmans. Convex relaxations of latent variable training. In NIPS, 2007.
-
(2007)
NIPS
-
-
Guo, Y.1
Schuurmans, D.2
-
17
-
-
10044285992
-
Canonical correlation analysis: An overview with application to learning methods
-
D. Hardoon, S. Szedmak, and J. Shawe-Taylor. Canonical correlation analysis: An overview with application to learning methods. Neural computation, 16(12):2639-2664, 2004.
-
(2004)
Neural Computation
, vol.16
, Issue.12
, pp. 2639-2664
-
-
Hardoon, D.1
Szedmak, S.2
Shawe-Taylor, J.3
-
18
-
-
84883394520
-
Framing image description as a ranking task: Data, models and evaluation metrics
-
M. Hodosh, P. Young, and J. Hockenmaier. Framing image description as a ranking task: Data, models and evaluation metrics. JAIR, pages 853-899, 2013.
-
(2013)
JAIR
, pp. 853-899
-
-
Hodosh, M.1
Young, P.2
Hockenmaier, J.3
-
19
-
-
0000107975
-
Relations between two sets of variates
-
H. Hotelling. Relations between two sets of variates. Biometrika, 3:321-377, 1936.
-
(1936)
Biometrika
, vol.3
, pp. 321-377
-
-
Hotelling, H.1
-
20
-
-
77955990943
-
Discriminative clustering for image co-segmentation
-
A. Joulin, F. Bach, and J. Ponce. Discriminative clustering for image co-segmentation. In CVPR, 2010.
-
(2010)
CVPR
-
-
Joulin, A.1
Bach, F.2
Ponce, J.3
-
22
-
-
84943738421
-
Efficient image and video co-localization with frank-wolfe algorithm
-
A. Joulin, K. Tang, and L. Fei-Fei. Efficient image and video co-localization with frank-wolfe algorithm. In ECCV, 2014.
-
(2014)
ECCV
-
-
Joulin, A.1
Tang, K.2
Fei-Fei, L.3
-
23
-
-
84937843643
-
Deep fragment embeddings for bidirectional image sentence mapping
-
A. Karpathy, A. Joulin, and F. F. F. Li. Deep fragment embeddings for bidirectional image sentence mapping. In NIPS, 2014.
-
(2014)
NIPS
-
-
Karpathy, A.1
Joulin, A.2
Li, F.F.F.3
-
24
-
-
84915757230
-
Combining perframe and per-track cues for multi-person action recognition
-
S. Khamis, V. I. Morariu, and L. S. Davis. Combining perframe and per-track cues for multi-person action recognition. In ECCV, 2012.
-
(2012)
ECCV
-
-
Khamis, S.1
Morariu, V.I.2
Davis, L.S.3
-
25
-
-
80052882471
-
Scenario-based video event recognition by constraint flow
-
S. Kwak, B. Han, and J. H. Han. Scenario-based video event recognition by constraint flow. In CVPR, 2011.
-
(2011)
CVPR
-
-
Kwak, S.1
Han, B.2
Han, J.H.3
-
27
-
-
34948883502
-
Leveraging temporal, contextual and ordering constraints for recognizing complex activities in video
-
B. Laxton, J. Lim, and D. J. Kriegman. Leveraging temporal, contextual and ordering constraints for recognizing complex activities in video. In CVPR, 2007.
-
(2007)
CVPR
-
-
Laxton, B.1
Lim, J.2
Kriegman, D.J.3
-
29
-
-
84959916685
-
What's cookin'? Interpreting cooking videos using text, speech and vision
-
J. Malmaud, J. Huang, V. Rathod, N. Johnston, A. Rabinovich, and K. Murphy. What's cookin'? interpreting cooking videos using text, speech and vision. NAACL, 2015.
-
(2015)
NAACL
-
-
Malmaud, J.1
Huang, J.2
Rathod, V.3
Johnston, N.4
Rabinovich, A.5
Murphy, K.6
-
30
-
-
85117622017
-
The stanford corenlp natural language processing toolkit
-
C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The Stanford CoreNLP natural language processing toolkit. In ACL (Demo.), 2014.
-
(2014)
ACL (Demo.)
-
-
Manning, C.D.1
Surdeanu, M.2
Bauer, J.3
Finkel, J.4
Bethard, S.J.5
McClosky, D.6
-
32
-
-
84898956512
-
Distributed representations of words and phrases and their compositionality
-
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, 2013.
-
(2013)
NIPS
-
-
Mikolov, T.1
Sutskever, I.2
Chen, K.3
Corrado, G.S.4
Dean, J.5
-
33
-
-
85162522202
-
Im2text: Describing images using 1 million captioned photographs
-
V. Ordonez, G. Kulkarni, and T. L. Berg. Im2text: Describing images using 1 million captioned photographs. In NIPS, 2011.
-
(2011)
NIPS
-
-
Ordonez, V.1
Kulkarni, G.2
Berg, T.L.3
-
35
-
-
84884994717
-
Addressing big data time series: Mining trillions of time series subsequences under dynamic time warping
-
T. Rakthanmanon, B. Campana, A. Mueen, G. Batista, B. Westover, Q. Zhu, J. Zakaria, and E. Keogh. Addressing big data time series: Mining trillions of time series subsequences under dynamic time warping. ACM Trans. on Knowledge Discovery from Data (TKDD), 7(3):10, 2013.
-
(2013)
ACM Trans. on Knowledge Discovery from Data (TKDD)
, vol.7
, Issue.3
, pp. 10
-
-
Rakthanmanon, T.1
Campana, B.2
Mueen, A.3
Batista, G.4
Westover, B.5
Zhu, Q.6
Zakaria, J.7
Keogh, E.8
-
36
-
-
84943782750
-
Linking people with "their" names using coreference resolution
-
V. Ramanathan, A. Joulin, P. Liang, and L. Fei-Fei. Linking people with "their" names using coreference resolution. In ECCV, 2014.
-
(2014)
ECCV
-
-
Ramanathan, V.1
Joulin, A.2
Liang, P.3
Fei-Fei, L.4
-
37
-
-
84898785648
-
Grounding action descriptions in videos
-
M. Regneri, M. Rohrbach, D. Wetzel, S. Thater, B. Schiele, and M. Pinkal. Grounding action descriptions in videos. TACL, 1:25-36, 2013.
-
(2013)
TACL
, vol.1
, pp. 25-36
-
-
Regneri, M.1
Rohrbach, M.2
Wetzel, D.3
Thater, S.4
Schiele, B.5
Pinkal, M.6
-
38
-
-
84898775239
-
Translating video content to natural language descriptions
-
M. Rohrbach, W. Qiu, I. Titov, S. Thater, M. Pinkal, and B. Schiele. Translating video content to natural language descriptions. In ICCV, 2013.
-
(2013)
ICCV
-
-
Rohrbach, M.1
Qiu, W.2
Titov, I.3
Thater, S.4
Pinkal, M.5
Schiele, B.6
-
39
-
-
33845588233
-
Recognition of composite human activities through context-free grammar based representation
-
M. S. Ryoo and J. K. Aggarwal. Recognition of composite human activities through context-free grammar based representation. In CVPR, 2006.
-
(2006)
CVPR
-
-
Ryoo, M.S.1
Aggarwal, J.K.2
-
41
-
-
80052901415
-
Modeling the temporal extent of actions
-
S. Satkin and M. Hebert. Modeling the temporal extent of actions. In ECCV, 2010.
-
(2010)
ECCV
-
-
Satkin, S.1
Hebert, M.2
-
42
-
-
77955998009
-
Connecting modalities: Semisupervised segmentation and annotation of images using unaligned text corpora
-
R. Socher and L. Fei-Fei. Connecting modalities: Semisupervised segmentation and annotation of images using unaligned text corpora. In CVPR, 2010.
-
(2010)
CVPR
-
-
Socher, R.1
Fei-Fei, L.2
-
43
-
-
84964474107
-
Grounded compositional semantics for finding and describing images with sentences
-
R. Socher, A. Karpathy, Q. V. Le, C. D. Manning, and A. Y. Ng. Grounded compositional semantics for finding and describing images with sentences. TACL, 2014.
-
(2014)
TACL
-
-
Socher, R.1
Karpathy, A.2
Le, Q.V.3
Manning, C.D.4
Ng, A.Y.5
-
44
-
-
84866659479
-
Knock! knock! who is it?" probabilistic person identification in tv-series
-
M. Tapaswi, M. Bäuml, and R. Stiefelhagen. "knock! knock! who is it?" probabilistic person identification in tv-series. In CVPR, 2012.
-
(2012)
CVPR
-
-
Tapaswi, M.1
Bäuml, M.2
Stiefelhagen, R.3
-
45
-
-
84959255361
-
Book2movie: Aligning video scenes with book chapters
-
M. Tapaswi, M. Bäuml, and R. Stiefelhagen. Book2movie: Aligning video scenes with book chapters. In CVPR, 2015.
-
(2015)
CVPR
-
-
Tapaswi, M.1
Bäuml, M.2
Stiefelhagen, R.3
-
46
-
-
84898805910
-
Action recognition with improved trajectories
-
H. Wang and C. Schmid. Action recognition with improved trajectories. In ICCV, 2013.
-
(2013)
ICCV
-
-
Wang, H.1
Schmid, C.2
-
47
-
-
78149328370
-
Canonical time warping for alignment of human behavior
-
F. Zhou and F. De La Torre. Canonical time warping for alignment of human behavior. NIPS, 2009.
-
(2009)
NIPS
-
-
Zhou, F.1
De La Torre, F.2
|