-
1
-
-
80052886947
-
-
Aker, A., & Gaizauskas, R. (2010). Generating image descriptions using dependency relational patterns. In ACL
-
Aker, A., & Gaizauskas, R. (2010). Generating image descriptions using dependency relational patterns. In ACL.
-
-
-
-
2
-
-
0041876117
-
Matching words and pictures
-
Barnard, K., Duygulu, P., de Freitas, N., Forsyth, D., Blei, D., & Jordan, M. (2003). Matching words and pictures. Journal of Machine Learning Research, 3, 1107–1135.
-
(2003)
Journal of Machine Learning Research
, vol.3
, pp. 1107-1135
-
-
Barnard, K.1
Duygulu, P.2
de Freitas, N.3
Forsyth, D.4
Blei, D.5
Jordan, M.6
-
3
-
-
85083595611
-
-
Berg, T., Berg, A., Edwards, J., & Forsyth, D. (2004) Who’s in the picture?. In NIPS
-
Berg, T., Berg, A., Edwards, J., & Forsyth, D. (2004) Who’s in the picture?. In NIPS.
-
-
-
-
4
-
-
85083603282
-
-
Berg, T., Berg, A., Edwards, J., Maire, M., White, R., Learned-Miller, E., Teh, Y., & Forsyth, D. (2004). Names and faces. In CVPR
-
Berg, T., Berg, A., Edwards, J., Maire, M., White, R., Learned-Miller, E., Teh, Y., & Forsyth, D. (2004). Names and faces. In CVPR.
-
-
-
-
5
-
-
85083611986
-
-
Berg, T.L., Berg, A.C., & Shih, J. (2010). Automatic attribute discovery and characterization from noisy web data. In ECCV
-
Berg, T.L., Berg, A.C., & Shih, J. (2010). Automatic attribute discovery and characterization from noisy web data. In ECCV.
-
-
-
-
6
-
-
85083611323
-
-
Brants, T., & Franz., A. (2006). Web 1t 5-gram version 1. In LDC
-
Brants, T., & Franz., A. (2006). Web 1t 5-gram version 1. In LDC.
-
-
-
-
7
-
-
85083615261
-
-
Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. In WWW
-
Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. In WWW.
-
-
-
-
8
-
-
84898444828
-
-
Chum, O., Philbin, J., & Zisserman, A. (2008). Near duplicate image detection: min-hash and tf-idf weighting. In BMVC
-
Chum, O., Philbin, J., & Zisserman, A. (2008). Near duplicate image detection: min-hash and tf-idf weighting. In BMVC.
-
-
-
-
9
-
-
33645146449
-
-
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR
-
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR.
-
-
-
-
10
-
-
80052910977
-
-
Deng, J., Berg, A.C., & Fei-Fei, L. (2011). Hierarchical semantic indexing for large scale image retrieval. In CVPR
-
Deng, J., Berg, A.C., & Fei-Fei, L. (2011). Hierarchical semantic indexing for large scale image retrieval. In CVPR.
-
-
-
-
11
-
-
85083614126
-
-
Deng, J., Berg, A.C., Li, K., & Fei-Fei, L. (2010). What does classifying more than 10,000 image categories tell us?. In ECCV
-
Deng, J., Berg, A.C., Li, K., & Fei-Fei, L. (2010). What does classifying more than 10,000 image categories tell us?. In ECCV.
-
-
-
-
12
-
-
84866674680
-
-
Deng, J., Krause, J., Berg, A.C., & Fei-Fei, L. (2012). Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition. In CVPR
-
Deng, J., Krause, J., Berg, A.C., & Fei-Fei, L. (2012). Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition. In CVPR.
-
-
-
-
13
-
-
85162336771
-
-
Deng, J., Satheesh, S., Berg, A.C., & Fei-Fei, L. (2011). Fast and balanced: Efficient label tree learning for large scale object recognition. In NIPS
-
Deng, J., Satheesh, S., Berg, A.C., & Fei-Fei, L. (2011). Fast and balanced: Efficient label tree learning for large scale object recognition. In NIPS.
-
-
-
-
14
-
-
85083606820
-
-
Duygulu, P., Barnard, K., de Freitas, N., & Forsyth, D. (2002). Object recognition as machine translation. In ECCV
-
Duygulu, P., Barnard, K., de Freitas, N., & Forsyth, D. (2002). Object recognition as machine translation. In ECCV.
-
-
-
-
15
-
-
70450207704
-
-
Farhadi, A., Endres, I., Hoiem, D., & Forsyth, D.A. (2009). Describing objects by their attributes. In CVPR
-
Farhadi, A., Endres, I., Hoiem, D., & Forsyth, D.A. (2009). Describing objects by their attributes. In CVPR.
-
-
-
-
16
-
-
85083615281
-
-
Farhadi, A., Hejrati, M., Sadeghi, A., Young, P., Rashtchian, C., Hockenmaier, J., & Forsyth, D.A. (2010). Every picture tells a story: generating sentences for images. In ECCV
-
Farhadi, A., Hejrati, M., Sadeghi, A., Young, P., Rashtchian, C., Hockenmaier, J., & Forsyth, D.A. (2010). Every picture tells a story: generating sentences for images. In ECCV.
-
-
-
-
17
-
-
85083604421
-
-
Felzenszwalb, P.F., Girshick, R.B., McAllester, D. (2011). Discriminatively trained deformable part models, release 4
-
Felzenszwalb, P.F., Girshick, R.B., McAllester, D. (2011). Discriminatively trained deformable part models, release 4. http://people.cs.uchicago.edu/~pff/latent-release4/
-
-
-
-
18
-
-
80052878949
-
-
Feng, Y., & Lapata, M. (2010). How many words is a picture worth? automatic caption generation for news images. In ACL
-
Feng, Y., & Lapata, M. (2010). How many words is a picture worth? automatic caption generation for news images. In ACL.
-
-
-
-
19
-
-
85083601734
-
-
Ferrari, V., & Zisserman, A. (2007). Learning visual attributes. In NIPS
-
Ferrari, V., & Zisserman, A. (2007). Learning visual attributes. In NIPS.
-
-
-
-
20
-
-
84898773262
-
-
Guadarrama, S., Krishnamoorthy, N., Malkarnenkar, G., Venugopalan, S., Mooney, R., Darrell, T., & Saenko, K. (2013). Youtube2text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition. In ICCV
-
Guadarrama, S., Krishnamoorthy, N., Malkarnenkar, G., Venugopalan, S., Mooney, R., Darrell, T., & Saenko, K. (2013). Youtube2text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition. In ICCV.
-
-
-
-
21
-
-
51949088643
-
-
Hays, J., & Efros, A.A. (2008). im2gps: estimating geographic information from a single image. In CVPR
-
Hays, J., & Efros, A.A. (2008). im2gps: estimating geographic information from a single image. In CVPR.
-
-
-
-
22
-
-
84883394520
-
Framing image description as a ranking task: Data, models and evaluation metrics
-
Hodosh, M., Young, P., & Hockenmaier, J. (2013). Framing image description as a ranking task: Data, models and evaluation metrics. Journal of Artificial Intelligence Research, 47, 853–899.
-
(2013)
Journal of Artificial Intelligence Research
, vol.47
, pp. 853-899
-
-
Hodosh, M.1
Young, P.2
Hockenmaier, J.3
-
23
-
-
33745947933
-
-
Hoiem, D., Efros, A.A., & Hebert, M. (2005). Geometric context from a single image. In ICCV
-
Hoiem, D., Efros, A.A., & Hebert, M. (2005). Geometric context from a single image. In ICCV.
-
-
-
-
24
-
-
50549087889
-
-
Jing, Y., & Baluja, S. (2008). Pagerank for product image search. In WWW
-
Jing, Y., & Baluja, S. (2008). Pagerank for product image search. In WWW.
-
-
-
-
25
-
-
84887601544
-
Babytalk: Understanding and generating simple image descriptions
-
Kulkarni, G., Premraj, V., Ordonez, V., Dhar, S., Li, S., Choi, Y., et al. (2013). Babytalk: Understanding and generating simple image descriptions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 2891–2903.
-
(2013)
IEEE Transactions on Pattern Analysis and Machine Intelligence
, vol.35
, pp. 2891-2903
-
-
Kulkarni, G.1
Premraj, V.2
Ordonez, V.3
Dhar, S.4
Li, S.5
Choi, Y.6
Berg, A.7
Berg, T.8
-
26
-
-
77953185711
-
-
Kumar, N., Berg, A.C., Belhumeur, P.N., & Nayar, S.K. (2009). Attribute and simile classifiers for face verification. In ICCV
-
Kumar, N., Berg, A.C., Belhumeur, P.N., & Nayar, S.K. (2009). Attribute and simile classifiers for face verification. In ICCV.
-
-
-
-
27
-
-
84878189119
-
-
Kuznetsova, P., Ordonez, V., Berg, A., Berg, T.L., & Choi, Y. (2012). Collective generation of natural image descriptions. In ACL
-
Kuznetsova, P., Ordonez, V., Berg, A., Berg, T.L., & Choi, Y. (2012). Collective generation of natural image descriptions. In ACL.
-
-
-
-
28
-
-
84907331257
-
-
Kuznetsova, P., Ordonez, V., Berg, A.C., Berg, T.L., & Choi, Y. (2013). Generalizing image captions for image-text parallel corpus. In ACL
-
Kuznetsova, P., Ordonez, V., Berg, A.C., Berg, T.L., & Choi, Y. (2013). Generalizing image captions for image-text parallel corpus. In ACL.
-
-
-
-
29
-
-
70450172710
-
-
Lampert, C., Nickisch, H., & Harmeling, S. (2009). Learning to detect unseen object classes by between-class attribute transfer. In CVPR
-
Lampert, C., Nickisch, H., & Harmeling, S. (2009). Learning to detect unseen object classes by between-class attribute transfer. In CVPR.
-
-
-
-
30
-
-
85139271683
-
-
Leung, T.K., & Malik, J., (1999). Recognizing surfaces using three-dimensional textons. In ICCV
-
Leung, T.K., & Malik, J., (1999). Recognizing surfaces using three-dimensional textons. In ICCV.
-
-
-
-
31
-
-
84862279067
-
-
Li, S., Kulkarni, G., Berg, T.L., Berg, A.C., & Choi, Y. (2011). Composing simple image descriptions using web-scale n-grams. In CoNLL
-
Li, S., Kulkarni, G., Berg, T.L., Berg, A.C., & Choi, Y. (2011). Composing simple image descriptions using web-scale n-grams. In CoNLL.
-
-
-
-
32
-
-
85083608335
-
-
Li, W., Xu, W., Wu, M., Yuan, C., & Lu, Q. (2006). Extractive summarization using inter- and intra- event relevance. In International Conference on Computational Linguistics
-
Li, W., Xu, W., Wu, M., Yuan, C., & Lu, Q. (2006). Extractive summarization using inter- and intra- event relevance. In International Conference on Computational Linguistics.
-
-
-
-
33
-
-
85162518327
-
-
Li, Li-Jia., Su, Hao., Xing, E.P., & Fei-Fei, L. (2010). Object bank: A high-level image representation for scene classification and semantic feature sparsification. In NIPS
-
Li, Li-Jia., Su, Hao., Xing, E.P., & Fei-Fei, L. (2010). Object bank: A high-level image representation for scene classification and semantic feature sparsification. In NIPS.
-
-
-
-
34
-
-
85083610498
-
-
Lin, C.Y. (2004). Rouge: A package for automatic evaluation of summaries. In ACL
-
Lin, C.Y. (2004). Rouge: A package for automatic evaluation of summaries. In ACL.
-
-
-
-
35
-
-
3042535216
-
Distinctive image features from scale invariant keypoints
-
Lowe, D. G. (2004). Distinctive image features from scale invariant keypoints. International Journal of Computer Vision, 60, 91–110.
-
(2004)
International Journal of Computer Vision
, vol.60
, pp. 91-110
-
-
Lowe, D.G.1
-
36
-
-
84906925144
-
-
Mason, R., & Charniak, E. (2014). Nonparametric method for data-driven image captioning. In ACL
-
Mason, R., & Charniak, E. (2014). Nonparametric method for data-driven image captioning. In ACL.
-
-
-
-
37
-
-
85083602457
-
-
Mihalcea, R. (2005). Language independent extractive summarization. In AAAI
-
Mihalcea, R. (2005). Language independent extractive summarization. In AAAI.
-
-
-
-
38
-
-
85083599449
-
-
Mitchell, M., Dodge, J., Goyal, A., Yamaguchi, K., Sratos, K., Han, X., Mensch, A., Berg, A., Berg, T.L., & Daumé, III, H. (2012). Midge: Generating image descriptions from computer vision detections. In EACL
-
Mitchell, M., Dodge, J., Goyal, A., Yamaguchi, K., Sratos, K., Han, X., Mensch, A., Berg, A., Berg, T.L., & Daumé, III, H. (2012). Midge: Generating image descriptions from computer vision detections. In EACL.
-
-
-
-
39
-
-
33750346745
-
-
Nenkova, A., Vanderwende, L., & McKeown, K. (2006). A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization. In SIGIR
-
Nenkova, A., Vanderwende, L., & McKeown, K. (2006). A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization. In SIGIR.
-
-
-
-
40
-
-
0035328421
-
Modeling the shape of the scene: a holistic representation of the spatial envelope
-
Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision, 42, 145–175.
-
(2001)
International Journal of Computer Vision
, vol.42
, pp. 145-175
-
-
Oliva, A.1
Torralba, A.2
-
41
-
-
84898828265
-
-
Ordonez, V., Deng, J., Choi, Y., Berg, A.C., & Berg, T.L. (2013). From large scale image categorization to entry-level categories. In ICCV
-
Ordonez, V., Deng, J., Choi, Y., Berg, A.C., & Berg, T.L. (2013). From large scale image categorization to entry-level categories. In ICCV.
-
-
-
-
42
-
-
85162525042
-
-
Ordonez, V., Kulkarni, G., & Berg, T.L. (2011). Im2text: Describing images using 1 million captioned photographs. In NIPS
-
Ordonez, V., Kulkarni, G., & Berg, T.L. (2011). Im2text: Describing images using 1 million captioned photographs. In NIPS.
-
-
-
-
43
-
-
85083599265
-
-
Papineni, K., Roukos, S., Ward, T., & Zhu, W. jing. (2002). Bleu: A method for automatic evaluation of machine translation. In ACL
-
Papineni, K., Roukos, S., Ward, T., & Zhu, W. jing. (2002). Bleu: A method for automatic evaluation of machine translation. In ACL.
-
-
-
-
44
-
-
36348934026
-
-
Petrov, S., Barrett, L., Thibaux, R., & Klein, D. (2006). Learning accurate, compact, and interpretable tree annotation. In COLING/ACL
-
Petrov, S., Barrett, L., Thibaux, R., & Klein, D. (2006). Learning accurate, compact, and interpretable tree annotation. In COLING/ACL.
-
-
-
-
45
-
-
84858380058
-
-
Petrov, S., & Klein, D. (2007). Improved inference for unlexicalized parsing. In HLT-NAACL
-
Petrov, S., & Klein, D. (2007). Improved inference for unlexicalized parsing. In HLT-NAACL.
-
-
-
-
46
-
-
84977916396
-
-
Radev, D.R., & Allison, T. (2004). Mead—A platform for multidocument multilingual text summarization. In LREC
-
Radev, D.R., & Allison, T. (2004). Mead—A platform for multidocument multilingual text summarization. In LREC.
-
-
-
-
47
-
-
85083615224
-
-
Rashtchian, C., Young, P., Hodosh, M., & Hockenmaier, J. (2010). Collecting image annotations using amazon’s mechanical turk. In NAACL Workshop Creating Speech and Language Data With Amazon’s Mechanical Turk
-
Rashtchian, C., Young, P., Hodosh, M., & Hockenmaier, J. (2010). Collecting image annotations using amazon’s mechanical turk. In NAACL Workshop Creating Speech and Language Data With Amazon’s Mechanical Turk.
-
-
-
-
48
-
-
57349171311
-
-
Roelleke, T., & Wang, J. (2008). Tf-idf uncovered: a study of theories and probabilities. In SIGIR
-
Roelleke, T., & Wang, J. (2008). Tf-idf uncovered: a study of theories and probabilities. In SIGIR.
-
-
-
-
49
-
-
0345414182
-
-
Sivic, J., & Zisserman, A. (2003). Video google: A text retrieval approach to object matching in videos. In ICCV
-
Sivic, J., & Zisserman, A. (2003). Video google: A text retrieval approach to object matching in videos. In ICCV.
-
-
-
-
50
-
-
85083604656
-
-
Stratos, K., Sood, A., Mensch, A., Han, X., Mitchell, M., Yamaguchi, K., Dodge, J., Goyal, A., Daumé, III, H., Berg, A., & Berg, T.L. (2012). Understanding and predicting importance in images. In CVPR
-
Stratos, K., Sood, A., Mensch, A., Han, X., Mitchell, M., Yamaguchi, K., Dodge, J., Goyal, A., Daumé, III, H., Berg, A., & Berg, T.L. (2012). Understanding and predicting importance in images. In CVPR.
-
-
-
-
51
-
-
85083597773
-
-
Tighe, J., & Lazebnik, S. (2010). Superparsing: Scalable nonparametric image parsing with superpixels. In ECCV
-
Tighe, J., & Lazebnik, S. (2010). Superparsing: Scalable nonparametric image parsing with superpixels. In ECCV.
-
-
-
-
52
-
-
54749092170
-
80 million tiny images: a large dataset for non-parametric object and scene recognition
-
Torralba, A., Fergus, R., & Freeman, W. (2008). 80 million tiny images: a large dataset for non-parametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 1958–1970.
-
(2008)
IEEE Transactions on Pattern Analysis and Machine Intelligence
, vol.30
, pp. 1958-1970
-
-
Torralba, A.1
Fergus, R.2
Freeman, W.3
-
53
-
-
80053403625
-
-
Wong, K.F., Wu, M., & Li, W. (2008). Extractive summarization using supervised and semi-supervised learning. In COLING
-
Wong, K.F., Wu, M., & Li, W. (2008). Extractive summarization using supervised and semi-supervised learning. In COLING.
-
-
-
-
54
-
-
77955988947
-
-
Xiao, J., Hays, J., Ehinger, K., Oliva, A., & Torralba, A. (2010). Sun database: Large-scale scene recognition from abbey to zoo. In CVPR
-
Xiao, J., Hays, J., Ehinger, K., Oliva, A., & Torralba, A. (2010). Sun database: Large-scale scene recognition from abbey to zoo. In CVPR.
-
-
-
-
55
-
-
80053258778
-
-
Yang, Y., Teo, C.L., Daumé, III, H., & Aloimonos, Y. (2011). Corpus-guided sentence generation of natural images. In EMNLP
-
Yang, Y., Teo, C.L., Daumé, III, H., & Aloimonos, Y. (2011). Corpus-guided sentence generation of natural images. In EMNLP.
-
-
-
-
56
-
-
77954862144
-
I2t: Image parsing to text description
-
Yao, B., Yang, X., Lin, L., Lee, M. W., & Zhu, S. C. (2010). I2t: Image parsing to text description. Proceedings of the IEEE.
-
(2010)
Proceedings of the IEEE
-
-
Yao, B.1
Yang, X.2
Lin, L.3
Lee, M.W.4
Zhu, S.C.5
-
57
-
-
84906494296
-
From image descriptions to visual denotations: New similarity metrics for semantic inference over event description
-
Young, P., Lai, A., Hodosh, M., & Hockenmaier, J. (2014). From image descriptions to visual denotations: New similarity metrics for semantic inference over event description. Transactions of the Association for Computational Linguistics, 2, 67–78.
-
(2014)
Transactions of the Association for Computational Linguistics
, vol.2
, pp. 67-78
-
-
Young, P.1
Lai, A.2
Hodosh, M.3
Hockenmaier, J.4
|