메뉴 건너뛰기




Volumn 119, Issue 1, 2016, Pages 46-59

Large Scale Retrieval and Generation of Image Descriptions

Author keywords

Big data; Data driven; Image description; Natural language processing; Retrieval

Indexed keywords

ALGORITHMS; BIG DATA; COMPUTATIONAL LINGUISTICS; DATA VISUALIZATION; NATURAL LANGUAGE PROCESSING SYSTEMS; OPTIMIZATION;

EID: 84936796885     PISSN: 09205691     EISSN: 15731405     Source Type: Journal    
DOI: 10.1007/s11263-015-0840-y     Document Type: Article
Times cited : (64)

References (57)
  • 1
    • 80052886947 scopus 로고    scopus 로고
    • Aker, A., & Gaizauskas, R. (2010). Generating image descriptions using dependency relational patterns. In ACL
    • Aker, A., & Gaizauskas, R. (2010). Generating image descriptions using dependency relational patterns. In ACL.
  • 3
    • 85083595611 scopus 로고    scopus 로고
    • Berg, T., Berg, A., Edwards, J., & Forsyth, D. (2004) Who’s in the picture?. In NIPS
    • Berg, T., Berg, A., Edwards, J., & Forsyth, D. (2004) Who’s in the picture?. In NIPS.
  • 4
    • 85083603282 scopus 로고    scopus 로고
    • Berg, T., Berg, A., Edwards, J., Maire, M., White, R., Learned-Miller, E., Teh, Y., & Forsyth, D. (2004). Names and faces. In CVPR
    • Berg, T., Berg, A., Edwards, J., Maire, M., White, R., Learned-Miller, E., Teh, Y., & Forsyth, D. (2004). Names and faces. In CVPR.
  • 5
    • 85083611986 scopus 로고    scopus 로고
    • Berg, T.L., Berg, A.C., & Shih, J. (2010). Automatic attribute discovery and characterization from noisy web data. In ECCV
    • Berg, T.L., Berg, A.C., & Shih, J. (2010). Automatic attribute discovery and characterization from noisy web data. In ECCV.
  • 6
    • 85083611323 scopus 로고    scopus 로고
    • Brants, T., & Franz., A. (2006). Web 1t 5-gram version 1. In LDC
    • Brants, T., & Franz., A. (2006). Web 1t 5-gram version 1. In LDC.
  • 7
    • 85083615261 scopus 로고    scopus 로고
    • Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. In WWW
    • Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. In WWW.
  • 8
    • 84898444828 scopus 로고    scopus 로고
    • Chum, O., Philbin, J., & Zisserman, A. (2008). Near duplicate image detection: min-hash and tf-idf weighting. In BMVC
    • Chum, O., Philbin, J., & Zisserman, A. (2008). Near duplicate image detection: min-hash and tf-idf weighting. In BMVC.
  • 9
    • 33645146449 scopus 로고    scopus 로고
    • Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR
    • Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR.
  • 10
    • 80052910977 scopus 로고    scopus 로고
    • Deng, J., Berg, A.C., & Fei-Fei, L. (2011). Hierarchical semantic indexing for large scale image retrieval. In CVPR
    • Deng, J., Berg, A.C., & Fei-Fei, L. (2011). Hierarchical semantic indexing for large scale image retrieval. In CVPR.
  • 11
    • 85083614126 scopus 로고    scopus 로고
    • Deng, J., Berg, A.C., Li, K., & Fei-Fei, L. (2010). What does classifying more than 10,000 image categories tell us?. In ECCV
    • Deng, J., Berg, A.C., Li, K., & Fei-Fei, L. (2010). What does classifying more than 10,000 image categories tell us?. In ECCV.
  • 12
    • 84866674680 scopus 로고    scopus 로고
    • Deng, J., Krause, J., Berg, A.C., & Fei-Fei, L. (2012). Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition. In CVPR
    • Deng, J., Krause, J., Berg, A.C., & Fei-Fei, L. (2012). Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition. In CVPR.
  • 13
    • 85162336771 scopus 로고    scopus 로고
    • Deng, J., Satheesh, S., Berg, A.C., & Fei-Fei, L. (2011). Fast and balanced: Efficient label tree learning for large scale object recognition. In NIPS
    • Deng, J., Satheesh, S., Berg, A.C., & Fei-Fei, L. (2011). Fast and balanced: Efficient label tree learning for large scale object recognition. In NIPS.
  • 14
    • 85083606820 scopus 로고    scopus 로고
    • Duygulu, P., Barnard, K., de Freitas, N., & Forsyth, D. (2002). Object recognition as machine translation. In ECCV
    • Duygulu, P., Barnard, K., de Freitas, N., & Forsyth, D. (2002). Object recognition as machine translation. In ECCV.
  • 15
    • 70450207704 scopus 로고    scopus 로고
    • Farhadi, A., Endres, I., Hoiem, D., & Forsyth, D.A. (2009). Describing objects by their attributes. In CVPR
    • Farhadi, A., Endres, I., Hoiem, D., & Forsyth, D.A. (2009). Describing objects by their attributes. In CVPR.
  • 16
    • 85083615281 scopus 로고    scopus 로고
    • Farhadi, A., Hejrati, M., Sadeghi, A., Young, P., Rashtchian, C., Hockenmaier, J., & Forsyth, D.A. (2010). Every picture tells a story: generating sentences for images. In ECCV
    • Farhadi, A., Hejrati, M., Sadeghi, A., Young, P., Rashtchian, C., Hockenmaier, J., & Forsyth, D.A. (2010). Every picture tells a story: generating sentences for images. In ECCV.
  • 17
    • 85083604421 scopus 로고    scopus 로고
    • Felzenszwalb, P.F., Girshick, R.B., McAllester, D. (2011). Discriminatively trained deformable part models, release 4
    • Felzenszwalb, P.F., Girshick, R.B., McAllester, D. (2011). Discriminatively trained deformable part models, release 4. http://people.cs.uchicago.edu/~pff/latent-release4/
  • 18
    • 80052878949 scopus 로고    scopus 로고
    • Feng, Y., & Lapata, M. (2010). How many words is a picture worth? automatic caption generation for news images. In ACL
    • Feng, Y., & Lapata, M. (2010). How many words is a picture worth? automatic caption generation for news images. In ACL.
  • 19
    • 85083601734 scopus 로고    scopus 로고
    • Ferrari, V., & Zisserman, A. (2007). Learning visual attributes. In NIPS
    • Ferrari, V., & Zisserman, A. (2007). Learning visual attributes. In NIPS.
  • 20
    • 84898773262 scopus 로고    scopus 로고
    • Guadarrama, S., Krishnamoorthy, N., Malkarnenkar, G., Venugopalan, S., Mooney, R., Darrell, T., & Saenko, K. (2013). Youtube2text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition. In ICCV
    • Guadarrama, S., Krishnamoorthy, N., Malkarnenkar, G., Venugopalan, S., Mooney, R., Darrell, T., & Saenko, K. (2013). Youtube2text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition. In ICCV.
  • 21
    • 51949088643 scopus 로고    scopus 로고
    • Hays, J., & Efros, A.A. (2008). im2gps: estimating geographic information from a single image. In CVPR
    • Hays, J., & Efros, A.A. (2008). im2gps: estimating geographic information from a single image. In CVPR.
  • 22
    • 84883394520 scopus 로고    scopus 로고
    • Framing image description as a ranking task: Data, models and evaluation metrics
    • Hodosh, M., Young, P., & Hockenmaier, J. (2013). Framing image description as a ranking task: Data, models and evaluation metrics. Journal of Artificial Intelligence Research, 47, 853–899.
    • (2013) Journal of Artificial Intelligence Research , vol.47 , pp. 853-899
    • Hodosh, M.1    Young, P.2    Hockenmaier, J.3
  • 23
    • 33745947933 scopus 로고    scopus 로고
    • Hoiem, D., Efros, A.A., & Hebert, M. (2005). Geometric context from a single image. In ICCV
    • Hoiem, D., Efros, A.A., & Hebert, M. (2005). Geometric context from a single image. In ICCV.
  • 24
    • 50549087889 scopus 로고    scopus 로고
    • Jing, Y., & Baluja, S. (2008). Pagerank for product image search. In WWW
    • Jing, Y., & Baluja, S. (2008). Pagerank for product image search. In WWW.
  • 26
    • 77953185711 scopus 로고    scopus 로고
    • Kumar, N., Berg, A.C., Belhumeur, P.N., & Nayar, S.K. (2009). Attribute and simile classifiers for face verification. In ICCV
    • Kumar, N., Berg, A.C., Belhumeur, P.N., & Nayar, S.K. (2009). Attribute and simile classifiers for face verification. In ICCV.
  • 27
    • 84878189119 scopus 로고    scopus 로고
    • Kuznetsova, P., Ordonez, V., Berg, A., Berg, T.L., & Choi, Y. (2012). Collective generation of natural image descriptions. In ACL
    • Kuznetsova, P., Ordonez, V., Berg, A., Berg, T.L., & Choi, Y. (2012). Collective generation of natural image descriptions. In ACL.
  • 28
    • 84907331257 scopus 로고    scopus 로고
    • Kuznetsova, P., Ordonez, V., Berg, A.C., Berg, T.L., & Choi, Y. (2013). Generalizing image captions for image-text parallel corpus. In ACL
    • Kuznetsova, P., Ordonez, V., Berg, A.C., Berg, T.L., & Choi, Y. (2013). Generalizing image captions for image-text parallel corpus. In ACL.
  • 29
    • 70450172710 scopus 로고    scopus 로고
    • Lampert, C., Nickisch, H., & Harmeling, S. (2009). Learning to detect unseen object classes by between-class attribute transfer. In CVPR
    • Lampert, C., Nickisch, H., & Harmeling, S. (2009). Learning to detect unseen object classes by between-class attribute transfer. In CVPR.
  • 30
    • 85139271683 scopus 로고    scopus 로고
    • Leung, T.K., & Malik, J., (1999). Recognizing surfaces using three-dimensional textons. In ICCV
    • Leung, T.K., & Malik, J., (1999). Recognizing surfaces using three-dimensional textons. In ICCV.
  • 31
    • 84862279067 scopus 로고    scopus 로고
    • Li, S., Kulkarni, G., Berg, T.L., Berg, A.C., & Choi, Y. (2011). Composing simple image descriptions using web-scale n-grams. In CoNLL
    • Li, S., Kulkarni, G., Berg, T.L., Berg, A.C., & Choi, Y. (2011). Composing simple image descriptions using web-scale n-grams. In CoNLL.
  • 32
    • 85083608335 scopus 로고    scopus 로고
    • Li, W., Xu, W., Wu, M., Yuan, C., & Lu, Q. (2006). Extractive summarization using inter- and intra- event relevance. In International Conference on Computational Linguistics
    • Li, W., Xu, W., Wu, M., Yuan, C., & Lu, Q. (2006). Extractive summarization using inter- and intra- event relevance. In International Conference on Computational Linguistics.
  • 33
    • 85162518327 scopus 로고    scopus 로고
    • Li, Li-Jia., Su, Hao., Xing, E.P., & Fei-Fei, L. (2010). Object bank: A high-level image representation for scene classification and semantic feature sparsification. In NIPS
    • Li, Li-Jia., Su, Hao., Xing, E.P., & Fei-Fei, L. (2010). Object bank: A high-level image representation for scene classification and semantic feature sparsification. In NIPS.
  • 34
    • 85083610498 scopus 로고    scopus 로고
    • Lin, C.Y. (2004). Rouge: A package for automatic evaluation of summaries. In ACL
    • Lin, C.Y. (2004). Rouge: A package for automatic evaluation of summaries. In ACL.
  • 35
    • 3042535216 scopus 로고    scopus 로고
    • Distinctive image features from scale invariant keypoints
    • Lowe, D. G. (2004). Distinctive image features from scale invariant keypoints. International Journal of Computer Vision, 60, 91–110.
    • (2004) International Journal of Computer Vision , vol.60 , pp. 91-110
    • Lowe, D.G.1
  • 36
    • 84906925144 scopus 로고    scopus 로고
    • Mason, R., & Charniak, E. (2014). Nonparametric method for data-driven image captioning. In ACL
    • Mason, R., & Charniak, E. (2014). Nonparametric method for data-driven image captioning. In ACL.
  • 37
    • 85083602457 scopus 로고    scopus 로고
    • Mihalcea, R. (2005). Language independent extractive summarization. In AAAI
    • Mihalcea, R. (2005). Language independent extractive summarization. In AAAI.
  • 38
    • 85083599449 scopus 로고    scopus 로고
    • Mitchell, M., Dodge, J., Goyal, A., Yamaguchi, K., Sratos, K., Han, X., Mensch, A., Berg, A., Berg, T.L., & Daumé, III, H. (2012). Midge: Generating image descriptions from computer vision detections. In EACL
    • Mitchell, M., Dodge, J., Goyal, A., Yamaguchi, K., Sratos, K., Han, X., Mensch, A., Berg, A., Berg, T.L., & Daumé, III, H. (2012). Midge: Generating image descriptions from computer vision detections. In EACL.
  • 39
    • 33750346745 scopus 로고    scopus 로고
    • Nenkova, A., Vanderwende, L., & McKeown, K. (2006). A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization. In SIGIR
    • Nenkova, A., Vanderwende, L., & McKeown, K. (2006). A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization. In SIGIR.
  • 40
    • 0035328421 scopus 로고    scopus 로고
    • Modeling the shape of the scene: a holistic representation of the spatial envelope
    • Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision, 42, 145–175.
    • (2001) International Journal of Computer Vision , vol.42 , pp. 145-175
    • Oliva, A.1    Torralba, A.2
  • 41
    • 84898828265 scopus 로고    scopus 로고
    • Ordonez, V., Deng, J., Choi, Y., Berg, A.C., & Berg, T.L. (2013). From large scale image categorization to entry-level categories. In ICCV
    • Ordonez, V., Deng, J., Choi, Y., Berg, A.C., & Berg, T.L. (2013). From large scale image categorization to entry-level categories. In ICCV.
  • 42
    • 85162525042 scopus 로고    scopus 로고
    • Ordonez, V., Kulkarni, G., & Berg, T.L. (2011). Im2text: Describing images using 1 million captioned photographs. In NIPS
    • Ordonez, V., Kulkarni, G., & Berg, T.L. (2011). Im2text: Describing images using 1 million captioned photographs. In NIPS.
  • 43
    • 85083599265 scopus 로고    scopus 로고
    • Papineni, K., Roukos, S., Ward, T., & Zhu, W. jing. (2002). Bleu: A method for automatic evaluation of machine translation. In ACL
    • Papineni, K., Roukos, S., Ward, T., & Zhu, W. jing. (2002). Bleu: A method for automatic evaluation of machine translation. In ACL.
  • 44
    • 36348934026 scopus 로고    scopus 로고
    • Petrov, S., Barrett, L., Thibaux, R., & Klein, D. (2006). Learning accurate, compact, and interpretable tree annotation. In COLING/ACL
    • Petrov, S., Barrett, L., Thibaux, R., & Klein, D. (2006). Learning accurate, compact, and interpretable tree annotation. In COLING/ACL.
  • 45
    • 84858380058 scopus 로고    scopus 로고
    • Petrov, S., & Klein, D. (2007). Improved inference for unlexicalized parsing. In HLT-NAACL
    • Petrov, S., & Klein, D. (2007). Improved inference for unlexicalized parsing. In HLT-NAACL.
  • 46
    • 84977916396 scopus 로고    scopus 로고
    • Radev, D.R., & Allison, T. (2004). Mead—A platform for multidocument multilingual text summarization. In LREC
    • Radev, D.R., & Allison, T. (2004). Mead—A platform for multidocument multilingual text summarization. In LREC.
  • 47
    • 85083615224 scopus 로고    scopus 로고
    • Rashtchian, C., Young, P., Hodosh, M., & Hockenmaier, J. (2010). Collecting image annotations using amazon’s mechanical turk. In NAACL Workshop Creating Speech and Language Data With Amazon’s Mechanical Turk
    • Rashtchian, C., Young, P., Hodosh, M., & Hockenmaier, J. (2010). Collecting image annotations using amazon’s mechanical turk. In NAACL Workshop Creating Speech and Language Data With Amazon’s Mechanical Turk.
  • 48
    • 57349171311 scopus 로고    scopus 로고
    • Roelleke, T., & Wang, J. (2008). Tf-idf uncovered: a study of theories and probabilities. In SIGIR
    • Roelleke, T., & Wang, J. (2008). Tf-idf uncovered: a study of theories and probabilities. In SIGIR.
  • 49
    • 0345414182 scopus 로고    scopus 로고
    • Sivic, J., & Zisserman, A. (2003). Video google: A text retrieval approach to object matching in videos. In ICCV
    • Sivic, J., & Zisserman, A. (2003). Video google: A text retrieval approach to object matching in videos. In ICCV.
  • 50
    • 85083604656 scopus 로고    scopus 로고
    • Stratos, K., Sood, A., Mensch, A., Han, X., Mitchell, M., Yamaguchi, K., Dodge, J., Goyal, A., Daumé, III, H., Berg, A., & Berg, T.L. (2012). Understanding and predicting importance in images. In CVPR
    • Stratos, K., Sood, A., Mensch, A., Han, X., Mitchell, M., Yamaguchi, K., Dodge, J., Goyal, A., Daumé, III, H., Berg, A., & Berg, T.L. (2012). Understanding and predicting importance in images. In CVPR.
  • 51
    • 85083597773 scopus 로고    scopus 로고
    • Tighe, J., & Lazebnik, S. (2010). Superparsing: Scalable nonparametric image parsing with superpixels. In ECCV
    • Tighe, J., & Lazebnik, S. (2010). Superparsing: Scalable nonparametric image parsing with superpixels. In ECCV.
  • 53
    • 80053403625 scopus 로고    scopus 로고
    • Wong, K.F., Wu, M., & Li, W. (2008). Extractive summarization using supervised and semi-supervised learning. In COLING
    • Wong, K.F., Wu, M., & Li, W. (2008). Extractive summarization using supervised and semi-supervised learning. In COLING.
  • 54
    • 77955988947 scopus 로고    scopus 로고
    • Xiao, J., Hays, J., Ehinger, K., Oliva, A., & Torralba, A. (2010). Sun database: Large-scale scene recognition from abbey to zoo. In CVPR
    • Xiao, J., Hays, J., Ehinger, K., Oliva, A., & Torralba, A. (2010). Sun database: Large-scale scene recognition from abbey to zoo. In CVPR.
  • 55
    • 80053258778 scopus 로고    scopus 로고
    • Yang, Y., Teo, C.L., Daumé, III, H., & Aloimonos, Y. (2011). Corpus-guided sentence generation of natural images. In EMNLP
    • Yang, Y., Teo, C.L., Daumé, III, H., & Aloimonos, Y. (2011). Corpus-guided sentence generation of natural images. In EMNLP.
  • 57


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.