SCOPUS 정보 검색 플랫폼

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volumn 8692 LNCS, Issue PART 4, 2014, Pages 529-545

Improving image-sentence embeddings using large weakly annotated photo collections

(5) Gong, Yunchao a Wang, Liwei b Hodosh, Micah b Hockenmaier, Julia b Lazebnik, Svetlana b

a UNIVERSITY OF NORTH CAROLINA (United States)

b UNIVERSITY OF ILLINOIS AT URBANA CHAMPAIGN (United States)

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER SCIENCE; COMPUTERS; ARTIFICIAL INTELLIGENCE; BIOINFORMATICS;

EMBEDDING METHOD; EMBEDDINGS; IMAGE DESCRIPTIONS; PHOTO COLLECTIONS; TRAINING IMAGE; TRAINING SETS;

ARTIFICIAL INTELLIGENCE; COMPUTER VISION;

EID: 84906484732 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/978-3-319-10593-2_35 Document Type: Conference Paper

Times cited : (212)

References (41)

1
- 78149311145
- Every picture tells a story: Generating sentences from images
- Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. Springer, Heidelberg
- Farhadi, A., Hejrati, M., Sadeghi, M.A., Young, P., Rashtchian, C., Hockenmaier, J., Forsyth, D.: Every picture tells a story: Generating sentences from images. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 15-29. Springer, Heidelberg (2010)
- (2010) LNCS , vol.6314 , pp. 15-29
- Farhadi, A.¹ Hejrati, M.² Sadeghi, M.A.³ Young, P.⁴ Rashtchian, C.⁵ Hockenmaier, J.⁶ Forsyth, D.⁷

2
- 80052901011
- Baby talk: Understanding and generating image descriptions
- Kulkarni, G., Premraj, V., Dhar, S., Li, S., Choi, Y., Berg, A.C., Berg, T.L.: Baby talk: Understanding and generating image descriptions. In: CVPR (2011)
- (2011) CVPR
- Kulkarni, G.¹ Premraj, V.² Dhar, S.³ Li, S.⁴ Choi, Y.⁵ Berg, A.C.⁶ Berg, T.L.⁷

3
- 84862279067
- Composing simple image descriptions using web-scale n-grams
- Li, S., Kulkarni, G., Berg, T.L., Berg, A.C., Choi, Y.: Composing simple image descriptions using web-scale n-grams. In: CoNLL (2011)
- (2011) CoNLL
- Li, S.¹ Kulkarni, G.² Berg, T.L.³ Berg, A.C.⁴ Choi, Y.⁵

4
- 85034832841
- Midge: Generating image descriptions from computer vision detections
- Mitchell, M., Han, X., Dodge, J., Mensch, A., Goyal, A., Berg, A., Yamaguchi, K., Berg, T., Stratos, K., Daumé, I.H.: Midge: Generating image descriptions from computer vision detections. In: EACL (2012)
- (2012) EACL
- Mitchell, M.¹ Han, X.² Dodge, J.³ Mensch, A.⁴ Goyal, A.⁵ Berg, A.⁶ Yamaguchi, K.⁷ Berg, T.⁸ Stratos, K.⁹ Daumé, I.H.¹⁰

5
- 84887365305
- A sentence is worth a thousand pixels
- Fidler, S., Sharma, A., Urtasun, R.: A sentence is worth a thousand pixels. In: CVPR (2013)
- (2013) CVPR
- Fidler, S.¹ Sharma, A.² Urtasun, R.³

6
- 77954862144
- I2T: Image parsing to text description
- Yao, B.Z., Yang, X., Lin, L., Lee, M.W., Zhu, S.C.: I2T: Image parsing to text description. Proceedings of the IEEE 98 (2010)
- (2010) Proceedings of the IEEE , vol.98
- Yao, B.Z.¹ Yang, X.² Lin, L.³ Lee, M.W.⁴ Zhu, S.C.⁵

7
- 84883394520
- Framing image description as a ranking task: Data, models and evaluation metrics
- Hodosh, M., Young, P., Hockenmaier, J.: Framing image description as a ranking task: Data, models and evaluation metrics. Journal of Artificial Intelligence Research (2013)
- (2013) Journal of Artificial Intelligence Research
- Hodosh, M.¹ Young, P.² Hockenmaier, J.³

8
- 85162522202
- Im2Text: Describing images using 1 million captioned photographs
- Ordonez, V., Kulkarni, G., Berg, T.L.: Im2Text: Describing images using 1 million captioned photographs. In: NIPS (2011)
- (2011) NIPS
- Ordonez, V.¹ Kulkarni, G.² Berg, T.L.³

9
- 84906925854
- Grounded compositional semantics for finding and describing images with sentences
- Socher, R., Le, Q.V., Manning, C.D., Ng, A.Y.: Grounded compositional semantics for finding and describing images with sentences. In: ACL (2013)
- (2013) ACL
- Socher, R.¹ Le, Q.V.² Manning, C.D.³ Ng, A.Y.⁴

10
- 84878189119
- Collective generation of natural image descriptions
- Kuznetsova, P., Ordonez, V., Berg, A.C., Berg, T.L., Choi, Y.: Collective generation of natural image descriptions. In: ACL (2012)
- (2012) ACL
- Kuznetsova, P.¹ Ordonez, V.² Berg, A.C.³ Berg, T.L.⁴ Choi, Y.⁵

11
- 85133336275
- Bleu: A method for automatic evaluation of machine translation
- Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: ACL, pp. 311-318 (2002)
- (2002) ACL , pp. 311-318
- Papineni, K.¹ Roukos, S.² Ward, T.³ Zhu, W.J.⁴

12
- 10044285992
- Canonical correlation analysis; an overview with application to learning methods
- Hardoon, D., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis; an overview with application to learning methods. Neural Computation 16 (2004)
- (2004) Neural Computation , vol.16
- Hardoon, D.¹ Szedmak, S.² Shawe-Taylor, J.³

13
- 84906498766
- A multi-view embedding space for modeling internet images, tags, and their semantics
- Gong, Y., Ke, Q., Isard, M., Lazebnik, S.: A multi-view embedding space for modeling internet images, tags, and their semantics. IJCV (2013)
- (2013) IJCV
- Gong, Y.¹ Ke, Q.² Isard, M.³ Lazebnik, S.⁴

14
- 84897476317
- Connecting the dots with landmarks: Discriminatively learning domain-invariant features for unsupervised domain adaptation
- Gong, B., Grauman, K., Sha, F.: Connecting the dots with landmarks: Discriminatively learning domain-invariant features for unsupervised domain adaptation. In: ICML, pp. 222-230 (2013)
- (2013) ICML , pp. 222-230
- Gong, B.¹ Grauman, K.² Sha, F.³

15
- 78149318752
- Adapting visual category models to new domains
- Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. Springer, Heidelberg
- Saenko, K., Kulis, B., Fritz, M., Darrell, T.: Adapting visual category models to new domains. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 213-226. Springer, Heidelberg (2010)
- (2010) LNCS , vol.6314 , pp. 213-226
- Saenko, K.¹ Kulis, B.² Fritz, M.³ Darrell, T.⁴

16
- 82455188167
- Data-driven visual similarity for cross-domain image matching
- Shrivastava, A., Malisiewicz, T., Gupta, A., Efros, A.A.: Data-driven visual similarity for cross-domain image matching. ACM SIGGRAPH ASIA 30(6) (2011)
- (2011) ACM SIGGRAPH ASIA , vol.30 , Issue.6
- Shrivastava, A.¹ Malisiewicz, T.² Gupta, A.³ Efros, A.A.⁴

17
- 34547673022
- Scene completion using millions of photographs
- Hays, J., Efros, A.A.: Scene completion using millions of photographs. ACM Transactions on Graphics (SIGGRAPH) 26(3) (2007)
- (2007) ACM Transactions on Graphics (SIGGRAPH) , vol.26 , Issue.3
- Hays, J.¹ Efros, A.A.²

18
- 84866661767
- Large-scale knowledge transfer for object localization in imageNet
- Guillaumin, M., Ferrari, V.: Large-scale knowledge transfer for object localization in imageNet. In: CVPR, 3202-3209 (2012)
- (2012) CVPR , pp. 3202-3209
- Guillaumin, M.¹ Ferrari, V.²

19
- 77956006653
- Multimodal semi-supervised learning for image classification
- Guillaumin, M., Verbeek, J., Schmid, C.: Multimodal semi-supervised learning for image classification. In: CVPR, 902-909 (2010)
- (2010) CVPR , pp. 902-909
- Guillaumin, M.¹ Verbeek, J.² Schmid, C.³

20
- 35148862171
- Learning visual representations using images with captions
- Quattoni, A., Collins, M., Darrell, T.: Learning visual representations using images with captions. In: CVPR (2007)
- (2007) CVPR
- Quattoni, A.¹ Collins, M.² Darrell, T.³

21
- 70450207253
- Building text features for object image classification
- Wang, G., Hoiem, D., Forsyth, D.: Building text features for object image classification. In: CVPR (2009)
- (2009) CVPR
- Wang, G.¹ Hoiem, D.² Forsyth, D.³

22
- 84906494296
- From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
- Young, P., Lai, A., Hodosh, M., Hockenmaier, J.: From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. In: TACL (2014)
- (2014) TACL
- Young, P.¹ Lai, A.² Hodosh, M.³ Hockenmaier, J.⁴

23
- 0035328421
- Modeling the shape of the scene: A holistic representation of the spatial envelope
- Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV (2001)
- (2001) IJCV
- Oliva, A.¹ Torralba, A.²

24
- 77955426203
- Evaluating color descriptors for object and scene recognition
- van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. PAMI 32(9), 1582-1596 (2010)
- (2010) PAMI , vol.32 , Issue.9 , pp. 1582-1596
- Van De Sande, K.E.A.¹ Gevers, T.² Snoek, C.G.M.³

25
- 33645146449
- Histograms of oriented gradients for human detection
- Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
- (2005) CVPR
- Dalal, N.¹ Triggs, B.²

26
- 77956004473
- Aggregating local descriptors into a compact image representation
- Jégou, H., Douze, M., Schmid, C., Perez, P.: Aggregating local descriptors into a compact image representation. In: CVPR (2010)
- (2010) CVPR
- Jégou, H.¹ Douze, M.² Schmid, C.³ Perez, P.⁴

27
- 84876231242
- ImageNet classification with deep convolutional neural networks
- Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)
- (2012) NIPS
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.E.³

28
- 84906504048
- DeCAF: A deep convolutional activation feature for generic visual recognition
- abs/1310.1531
- Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: DeCAF: A deep convolutional activation feature for generic visual recognition. CoRR abs/1310.1531 (2013)
- (2013) CoRR
- Donahue, J.¹ Jia, Y.² Vinyals, O.³ Hoffman, J.⁴ Zhang, N.⁵ Tzeng, E.⁶ Darrell, T.⁷

29
- 85198028989
- ImageNet: A large-scale hierarchical image database
- Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: A large-scale hierarchical image database. In: CVPR (2009)
- (2009) CVPR
- Deng, J.¹ Dong, W.² Socher, R.³ Li, L.J.⁴ Li, K.⁵ Fei-Fei, L.⁶

30
- 62949095898
- Nltk: The natural language toolkit
- Loper, E., Bird, S.: Nltk: The natural language toolkit. In: Proceedings of the ACL 2002 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics, vol. 1 (2002)
- (2002) Proceedings of the ACL 2002 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics , vol.1
- Loper, E.¹ Bird, S.²

31
- 84867117593
- Wsabie: Scaling up to large vocabulary image annotation
- Weston, J., Bengio, S., Usunier, N.: Wsabie: Scaling up to large vocabulary image annotation. In: IJCAI (2011)
- (2011) IJCAI
- Weston, J.¹ Bengio, S.² Usunier, N.³

32
- 80052250414
- Adaptive subgradient methods for online learning and stochastic optimization
- Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. JMLR (2011)
- (2011) JMLR
- Duchi, J.¹ Hazan, E.² Singer, Y.³

33
- 84893382981
- arXiv preprint arXiv:1212.5701
- Zeiler, M.D.: ADADELTA: An adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)
- (2012) ADADELTA: An Adaptive Learning Rate Method
- Zeiler, M.D.¹

34
- 84898938559
- Zeroshot learning through cross-modal transfer
- Socher, R., Ganjoo, M., Sridhar, H., Bastani, O., Manning, C.D., Ng, A.Y.: Zeroshot learning through cross-modal transfer. In: NIPS (2013)
- (2013) NIPS
- Socher, R.¹ Ganjoo, M.² Sridhar, H.³ Bastani, O.⁴ Manning, C.D.⁵ Ng, A.Y.⁶

35
- 0000107975
- Relations between two sets of variables
- Hotelling, H.: Relations between two sets of variables. Biometrika 28, 312-377 (1936)
- (1936) Biometrika , vol.28 , pp. 312-377
- Hotelling, H.¹

36
- 84866699225
- Leveraging category-level labels for instance-level image retrieval
- Gordo, A., Rodriguez-Serrano, J.A., Perronnin, F., Valveny, E.: Leveraging category-level labels for instance-level image retrieval. In: CVPR (2012)
- (2012) CVPR
- Gordo, A.¹ Rodriguez-Serrano, J.A.² Perronnin, F.³ Valveny, E.⁴

37
- 84863396387
- Domain adaptation for object recognition: An unsupervised approach
- Gopalan, R., Li, R., Chellappa, R.: Domain adaptation for object recognition: An unsupervised approach. In: ICCV (2011)
- (2011) ICCV
- Gopalan, R.¹ Li, R.² Chellappa, R.³

38
- 84906513179
- From sBoW to dCoT: Marginalized encoders for text representation
- Xu, Z., Chen, M., Weinberger, K.Q., Sha, F.: From sBoW to dCoT: Marginalized encoders for text representation. In: CIKM (2011)
- (2011) CIKM
- Xu, Z.¹ Chen, M.² Weinberger, K.Q.³ Sha, F.⁴

39
- 77953218689
- Random features for large-scale kernel machines
- Rahimi, A., Recht, B.: Random features for large-scale kernel machines. In: NIPS (2007)
- (2007) NIPS
- Rahimi, A.¹ Recht, B.²

40
- 56449089103
- Extracting and composing robust features with denoising autoencoders
- Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: ICML, pp. 1096-1103 (2008)
- (2008) ICML , pp. 1096-1103
- Vincent, P.¹ Larochelle, H.² Bengio, Y.³ Manzagol, P.A.⁴

41
- 69349090197
- Learning deep architectures for AI
- Bengio, Y.: Learning deep architectures for AI. Foundations and Trends in Machine Learning 2(1), 1-127 (2009)
- (2009) Foundations and Trends in Machine Learning , vol.2 , Issue.1 , pp. 1-127
- Bengio, Y.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.