SCOPUS 정보 검색 플랫폼

Proceedings of the IEEE International Conference on Computer Vision

Volumn , Issue , 2013, Pages 2712-2719

Youtube2text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition

(7) Guadarrama, Sergio a Krishnamoorthy, Niveda b Malkarnenkar, Girish b Venugopalan, Subhashini b Mooney, Raymond b Darrell, Trevor a Saenko, Kate c

a UNIVERSITY OF CALIFORNIA (United States)

b UNIVERSITY OF TEXAS AT AUSTIN (United States)

c UNIVERSITY OF MASSACHUSETTS LOWELL (United States)

Author keywords

Describing Activities in videos; Large scale activity recognition; Recognizing activities in videos; semantic hierarchies; zero shot learning

Indexed keywords

COMPUTATIONAL LINGUISTICS; OBJECT RECOGNITION; SEMANTICS; VIDEO CAMERAS;

ACTIVITY RECOGNITION; DESCRIBING ACTIVITIES IN VIDEOS; RECOGNIZING ACTIVITIES IN VIDEOS; SEMANTIC HIERARCHIES; ZERO-SHOT LEARNING;

SEMANTIC WEB;

EID: 84898773262 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICCV.2013.337 Document Type: Conference Paper

Times cited : (537)

References (31)

1
- 77951155435
- Video2text: Learning to annotate video content
- ICDMW '09 IEEE International Conference on
- H. Aradhye, G. Toderici, and J. Yagnik. Video2text: Learning to annotate video content. In Data Mining Workshops, 2009. ICDMW '09. IEEE International Conference on, pages 144-151, 2009.
- (2009) In Data Mining Workshops 2009 , pp. 144-151
- Aradhye, H.¹ Toderici, G.² Yagnik, J.³

2
- 84885996388
- Video in sentences out
- A. Barbu, A. Bridge, Z. Burchill, D. Coroian, S. Dickinson, S. Fidler, A. Michaux, S. Mussman, S. Narayanaswamy, D. Salvi, et al. Video in sentences out. In Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence (UAI), pages 102-12, 2012.
- (2012) In Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence (UAI) , pp. 102-112
- Barbu, A.¹ Bridge, A.² Burchill, Z.³ Coroian, D.⁴ Dickinson, S.⁵ Fidler, S.⁶ Michaux, A.⁷ Mussman, S.⁸ Narayanaswamy, S.⁹ Salvi, D.¹⁰

3
- 79955702502
- Libsvm: A library for support vector machines
- C. Chang and C. Lin. Libsvm: a library for support vector machines. ACM Trans. on Intelligent Systems and Technology (TIST), 2(3):27, 2011.
- (2011) ACM Trans. on Intelligent Systems and Technology (TIST) , vol.2 , Issue.3 , pp. 27
- Chang, C.¹ Lin, C.²

4
- 84859089502
- Collecting highly parallel data for paraphrase evaluation
- Portland, Oregon
- D. L. Chen and W. B. Dolan. Collecting highly parallel data for paraphrase evaluation. In Proceddings of ACL, 2013, pages 190-200, Portland, Oregon, 2011.
- (2011) In Proceddings of ACL 2013 , pp. 190-200
- Chen, D.L.¹ Dolan, W.B.²

5
- 84887345951
- A thousand frames in just a few words: Lingual description of videos through latent topics and sparse object stitching
- IEEE Computer Society
- P. Das, C. Xu, R. F. Doell, and J. J. Corso. A thousand frames in just a few words: Lingual description of videos through latent topics and sparse object stitching. In Computer Vision and Pattern Recognition (CVPR), 2013., pages 2634-2641. IEEE Computer Society, 2013.
- (2013) In Computer Vision and Pattern Recognition (CVPR 2013) , pp. 2634-2641
- Das, P.¹ Xu, C.² Doell, R.F.³ Corso, J.J.⁴

6
- 84866674680
- Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition
- IEEE
- J. Deng, J. Krause, A. C. Berg, and L. Fei-Fei. Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition. In Computer Vision and Pattern Recognition (CVPR), 2012., pages 3450-3457. IEEE, 2012.
- (2012) In Computer Vision and Pattern Recognition (CVPR 2012) , pp. 3450-3457
- Deng, J.¹ Krause, J.² Berg, A.C.³ Fei-Fei, L.⁴

7
- 84946590544
- Construction and analysis of a large scale image ontology
- J. Deng, K. Li, M. Do, H. Su, and L. Fei-Fei. Construction and Analysis of a Large Scale Image Ontology. In Vision Sciences Society, 2009.
- (2009) In Vision Sciences Society
- Deng, J.¹ Li, K.² Do, M.³ Su, H.⁴ Fei-Fei, L.⁵

8
- 84864139941
- Beyond audio and video retrieval: Towards multimedia summarization
- ACM
- D. Ding, F. Metze, S. Rawat, P. Schulam, S. Burger, E. Younessian, L. Bao, M. Christel, and A. Hauptmann. Beyond audio and video retrieval: towards multimedia summarization. In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, page 2. ACM, 2012.
- (2012) In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval , pp. 2
- Ding, D.¹ Metze, F.² Rawat, S.³ Schulam, P.⁴ Burger, S.⁵ Younessian, E.⁶ Bao, L.⁷ Christel, M.⁸ Hauptmann, A.⁹

9
- 77951298115
- The pascal visual object classes (voc) challenge
- June
- M. Everingham, L. Van Gool, C. K. I.Williams, J.Winn, and A. Zisserman. The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2):303-338, June 2010.
- (2010) International Journal of Computer Vision , vol.88 , Issue.2 , pp. 303-338
- Everingham, M.¹ Van Gool, L.² Williams, C.K.I.³ Winn, J.⁴ Zisserman, A.⁵

10
- 78149311145
- Every picture tells a story: Generating sentences from images
- A. Farhadi, M. Hejrati, M. Sadeghi, P. Young, C. Rashtchian, J. Hockenmaier, and D. Forsyth. Every picture tells a story: Generating sentences from images. Computer Vision-ECCV 2010, pages 15-29, 2010.
- (2010) Computer Vision-ECCV 2010 , pp. 15-29
- Farhadi, A.¹ Hejrati, M.² Sadeghi, M.³ Young, P.⁴ Rashtchian, C.⁵ Hockenmaier, J.⁶ Forsyth, D.⁷

11
- 77955422240
- Object detection with discriminatively trained part-based models
- P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence., 32(9):1627-1645, 2010.
- (2010) IEEE Transactions on Pattern Analysis and Machine Intelligence. , vol.32 , Issue.9 , pp. 1627-1645
- Felzenszwalb, P.F.¹ Girshick, R.B.² McAllester, D.³ Ramanan, D.⁴

12
- 84898785322
- Describing video contents in natural language
- M. Khan and Y. Gotoh. Describing video contents in natural language. Proceedings of the EACL Workshop on Innovative Hybrid Approaches to the Processing of Textual Data, pages 27-35, 2012.
- (2012) Proceedings of the EACL Workshop on Innovative Hybrid Approaches to the Processing of Textual Data , pp. 27-35
- Khan, M.¹ Gotoh, Y.²

13
- 0036843382
- Natural language description of human activities from video images based on concept hierarchy of actions
- A. Kojima, T. Tamura, and K. Fukunaga. Natural language description of human activities from video images based on concept hierarchy of actions. International Journal of Computer Vision, 50(2):171-184, 2002.
- (2002) International Journal of Computer Vision , vol.50 , Issue.2 , pp. 171-184
- Kojima, A.¹ Tamura, T.² Fukunaga, K.³

14
- 84893398951
- Generating natural-language video descriptions using text-mined knowledge
- N. Krishnamoorthy, G. Malkarnenkar, R. J. Mooney, K. Saenko, and S. Guadarrama. Generating natural-language video descriptions using text-mined knowledge. In Procedings of AAAI, 2013, 2013.
- (2013) In Procedings of AAAI 2013
- Krishnamoorthy, N.¹ Malkarnenkar, G.² Mooney, R.J.³ Saenko, K.⁴ Guadarrama, S.⁵

15
- 80052901011
- Baby talk: Understanding and generating simple image descriptions
- IEEE
- G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. Choi, A. Berg, and T. Berg. Baby talk: Understanding and generating simple image descriptions. In Computer Vision and Pattern Recognition (CVPR), 2011., pages 1601-1608. IEEE, 2011.
- (2011) In Computer Vision and Pattern Recognition (CVPR 2011) , pp. 1601-1608
- Kulkarni, G.¹ Premraj, V.² Dhar, S.³ Li, S.⁴ Choi, Y.⁵ Berg, A.⁶ Berg, T.⁷

16
- 50649122769
- Retrieving actions in movies
- ICCV 2007 IEEE
- I. Laptev and P. Perez. Retrieving actions in movies. In International Conference on Computer Vision, 2007. ICCV 2007., pages 1-8. IEEE, 2007.
- (2007) In International Conference on Computer Vision 2007 , pp. 1-8
- Laptev, I.¹ Perez, P.²

17
- 51849094354
- Save: A framework for semantic annotation of visual events
- CVPRW'08 IEEE
- M. Lee, A. Hakeem, N. Haering, and S. Zhu. Save: A framework for semantic annotation of visual events. In Computer Vision and Pattern Recognition Workshops, 2008. CVPRW'08., pages 1-8. IEEE, 2008.
- (2008) In Computer Vision and Pattern Recognition Workshops 2008 , pp. 1-8
- Lee, M.¹ Hakeem, A.² Haering, N.³ Zhu, S.⁴

18
- 85162513516
- Object bank: A high-level image representation for scene classification and semantic feature sparsification
- L. Li, H. Su, E. Xing, and L. Fei-Fei. Object bank: A high-level image representation for scene classification and semantic feature sparsification. Advances in Neural Information Processing Systems, 24, 2010.
- (2010) Advances in Neural Information Processing Systems , vol.24
- Li, L.¹ Su, H.² Xing, E.³ Fei-Fei, L.⁴

19
- 84862279067
- Composing simple image descriptions using web-scale n-grams
- Association for Computational Linguistics
- S. Li, G. Kulkarni, T. Berg, A. Berg, and Y. Choi. Composing simple image descriptions using web-scale n-grams. In Proceedings of the Fifteenth Conference on Computational Natural Language Learning, pages 220-228. Association for Computational Linguistics, 2011.
- (2011) In Proceedings of the Fifteenth Conference on Computational Natural Language Learning , pp. 220-228
- Li, S.¹ Kulkarni, G.² Berg, T.³ Berg, A.⁴ Choi, Y.⁵

20
- 84959182849
- Improving video activity recognition using object recognition and text mining
- ECAI
- T. Motwani and R. Mooney. Improving video activity recognition using object recognition and text mining. In European Conference on Artificial Intelligence. ECAI, 2012.
- (2012) In European Conference on Artificial Intelligence
- Motwani, T.¹ Mooney, R.²

21
- 85081941118
- Wordnet: Similarity: Measuring the relatedness of concepts
- Association for Computational Linguistics
- T. Pedersen, S. Patwardhan, and J. Michelizzi. Wordnet:: Similarity: measuring the relatedness of concepts. In Demonstration Papers at HLT-NAACL 2004, pages 38-41. Association for Computational Linguistics, 2004.
- (2004) In Demonstration Papers at HLT-NAACL 2004 , pp. 38-41
- Pedersen, T.¹ Patwardhan, S.² Michelizzi, J.³

22
- 85123966307
- Distributional clustering of english words
- Association for Computational Linguistics
- F. Pereira, N. Tishby, and L. Lee. Distributional clustering of english words. In Proceedings of the 31st annual meeting on Association for Computational Linguistics, pages 183-190. Association for Computational Linguistics, 1993.
- (1993) In Proceedings of the 31st Annual Meeting on Association for Computational Linguistics , pp. 183-190
- Pereira, F.¹ Tishby, N.² Lee, L.³

23
- 0003243224
- Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods
- MIT Press
- J. C. Platt. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In ADVANCES IN LARGE MARGIN CLASSIFIERS, pages 61-74. MIT Press, 1999.
- (1999) In ADVANCES in LARGE MARGin CLASSIFIERS , pp. 61-74
- Platt, J.C.¹

24
- 84879550059
- Recognizing 50 human action categories of web videos
- K. Reddy and M. Shah. Recognizing 50 human action categories of web videos. Machine Vision and Applications, pages 1-11, 2012.
- (2012) Machine Vision and Applications , pp. 1-11
- Reddy, K.¹ Shah, M.²

25
- 51949083482
- Labelme: A database and web-based tool for image annotation
- B. Russell, A. Torralba, K. Murphy, and W. T. Freeman. Labelme: a database and web-based tool for image annotation. In International Journal of Computer Vision, 2007.
- (2007) In International Journal of Computer Vision
- Russell, B.¹ Torralba, A.² Murphy, K.³ Freeman, W.T.⁴

26
- 10044233701
- Recognizing human actions: A local svm approach
- ICPR 2004 IEEE
- C. Schuldt, I. Laptev, and B. Caputo. Recognizing human actions: A local svm approach. In Pattern Recognition, 2004. ICPR 2004., volume 3, pages 32-36. IEEE, 2004.
- (2004) In Pattern Recognition 2004 , vol.3 , pp. 32-36
- Schuldt, C.¹ Laptev, I.² Caputo, B.³

27
- 80052877143
- Action recognition by dense trajectories
- IEEE
- H. Wang, A. Klaser, C. Schmid, and C.-L. Liu. Action recognition by dense trajectories. In Computer Vision and Pattern Recognition (CVPR), 2011., pages 3169-3176. IEEE, 2011.
- (2011) In Computer Vision and Pattern Recognition (CVPR 2011) , pp. 3169-3176
- Wang, H.¹ Klaser, A.² Schmid, C.³ Liu, C.-L.⁴

28
- 85146676791
- Verbs semantics and lexical selection
- Association for Computational Linguistics
- Z.Wu and M. Palmer. Verbs semantics and lexical selection. In Proceedings of the 32nd annual meeting on Association for Computational Linguistics, pages 133-138. Association for Computational Linguistics, 1994.
- (1994) In Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics , pp. 133-138
- Wu, Z.¹ Palmer, M.²

29
- 80053258778
- Corpus-guided sentence generation of natural images
- EMNLP '11
- Y. Yang, C. L. Teo, H. Daum?e, III, and Y. Aloimonos. Corpus-guided sentence generation of natural images. In Proc. of the Conference on Empirical Methods in Natural Language Processing, EMNLP '11, pages 444-454, 2011.
- (2011) In Proc. of the Conference on Empirical Methods in Natural Language Processing , pp. 444-454
- Yang, Y.¹ Teo, C.L.² Iii. Daume, H.³ Aloimonos, Y.⁴

30
- 77954862144
- I2t: Image parsing to text description
- B. Yao, X. Yang, L. Lin, M. Lee, and S. Zhu. I2t: Image parsing to text description. Proceedings of the IEEE, 98(8):1485-1508, 2010.
- (2010) Proceedings of the IEEE , vol.98 , Issue.8 , pp. 1485-1508
- Yao, B.¹ Yang, X.² Lin, L.³ Lee, M.⁴ Zhu, S.⁵

31
- 33846580425
- Local features and kernels for classification of texture and object categories: A comprehensive study
- J. Zhang, M. Marszałek, S. Lazebnik, and C. Schmid. Local features and kernels for classification of texture and object categories: A comprehensive study. International Journal of Computer Vision, 73(2):213-238, 2007.
- (2007) International Journal of Computer Vision , vol.73 , Issue.2 , pp. 213-238
- Zhang, J.¹ Marszałek, M.² Lazebnik, S.³ Schmid, C.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.