SCOPUS 정보 검색 플랫폼

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Volumn , Issue , 2014, Pages 2657-2664

Visual semantic search: Retrieving videos via complex textual queries

(4) Lin, Dahua a Fidler, Sanja a,b Kong, Chen c Urtasun, Raquel a,b

a TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO (United States)

b UNIVERSITY OF TORONTO (Canada)

c TSINGHUA UNIVERSITY (China)

Author keywords

images and videos; scene understanding; Video retrieval

Indexed keywords

GRAPH ALGORITHMS; NATURAL LANGUAGE PROCESSING SYSTEMS; PATTERN RECOGNITION; SEMANTICS;

AUTONOMOUS DRIVING; BIPARTITE MATCHING ALGORITHM; IMAGES AND VIDEOS; NATURAL LANGUAGE QUERIES; OBJECT APPEARANCE; SCENE UNDERSTANDING; STRUCTURE PREDICTION; VIDEO RETRIEVAL;

SEMANTIC WEB;

EID: 84911442106 PISSN: 10636919 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/CVPR.2014.340 Document Type: Conference Paper

Times cited : (140)

References (28)

1
- 51949084160
- Utilizing semantic word similarity measures for video retrieval
- Y. Aytar, M. Shah, and J. Luo. Utilizing semantic word similarity measures for video retrieval. In CVPR, 2008.
- (2008) CVPR
- Aytar, Y.¹ Shah, M.² Luo, J.³

2
- 84885996388
- Video-in-sentences out
- A. Barbu, A. Bridge, Z. Burchill, D. Coroian, S. Dickinson, S. Fidler, A. Michaux, S. Mussman, S. Narayanaswamy, D. Salvi, L. Schmidt, J. Shangguan, J. Siskind, J. Waggoner, S. Wang, J. Wei, Y. Yin, and Z. Zhang. Video-in-sentences out. In UAI, 2012.
- (2012) UAI
- Barbu, A.¹ Bridge, A.² Burchill, Z.³ Coroian, D.⁴ Dickinson, S.⁵ Fidler, S.⁶ Michaux, A.⁷ Mussman, S.⁸ Narayanaswamy, S.⁹ Salvi, D.¹⁰ Schmidt, L.¹¹ Shangguan, J.¹² Siskind, J.¹³ Waggoner, J.¹⁴ Wang, S.¹⁵ Wei, J.¹⁶ Yin, Y.¹⁷ Zhang, Z.¹⁸

3
- 0041876117
- Matching words and pictures
- K. Barnard, P. Duygulu, D. Forsyth, N. de Freitas, D. Blei, and M. Jordan. Matching words and pictures. In JMLR, 2003.
- (2003) JMLR
- Barnard, K.¹ Duygulu, P.² Forsyth, D.³ De Freitas, N.⁴ Blei, D.⁵ Jordan, M.⁶

4
- 0036538619
- Shape matching and object recognition using shape contexts
- S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape contexts. IEEE Transaction on PAMI, 24(24), 2002.
- (2002) IEEE Transaction on PAMI , vol.24 , Issue.24
- Belongie, S.¹ Malik, J.² Puzicha, J.³

5
- 84889607930
- Zero-shot video retrieval using content and concepts
- J. Dalton, J. Allan, and P. Mirajkar. Zero-shot video retrieval using content and concepts. In CIKM, 2013.
- (2013) CIKM
- Dalton, J.¹ Allan, J.² Mirajkar, P.³

6
- 80051961229
- Every picture tells a story: Generating sentences for images
- A. Farhadi, M. Hejrati, M. Sadeghi, P. Young, C. Rashtchian, J. Hockenmaier, and D. Forsyth. Every picture tells a story: Generating sentences for images. In ECCV, 2010.
- (2010) ECCV
- Farhadi, A.¹ Hejrati, M.² Sadeghi, M.³ Young, P.⁴ Rashtchian, C.⁵ Hockenmaier, J.⁶ Forsyth, D.⁷

7
- 77955422240
- Object detection with discriminatively trained part based models
- P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ra-manan. Object detection with discriminatively trained part based models. PAMI, 32(9), 2010.
- (2010) PAMI , vol.32 , Issue.9
- Felzenszwalb, P.¹ Girshick, R.² McAllester, D.³ Ra-Manan, D.⁴

8
- 84887365305
- A sentence is worth a thousand pixels
- S. Fidler, A. Sharma, and R. Urtasun. A sentence is worth a thousand pixels. In CVPR, 2013.
- (2013) CVPR
- Fidler, S.¹ Sharma, A.² Urtasun, R.³

9
- 84866704163
- Are we ready for autonomous driving? the kitti vision benchmark suite
- A. Geiger, P. Lenz, and R. Urtasun. Are we ready for autonomous driving? the kitti vision benchmark suite. In CVPR, 2012.
- (2012) CVPR
- Geiger, A.¹ Lenz, P.² Urtasun, R.³

10
- 84911393175
- Stereoscan: Dense 3d reconstruction in real-time
- A. Geiger, J. Ziegler, and C. Stiller. Stereoscan: Dense 3d reconstruction in real-time. In IVS (IV), 2011.
- (2011) IVS (IV)
- Geiger, A.¹ Ziegler, J.² Stiller, C.³

11
- 84883075039
- Joint visual-text modeling for automatic retrieval of multimedia documents
- G. Iyengar, P. Duygulu, S. Feng, P. Ircing, S. Khudanpur, D. Klakow, M. Krause, R. Manmatha, and H. Nock. Joint visual-text modeling for automatic retrieval of multimedia documents. In Proc. of ACM Multimedia, 2005.
- (2005) Proc. of ACM Multimedia
- Iyengar, G.¹ Duygulu, P.² Feng, S.³ Ircing, P.⁴ Khudanpur, S.⁵ Klakow, D.⁶ Krause, M.⁷ Manmatha, R.⁸ Nock, H.⁹

12
- 84911370987
- What are you talking about? Text-to-image coreference
- C. Kong, D. Lin, M. Bansal, R. Urtasun, and S. Fidler. What are you talking about? text-to-image coreference. In CVPR, 2014.
- (2014) CVPR
- Kong, C.¹ Lin, D.² Bansal, M.³ Urtasun, R.⁴ Fidler, S.⁵

13
- 33745130042
- Content-based multimedia information retrieval
- M. S. Lew, N. Sebe, C. Djeraba, and R. Jain. Content-based multimedia information retrieval. ACM Trans. on Multimedia Computing, Comm., and Applications, 2006.
- (2006) ACM Trans. on Multimedia Computing, Comm., and Applications
- Lew, M.S.¹ Sebe, N.² Djeraba, C.³ Jain, R.⁴

14
- 70450219021
- Towards total scene un-derstanding:classification, annotation and segmentation in an automatic framework
- L. Li, R. Socher, and L. Fei-Fei. Towards total scene un-derstanding:classification, annotation and segmentation in an automatic framework. In CVPR, 2009.
- (2009) CVPR
- Li, L.¹ Socher, R.² Fei-Fei, L.³

15
- 84867118595
- A joint model of language and perception for grounded attribute learning
- C. Matuszek, N. FitzGerald, L. Zettlemoyer, L. Bo, and D. Fox. A joint model of language and perception for grounded attribute learning. In ICML, 2013.
- (2013) ICML
- Matuszek, C.¹ Fitzgerald, N.² Zettlemoyer, L.³ Bo, L.⁴ Fox, D.⁵

16
- 80052904076
- Globally-optimal greedy algorithms for tracking a variable number of objects
- H. Pirsiavash, D. Ramanan, and C. Fowlkes. Globally-optimal greedy algorithms for tracking a variable number of objects. In CVPR, 2011.
- (2011) CVPR
- Pirsiavash, H.¹ Ramanan, D.² Fowlkes, C.³

17
- 84898775239
- Translating video content to natural language descriptions
- M. Rohrbach, W. Qiu, I. Titov, S. Thater, M. Pinkal, and B. Schiele. Translating video content to natural language descriptions. In ICCV, 2013.
- (2013) ICCV
- Rohrbach, M.¹ Qiu, W.² Titov, I.³ Thater, S.⁴ Pinkal, M.⁵ Schiele, B.⁶

18
- 84881536861
- Indoor segmentation and support inference from rgbd images
- N. Silberman, D. Hoiem, P. Kohli, and R. Fergus. Indoor segmentation and support inference from rgbd images. In ECCV, 2012.
- (2012) ECCV
- Silberman, N.¹ Hoiem, D.² Kohli, P.³ Fergus, R.⁴

19
- 0345414182
- Video google: A text retrieval approach to object matching in videos
- J. Sivic and A. Zisserman. Video google: A text retrieval approach to object matching in videos. In ICCV, 2003.
- (2003) ICCV
- Sivic, J.¹ Zisserman, A.²

20
- 34547455218
- Adding semantics to detectors for video retrieval
- C. G. M. Snoek, B. Huurnink, L. Hollink, M. de Rijke, G. Schreiber, and M. Worring. Adding semantics to detectors for video retrieval. IEEE Transaction of Multimedia, 9(5):975-986, 2007.
- (2007) IEEE Transaction of Multimedia , vol.9 , Issue.5 , pp. 975-986
- Snoek, C.G.M.¹ Huurnink, B.² Hollink, L.³ De Rijke, M.⁴ Schreiber, G.⁵ Worring, M.⁶

21
- 68349121465
- Concept-based video retrieval
- C. G. M. Snoek and M. Worring. Concept-based video retrieval. Foundations and Trends in Information Retrieval, 2(4):215-232, 2008.
- (2008) Foundations and Trends in Information Retrieval , vol.2 , Issue.4 , pp. 215-232
- Snoek, C.G.M.¹ Worring, M.²

22
- 84893795422
- Parsing with compositional vector grammars
- R. Socher, J. Bauer, C. D. Manning, and A. Y. Ng. Parsing with compositional vector grammars. In ACL, 2013.
- (2013) ACL
- Socher, R.¹ Bauer, J.² Manning, C.D.³ Ng, A.Y.⁴

23
- 31844442382
- Learning structured prediciton models: A large margin approach
- B. Taskar, V. Chatalbashev, D. Koller, and C. Guestrin. Learning structured prediciton models: A large margin approach. In Proc. of ICML, 2005.
- (2005) Proc. of ICML
- Taskar, B.¹ Chatalbashev, V.² Koller, D.³ Guestrin, C.⁴

24
- 24944537843
- Large margin methods for structured and interdependent output variables
- I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. Large margin methods for structured and interdependent output variables. JMLR, 6:1453-1484, 2005.
- (2005) JMLR , vol.6 , pp. 1453-1484
- Tsochantaridis, I.¹ Joachims, T.² Hofmann, T.³ Altun, Y.⁴

25
- 37848999897
- The importance of query-concept-mapping for automatic video retrieval
- D. Wang, X. Li, J. Li, and B. Zhang. The importance of query-concept-mapping for automatic video retrieval. In Proc. of ACM Multimedia, 2007.
- (2007) Proc. of ACM Multimedia
- Wang, D.¹ Li, X.² Li, J.³ Zhang, B.⁴

26
- 84887340824
- Robust monocular epipolar flow estimation
- K. Yamaguchi, D. McAllester, and R. Urtasun. Robust monocular epipolar flow estimation. In CVPR, 2013.
- (2013) CVPR
- Yamaguchi, K.¹ McAllester, D.² Urtasun, R.³

27
- 84911376686
- I2t: Image parsing to text description
- B. Yao, X. Yang, M. L. L. Lin, and S. Zhu. I2t: Image parsing to text description. In PAMI, 2010.
- (2010) PAMI
- Yao, B.¹ Yang, X.² Lin, M.L.L.³ Zhu, S.⁴

28
- 51949088494
- Global data association for multi-object tracking using network flows
- L. Zhang, Y. Li, and R. Nevatia. Global data association for multi-object tracking using network flows. In CVPR'08.
- CVPR'08
- Zhang, L.¹ Li, Y.² Nevatia, R.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.