메뉴 건너뛰기




Volumn 2015 International Conference on Computer Vision, ICCV 2015, Issue , 2015, Pages 19-27

Aligning books and movies: Towards story-like visual explanations by watching movies and reading books

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION; SEMANTICS;

EID: 84973911532     PISSN: 15505499     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICCV.2015.11     Document Type: Conference Paper
Times cited : (2647)

References (42)
  • 1
    • 85083953689 scopus 로고    scopus 로고
    • Neural machine translation by jointly learning to align and translate
    • D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. ICLR, 2015. 4
    • (2015) ICLR , vol.4
    • Bahdanau, D.1    Cho, K.2    Bengio, Y.3
  • 3
    • 84961291190 scopus 로고    scopus 로고
    • Learning phrase representations using rnn encoderdecoder for statistical machine translation
    • K. Cho, B. van Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using rnn encoderdecoder for statistical machine translation. EMNLP, 2014. 4
    • (2014) EMNLP , vol.4
    • Cho, K.1    Van Merrienboer, B.2    Gulcehre, C.3    Bougares, F.4    Schwenk, H.5    Bengio, Y.6
  • 5
    • 70450145539 scopus 로고    scopus 로고
    • Movie/script: Alignment and parsing of video and text transcription
    • T. Cour, C. Jordan, E. Miltsakaki, and B. Taskar. Movie/script: Alignment and parsing of video and text transcription. In ECCV, 2008. 2
    • (2008) ECCV , vol.2
    • Cour, T.1    Jordan, C.2    Miltsakaki, E.3    Taskar, B.4
  • 6
    • 84898027861 scopus 로고    scopus 로고
    • Hello! My name is Buffy-Automatic Naming of Characters in TV Video
    • M. Everingham, J. Sivic, and A. Zisserman. "Hello! My name is. Buffy"-Automatic Naming of Characters in TV Video. BMVC, pages 899-908, 2006. 2
    • (2006) BMVC , vol.2 , pp. 899-908
    • Everingham, M.1    Sivic, J.2    Zisserman, A.3
  • 8
    • 84887365305 scopus 로고    scopus 로고
    • A sentence is worth a thousand pixels
    • S. Fidler, A. Sharma, and R. Urtasun. A sentence is worth a thousand pixels. In CVPR, 2013. 2
    • (2013) CVPR , vol.2
    • Fidler, S.1    Sharma, A.2    Urtasun, R.3
  • 9
    • 57149125139 scopus 로고    scopus 로고
    • Beyond nouns: Exploiting prepositions and comparative adjectives for learning visual classifiers
    • A. Gupta and L. Davis. Beyond nouns: Exploiting prepositions and comparative adjectives for learning visual classifiers. In ECCV, 2008. 1
    • (2008) ECCV , vol.1
    • Gupta, A.1    Davis, L.2
  • 11
    • 84883394520 scopus 로고    scopus 로고
    • Framing image description as a ranking task: Data, models and evaluation metrics
    • 2
    • M. Hodosh, P. Young, and J. Hockenmaier. Framing image description as a ranking task: Data, models and evaluation metrics. JAIR, 47: 853-899, 2013. 2
    • (2013) JAIR , vol.47 , pp. 853-899
    • Hodosh, M.1    Young, P.2    Hockenmaier, J.3
  • 12
    • 84926283798 scopus 로고    scopus 로고
    • Recurrent continuous translation models
    • N. Kalchbrenner and P. Blunsom. Recurrent continuous translation models. In EMNLP, pages 1700-1709, 2013. 4
    • (2013) EMNLP , vol.4 , pp. 1700-1709
    • Kalchbrenner, N.1    Blunsom, P.2
  • 13
    • 84952902559 scopus 로고    scopus 로고
    • Deep visual-semantic alignments for generating image descriptions
    • A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015. 1, 2
    • (2015) CVPR , vol.1 , pp. 2
    • Karpathy, A.1    Fei-Fei, L.2
  • 17
    • 84911370987 scopus 로고    scopus 로고
    • What are you talking about text-to-image coreference
    • C. Kong, D. Lin, M. Bansal, R. Urtasun, and S. Fidler. What are you talking about text-to-image coreference. In CVPR, 2014. 1, 2
    • (2014) CVPR , vol.1 , pp. 2
    • Kong, C.1    Lin, D.2    Bansal, M.3    Urtasun, R.4    Fidler, S.5
  • 19
    • 84911442106 scopus 로고    scopus 로고
    • Visual semantic search: Retrieving videos via complex textual queries
    • 2
    • D. Lin, S. Fidler, C. Kong, and R. Urtasun. Visual Semantic Search: Retrieving Videos via Complex Textual Queries. CVPR, pages 2657-2664, 2014. 1, 2
    • (2014) CVPR , vol.1 , pp. 2657-2664
    • Lin, D.1    Fidler, S.2    Kong, C.3    Urtasun, R.4
  • 21
    • 84959227898 scopus 로고    scopus 로고
    • Don't just listen, use your imagination: Leveraging visual common sense for non-visual tasks
    • X. Lin and D. Parikh. Don't just listen, use your imagination: Leveraging visual common sense for non-visual tasks. In CVPR, 2015. 1
    • (2015) CVPR , vol.1
    • Lin, X.1    Parikh, D.2
  • 22
    • 84937822746 scopus 로고    scopus 로고
    • A multi-world approach to question answering about real-world scenes based on uncertain input
    • M. Malinowski and M. Fritz. A multi-world approach to question answering about real-world scenes based on uncertain input. In NIPS, 2014. 1
    • (2014) NIPS , vol.1
    • Malinowski, M.1    Fritz, M.2
  • 26
    • 85133336275 scopus 로고    scopus 로고
    • BLEU: A method for automatic evaluation of machine translation
    • K. Papineni, S. Roukos, T. Ward, andW. J. Zhu. BLEU: A method for automatic evaluation of machine translation. In ACL, pages 311-318, 2002. 6
    • (2002) ACL , vol.6 , pp. 311-318
    • Papineni, K.1    Roukos, S.2    Ward, T.3    Zhu, A.J.4
  • 28
    • 84906510695 scopus 로고    scopus 로고
    • Linking people in videos with their. Names using coreference resolution
    • 2
    • V. Ramanathan, A. Joulin, P. Liang, and L. Fei-Fei. Linking People in Videos with "Their" Names Using Coreference Resolution. In ECCV, pages 95-110. 2014. 2
    • (2014) ECCV , pp. 95-110
    • Ramanathan, V.1    Joulin, A.2    Liang, P.3    Fei-Fei, L.4
  • 29
    • 84898775557 scopus 로고    scopus 로고
    • Video event understanding using natural language descriptions
    • V. Ramanathan, P. Liang, and L. Fei-Fei. Video event understanding using natural language descriptions. In ICCV, 2013. 1
    • (2013) ICCV , vol.1
    • Ramanathan, V.1    Liang, P.2    Fei-Fei, L.3
  • 31
    • 84898875082 scopus 로고    scopus 로고
    • Subtitle-free Movie to Script Alignment
    • P. Sankar, C. V. Jawahar, and A. Zisserman. Subtitle-free Movie to Script Alignment. In BMVC, 2009. 2
    • (2009) BMVC , vol.2
    • Sankar, P.1    Jawahar, C.V.2    Zisserman, A.3
  • 32
    • 84867113207 scopus 로고    scopus 로고
    • Efficient structured prediction with latent variables for general graphical models
    • A. Schwing, T. Hazan, M. Pollefeys, and R. Urtasun. Efficient Structured Prediction with Latent Variables for General Graphical Models. In ICML, 2012. 7
    • (2012) ICML , vol.7
    • Schwing, A.1    Hazan, T.2    Pollefeys, M.3    Urtasun, R.4
  • 33
    • 70450202706 scopus 로고    scopus 로고
    • Who are you"-Learning person specific classifiers from video
    • 2
    • J. Sivic, M. Everingham, and A. Zisserman. "Who are you"-Learning person specific classifiers from video. CVPR, pages 1145-1152, 2009. 2
    • (2009) CVPR , pp. 1145-1152
    • Sivic, J.1    Everingham, M.2    Zisserman, A.3
  • 34
    • 84964474107 scopus 로고    scopus 로고
    • Grounded compositional semantics for finding and describing images with sentences
    • R. Socher, A. Karpathy, Q. V. Le, C. D. Manning, and A. Y. Ng. Grounded compositional semantics for finding and describing images with sentences. ACL, 2: 207-218, 2014. 2
    • (2014) ACL , vol.2 , pp. 207-218
    • Socher, R.1    Karpathy, A.2    Le, Q.V.3    Manning, C.D.4    Ng, A.Y.5
  • 35
    • 84928547704 scopus 로고    scopus 로고
    • Sequence to sequence learning with neural networks
    • I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, 2014. 4
    • (2014) NIPS , vol.4
    • Sutskever, I.1    Vinyals, O.2    Le, Q.V.3
  • 37
    • 84959255361 scopus 로고    scopus 로고
    • Book2Movie: Aligning Video scenes with Book chapters
    • M. Tapaswi, M. Bauml, and R. Stiefelhagen. Book2Movie: Aligning Video scenes with Book chapters. In CVPR, 2015. 2
    • (2015) CVPR , vol.2
    • Tapaswi, M.1    Bauml, M.2    Stiefelhagen, R.3
  • 38
    • 84977834021 scopus 로고    scopus 로고
    • Aligning plot synopses to videos for story-based retrieval
    • 1, 2, 6
    • M. Tapaswi, M. Buml, and R. Stiefelhagen. Aligning Plot Synopses to Videos for Story-based Retrieval. IJMIR, 4: 3-16, 2015. 1, 2, 6
    • (2015) IJMIR , vol.4 , pp. 3-16
    • Tapaswi, M.1    Buml, M.2    Stiefelhagen, R.3
  • 42
    • 85015194053 scopus 로고    scopus 로고
    • Learning deep features for scene recognition using places database
    • B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. Learning Deep Features for Scene Recognition using Places Database. In NIPS, 2014. 5, 8
    • (2014) NIPS , vol.5 , pp. 8
    • Zhou, B.1    Lapedriza, A.2    Xiao, J.3    Torralba, A.4    Oliva, A.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.