메뉴 건너뛰기




Volumn , Issue , 2013, Pages 433-440

Translating video content to natural language descriptions

Author keywords

[No Author keywords available]

Indexed keywords

SEMANTICS; VISION;

EID: 84898775239     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICCV.2013.61     Document Type: Conference Paper
Times cited : (382)

References (27)
  • 1
    • 80052886947 scopus 로고    scopus 로고
    • Generating image descriptions using dependency relational patterns
    • A. Aker and R. J. Gaizauskas. Generating image descriptions using dependency relational patterns. In ACL, 2010.
    • (2011) ACL
    • Aker, A.1    Gaizauskas, R.J.2
  • 3
    • 84887345951 scopus 로고    scopus 로고
    • Thousand frames in just a few words: Lingual description of videos through latent topics and sparse object stitching
    • J. Corso, C. Xu, P. Das, R. F. Doell, and P. Rosebrough. Thousand frames in just a few words: Lingual description of videos through latent topics and sparse object stitching. In CVPR, 2013.
    • (2013) CVPR
    • Corso, J.1    Xu, C.2    Das, P.3    Doell, R.F.4    Rosebrough, P.5
  • 4
    • 0038401728 scopus 로고    scopus 로고
    • Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary
    • P. Duygulu, K. Barnard, N. de Freitas, and D. A. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In ECCV, 2002.
    • (2002) ECCV
    • Duygulu, P.1    Barnard, K.2    De Freitas, N.3    Forsyth, D.A.4
  • 6
    • 84867209146 scopus 로고    scopus 로고
    • IRSTLM: An open source toolkit for handling large scale language models
    • M. Federico, N. Bertoldi, and M. Cettolo. IRSTLM: an open source toolkit for handling large scale language models. In Interspeech. ISCA, 2008.
    • (2008) Interspeech. ISCA
    • Federico, M.1    Bertoldi, N.2    Cettolo, M.3
  • 7
    • 84898793348 scopus 로고    scopus 로고
    • How many words is a picture worth? Automatic caption generation for news images
    • Y. Feng and M. Lapata. How many words is a picture worth? Automatic caption generation for news images. ACL'10.
    • ACL , pp. 10
    • Feng, Y.1    Lapata, M.2
  • 8
    • 84898773262 scopus 로고    scopus 로고
    • Youtube2text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shoot recognition
    • S. Guadarrama, N. Krishnamoorthy, G. Malkarnenkar, R. Mooney, T. Darrell, and K. Saenko. Youtube2text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shoot recognition. In ICCV, 2013.
    • (2013) ICCV
    • Guadarrama, S.1    Krishnamoorthy, N.2    Malkarnenkar, G.3    Mooney, R.4    Darrell, T.5    Saenko, K.6
  • 9
    • 70450202741 scopus 로고    scopus 로고
    • Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos
    • A. Gupta, P. Srinivasan, J. B. Shi, and L. Davis. Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos. In CVPR, 2009.
    • (2009) CVPR
    • Gupta, A.1    Srinivasan, P.2    Shi, J.B.3    Davis, L.4
  • 10
    • 84877964523 scopus 로고    scopus 로고
    • Automated textual descriptions for a wide range of video events with 48 human actions
    • P. Hanckmann, K. Schutte, and G. J. Burghouts. Automated textual descriptions for a wide range of video events with 48 human actions. In ECCV Workshops, 2012.
    • (2012) ECCV Workshops
    • Hanckmann, P.1    Schutte, K.2    Burghouts, G.J.3
  • 14
    • 0036843382 scopus 로고    scopus 로고
    • Natural language description of human activities from video images based on concept hierarchy of actions
    • A. Kojima, T. Tamura, and K. Fukunaga. Natural language description of human activities from video images based on concept hierarchy of actions. IJCV, 2002.
    • (2002) IJCV
    • Kojima, A.1    Tamura, T.2    Fukunaga, K.3
  • 17
    • 49449119085 scopus 로고    scopus 로고
    • Statistical machine translation
    • A. Lopez. Statistical machine translation. ACM, 2008.
    • (2008) ACM
    • Lopez, A.1
  • 19
    • 0042879653 scopus 로고    scopus 로고
    • A systematic comparison of various statistical alignment models
    • F. J. Och and H. Ney. A systematic comparison of various statistical alignment models. CL, 2003.
    • (2003) CL
    • Och, F.J.1    Ney, H.2
  • 20
    • 85162522202 scopus 로고    scopus 로고
    • Im2text: Describing images using 1 million captioned photographs
    • V. Ordonez, G. Kulkarni, and T. L. Berg. Im2text: Describing images using 1 million captioned photographs. In NIPS, 2011.
    • (2011) NIPS
    • Ordonez, V.1    Kulkarni, G.2    Berg, T.L.3
  • 21
    • 85133336275 scopus 로고    scopus 로고
    • BLEU: A method for automatic evaluation of machine translation
    • K. Papineni, S. Roukos, T. Ward, and W. jing Zhu. BLEU: a method for automatic evaluation of machine translation. In ACL, 2002.
    • (2002) ACL
    • Papineni, K.1    Roukos, S.2    Ward, T.3    Jing Zhu, W.4
  • 22
    • 84898775557 scopus 로고    scopus 로고
    • Video event understanding using natural language descriptions
    • V. Ramanathan, P. Liang, and L. Fei-Fei. Video event understanding using natural language descriptions. In ICCV, 2013.
    • (2013) ICCV
    • Ramanathan, V.1    Liang, P.2    Fei-Fei, L.3
  • 26
    • 84455192418 scopus 로고    scopus 로고
    • Towards textually describing complex video contents with audio-visual concept classifiers
    • C. C. Tan, Y.-G. Jiang, and C.-W. Ngo. Towards textually describing complex video contents with audio-visual concept classifiers. In ACM Multimedia, 2011.
    • (2011) ACM Multimedia
    • Tan, C.C.1    Jiang, Y.-G.2    Ngo, C.-W.3
  • 27
    • 84876945537 scopus 로고    scopus 로고
    • Dense trajectories and motion boundary descriptors for action recognition
    • H. Wang, A. Kl̈aser, C. Schmid, and C. Liu. Dense trajectories and motion boundary descriptors for action recognition. IJCV, 2013.
    • (2013) IJCV
    • Wang, H.1    Kl̈aser, A.2    Schmid, C.3    Liu, C.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.