-
1
-
-
80052886947
-
Generating image descriptions using dependency relational patterns
-
A. Aker and R. J. Gaizauskas. Generating image descriptions using dependency relational patterns. In ACL, 2010.
-
(2011)
ACL
-
-
Aker, A.1
Gaizauskas, R.J.2
-
2
-
-
84885996388
-
Video in sentences out
-
A. Barbu, A. Bridge, Z. Burchill, D. Coroian, S. Dickinson, S. Fidler, A. Michaux, S. Mussman, S. Narayanaswamy, D. Salvi, L. Schmidt, J. Shangguan, J. M. Siskind, J. Waggoner, S. Wang, J. Wei, Y. Yin, and Z. Zhang. Video in sentences out. In UAI, 2012.
-
(2012)
UAI
-
-
Barbu, A.1
Bridge, A.2
Burchill, Z.3
Coroian, D.4
Dickinson, S.5
Fidler, S.6
Michaux, A.7
Mussman, S.8
Narayanaswamy, S.9
Salvi, D.10
Schmidt, L.11
Shangguan, J.12
Siskind, J.M.13
Waggoner, J.14
Wang, S.15
Wei, J.16
Yin, Y.17
Zhang, Z.18
-
3
-
-
84887345951
-
Thousand frames in just a few words: Lingual description of videos through latent topics and sparse object stitching
-
J. Corso, C. Xu, P. Das, R. F. Doell, and P. Rosebrough. Thousand frames in just a few words: Lingual description of videos through latent topics and sparse object stitching. In CVPR, 2013.
-
(2013)
CVPR
-
-
Corso, J.1
Xu, C.2
Das, P.3
Doell, R.F.4
Rosebrough, P.5
-
4
-
-
0038401728
-
Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary
-
P. Duygulu, K. Barnard, N. de Freitas, and D. A. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In ECCV, 2002.
-
(2002)
ECCV
-
-
Duygulu, P.1
Barnard, K.2
De Freitas, N.3
Forsyth, D.A.4
-
5
-
-
80052017343
-
Every picture tells a story: Generating sentences from images
-
A. Farhadi, M. Hejrati, M. Sadeghi, P. Young, C. Rashtchian, J. Hockenmaier, and D. Forsyth. Every picture tells a story: Generating sentences from images. In ECCV, 2010.
-
(2011)
ECCV
-
-
Farhadi, A.1
Hejrati, M.2
Sadeghi, M.3
Young, P.4
Rashtchian, C.5
Hockenmaier, J.6
Forsyth, D.7
-
6
-
-
84867209146
-
IRSTLM: An open source toolkit for handling large scale language models
-
M. Federico, N. Bertoldi, and M. Cettolo. IRSTLM: an open source toolkit for handling large scale language models. In Interspeech. ISCA, 2008.
-
(2008)
Interspeech. ISCA
-
-
Federico, M.1
Bertoldi, N.2
Cettolo, M.3
-
7
-
-
84898793348
-
How many words is a picture worth? Automatic caption generation for news images
-
Y. Feng and M. Lapata. How many words is a picture worth? Automatic caption generation for news images. ACL'10.
-
ACL
, pp. 10
-
-
Feng, Y.1
Lapata, M.2
-
8
-
-
84898773262
-
Youtube2text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shoot recognition
-
S. Guadarrama, N. Krishnamoorthy, G. Malkarnenkar, R. Mooney, T. Darrell, and K. Saenko. Youtube2text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shoot recognition. In ICCV, 2013.
-
(2013)
ICCV
-
-
Guadarrama, S.1
Krishnamoorthy, N.2
Malkarnenkar, G.3
Mooney, R.4
Darrell, T.5
Saenko, K.6
-
9
-
-
70450202741
-
Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos
-
A. Gupta, P. Srinivasan, J. B. Shi, and L. Davis. Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos. In CVPR, 2009.
-
(2009)
CVPR
-
-
Gupta, A.1
Srinivasan, P.2
Shi, J.B.3
Davis, L.4
-
10
-
-
84877964523
-
Automated textual descriptions for a wide range of video events with 48 human actions
-
P. Hanckmann, K. Schutte, and G. J. Burghouts. Automated textual descriptions for a wide range of video events with 48 human actions. In ECCV Workshops, 2012.
-
(2012)
ECCV Workshops
-
-
Hanckmann, P.1
Schutte, K.2
Burghouts, G.J.3
-
13
-
-
85110867932
-
Moses: Open source toolkit for statistical machine translation
-
P. Koehn, H. Hoang, A. Birch, C. Callison-Burch, M. Federico, N. Bertoldi, B. Cowan, W. Shen, C. Moran, R. Zens, C. Dyer, O. Bojar, A. Constantin, and E. Herbst. Moses: Open source toolkit for statistical machine translation. In ACL demo, 2007.
-
(2007)
ACL Demo
-
-
Koehn, P.1
Hoang, H.2
Birch, A.3
Callison-Burch, C.4
Federico, M.5
Bertoldi, N.6
Cowan, B.7
Shen, W.8
Moran, C.9
Zens, R.10
Dyer, C.11
Bojar, O.12
Constantin, A.13
Herbst, E.14
-
14
-
-
0036843382
-
Natural language description of human activities from video images based on concept hierarchy of actions
-
A. Kojima, T. Tamura, and K. Fukunaga. Natural language description of human activities from video images based on concept hierarchy of actions. IJCV, 2002.
-
(2002)
IJCV
-
-
Kojima, A.1
Tamura, T.2
Fukunaga, K.3
-
15
-
-
80052901011
-
Baby talk: Understanding and generating simple image descriptions
-
G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. Choi, A. C. Berg, and T. L. Berg. Baby talk: Understanding and generating simple image descriptions. In CVPR, 2011.
-
(2011)
CVPR
-
-
Kulkarni, G.1
Premraj, V.2
Dhar, S.3
Li, S.4
Choi, Y.5
Berg, A.C.6
Berg, T.L.7
-
16
-
-
84878189119
-
Collective generation of natural image descriptions
-
P. Kuznetsova, V. Ordonez, A. C. Berg, T. L. Berg, and Y. Choi. Collective generation of natural image descriptions. In ACL, 2012.
-
(2012)
ACL
-
-
Kuznetsova, P.1
Ordonez, V.2
Berg, A.C.3
Berg, T.L.4
Choi, Y.5
-
17
-
-
49449119085
-
Statistical machine translation
-
A. Lopez. Statistical machine translation. ACM, 2008.
-
(2008)
ACM
-
-
Lopez, A.1
-
18
-
-
85044322278
-
Generating image descriptions from computer vision detections
-
M. Mitchell, J. Dodge, A. Goyal, K. Yamaguchi, K. Stratos, X. Han, A. Mensch, A. C. Berg, T. L. Berg, and H. D. III. Midge, Generating image descriptions from computer vision detections. In EACL, 2012.
-
(2012)
EACL
-
-
Mitchell, M.1
Dodge, J.2
Goyal, A.3
Yamaguchi, K.4
Stratos, K.5
Han, X.6
Mensch, A.7
Berg, A.C.8
Berg, T.L.9
Midge III, H.D.10
-
19
-
-
0042879653
-
A systematic comparison of various statistical alignment models
-
F. J. Och and H. Ney. A systematic comparison of various statistical alignment models. CL, 2003.
-
(2003)
CL
-
-
Och, F.J.1
Ney, H.2
-
20
-
-
85162522202
-
Im2text: Describing images using 1 million captioned photographs
-
V. Ordonez, G. Kulkarni, and T. L. Berg. Im2text: Describing images using 1 million captioned photographs. In NIPS, 2011.
-
(2011)
NIPS
-
-
Ordonez, V.1
Kulkarni, G.2
Berg, T.L.3
-
21
-
-
85133336275
-
BLEU: A method for automatic evaluation of machine translation
-
K. Papineni, S. Roukos, T. Ward, and W. jing Zhu. BLEU: a method for automatic evaluation of machine translation. In ACL, 2002.
-
(2002)
ACL
-
-
Papineni, K.1
Roukos, S.2
Ward, T.3
Jing Zhu, W.4
-
22
-
-
84898775557
-
Video event understanding using natural language descriptions
-
V. Ramanathan, P. Liang, and L. Fei-Fei. Video event understanding using natural language descriptions. In ICCV, 2013.
-
(2013)
ICCV
-
-
Ramanathan, V.1
Liang, P.2
Fei-Fei, L.3
-
23
-
-
84898785648
-
Grounding action descriptions in videos
-
M. Regneri, M. Rohrbach, D. Wetzel, S. Thater, B. Schiele, and M. Pinkal. Grounding action descriptions in videos. TACL, 2013.
-
(2013)
TACL
-
-
Regneri, M.1
Rohrbach, M.2
Wetzel, D.3
Thater, S.4
Schiele, B.5
Pinkal, M.6
-
24
-
-
84887351648
-
Script data for attribute-based recognition of composite activities
-
M. Rohrbach, M. Regneri, M. Andriluka, S. Amin, M. Pinkal, and B. Schiele. Script data for attribute-based recognition of composite activities. In ECCV, 2012.
-
(2012)
ECCV
-
-
Rohrbach, M.1
Regneri, M.2
Andriluka, M.3
Amin, S.4
Pinkal, M.5
Schiele, B.6
-
26
-
-
84455192418
-
Towards textually describing complex video contents with audio-visual concept classifiers
-
C. C. Tan, Y.-G. Jiang, and C.-W. Ngo. Towards textually describing complex video contents with audio-visual concept classifiers. In ACM Multimedia, 2011.
-
(2011)
ACM Multimedia
-
-
Tan, C.C.1
Jiang, Y.-G.2
Ngo, C.-W.3
-
27
-
-
84876945537
-
Dense trajectories and motion boundary descriptors for action recognition
-
H. Wang, A. Kl̈aser, C. Schmid, and C. Liu. Dense trajectories and motion boundary descriptors for action recognition. IJCV, 2013.
-
(2013)
IJCV
-
-
Wang, H.1
Kl̈aser, A.2
Schmid, C.3
Liu, C.4
|