TY - GEN
T1 - Detecting visual text
AU - Dodge, Jesse
AU - Goyal, Amit
AU - Han, Xufeng
AU - Mensch, Alyssa
AU - Mitchell, Margaret
AU - Stratos, Karl
AU - Yamaguchi, Kota
AU - Choi, Yejin
AU - Daumé, Hal
AU - Berg, Alexander C.
AU - Berg, Tamara L.
N1 - Publisher Copyright: © 2012 Association for Computational Linguistics.
PY - 2012
Y1 - 2012
N2 - When people describe a scene, they often include information that is not visually apparent; sometimes based on background knowledge, sometimes to tell a story. We aim to separate visual text - descriptions of what is being seen - from non-visual text in natural images and their descriptions. To do so, we first concretely define what it means to be visual, annotate visual text and then develop algorithms to automatically classify noun phrases as visual or non-visual. We find that using text alone, we are able to achieve high accuracies at this task, and that incorporating features derived from computer vision algorithms improves performance. Finally, we show that we can reliably mine visual nouns and adjectives from large corpora and that we can use these effectively in the classification task.
AB - When people describe a scene, they often include information that is not visually apparent; sometimes based on background knowledge, sometimes to tell a story. We aim to separate visual text - descriptions of what is being seen - from non-visual text in natural images and their descriptions. To do so, we first concretely define what it means to be visual, annotate visual text and then develop algorithms to automatically classify noun phrases as visual or non-visual. We find that using text alone, we are able to achieve high accuracies at this task, and that incorporating features derived from computer vision algorithms improves performance. Finally, we show that we can reliably mine visual nouns and adjectives from large corpora and that we can use these effectively in the classification task.
UR - https://www.scopus.com/pages/publications/84901455535
UR - https://www.scopus.com/pages/publications/84901455535#tab=citedBy
M3 - Conference contribution
T3 - NAACL HLT 2012 - 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference
SP - 762
EP - 772
BT - Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
T2 - 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2012
Y2 - 3 June 2012 through 8 June 2012
ER -