Combining Text Semantics and Image Geometry to Improve Scene Interpretation

Dennis Medved; Fangyuan Jiang; Peter Exner; Magnus Oskarsson; Pierre Nugues; Kalle Aström

doi:10.5220/0004752004790486

Combining Text Semantics and Image Geometry to Improve Scene Interpretation

Dennis Medved, Fangyuan Jiang, Peter Exner, Magnus Oskarsson, Pierre Nugues, Kalle Aström

2014

Abstract

In this paper, we describe a novel system that identifies relations between the objects extracted from an image. We started from the idea that in addition to the geometric and visual properties of the image objects, we could exploit lexical and semantic information from the text accompanying the image. As experimental set up, we gathered a corpus of images from Wikipedia as well as their associated articles. We extracted two types of objects: human beings and horses and we considered three relations that could hold between them: Ride, Lead, or None. We used geometric features as a baseline to identify the relations between the entities and we describe the improvements brought by the addition of bag-of-word features and predicate–argument structures we derived from the text. The best semantic model resulted in a relative error reduction of more than 18% over the baseline.

Download

Paper Citation

in Harvard Style

Medved D., Jiang F., Exner P., Oskarsson M., Nugues P. and Aström K. (2014). Combining Text Semantics and Image Geometry to Improve Scene Interpretation . In Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-018-5, pages 479-486. DOI: 10.5220/0004752004790486

in Bibtex Style

@conference{icpram14,
author={Dennis Medved and Fangyuan Jiang and Peter Exner and Magnus Oskarsson and Pierre Nugues and Kalle Aström},
title={Combining Text Semantics and Image Geometry to Improve Scene Interpretation},
booktitle={Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2014},
pages={479-486},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004752004790486},
isbn={978-989-758-018-5},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Combining Text Semantics and Image Geometry to Improve Scene Interpretation
SN - 978-989-758-018-5
AU - Medved D.
AU - Jiang F.
AU - Exner P.
AU - Oskarsson M.
AU - Nugues P.
AU - Aström K.
PY - 2014
SP - 479
EP - 486
DO - 10.5220/0004752004790486