A COMPLETE SYSTEM FOR DETECTION AND RECOGNITION OF TEXT IN GRAPHICAL DOCUMENTS USING BACKGROUND INFORMATION

Partha Pratim Roy, Josep Lladós, Umapada Pal

2009

Abstract

Automatic Text/symbols retrieval in graphical documents (map, engineering drawing) involves many challenges because they are not usually parallel to each other. They are multi-oriented and curve in nature to annotate the graphical curve lines and hence follow a curvi-linear way too. Sometimes, text and symbols frequently touch/overlap with graphical components (river, street, border line) which enhances the problem. For OCR of such documents we need to extract individual text lines and their corresponding words/characters. In this paper, we propose a methodology to extract individual text lines and an approach for recognition of the extracted text characters from such complex graphical documents. The methodology is based on the foreground and background information of the text components. To take care of background information, water reservoir concept and convex hull have been used. For recognition of multi-font, multi-scale and multi-oriented characters, Support Vector Machine (SVM) based classifier is applied. Circular ring and convex hull have been used along with angular information of the contour pixels of the characters to make the feature rotation and scale invariant.

Download


Paper Citation


in Harvard Style

Pratim Roy P., Lladós J. and Pal U. (2009). A COMPLETE SYSTEM FOR DETECTION AND RECOGNITION OF TEXT IN GRAPHICAL DOCUMENTS USING BACKGROUND INFORMATION . In Proceedings of the Fourth International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2009) ISBN 978-989-8111-69-2, pages 209-216. DOI: 10.5220/0001801902090216

in Bibtex Style

@conference{visapp09,
author={Partha Pratim Roy and Josep Lladós and Umapada Pal},
title={A COMPLETE SYSTEM FOR DETECTION AND RECOGNITION OF TEXT IN GRAPHICAL DOCUMENTS USING BACKGROUND INFORMATION},
booktitle={Proceedings of the Fourth International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2009)},
year={2009},
pages={209-216},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001801902090216},
isbn={978-989-8111-69-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Fourth International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2009)
TI - A COMPLETE SYSTEM FOR DETECTION AND RECOGNITION OF TEXT IN GRAPHICAL DOCUMENTS USING BACKGROUND INFORMATION
SN - 978-989-8111-69-2
AU - Pratim Roy P.
AU - Lladós J.
AU - Pal U.
PY - 2009
SP - 209
EP - 216
DO - 10.5220/0001801902090216