Creating Facets Hierarchy for Unstructured Arabic Documents

Khaled Nagi, Dalia Halim

2013

Abstract

Faceted search is becoming the standard searching method on modern web sites. To implement a faceted search system, a well defined metadata structure for the searched items must exist. Unfortunately, online text documents are simple plain text, usually without any metadata to describe their content. Taking advantage of external lexical hierarchies, a variety of methods for extracting plain and hierarchical facets from textual content are recently introduced. Meanwhile, the size of Arabic documents that can be accessed online is increasing every day. However, the Arabic language is not as established as the English language on the web. In our work, we introduce a faceted search system for unstructured Arabic text. Since the maturity of Arabic processing tools is not as high as the English ones, we try two methods for building the facets hierarchy for the Arabic terms. We then combine these methods into a hybrid one to get the best out of both approaches. We assess the three methods using our prototype by searching in real-life articles extracted from two sources: the BBC Arabic edition website and the Arab Sciencepedia Website.

Download


Paper Citation


in Harvard Style

Nagi K. and Halim D. (2013). Creating Facets Hierarchy for Unstructured Arabic Documents . In Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2013) ISBN 978-989-8565-81-5, pages 109-119. DOI: 10.5220/0004621201090119

in Bibtex Style

@conference{keod13,
author={Khaled Nagi and Dalia Halim},
title={Creating Facets Hierarchy for Unstructured Arabic Documents},
booktitle={Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2013)},
year={2013},
pages={109-119},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004621201090119},
isbn={978-989-8565-81-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2013)
TI - Creating Facets Hierarchy for Unstructured Arabic Documents
SN - 978-989-8565-81-5
AU - Nagi K.
AU - Halim D.
PY - 2013
SP - 109
EP - 119
DO - 10.5220/0004621201090119