Creating Facets Hierarchy for Unstructured Arabic Documents
Khaled Nagi, Dalia Halim
2013
Abstract
Faceted search is becoming the standard searching method on modern web sites. To implement a faceted search system, a well defined metadata structure for the searched items must exist. Unfortunately, online text documents are simple plain text, usually without any metadata to describe their content. Taking advantage of external lexical hierarchies, a variety of methods for extracting plain and hierarchical facets from textual content are recently introduced. Meanwhile, the size of Arabic documents that can be accessed online is increasing every day. However, the Arabic language is not as established as the English language on the web. In our work, we introduce a faceted search system for unstructured Arabic text. Since the maturity of Arabic processing tools is not as high as the English ones, we try two methods for building the facets hierarchy for the Arabic terms. We then combine these methods into a hybrid one to get the best out of both approaches. We assess the three methods using our prototype by searching in real-life articles extracted from two sources: the BBC Arabic edition website and the Arab Sciencepedia Website.
DownloadPaper Citation
in Harvard Style
Nagi K. and Halim D. (2013). Creating Facets Hierarchy for Unstructured Arabic Documents . In Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2013) ISBN 978-989-8565-81-5, pages 109-119. DOI: 10.5220/0004621201090119
in Bibtex Style
@conference{keod13,
author={Khaled Nagi and Dalia Halim},
title={Creating Facets Hierarchy for Unstructured Arabic Documents},
booktitle={Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2013)},
year={2013},
pages={109-119},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004621201090119},
isbn={978-989-8565-81-5},
}
in EndNote Style
TY - CONF
JO - Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2013)
TI - Creating Facets Hierarchy for Unstructured Arabic Documents
SN - 978-989-8565-81-5
AU - Nagi K.
AU - Halim D.
PY - 2013
SP - 109
EP - 119
DO - 10.5220/0004621201090119