Facebook Posts Text Classification to Improve Information Filtering

Randa Benkhelifa, Fatima Zohra Laallam

2016

Abstract

Facebook is one of the most used socials networking sites. It is more than a simple website, but a popular tool of communication. Social networking users communicate between them exchanging a several kinds of content including a free text, image and video. Today, the social media users have a special way to express themselves. They create a new language known as “internet slang”, which crosses the same meaning using different lexical units. This unstructured text has its own specific characteristics, such as, massive, noisy and dynamic, while it requires novel preprocessing methods adapted to those characteristics in order to ease and make the process of the classification algorithms effective. Most of previous works about social media text classification eliminate Stopwords and classify posts based on their topic (e.g. politics, sport, art, etc). In this paper, we propose to classify them in a lower level into diverse pre-chosen classes using three machine learning algorithms SVM, Naïve Bayes and K-NN. To improve our classification, we propose a new preprocessing approach based on the Stopwords, Internet slang and other specific lexical units. Finally, we compared between all results for each classifier, then between classifiers results.

Download


Paper Citation


in Harvard Style

Benkhelifa R. and Laallam F. (2016). Facebook Posts Text Classification to Improve Information Filtering . In Proceedings of the 12th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-758-186-1, pages 202-207. DOI: 10.5220/0005907702020207

in Bibtex Style

@conference{webist16,
author={Randa Benkhelifa and Fatima Zohra Laallam},
title={Facebook Posts Text Classification to Improve Information Filtering},
booktitle={Proceedings of the 12th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2016},
pages={202-207},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005907702020207},
isbn={978-989-758-186-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 12th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - Facebook Posts Text Classification to Improve Information Filtering
SN - 978-989-758-186-1
AU - Benkhelifa R.
AU - Laallam F.
PY - 2016
SP - 202
EP - 207
DO - 10.5220/0005907702020207