Efficient Social Network Multilingual Classification using Character, POS n-grams and Dynamic Normalization

Carlos-Emiliano González-Gallardo, Juan-Manuel Torres-Moreno, Azucena Montes Rendón, Gerardo Sierra

2016

Abstract

In this paper we describe a dynamic normalization process applied to social network multilingual documents (Facebook and Twitter) to improve the performance of the Author profiling task for short texts. After the normalization process, n-grams of characters and n-grams of POS tags are obtained to extract all the possible stylistic information encoded in the documents (emoticons, character flooding, capital letters, references to other users, hyperlinks, hashtags, etc.). Experiments with SVM showed up to 90% of performance.

Download


Paper Citation


in Harvard Style

González-Gallardo C., Torres-Moreno J., Montes Rendón A. and Sierra G. (2016). Efficient Social Network Multilingual Classification using Character, POS n-grams and Dynamic Normalization . In Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016) ISBN 978-989-758-203-5, pages 307-314. DOI: 10.5220/0006052803070314

in Bibtex Style

@conference{kdir16,
author={Carlos-Emiliano González-Gallardo and Juan-Manuel Torres-Moreno and Azucena Montes Rendón and Gerardo Sierra},
title={Efficient Social Network Multilingual Classification using Character, POS n-grams and Dynamic Normalization},
booktitle={Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)},
year={2016},
pages={307-314},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006052803070314},
isbn={978-989-758-203-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)
TI - Efficient Social Network Multilingual Classification using Character, POS n-grams and Dynamic Normalization
SN - 978-989-758-203-5
AU - González-Gallardo C.
AU - Torres-Moreno J.
AU - Montes Rendón A.
AU - Sierra G.
PY - 2016
SP - 307
EP - 314
DO - 10.5220/0006052803070314