Cluster Analysis of Twitter Data: A Review of Algorithms

Noufa Alnajran, Keeley Crockett, David McLean, Annabel Latham

2017

Abstract

Twitter, a microblogging online social network (OSN), has quickly gained prominence as it provides people with the opportunity to communicate and share posts and topics. Tremendous value lies in automated analysing and reasoning about such data in order to derive meaningful insights, which carries potential opportunities for businesses, users, and consumers. However, the sheer volume, noise, and dynamism of Twitter, imposes challenges that hinder the efficacy of observing clusters with high intra-cluster (i.e. minimum variance) and low inter-cluster similarities. This review focuses on research that has used various clustering algorithms to analyse Twitter data streams and identify hidden patterns in tweets where text is highly unstructured. This paper performs a comparative analysis on approaches of unsupervised learning in order to determine whether empirical findings support the enhancement of decision support and pattern recognition applications. A review of the literature identified 13 studies that implemented different clustering methods. A comparison including clustering methods, algorithms, number of clusters, dataset(s) size, distance measure, clustering features, evaluation methods, and results was conducted. The conclusion reports that the use of unsupervised learning in mining social media data has several weaknesses. Success criteria and future directions for research and practice to the research community are discussed.

Download


Paper Citation


in Harvard Style

Alnajran N., Crockett K., McLean D. and Latham A. (2017). Cluster Analysis of Twitter Data: A Review of Algorithms . In Proceedings of the 9th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-220-2, pages 239-249. DOI: 10.5220/0006202802390249

in Bibtex Style

@conference{icaart17,
author={Noufa Alnajran and Keeley Crockett and David McLean and Annabel Latham},
title={Cluster Analysis of Twitter Data: A Review of Algorithms},
booktitle={Proceedings of the 9th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2017},
pages={239-249},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006202802390249},
isbn={978-989-758-220-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Cluster Analysis of Twitter Data: A Review of Algorithms
SN - 978-989-758-220-2
AU - Alnajran N.
AU - Crockett K.
AU - McLean D.
AU - Latham A.
PY - 2017
SP - 239
EP - 249
DO - 10.5220/0006202802390249