Estimating Sentiment via Probability and Information Theory

Kevin Labille, Sultan Alfarhood, Susan Gauch

2016

Abstract

Opinion detection and opinion analysis is a challenging but important task. Such sentiment analysis can be done using traditional supervised learning methods such as naive Bayes classification and support vector ma- chines (SVM) or unsupervised approaches based on a lexicon may be employed. Because lexicon-based senti- ment analysis methods make use of an opinion dictionary that is a list of opinion-bearing or sentiment words, sentiment lexicons play a key role. Our work focuses on the task of generating such a lexicon. We propose several novel methods to automatically generate a general-purpose sentiment lexicon using a corpus-based approach. While most existing methods generate a lexicon using a list of seed sentiment words and a domain corpus, our work differs from these by generating a lexicon from scratch using probabilistic techniques and information theoretical text mining techniques on a large diverse corpus. We conclude by presenting an ensem- ble method that combines the two approaches. We evaluate and demonstrate the effectiveness of our methods by utilizing the various automatically-generated lexicons during sentiment analysis. When used for sentiment analysis, our best single lexicon achieves an accuracy of 87.60% and the ensemble approach achieves 88.75% accuracy, both statistically significant improvements over 81.60% with a widely-used sentiment lexicon.

Download


Paper Citation


in Harvard Style

Labille K., Alfarhood S. and Gauch S. (2016). Estimating Sentiment via Probability and Information Theory . In Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016) ISBN 978-989-758-203-5, pages 121-129. DOI: 10.5220/0006072101210129

in Bibtex Style

@conference{kdir16,
author={Kevin Labille and Sultan Alfarhood and Susan Gauch},
title={Estimating Sentiment via Probability and Information Theory},
booktitle={Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)},
year={2016},
pages={121-129},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006072101210129},
isbn={978-989-758-203-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)
TI - Estimating Sentiment via Probability and Information Theory
SN - 978-989-758-203-5
AU - Labille K.
AU - Alfarhood S.
AU - Gauch S.
PY - 2016
SP - 121
EP - 129
DO - 10.5220/0006072101210129