News Classifications with Labeled LDA

Yiqi Bai, Jie Wang

2015

Abstract

Automatically categorizing news articles with high accuracy is an important task in an automated quick news system. We present two classifiers to classify news articles based on Labeled Latent Dirichlet Allocation, called LLDA-C and SLLDA-C. To verify classification accuracy we compare classification results obtained by the classifiers with those by trained professionals. We show that, through extensive experiments, both LLDA-C and SLLDA-C outperform SVM (Support Vector Machine, our baseline classifier) on precisions, particularly when only a small training dataset is available. SSLDA-C is also much more efficient than SVM. In terms of recalls, we show that LLDA-C is better than SVM. In terms of average Macro-F1 and Micro-F1 scores, we show that LLDA classifiers are superior over SVM. To further explore classifications of news articles we introduce the notion of content complexity, and study how content complexity would affect classifications.

Download


Paper Citation


in Harvard Style

Bai Y. and Wang J. (2015). News Classifications with Labeled LDA . In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015) ISBN 978-989-758-158-8, pages 75-83. DOI: 10.5220/0005610600750083

in Bibtex Style

@conference{kdir15,
author={Yiqi Bai and Jie Wang},
title={News Classifications with Labeled LDA},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)},
year={2015},
pages={75-83},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005610600750083},
isbn={978-989-758-158-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)
TI - News Classifications with Labeled LDA
SN - 978-989-758-158-8
AU - Bai Y.
AU - Wang J.
PY - 2015
SP - 75
EP - 83
DO - 10.5220/0005610600750083