Temporal-based Feature Selection and Transfer Learning for Text Categorization

Fumiyo Fukumoto, Yoshimi Suzuki

2015

Abstract

This paper addresses text categorization problem that training data may derive from a different time period from the test data. We present a method for text categorization that minimizes the impact of temporal effects. Like much previous work on text categorization, we used feature selection. We selected two types of informative terms according to corpus statistics. One is temporal independent terms that are salient across full temporal range of training documents. Another is temporal dependent terms which are important for a specific time period. For the training documents represented by independent/dependent terms, we applied boosting based transfer learning to learn accurate model for timeline adaptation. The results using Japanese data showed that the method was comparable to the current state-of-the-art biased-SVM method, as the macro-averaged F-score obtained by our method was 0.688 and that of biased-SVM was 0.671. Moreover, we found that the method is effective, especially when the creation time period of the test data differs greatly from that of the training data.

Download


Paper Citation


in Harvard Style

Fukumoto F. and Suzuki Y. (2015). Temporal-based Feature Selection and Transfer Learning for Text Categorization . In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015) ISBN 978-989-758-158-8, pages 17-26. DOI: 10.5220/0005593100170026

in Bibtex Style

@conference{kdir15,
author={Fumiyo Fukumoto and Yoshimi Suzuki},
title={Temporal-based Feature Selection and Transfer Learning for Text Categorization},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)},
year={2015},
pages={17-26},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005593100170026},
isbn={978-989-758-158-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)
TI - Temporal-based Feature Selection and Transfer Learning for Text Categorization
SN - 978-989-758-158-8
AU - Fukumoto F.
AU - Suzuki Y.
PY - 2015
SP - 17
EP - 26
DO - 10.5220/0005593100170026