Temporal-based Feature Selection and Transfer Learning for Text Categorization
Fumiyo Fukumoto, Yoshimi Suzuki
2015
Abstract
This paper addresses text categorization problem that training data may derive from a different time period from the test data. We present a method for text categorization that minimizes the impact of temporal effects. Like much previous work on text categorization, we used feature selection. We selected two types of informative terms according to corpus statistics. One is temporal independent terms that are salient across full temporal range of training documents. Another is temporal dependent terms which are important for a specific time period. For the training documents represented by independent/dependent terms, we applied boosting based transfer learning to learn accurate model for timeline adaptation. The results using Japanese data showed that the method was comparable to the current state-of-the-art biased-SVM method, as the macro-averaged F-score obtained by our method was 0.688 and that of biased-SVM was 0.671. Moreover, we found that the method is effective, especially when the creation time period of the test data differs greatly from that of the training data.
DownloadPaper Citation
in Harvard Style
Fukumoto F. and Suzuki Y. (2015). Temporal-based Feature Selection and Transfer Learning for Text Categorization . In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015) ISBN 978-989-758-158-8, pages 17-26. DOI: 10.5220/0005593100170026
in Bibtex Style
@conference{kdir15,
author={Fumiyo Fukumoto and Yoshimi Suzuki},
title={Temporal-based Feature Selection and Transfer Learning for Text Categorization},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)},
year={2015},
pages={17-26},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005593100170026},
isbn={978-989-758-158-8},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)
TI - Temporal-based Feature Selection and Transfer Learning for Text Categorization
SN - 978-989-758-158-8
AU - Fukumoto F.
AU - Suzuki Y.
PY - 2015
SP - 17
EP - 26
DO - 10.5220/0005593100170026