Authorship Attribution using Variable Length Part-of-Speech Patterns

Yao Jean Marc Pokou, Philippe Fournier-Viger, Chadia Moghrabi

2016

Abstract

Identifying the author of a book or document is an interesting research topic having numerous real-life applications. A number of algorithms have been proposed for the automatic authorship attribution of texts. However, it remains an important challenge to find distinct and quantifiable features for accurately identifying or narrowing the range of likely authors of a text. In this paper we propose a novel approach for authorship attribution, which relies on the discovery of variable-length sequential patterns of parts of speech to build signatures representing each author’s writing style. An experimental evaluation using 10 authors and 30 books, consisting of 2,615,856 words, from Project Gutenberg was carried. Results show that the proposed approach can accurately classify texts most of the time using a very small number of variable-length patterns. The proposed approach is also shown to perform better using variable-length patterns than with fixed-length patterns (bigrams or trigrams).

Download


Paper Citation


in Harvard Style

Pokou Y., Fournier-Viger P. and Moghrabi C. (2016). Authorship Attribution using Variable Length Part-of-Speech Patterns . In Proceedings of the 8th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-172-4, pages 354-361. DOI: 10.5220/0005710103540361

in Bibtex Style

@conference{icaart16,
author={Yao Jean Marc Pokou and Philippe Fournier-Viger and Chadia Moghrabi},
title={Authorship Attribution using Variable Length Part-of-Speech Patterns},
booktitle={Proceedings of the 8th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2016},
pages={354-361},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005710103540361},
isbn={978-989-758-172-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Authorship Attribution using Variable Length Part-of-Speech Patterns
SN - 978-989-758-172-4
AU - Pokou Y.
AU - Fournier-Viger P.
AU - Moghrabi C.
PY - 2016
SP - 354
EP - 361
DO - 10.5220/0005710103540361