ANOMALY-BASED SPAM FILTERING

Igor Santos, Carlos Laorden, Xabier Ugarte-Pedrero, Borja Sanz, Pablo G. Bringas

2011

Abstract

Spam has become an important problem for computer security because it is a channel for the spreading of threats such as computer viruses, worms and phishing. Currently, more than 85% of received e-mails are spam. Historical approaches to combat these messages, including simple techniques such as sender blacklisting or the use of e-mail signatures, are no longer completely reliable. Many solutions utilise machine-learning approaches trained using statistical representations of the terms that usually appear in the e-mails. However, these methods require a time-consuming training step with labelled data. Dealing with the situation where the availability of labelled training instances is limited slows down the progress of filtering systems and offers advantages to spammers. In this paper, we present the first spam filtering method based on anomaly detection that reduces the necessity of labelling spam messages and only employs the representation of legitimate emails. This approach represents legitimate e-mails as word frequency vectors. Thereby, an email is classified as spam or legitimate by measuring its deviation to the representation of the legitimate e-mails. We show that this method achieves high accuracy rates detecting spam while maintaining a low false positive rate and reducing the effort produced by labelling spam.

Download


Paper Citation


in Harvard Style

Santos I., Laorden C., Ugarte-Pedrero X., Sanz B. and G. Bringas P. (2011). ANOMALY-BASED SPAM FILTERING . In Proceedings of the International Conference on Security and Cryptography - Volume 1: SECRYPT, (ICETE 2011) ISBN 978-989-8425-71-3, pages 5-14. DOI: 10.5220/0003444700050014

in Bibtex Style

@conference{secrypt11,
author={Igor Santos and Carlos Laorden and Xabier Ugarte-Pedrero and Borja Sanz and Pablo G. Bringas},
title={ANOMALY-BASED SPAM FILTERING},
booktitle={Proceedings of the International Conference on Security and Cryptography - Volume 1: SECRYPT, (ICETE 2011)},
year={2011},
pages={5-14},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003444700050014},
isbn={978-989-8425-71-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Security and Cryptography - Volume 1: SECRYPT, (ICETE 2011)
TI - ANOMALY-BASED SPAM FILTERING
SN - 978-989-8425-71-3
AU - Santos I.
AU - Laorden C.
AU - Ugarte-Pedrero X.
AU - Sanz B.
AU - G. Bringas P.
PY - 2011
SP - 5
EP - 14
DO - 10.5220/0003444700050014