Behavior-based Malware Analysis using Profile Hidden Markov Models

Saradha Ravi, N. Balakrishnan, Bharath Venkatesh

2013

Abstract

In the area of malware analysis, static binary analysis techniques are becoming increasingly difficult with the code obfuscation methods and code packing employed when writing the malware. The behavior-based analysis techniques are being used in large malware analysis systems because of this reason. In these dynamic analysis systems, the malware samples are executed and monitored in a controlled environment using tools such as CWSandbox(Willems et al., 2007). In previous works, a number of clustering and classification techniques from machine learning and data mining have been used to classify the malwares into families and to identify even new malware families, from the behavior reports. In our work, we propose to use the Profile Hidden Markov Model to classify the malware files into families or groups based on their behavior on the host system. PHMM has been used extensively in the area of bioinformatics to search for similar protein and DNA sequences in a large database. We see that using this particular model will help us overcome the hurdle posed by polymorphism that is common in malware today. We show that the classification accuracy is high and comparable with the state-of-art-methods, even when using very few training samples for building models. The experiments were on a dataset with 24 families initially, and later using a larger dataset with close to 400 different families of malware. A fast clustering method to group malware with similar behaviour following the scoring on the PHMMprofile database was used for the large dataset. We have presented the challenges in the evaluation methods and metrics of clustering on large number of malware files and show the effectiveness of using profile hidden model models for known malware families.

Download


Paper Citation


in Harvard Style

Ravi S., Balakrishnan N. and Venkatesh B. (2013). Behavior-based Malware Analysis using Profile Hidden Markov Models . In Proceedings of the 10th International Conference on Security and Cryptography - Volume 1: SECRYPT, (ICETE 2013) ISBN 978-989-8565-73-0, pages 195-206. DOI: 10.5220/0004528201950206

in Bibtex Style

@conference{secrypt13,
author={Saradha Ravi and N. Balakrishnan and Bharath Venkatesh},
title={Behavior-based Malware Analysis using Profile Hidden Markov Models},
booktitle={Proceedings of the 10th International Conference on Security and Cryptography - Volume 1: SECRYPT, (ICETE 2013)},
year={2013},
pages={195-206},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004528201950206},
isbn={978-989-8565-73-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 10th International Conference on Security and Cryptography - Volume 1: SECRYPT, (ICETE 2013)
TI - Behavior-based Malware Analysis using Profile Hidden Markov Models
SN - 978-989-8565-73-0
AU - Ravi S.
AU - Balakrishnan N.
AU - Venkatesh B.
PY - 2013
SP - 195
EP - 206
DO - 10.5220/0004528201950206