An automated classification system based on the strings of trojan and virus families


Autoria(s): Tian, Ronghua; Batten, Lynn; Islam, Rafiqul; Versteeg, Steve
Contribuinte(s)

[Unknown]

Data(s)

01/01/2009

Resumo

Classifying malware correctly is an important research issue for anti-malware software producers. This paper presents an effective and efficient malware classification technique based on string information using several wellknown classification algorithms. In our testing we extracted the printable strings from 1367 samples, including unpacked trojans and viruses and clean files. Information describing the printable strings contained in each sample was input to various classification algorithms, including treebased classifiers, a nearest neighbour algorithm, statistical algorithms and AdaBoost. Using k-fold cross validation on the unpacked malware and clean files, we achieved a classification accuracy of 97%. Our results reveal that strings from library code (rather than malicious code itself) can be utilised to distinguish different malware families.<br />

Identificador

http://hdl.handle.net/10536/DRO/DU:30028345

Idioma(s)

eng

Publicador

IEEE

Relação

http://dro.deakin.edu.au/eserv/DU:30028345/MALWARE_2009_evid_conf.pdf

http://dro.deakin.edu.au/eserv/DU:30028345/MALWARE_2009_evid_refereering.pdf

http://dro.deakin.edu.au/eserv/DU:30028345/tian-rh-anautomatedclassification-2009.pdf

http://dx.doi.org/10.1109/MALWARE.2009.5403021

http://isiom.wssrl.org/

Direitos

2009, IEEE

Palavras-Chave #malware #classification #strings
Tipo

Conference Paper