124 resultados para malware classification

em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Malware detection is a growing problem particularly on the Android mobile platform due to its increasing popularity and accessibility to numerous third party app markets. This has also been made worse by the increasingly sophisticated detection avoidance techniques employed by emerging malware families. This calls for more effective techniques for detection and classification of Android malware. Hence, in this paper we present an n-opcode analysis based approach that utilizes machine learning to classify and categorize Android malware. This approach enables automated feature discovery that eliminates the need for applying expert or domain knowledge to define the needed features. Our experiments on 2520 samples that were performed using up to 10-gram opcode features showed that an f-measure of 98% is achievable using this approach.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Mobile malware has been growing in scale and complexity spurred by the unabated uptake of smartphones worldwide. Android is fast becoming the most popular mobile platform resulting in sharp increase in malware targeting the platform. Additionally, Android malware is evolving rapidly to evade detection by traditional signature-based scanning. Despite current detection measures in place, timely discovery of new malware is still a critical issue. This calls for novel approaches to mitigate the growing threat of zero-day Android malware. Hence, the authors develop and analyse proactive machine-learning approaches based on Bayesian classification aimed at uncovering unknown Android malware via static analysis. The study, which is based on a large malware sample set of majority of the existing families, demonstrates detection capabilities with high accuracy. Empirical results and comparative analysis are presented offering useful insight towards development of effective static-analytic Bayesian classification-based solutions for detecting unknown Android malware.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Mobile malware has been growing in scale and complexity as smartphone usage continues to rise. Android has surpassed other mobile platforms as the most popular whilst also witnessing a dramatic increase in malware targeting the platform. A worrying trend that is emerging is the increasing sophistication of Android malware to evade detection by traditional signature-based scanners. As such, Android app marketplaces remain at risk of hosting malicious apps that could evade detection before being downloaded by unsuspecting users. Hence, in this paper we present an effective approach to alleviate this problem based on Bayesian classification models obtained from static code analysis. The models are built from a collection of code and app characteristics that provide indicators of potential malicious activities. The models are evaluated with real malware samples in the wild and results of experiments are presented to demonstrate the effectiveness of the proposed approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mobile malware has continued to grow at an alarming rate despite on-going mitigation efforts. This has been much more prevalent on Android due to being an open platform that is rapidly overtaking other competing platforms in the mobile smart devices market. Recently, a new generation of Android malware families has emerged with advanced evasion capabilities which make them much more difficult to detect using conventional methods. This paper proposes and investigates a parallel machine learning based classification approach for early detection of Android malware. Using real malware samples and benign applications, a composite classification model is developed from parallel combination of heterogeneous classifiers. The empirical evaluation of the model under different combination schemes demonstrates its efficacy and potential to improve detection accuracy. More importantly, by utilizing several classifiers with diverse characteristics, their strengths can be harnessed not only for enhanced Android malware detection but also quicker white box analysis by means of the more interpretable constituent classifiers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

N-gram analysis is an approach that investigates the structure of a program using bytes, characters or text strings. This research uses dynamic analysis to investigate malware detection using a classification approach based on N-gram analysis. A key issue with dynamic analysis is the length of time a program has to be run to ensure a correct classification. The motivation for this research is to find the optimum subset of operational codes (opcodes) that make the best indicators of malware and to determine how long a program has to be monitored to ensure an accurate support vector machine (SVM) classification of benign and malicious software. The experiments within this study represent programs as opcode density histograms gained through dynamic analysis for different program run periods. A SVM is used as the program classifier to determine the ability of different program run lengths to correctly determine the presence of malicious software. The findings show that malware can be detected with different program run lengths using a small number of opcodes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

N-gram analysis is an approach that investigates the structure of a program using bytes, characters or text strings. This research uses dynamic analysis to investigate malware detection using a classification approach based on N-gram analysis. The motivation for this research is to find a subset of Ngram features that makes a robust indicator of malware. The experiments within this paper represent programs as N-gram density histograms, gained through dynamic analysis. A Support Vector Machine (SVM) is used as the program classifier to determine the ability of N-grams to correctly determine the presence of malicious software. The preliminary findings show that an N-gram size N=3 and N=4 present the best avenues for further analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The research presented, investigates the optimal set of operational codes (opcodes) that create a robust indicator of malicious software (malware) and also determines a program’s execution duration for accurate classification of benign and malicious software. The features extracted from the dataset are opcode density histograms, extracted during the program execution. The classifier used is a support vector machine and is configured to select those features to produce the optimal classification of malware over different program run lengths. The findings demonstrate that malware can be detected using dynamic analysis with relatively few opcodes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Previous studies have revealed considerable interobserver and intraobserver variation in the histological classification of preinvasive cervical squamous lesions. The aim of the present study was to develop a decision support system (DSS) for the histological interpretation of these lesions. Knowledge and uncertainty were represented in the form of a Bayesian belief network that permitted the storage of diagnostic knowledge and, for a given case, the collection of evidence in a cumulative manner that provided a final probability for the possible diagnostic outcomes. The network comprised 8 diagnostic histological features (evidence nodes) that were each independently linked to the diagnosis (decision node) by a conditional probability matrix. Diagnostic outcomes comprised normal; koilocytosis; and cervical intraepithelial neoplasia (CIN) 1, CIN II, and CIN M. For each evidence feature, a set of images was recorded that represented the full spectrum of change for that feature. The system was designed to be interactive in that the histopathologist was prompted to enter evidence into the network via a specifically designed graphical user interface (i-Path Diagnostics, Belfast, Northern Ireland). Membership functions were used to derive the relative likelihoods for the alternative feature outcomes, the likelihood vector was entered into the network, and the updated diagnostic belief was computed for the diagnostic outcomes and displayed. A cumulative probability graph was generated throughout the diagnostic process and presented on screen. The network was tested on 50 cervical colposcopic biopsy specimens, comprising 10 cases each of normal, koilocytosis, CIN 1, CIN H, and CIN III. These had been preselected by a consultant gynecological pathologist. Using conventional morphological assessment, the cases were classified on 2 separate occasions by 2 consultant and 2 junior pathologists. The cases were also then classified using the DSS on 2 occasions by the 4 pathologists and by 2 medical students with no experience in cervical histology. Interobserver and intraobserver agreement using morphology and using the DSS was calculated with K statistics. Intraobserver reproducibility using conventional unaided diagnosis was reasonably good (kappa range, 0.688 to 0.861), but interobserver agreement was poor (kappa range, 0.347 to 0.747). Using the DSS improved overall reproducibility between individuals. Using the DSS, however, did not enhance the diagnostic performance of junior pathologists when comparing their DSS-based diagnosis against an experienced consultant. However, the generation of a cumulative probability graph also allowed a comparison of individual performance, how individual features were assessed in the same case, and how this contributed to diagnostic disagreement between individuals. Diagnostic features such as nuclear pleomorphism were shown to be particularly problematic and poorly reproducible. DSSs such as this therefore not only have a role to play in enhancing decision making but also in the study of diagnostic protocol, education, self-assessment, and quality control. (C) 2003 Elsevier Inc. All rights reserved.