999 resultados para malware analysis


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Static detection of polymorphic malware variants plays an important role to improve system security. Control flow has shown to be an effective characteristic that represents polymorphic malware instances. In our research, we propose a similarity search of malware using novel distance metrics of malware signatures. We describe a malware signature by the set of control flow graphs the malware contains. We propose two approaches and use the first to perform pre-filtering. Firstly, we use a distance metric based on the distance between feature vectors. The feature vector is a decomposition of the set of graphs into either fixed size k-sub graphs, or q-gram strings of the high-level source after decompilation. We also propose a more effective but less computationally efficient distance metric based on the minimum matching distance. The minimum matching distance uses the string edit distances between programs' decompiled flow graphs, and the linear sum assignment problem to construct a minimum sum weight matching between two sets of graphs. We implement the distance metrics in a complete malware variant detection system. The evaluation shows that our approach is highly effective in terms of a limited false positive rate and our system detects more malware variants when compared to the detection rates of other algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In statistical classification work, one method of speeding up the process is to use only a small percentage of the total parameter set available. In this paper, we apply this technique both to the classification of malware and the identification of malware from a set combined with cleanware. In order to demonstrate the usefulness of our method, we use the same sets of malware and cleanware as in an earlier paper. Using the statistical technique Information Gain (IG), we reduce the set of features used in the experiment from 7,605 to just over 1,000. The best accuracy obtained in the former paper using 7,605 features is 97.3% for malware versus cleanware detection and 97.4% for malware family classification; on the reduced feature set, we obtain a (best) accuracy of 94.6% on the malware versus cleanware test and 94.5% on the malware classification test. An interesting feature of the new tests presented here is the reduction in false negative rates by a factor of about 1/3 when compared with the results of the earlier paper. In addition, the speed with which our tests run is reduced by a factor of approximately 3/5 from the times posted for the original paper. The small loss in accuracy and improved false negative rate along with significant improvement in speed indicate that feature reduction should be further pursued as a tool to prevent algorithms from becoming intractable due to too much data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Zero-day or unknown malware are created using code obfuscation techniques that can modify the parent code to produce offspring copies which have the same functionality but with different signatures. Current techniques reported in literature lack the capability of detecting zero-day malware with the required accuracy and efficiency. In this paper, we have proposed and evaluated a novel method of employing several data mining techniques to detect and classify zero-day malware with high levels of accuracy and efficiency based on the frequency of Windows API calls. This paper describes the methodology employed for the collection of large data sets to train the classifiers, and analyses the performance results of the various data mining algorithms adopted for the study using a fully automated tool developed in this research to conduct the various experimental investigations and evaluation. Through the performance results of these algorithms from our experimental analysis, we are able to evaluate and discuss the advantages of one data mining algorithm over the other for accurately detecting zero-day malware successfully. The data mining framework employed in this research learns through analysing the behavior of existing malicious and benign codes in large datasets. We have employed robust classifiers, namely Naïve Bayes (NB) Algorithm, k−Nearest Neighbor (kNN) Algorithm, Sequential Minimal Optimization (SMO) Algorithm with 4 differents kernels (SMO - Normalized PolyKernel, SMO – PolyKernel, SMO – Puk, and SMO- Radial Basis Function (RBF)), Backpropagation Neural Networks Algorithm, and J48 decision tree and have evaluated their performance. Overall, the automated data mining system implemented for this study has achieved high true positive (TP) rate of more than 98.5%, and low false positive (FP) rate of less than 0.025, which has not been achieved in literature so far. This is much higher than the required commercial acceptance level indicating that our novel technique is a major leap forward in detecting zero-day malware. This paper also offers future directions for researchers in exploring different aspects of obfuscations that are affecting the IT world today.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Smartphones are mobile phones that offer processing power and features like personal computers (PC) with the aim of improving user productivity as they allow users to access and manipulate data over networks and Internet, through various mobile applications. However, with such anywhere and anytime functionality, new security threats and risks of sensitive and personal data are envisaged to evolve. With the emergence of open mobile platforms that enable mobile users to install applications on their own, it opens up new avenues for propagating malware among various mobile users very quickly. In particular, they become crossover targets of PC malware through the synchronization function between smartphones and computers. Literature lacks detailed analysis of smartphones malware and synchronization vulnerabilities. This paper addresses these gaps in literature, by first identifying the similarities and differences between smartphone malware and PC malware, and then by investigating how hackers exploit synchronization vulnerabilities to launch their attacks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Since its establishment, the Android applications market has been infected by a proliferation of malicious applications. Recent studies show that rogue developers are injecting malware into legitimate market applications which are then installed on open source sites for consumer uptake. Often, applications are infected several times. In this paper, we investigate the behavior of malicious Android applications, we present a simple and effective way to safely execute and analyze them. As part of this analysis, we use the Android application sandbox Droidbox to generate behavioral graphs for each sample and these provide the basis of the development of patterns to aid in identifying it. As a result, we are able to determine if family names have been correctly assigned by current anti-virus vendors. Our results indicate that the traditional anti-virus mechanisms are not able to correctly identify malicious Android applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Wire is a intermediate language to enable static program analysis on low level objects such as native executables. It has practical benefit in analysing the structure and semantics of malware, or for identifying software defects in closed source software. In this paper we describe how an executable program is disassembled and translated to the Wire intermediate language. We define the formal syntax and operational semantics of Wire and discuss our justifications for its language features. We use Wire in our previous work Malwise, a malware variant detection system. We also examine applications for when a formally defined intermediate language is given. Our results include showing the semantic equivalence between obfuscated and non obfuscated code samples. These examples stem from the obfuscations commonly used by malware.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Detecting malicious software or malware is one of the major concerns in information security governance as malware authors pose a major challenge to digital forensics by using a variety of highly sophisticated stealth techniques to hide malicious code in computing systems, including smartphones. The current detection techniques are futile, as forensic analysis of infected devices is unable to identify all the hidden malware, thereby resulting in zero day attacks. This chapter takes a key step forward to address this issue and lays foundation for deeper investigations in digital forensics. The goal of this chapter is, firstly, to unearth the recent obfuscation strategies employed to hide malware. Secondly, this chapter proposes innovative techniques that are implemented as a fully-automated tool, and experimentally tested to exhaustively detect hidden malware that leverage on system vulnerabilities. Based on these research investigations, the chapter also arrives at an information security governance plan that would aid in addressing the current and future cybercrime situations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Static detection of malware variants plays an important role in system security and control flow has been shown as an effective characteristic that represents polymorphic malware. In our research, we propose a similarity search of malware to detect these variants using novel distance metrics. We describe a malware signature by the set of control flowgraphs the malware contains. We use a distance metric based on the distance between feature vectors of string-based signatures. The feature vector is a decomposition of the set of graphs into either fixed size k-subgraphs, or q-gram strings of the high-level source after decompilation. We use this distance metric to perform pre-filtering. We also propose a more effective but less computationally efficient distance metric based on the minimum matching distance. The minimum matching distance uses the string edit distances between programs' decompiled flowgraphs, and the linear sum assignment problem to construct a minimum sum weight matching between two sets of graphs. We implement the distance metrics in a complete malware variant detection system. The evaluation shows that our approach is highly effective in terms of a limited false positive rate and our system detects more malware variants when compared to the detection rates of other algorithms. © 2013 IEEE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Malware is pervasive in networks, and poses a critical threat to network security. However, we have very limited understanding of malware behavior in networks to date. In this paper, we investigate how malware propagates in networks from a global perspective. We formulate the problem, and establish a rigorous two layer epidemic model for malware propagation from network to network. Based on the proposed model, our analysis indicates that the distribution of a given malware follows exponential distribution, power law distribution with a short exponential tail, and power law distribution at its early, late and final stages, respectively. Extensive experiments have been performed through two real-world global scale malware data sets, and the results confirm our theoretical findings.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Since Sharir and Pnueli, algorithms for context-sensitivity have been defined in terms of 'valid' paths in an interprocedural flow graph. The definition of valid paths requires atomic call and ret statements, and encapsulated procedures. Thus, the resulting algorithms are not directly applicable when behavior similar to call and ret instructions may be realized using non-atomic statements, or when procedures do not have rigid boundaries, such as with programs in low level languages like assembly or RTL. We present a framework for context-sensitive analysis that requires neither atomic call and ret instructions, nor encapsulated procedures. The framework presented decouples the transfer of control semantics and the context manipulation semantics of statements. A new definition of context-sensitivity, called stack contexts, is developed. A stack context, which is defined using trace semantics, is more general than Sharir and Pnueli's interprocedural path based calling-context. An abstract interpretation based framework is developed to reason about stack-contexts and to derive analogues of calling-context based algorithms using stack-context. The framework presented is suitable for deriving algorithms for analyzing binary programs, such as malware, that employ obfuscations with the deliberate intent of defeating automated analysis. The framework is used to create a context-sensitive version of Venable et al.'s algorithm for analyzing x86 binaries without requiring that a binary conforms to a standard compilation model for maintaining procedures, calls, and returns. Experimental results show that a context-sensitive analysis using stack-context performs just as well for programs where the use of Sharir and Pnueli's calling-context produces correct approximations. However, if those programs are transformed to use call obfuscations, a contextsensitive analysis using stack-context still provides the same, correct results and without any additional overhead. © Springer Science+Business Media, LLC 2011.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Malicious programs (malware) can cause severe damage on computer systems and data. The mechanism that the human immune system uses to detect and protect from organisms that threaten the human body is efficient and can be adapted to detect malware attacks. In this paper we propose a system to perform malware distributed collection, analysis and detection, this last inspired by the human immune system. After collecting malware samples from Internet, they are dynamically analyzed so as to provide execution traces at the operating system level and network flows that are used to create a behavioral model and to generate a detection signature. Those signatures serve as input to a malware detector, acting as the antibodies in the antigen detection process. This allows us to understand the malware attack and aids in the infection removal procedures. © 2012 Springer-Verlag.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nel mondo della sicurezza informatica, le tecnologie si evolvono per far fronte alle minacce. Non è possibile prescindere dalla prevenzione, ma occorre accettare il fatto che nessuna barriera risulterà impenetrabile e che la rilevazione, unitamente ad una pronta risposta, rappresenta una linea estremamente critica di difesa, ma l’unica veramente attuabile per poter guadagnare più tempo possibile o per limitare i danni. Introdurremo quindi un nuovo modello operativo composto da procedure capaci di affrontare le nuove sfide che il malware costantemente offre e allo stesso tempo di sollevare i comparti IT da attività onerose e sempre più complesse, ottimizzandone il processo di comunicazione e di risposta.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

El Malware es una grave amenaza para la seguridad de los sistemas. Con el uso generalizado de la World Wide Web, ha habido un enorme aumento en los ataques de virus, haciendo que la seguridad informática sea esencial para todas las computadoras y se expandan las áreas de investigación sobre los nuevos incidentes que se generan, siendo una de éstas la clasificación del malware. Los “desarrolladores de malware” utilizan nuevas técnicas para generar malware polimórfico reutilizando los malware existentes, por lo cual es necesario agruparlos en familias para estudiar sus características y poder detectar nuevas variantes de los mismos. Este trabajo, además de presentar un detallado estado de la cuestión de la clasificación del malware de ficheros ejecutables PE, presenta un enfoque en el que se mejora el índice de la clasificación de la base de datos de Malware MALICIA utilizando las características estáticas de ficheros ejecutables Imphash y Pehash, utilizando dichas características se realiza un clustering con el algoritmo clustering agresivo el cual se cambia con la clasificación actual mediante el algoritmo de majority voting y la característica icon_label, obteniendo un Precision de 99,15% y un Recall de 99,32% mejorando la clasificación de MALICIA con un F-measure de 99,23%.---ABSTRACT---Malware is a serious threat to the security of systems. With the widespread use of the World Wide Web, there has been a huge increase in virus attacks, making the computer security essential for all computers. Near areas of research have append in this area including classifying malware into families, Malware developers use polymorphism to generate new variants of existing malware. Thus it is crucial to group variants of the same family, to study their characteristics and to detect new variants. This work, in addition to presenting a detailed analysis of the problem of classifying malware PE executable files, presents an approach in which the classification in the Malware database MALICIA is improved by using static characteristics of executable files, namely Imphash and Pehash. Both features are evaluated through clustering real malware with family labels with aggressive clustering algorithm and combining this with the current classification by Majority voting algorithm, obtaining a Precision of 99.15% and a Recall of 99.32%, improving the classification of MALICIA with an F-measure of 99,23%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Esta tesis se centra en el análisis de dos aspectos complementarios de la ciberdelincuencia (es decir, el crimen perpetrado a través de la red para ganar dinero). Estos dos aspectos son las máquinas infectadas utilizadas para obtener beneficios económicos de la delincuencia a través de diferentes acciones (como por ejemplo, clickfraud, DDoS, correo no deseado) y la infraestructura de servidores utilizados para gestionar estas máquinas (por ejemplo, C & C, servidores explotadores, servidores de monetización, redirectores). En la primera parte se investiga la exposición a las amenazas de los ordenadores victimas. Para realizar este análisis hemos utilizado los metadatos contenidos en WINE-BR conjunto de datos de Symantec. Este conjunto de datos contiene metadatos de instalación de ficheros ejecutables (por ejemplo, hash del fichero, su editor, fecha de instalación, nombre del fichero, la versión del fichero) proveniente de 8,4 millones de usuarios de Windows. Hemos asociado estos metadatos con las vulnerabilidades en el National Vulnerability Database (NVD) y en el Opens Sourced Vulnerability Database (OSVDB) con el fin de realizar un seguimiento de la decadencia de la vulnerabilidad en el tiempo y observar la rapidez de los usuarios a remiendar sus sistemas y, por tanto, su exposición a posibles ataques. Hemos identificado 3 factores que pueden influir en la actividad de parches de ordenadores victimas: código compartido, el tipo de usuario, exploits. Presentamos 2 nuevos ataques contra el código compartido y un análisis de cómo el conocimiento usuarios y la disponibilidad de exploit influyen en la actividad de aplicación de parches. Para las 80 vulnerabilidades en nuestra base de datos que afectan código compartido entre dos aplicaciones, el tiempo entre el parche libera en las diferentes aplicaciones es hasta 118 das (con una mediana de 11 das) En la segunda parte se proponen nuevas técnicas de sondeo activos para detectar y analizar las infraestructuras de servidores maliciosos. Aprovechamos técnicas de sondaje activo, para detectar servidores maliciosos en el internet. Empezamos con el análisis y la detección de operaciones de servidores explotadores. Como una operación identificamos los servidores que son controlados por las mismas personas y, posiblemente, participan en la misma campaña de infección. Hemos analizado un total de 500 servidores explotadores durante un período de 1 año, donde 2/3 de las operaciones tenían un único servidor y 1/2 por varios servidores. Hemos desarrollado la técnica para detectar servidores explotadores a diferentes tipologías de servidores, (por ejemplo, C & C, servidores de monetización, redirectores) y hemos logrado escala de Internet de sondeo para las distintas categorías de servidores maliciosos. Estas nuevas técnicas se han incorporado en una nueva herramienta llamada CyberProbe. Para detectar estos servidores hemos desarrollado una novedosa técnica llamada Adversarial Fingerprint Generation, que es una metodología para generar un modelo único de solicitud-respuesta para identificar la familia de servidores (es decir, el tipo y la operación que el servidor apartenece). A partir de una fichero de malware y un servidor activo de una determinada familia, CyberProbe puede generar un fingerprint válido para detectar todos los servidores vivos de esa familia. Hemos realizado 11 exploraciones en todo el Internet detectando 151 servidores maliciosos, de estos 151 servidores 75% son desconocidos a bases de datos publicas de servidores maliciosos. Otra cuestión que se plantea mientras se hace la detección de servidores maliciosos es que algunos de estos servidores podrán estar ocultos detrás de un proxy inverso silente. Para identificar la prevalencia de esta configuración de red y mejorar el capacidades de CyberProbe hemos desarrollado RevProbe una nueva herramienta a través del aprovechamiento de leakages en la configuración de la Web proxies inversa puede detectar proxies inversos. RevProbe identifica que el 16% de direcciones IP maliciosas activas analizadas corresponden a proxies inversos, que el 92% de ellos son silenciosos en comparación con 55% para los proxies inversos benignos, y que son utilizado principalmente para equilibrio de carga a través de múltiples servidores. ABSTRACT In this dissertation we investigate two fundamental aspects of cybercrime: the infection of machines used to monetize the crime and the malicious server infrastructures that are used to manage the infected machines. In the first part of this dissertation, we analyze how fast software vendors apply patches to secure client applications, identifying shared code as an important factor in patch deployment. Shared code is code present in multiple programs. When a vulnerability affects shared code the usual linear vulnerability life cycle is not anymore effective to describe how the patch deployment takes place. In this work we show which are the consequences of shared code vulnerabilities and we demonstrate two novel attacks that can be used to exploit this condition. In the second part of this dissertation we analyze malicious server infrastructures, our contributions are: a technique to cluster exploit server operations, a tool named CyberProbe to perform large scale detection of different malicious servers categories, and RevProbe a tool that detects silent reverse proxies. We start by identifying exploit server operations, that are, exploit servers managed by the same people. We investigate a total of 500 exploit servers over a period of more 13 months. We have collected malware from these servers and all the metadata related to the communication with the servers. Thanks to this metadata we have extracted different features to group together servers managed by the same entity (i.e., exploit server operation), we have discovered that 2/3 of the operations have a single server while 1/3 have multiple servers. Next, we present CyberProbe a tool that detects different malicious server types through a novel technique called adversarial fingerprint generation (AFG). The idea behind CyberProbe’s AFG is to run some piece of malware and observe its network communication towards malicious servers. Then it replays this communication to the malicious server and outputs a fingerprint (i.e. a port selection function, a probe generation function and a signature generation function). Once the fingerprint is generated CyberProbe scans the Internet with the fingerprint and finds all the servers of a given family. We have performed a total of 11 Internet wide scans finding 151 new servers starting with 15 seed servers. This gives to CyberProbe a 10 times amplification factor. Moreover we have compared CyberProbe with existing blacklists on the internet finding that only 40% of the server detected by CyberProbe were listed. To enhance the capabilities of CyberProbe we have developed RevProbe, a reverse proxy detection tool that can be integrated with CyberProbe to allow precise detection of silent reverse proxies used to hide malicious servers. RevProbe leverages leakage based detection techniques to detect if a malicious server is hidden behind a silent reverse proxy and the infrastructure of servers behind it. At the core of RevProbe is the analysis of differences in the traffic by interacting with a remote server.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Kernel-level malware is one of the most dangerous threats to the security of users on the Internet, so there is an urgent need for its detection. The most popular detection approach is misuse-based detection. However, it cannot catch up with today's advanced malware that increasingly apply polymorphism and obfuscation. In this thesis, we present our integrity-based detection for kernel-level malware, which does not rely on the specific features of malware. ^ We have developed an integrity analysis system that can derive and monitor integrity properties for commodity operating systems kernels. In our system, we focus on two classes of integrity properties: data invariants and integrity of Kernel Queue (KQ) requests. ^ We adopt static analysis for data invariant detection and overcome several technical challenges: field-sensitivity, array-sensitivity, and pointer analysis. We identify data invariants that are critical to system runtime integrity from Linux kernel 2.4.32 and Windows Research Kernel (WRK) with very low false positive rate and very low false negative rate. We then develop an Invariant Monitor to guard these data invariants against real-world malware. In our experiment, we are able to use Invariant Monitor to detect ten real-world Linux rootkits and nine real-world Windows malware and one synthetic Windows malware. ^ We leverage static and dynamic analysis of kernel and device drivers to learn the legitimate KQ requests. Based on the learned KQ requests, we build KQguard to protect KQs. At runtime, KQguard rejects all the unknown KQ requests that cannot be validated. We apply KQguard on WRK and Linux kernel, and extensive experimental evaluation shows that KQguard is efficient (up to 5.6% overhead) and effective (capable of achieving zero false positives against representative benign workloads after appropriate training and very low false negatives against 125 real-world malware and nine synthetic attacks). ^ In our system, Invariant Monitor and KQguard cooperate together to protect data invariants and KQs in the target kernel. By monitoring these integrity properties, we can detect malware by its violation of these integrity properties during execution.^