901 resultados para Network anomaly detection
Resumo:
Subspaces and manifolds are two powerful models for high dimensional signals. Subspaces model linear correlation and are a good fit to signals generated by physical systems, such as frontal images of human faces and multiple sources impinging at an antenna array. Manifolds model sources that are not linearly correlated, but where signals are determined by a small number of parameters. Examples are images of human faces under different poses or expressions, and handwritten digits with varying styles. However, there will always be some degree of model mismatch between the subspace or manifold model and the true statistics of the source. This dissertation exploits subspace and manifold models as prior information in various signal processing and machine learning tasks.
A near-low-rank Gaussian mixture model measures proximity to a union of linear or affine subspaces. This simple model can effectively capture the signal distribution when each class is near a subspace. This dissertation studies how the pairwise geometry between these subspaces affects classification performance. When model mismatch is vanishingly small, the probability of misclassification is determined by the product of the sines of the principal angles between subspaces. When the model mismatch is more significant, the probability of misclassification is determined by the sum of the squares of the sines of the principal angles. Reliability of classification is derived in terms of the distribution of signal energy across principal vectors. Larger principal angles lead to smaller classification error, motivating a linear transform that optimizes principal angles. This linear transformation, termed TRAIT, also preserves some specific features in each class, being complementary to a recently developed Low Rank Transform (LRT). Moreover, when the model mismatch is more significant, TRAIT shows superior performance compared to LRT.
The manifold model enforces a constraint on the freedom of data variation. Learning features that are robust to data variation is very important, especially when the size of the training set is small. A learning machine with large numbers of parameters, e.g., deep neural network, can well describe a very complicated data distribution. However, it is also more likely to be sensitive to small perturbations of the data, and to suffer from suffer from degraded performance when generalizing to unseen (test) data.
From the perspective of complexity of function classes, such a learning machine has a huge capacity (complexity), which tends to overfit. The manifold model provides us with a way of regularizing the learning machine, so as to reduce the generalization error, therefore mitigate overfiting. Two different overfiting-preventing approaches are proposed, one from the perspective of data variation, the other from capacity/complexity control. In the first approach, the learning machine is encouraged to make decisions that vary smoothly for data points in local neighborhoods on the manifold. In the second approach, a graph adjacency matrix is derived for the manifold, and the learned features are encouraged to be aligned with the principal components of this adjacency matrix. Experimental results on benchmark datasets are demonstrated, showing an obvious advantage of the proposed approaches when the training set is small.
Stochastic optimization makes it possible to track a slowly varying subspace underlying streaming data. By approximating local neighborhoods using affine subspaces, a slowly varying manifold can be efficiently tracked as well, even with corrupted and noisy data. The more the local neighborhoods, the better the approximation, but the higher the computational complexity. A multiscale approximation scheme is proposed, where the local approximating subspaces are organized in a tree structure. Splitting and merging of the tree nodes then allows efficient control of the number of neighbourhoods. Deviation (of each datum) from the learned model is estimated, yielding a series of statistics for anomaly detection. This framework extends the classical {\em changepoint detection} technique, which only works for one dimensional signals. Simulations and experiments highlight the robustness and efficacy of the proposed approach in detecting an abrupt change in an otherwise slowly varying low-dimensional manifold.
Resumo:
Stealthy attackers move patiently through computer networks - taking days, weeks or months to accomplish their objectives in order to avoid detection. As networks scale up in size and speed, monitoring for such attack attempts is increasingly a challenge. This paper presents an efficient monitoring technique for stealthy attacks. It investigates the feasibility of proposed method under number of different test cases and examines how design of the network affects the detection. A methodological way for tracing anonymous stealthy activities to their approximate sources is also presented. The Bayesian fusion along with traffic sampling is employed as a data reduction method. The proposed method has the ability to monitor stealthy activities using 10-20% size sampling rates without degrading the quality of detection.
Resumo:
The analysis of system calls is one method employed by anomaly detection systems to recognise malicious code execution. Similarities can be drawn between this process and the behaviour of certain cells belonging to the human immune system, and can be applied to construct an artificial immune system. A recently developed hypothesis in immunology, the Danger Theory, states that our immune system responds to the presence of intruders through sensing molecules belonging to those invaders, plus signals generated by the host indicating danger and damage. We propose the incorporation of this concept into a responsive intrusion detection system, where behavioural information of the system and running processes is combined with information regarding individual system calls.
Resumo:
Network intrusion detection systems are themselves becoming targets of attackers. Alert flood attacks may be used to conceal malicious activity by hiding it among a deluge of false alerts sent by the attacker. Although these types of attacks are very hard to stop completely, our aim is to present techniques that improve alert throughput and capacity to such an extent that the resources required to successfully mount the attack become prohibitive. The key idea presented is to combine a token bucket filter with a realtime correlation algorithm. The proposed algorithm throttles alert output from the IDS when an attack is detected. The attack graph used in the correlation algorithm is used to make sure that alerts crucial to forming strategies are not discarded by throttling.
Resumo:
The immune system provides an ideal metaphor for anomaly detection in general and computer security in particular. Based on this idea, artificial immune systems have been used for a number of years for intrusion detection, unfortunately so far with little success. However, these previous systems were largely based on immunological theory from the 1970s and 1980s and over the last decade our understanding of immunological processes has vastly improved. In this paper we present two new immune inspired algorithms based on the latest immunological discoveries, such as the behaviour of Dendritic Cells. The resultant algorithms are applied to real world intrusion problems and show encouraging results. Overall, we believe there is a bright future for these next generation artificial immune algorithms
Resumo:
Network Intrusion Detection Systems (NIDS) monitor a net- work with the aim of discerning malicious from benign activity on that network. While a wide range of approaches have met varying levels of success, most IDS’s rely on having access to a database of known attack signatures which are written by security experts. Nowadays, in order to solve problems with false positive alerts, correlation algorithms are used to add additional structure to sequences of IDS alerts. However, such techniques are of no help in discovering novel attacks or variations of known attacks, something the human immune system (HIS) is capable of doing in its own specialised domain. This paper presents a novel immune algorithm for application to an intrusion detection problem. The goal is to discover packets containing novel variations of attacks covered by an existing signature base.
Resumo:
The analysis of system calls is one method employed by anomaly detection systems to recognise malicious code execution. Similarities can be drawn between this process and the behaviour of certain cells belonging to the human immune system, and can be applied to construct an artificial immune system. A recently developed hypothesis in immunology, the Danger Theory, states that our immune system responds to the presence of intruders through sensing molecules belonging to those invaders, plus signals generated by the host indicating danger and damage. We propose the incorporation of this concept into a responsive intrusion detection system, where behavioural information of the system and running processes is combined with information regarding individual system calls.
Resumo:
Network intrusion detection systems are themselves becoming targets of attackers. Alert flood attacks may be used to conceal malicious activity by hiding it among a deluge of false alerts sent by the attacker. Although these types of attacks are very hard to stop completely, our aim is to present techniques that improve alert throughput and capacity to such an extent that the resources required to successfully mount the attack become prohibitive. The key idea presented is to combine a token bucket filter with a realtime correlation algorithm. The proposed algorithm throttles alert output from the IDS when an attack is detected. The attack graph used in the correlation algorithm is used to make sure that alerts crucial to forming strategies are not discarded by throttling.
Resumo:
Ensuring the security of computers is a non-trivial task, with many techniques used by malicious users to compromise these systems. In recent years a new threat has emerged in the form of networks of hijacked zombie machines used to perform complex distributed attacks such as denial of service and to obtain sensitive data such as password information. These zombie machines are said to be infected with a dasiahotpsila - a malicious piece of software which is installed on a host machine and is controlled by a remote attacker, termed the dasiabotmaster of a botnetpsila. In this work, we use the biologically inspired dendritic cell algorithm (DCA) to detect the existence of a single hot on a compromised host machine. The DCA is an immune-inspired algorithm based on an abstract model of the behaviour of the dendritic cells of the human body. The basis of anomaly detection performed by the DCA is facilitated using the correlation of behavioural attributes such as keylogging and packet flooding behaviour. The results of the application of the DCA to the detection of a single hot show that the algorithm is a successful technique for the detection of such malicious software without responding to normally running programs.
Resumo:
The human immune system has numerous properties that make it ripe for exploitation in the computational domain, such as robustness and fault tolerance, and many different algorithms, collectively termed Artificial Immune Systems (AIS), have been inspired by it. Two generations of AIS are currently in use, with the first generation relying on simplified immune models and the second generation utilising interdisciplinary collaboration to develop a deeper understanding of the immune system and hence produce more complex models. Both generations of algorithms have been successfully applied to a variety of problems, including anomaly detection, pattern recognition, optimisation and robotics. In this chapter an overview of AIS is presented, its evolution is discussed, and it is shown that the diversification of the field is linked to the diversity of the immune system itself, leading to a number of algorithms as opposed to one archetypal system. Two case studies are also presented to help provide insight into the mechanisms of AIS; these are the idiotypic network approach and the Dendritic Cell Algorithm.
Resumo:
To analyze the characteristics and predict the dynamic behaviors of complex systems over time, comprehensive research to enable the development of systems that can intelligently adapt to the evolving conditions and infer new knowledge with algorithms that are not predesigned is crucially needed. This dissertation research studies the integration of the techniques and methodologies resulted from the fields of pattern recognition, intelligent agents, artificial immune systems, and distributed computing platforms, to create technologies that can more accurately describe and control the dynamics of real-world complex systems. The need for such technologies is emerging in manufacturing, transportation, hazard mitigation, weather and climate prediction, homeland security, and emergency response. Motivated by the ability of mobile agents to dynamically incorporate additional computational and control algorithms into executing applications, mobile agent technology is employed in this research for the adaptive sensing and monitoring in a wireless sensor network. Mobile agents are software components that can travel from one computing platform to another in a network and carry programs and data states that are needed for performing the assigned tasks. To support the generation, migration, communication, and management of mobile monitoring agents, an embeddable mobile agent system (Mobile-C) is integrated with sensor nodes. Mobile monitoring agents visit distributed sensor nodes, read real-time sensor data, and perform anomaly detection using the equipped pattern recognition algorithms. The optimal control of agents is achieved by mimicking the adaptive immune response and the application of multi-objective optimization algorithms. The mobile agent approach provides potential to reduce the communication load and energy consumption in monitoring networks. The major research work of this dissertation project includes: (1) studying effective feature extraction methods for time series measurement data; (2) investigating the impact of the feature extraction methods and dissimilarity measures on the performance of pattern recognition; (3) researching the effects of environmental factors on the performance of pattern recognition; (4) integrating an embeddable mobile agent system with wireless sensor nodes; (5) optimizing agent generation and distribution using artificial immune system concept and multi-objective algorithms; (6) applying mobile agent technology and pattern recognition algorithms for adaptive structural health monitoring and driving cycle pattern recognition; (7) developing a web-based monitoring network to enable the visualization and analysis of real-time sensor data remotely. Techniques and algorithms developed in this dissertation project will contribute to research advances in networked distributed systems operating under changing environments.
Resumo:
Over the last decade, there has been a trend where water utility companies aim to make water distribution networks more intelligent in order to improve their quality of service, reduce water waste, minimize maintenance costs etc., by incorporating IoT technologies. Current state of the art solutions use expensive power hungry deployments to monitor and transmit water network states periodically in order to detect anomalous behaviors such as water leakage and bursts. However, more than 97% of water network assets are remote away from power and are often in geographically remote underpopulated areas, facts that make current approaches unsuitable for next generation more dynamic adaptive water networks. Battery-driven wireless sensor/actuator based solutions are theoretically the perfect choice to support next generation water distribution. In this paper, we present an end-to-end water leak localization system, which exploits edge processing and enables the use of battery-driven sensor nodes. Our system combines a lightweight edge anomaly detection algorithm based on compression rates and an efficient localization algorithm based on graph theory. The edge anomaly detection and localization elements of the systems produce a timely and accurate localization result and reduce the communication by 99% compared to the traditional periodic communication. We evaluated our schemes by deploying non-intrusive sensors measuring vibrational data on a real-world water test rig that have had controlled leakage and burst scenarios implemented.
Resumo:
Las organizaciones y sus entornos son sistemas complejos. Tales sistemas son difíciles de comprender y predecir. Pese a ello, la predicción es una tarea fundamental para la gestión empresarial y para la toma de decisiones que implica siempre un riesgo. Los métodos clásicos de predicción (entre los cuales están: la regresión lineal, la Autoregresive Moving Average y el exponential smoothing) establecen supuestos como la linealidad, la estabilidad para ser matemática y computacionalmente tratables. Por diferentes medios, sin embargo, se han demostrado las limitaciones de tales métodos. Pues bien, en las últimas décadas nuevos métodos de predicción han surgido con el fin de abarcar la complejidad de los sistemas organizacionales y sus entornos, antes que evitarla. Entre ellos, los más promisorios son los métodos de predicción bio-inspirados (ej. redes neuronales, algoritmos genéticos /evolutivos y sistemas inmunes artificiales). Este artículo pretende establecer un estado situacional de las aplicaciones actuales y potenciales de los métodos bio-inspirados de predicción en la administración.
Resumo:
Intelligent systems are currently inherent to the society, supporting a synergistic human-machine collaboration. Beyond economical and climate factors, energy consumption is strongly affected by the performance of computing systems. The quality of software functioning may invalidate any improvement attempt. In addition, data-driven machine learning algorithms are the basis for human-centered applications, being their interpretability one of the most important features of computational systems. Software maintenance is a critical discipline to support automatic and life-long system operation. As most software registers its inner events by means of logs, log analysis is an approach to keep system operation. Logs are characterized as Big data assembled in large-flow streams, being unstructured, heterogeneous, imprecise, and uncertain. This thesis addresses fuzzy and neuro-granular methods to provide maintenance solutions applied to anomaly detection (AD) and log parsing (LP), dealing with data uncertainty, identifying ideal time periods for detailed software analyses. LP provides deeper semantics interpretation of the anomalous occurrences. The solutions evolve over time and are general-purpose, being highly applicable, scalable, and maintainable. Granular classification models, namely, Fuzzy set-Based evolving Model (FBeM), evolving Granular Neural Network (eGNN), and evolving Gaussian Fuzzy Classifier (eGFC), are compared considering the AD problem. The evolving Log Parsing (eLP) method is proposed to approach the automatic parsing applied to system logs. All the methods perform recursive mechanisms to create, update, merge, and delete information granules according with the data behavior. For the first time in the evolving intelligent systems literature, the proposed method, eLP, is able to process streams of words and sentences. Essentially, regarding to AD accuracy, FBeM achieved (85.64+-3.69)%; eGNN reached (96.17+-0.78)%; eGFC obtained (92.48+-1.21)%; and eLP reached (96.05+-1.04)%. Besides being competitive, eLP particularly generates a log grammar, and presents a higher level of model interpretability.
Resumo:
Ship tracking systems allow Maritime Organizations that are concerned with the Safety at Sea to obtain information on the current location and route of merchant vessels. Thanks to Space technology in recent years the geographical coverage of the ship tracking platforms has increased significantly, from radar based near-shore traffic monitoring towards a worldwide picture of the maritime traffic situation. The long-range tracking systems currently in operations allow the storage of ship position data over many years: a valuable source of knowledge about the shipping routes between different ocean regions. The outcome of this Master project is a software prototype for the estimation of the most operated shipping route between any two geographical locations. The analysis is based on the historical ship positions acquired with long-range tracking systems. The proposed approach makes use of a Genetic Algorithm applied on a training set of relevant ship positions extracted from the long-term storage tracking database of the European Maritime Safety Agency (EMSA). The analysis of some representative shipping routes is presented and the quality of the results and their operational applications are assessed by a Maritime Safety expert.