999 resultados para Traffic Records.


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Current DDoS attacks are carried out by attack tools, worms and botnets using different packet-transmission strategies and various forms of attack packets to beat defense systems. These problems lead to defense systems requiring various detection methods in order to identify attacks. Moreover, DDoS attacks can mix their traffics during flash crowds. By doing this, the complex defense system cannot detect the attack traffic in time. In this paper, we propose a behavior based detection that can discriminate DDoS attack traffic from traffic generated by real users. By using Pearson's correlation coefficient, our comparable detection methods can extract the repeatable features of the packet arrivals. The extensive simulations were tested for the accuracy of detection. We then performed experiments with several datasets and our results affirm that the proposed method can differentiate traffic of an attack source from legitimate traffic with a quick response. We also discuss approaches to improve our proposed methods at the conclusion of this paper.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Due to the increasing unreliability of traditional port-based methods, Internet traffic classification has attracted a lot of research efforts in recent years. Quite a lot of previous papers have focused on using statistical characteristics as discriminators and applying machine learning techniques to classify the traffic flows. In this paper, we propose a novel machine learning based approach where the features are extracted from packet payload instead of flow statistics. Specifically, every flow is represented by a feature vector, in which each item indicates the occurrence of a particular token, i.e.; a common substring, in the payload. We have applied various machine learning algorithms to evaluate the idea and used different feature selection schemes to identify the critical tokens. Experimental result based on a real-world traffic data set shows that the approach can achieve high accuracy with low overhead.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Network traffic classification is an essential component for network management and security systems. To address the limitations of traditional port-based and payload-based methods, recent studies have been focusing on alternative approaches. One promising direction is applying machine learning techniques to classify traffic flows based on packet and flow level statistics. In particular, previous papers have illustrated that clustering can achieve high accuracy and discover unknown application classes. In this work, we present a novel semi-supervised learning method using constrained clustering algorithms. The motivation is that in network domain a lot of background information is available in addition to the data instances themselves. For example, we might know that flow ƒ1 and ƒ2 are using the same application protocol because they are visiting the same host address at the same port simultaneously. In this case, ƒ1 and ƒ2 shall be grouped into the same cluster ideally. Therefore, we describe these correlations in the form of pair-wise must-link constraints and incorporate them in the process of clustering. We have applied three constrained variants of the K-Means algorithm, which perform hard or soft constraint satisfaction and metric learning from constraints. A number of real-world traffic traces have been used to show the availability of constraints and to test the proposed approach. The experimental results indicate that by incorporating constraints in the course of clustering, the overall accuracy and cluster purity can be significantly improved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The web is a rich resource for information discovery, as a result web mining is a hot topic. However, a reliable mining result depends on the reliability of the data set. For every single second, the web generate huge amount of data, such as web page requests, file transportation. The data reflect human behavior in the cyber space and therefore valuable for our analysis in various disciplines, e.g. social science, network security. How to deposit the data is a challenge. An usual strategy is to save the abstract of the data, such as using aggregation functions to preserve the features of the original data with much smaller space. A key problem, however is that such information can be distorted by the presence of illegitimate traffic, e.g. botnet recruitment scanning, DDoS attack traffic, etc. An important consideration in web related knowledge discovery then is the robustness of the aggregation method , which in turn may be affected by the reliability of network traffic data. In this chapter, we first present the methods of aggregation functions, and then we employe information distances to filter out anomaly data as a preparation for web data mining.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In today’s high speed networks it is becoming increasingly challenging for network managers to understand the nature of the traffic that is carried in their network. A major problem for traffic analysis in this context is how to extract a concise yet accurate summary of the relevant aggregate traffic flows that are present in network traces. In this paper, we present two summarization techniques to minimize the size of the traffic flow report that is generated by a hierarchical cluster analysis tool. By analyzing the accuracy and compaction gain of our approach on a standard benchmark dataset, we demonstrate that our approach achieves more accurate summaries than those of an existing tool that is based on frequent itemset mining.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The lack of comprehensive data on transport operations is a long- standing problem in transport research. Information on road transport in particular has proved difficult to obtain. This Paper documents a study which was aimed at developing and testing a technique to estimate long-distance passenger and freight movements based on direct observation of vehicle movements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Anonymous communication has become a hot research topic in order to meet the increasing demand for web privacy protection. However, there are few such systems which can provide high level anonymity for web browsing. The reason is the current dominant dummy packet padding method for anonymization against traffic analysis attacks. This method inherits huge delay and bandwidth waste, which inhibits its use for web browsing. In this paper, we propose a predicted packet padding strategy to replace the dummy packet padding method for anonymous web browsing systems. The proposed strategy mitigates delay and bandwidth waste significantly on average. We formulated the traffic analysis attack and defense problem, and defined a metric, cost coefficient of anonymization (CCA), to measure the performance of anonymization. We thoroughly analyzed the problem with the characteristics of web browsing and concluded that the proposed strategy is better than the current dummy packet padding strategy in theory. We have conducted extensive experiments on two real world data sets, and the results confirmed the advantage of the proposed method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This book focuses on network management and traffic engineering for Internet and distributed computing technologies, as well as present emerging technology trends and advanced platform

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Due to the limitations of the traditional port-based and payload-based traffic classification approaches, the past decade has seen extensive work on utilizing machine learning techniques to classify network traffic based on packet and flow level features. In particular, previous studies have shown that the unsupervised clustering approach is both accurate and capable of discovering previously unknown application classes. In this paper, we explore the utility of side information in the process of traffic clustering. Specifically, we focus on the flow correlation information that can be efficiently extracted from packet headers and expressed as instance-level constraints, which indicate that particular sets of flows are using the same application and thus should be put into the same cluster. To incorporate the constraints, we propose a modified constrained K-Means algorithm. A variety of real-world traffic traces are used to show that the constraints are widely available. The experimental results indicate that the constrained approach not only improves the quality of the resulted clusters, but also speeds up the convergence of the clustering process.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a new semi-supervised method to effectively improve traffic classification performance when few supervised training data are available. Existing semi supervised methods label a large proportion of testing flows as unknown flows due to limited supervised information, which severely affects the classification performance. To address this problem, we propose to incorporate flow correlation into both training and testing stages. At the training stage, we make use of flow correlation to extend the supervised data set by automatically labeling unlabeled flows according to their correlation to the pre-labeled flows. Consequently, the traffic classifier has better performance due to the extended size and quality of the supervised data sets. At the testing stage, the correlated flows are identified and classified jointly by combining their individual predictions, so as to further boost the classification accuracy. The empirical study on the real-world network traffic shows that the proposed method outperforms the state-of-the-art flow statistical feature based classification methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A critical problem for Internet traffic classification is how to obtain a high-performance statistical feature based classifier using a small set of training data. The solutions to this problem are essential to deal with the encrypted applications and the new emerging applications. In this paper, we propose a new Naive Bayes (NB) based classification scheme to tackle this problem, which utilizes two recent research findings, feature discretization and flow correlation. A new bag-of-flow (BoF) model is firstly introduced to describe the correlated flows and it leads to a new BoF-based traffic classification problem. We cast the BoF-based traffic classification as a specific classifier combination problem and theoretically analyze the classification benefit from flow aggregation. A number of combination methods are also formulated and used to aggregate the NB predictions of the correlated flows. Finally, we carry out a number of experiments on a large scale real-world network dataset. The experimental results show that the proposed scheme can achieve significantly higher classification accuracy and much faster classification speed with comparison to the state-of-the-art traffic classification methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose – The application of “Google” econometrics (Geco) has evolved rapidly in recent years and can be applied in various fields of research. Based on accepted theories in existing economic literature, this paper seeks to contribute to the innovative use of research on Google search query data to provide a new innovative to property research.

Design/methodology/approach – In this study, existing data from Google Insights for Search (GI4S) is extended into a new potential source of consumer sentiment data based on visits to a commonly-used UK online real-estate agent platform (Rightmove.co.uk). In order to contribute to knowledge about the use of Geco's black box, namely the unknown sampling population and the specific search queries influencing the variables, the GI4S series are compared to direct web navigation.

Findings – The main finding from this study is that GI4S data produce immediate real-time results with a high level of reliability in explaining the future volume of transactions and house prices in comparison to the direct website data. Furthermore, the results reveal that the number of visits to Rightmove.co.uk is driven by GI4S data and vice versa, and indeed without a contemporaneous relationship.

Originality/value – This study contributes to the new emerging and innovative field of research involving search engine data. It also contributes to the knowledge base about the increasing use of online consumer data in economic research in property markets.