997 resultados para Traffic Breakdown


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The web is a rich resource for information discovery, as a result web mining is a hot topic. However, a reliable mining result depends on the reliability of the data set. For every single second, the web generate huge amount of data, such as web page requests, file transportation. The data reflect human behavior in the cyber space and therefore valuable for our analysis in various disciplines, e.g. social science, network security. How to deposit the data is a challenge. An usual strategy is to save the abstract of the data, such as using aggregation functions to preserve the features of the original data with much smaller space. A key problem, however is that such information can be distorted by the presence of illegitimate traffic, e.g. botnet recruitment scanning, DDoS attack traffic, etc. An important consideration in web related knowledge discovery then is the robustness of the aggregation method , which in turn may be affected by the reliability of network traffic data. In this chapter, we first present the methods of aggregation functions, and then we employe information distances to filter out anomaly data as a preparation for web data mining.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In today’s high speed networks it is becoming increasingly challenging for network managers to understand the nature of the traffic that is carried in their network. A major problem for traffic analysis in this context is how to extract a concise yet accurate summary of the relevant aggregate traffic flows that are present in network traces. In this paper, we present two summarization techniques to minimize the size of the traffic flow report that is generated by a hierarchical cluster analysis tool. By analyzing the accuracy and compaction gain of our approach on a standard benchmark dataset, we demonstrate that our approach achieves more accurate summaries than those of an existing tool that is based on frequent itemset mining.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The lack of comprehensive data on transport operations is a long- standing problem in transport research. Information on road transport in particular has proved difficult to obtain. This Paper documents a study which was aimed at developing and testing a technique to estimate long-distance passenger and freight movements based on direct observation of vehicle movements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present an approach to computing high-breakdown regression estimators in parallel on graphics processing units (GPU).We show that sorting the residuals is not necessary, and it can be substituted by calculating the median. We present and compare various methods to calculate the median and order statistics on GPUs. We introduce an alternative method based on the optimization of a convex function, and showits numerical superiority when calculating the order statistics of very large arrays on GPUs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Anonymous communication has become a hot research topic in order to meet the increasing demand for web privacy protection. However, there are few such systems which can provide high level anonymity for web browsing. The reason is the current dominant dummy packet padding method for anonymization against traffic analysis attacks. This method inherits huge delay and bandwidth waste, which inhibits its use for web browsing. In this paper, we propose a predicted packet padding strategy to replace the dummy packet padding method for anonymous web browsing systems. The proposed strategy mitigates delay and bandwidth waste significantly on average. We formulated the traffic analysis attack and defense problem, and defined a metric, cost coefficient of anonymization (CCA), to measure the performance of anonymization. We thoroughly analyzed the problem with the characteristics of web browsing and concluded that the proposed strategy is better than the current dummy packet padding strategy in theory. We have conducted extensive experiments on two real world data sets, and the results confirmed the advantage of the proposed method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This book focuses on network management and traffic engineering for Internet and distributed computing technologies, as well as present emerging technology trends and advanced platform

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Due to the limitations of the traditional port-based and payload-based traffic classification approaches, the past decade has seen extensive work on utilizing machine learning techniques to classify network traffic based on packet and flow level features. In particular, previous studies have shown that the unsupervised clustering approach is both accurate and capable of discovering previously unknown application classes. In this paper, we explore the utility of side information in the process of traffic clustering. Specifically, we focus on the flow correlation information that can be efficiently extracted from packet headers and expressed as instance-level constraints, which indicate that particular sets of flows are using the same application and thus should be put into the same cluster. To incorporate the constraints, we propose a modified constrained K-Means algorithm. A variety of real-world traffic traces are used to show that the constraints are widely available. The experimental results indicate that the constrained approach not only improves the quality of the resulted clusters, but also speeds up the convergence of the clustering process.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a new semi-supervised method to effectively improve traffic classification performance when few supervised training data are available. Existing semi supervised methods label a large proportion of testing flows as unknown flows due to limited supervised information, which severely affects the classification performance. To address this problem, we propose to incorporate flow correlation into both training and testing stages. At the training stage, we make use of flow correlation to extend the supervised data set by automatically labeling unlabeled flows according to their correlation to the pre-labeled flows. Consequently, the traffic classifier has better performance due to the extended size and quality of the supervised data sets. At the testing stage, the correlated flows are identified and classified jointly by combining their individual predictions, so as to further boost the classification accuracy. The empirical study on the real-world network traffic shows that the proposed method outperforms the state-of-the-art flow statistical feature based classification methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A critical problem for Internet traffic classification is how to obtain a high-performance statistical feature based classifier using a small set of training data. The solutions to this problem are essential to deal with the encrypted applications and the new emerging applications. In this paper, we propose a new Naive Bayes (NB) based classification scheme to tackle this problem, which utilizes two recent research findings, feature discretization and flow correlation. A new bag-of-flow (BoF) model is firstly introduced to describe the correlated flows and it leads to a new BoF-based traffic classification problem. We cast the BoF-based traffic classification as a specific classifier combination problem and theoretically analyze the classification benefit from flow aggregation. A number of combination methods are also formulated and used to aggregate the NB predictions of the correlated flows. Finally, we carry out a number of experiments on a large scale real-world network dataset. The experimental results show that the proposed scheme can achieve significantly higher classification accuracy and much faster classification speed with comparison to the state-of-the-art traffic classification methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose – The application of “Google” econometrics (Geco) has evolved rapidly in recent years and can be applied in various fields of research. Based on accepted theories in existing economic literature, this paper seeks to contribute to the innovative use of research on Google search query data to provide a new innovative to property research.

Design/methodology/approach – In this study, existing data from Google Insights for Search (GI4S) is extended into a new potential source of consumer sentiment data based on visits to a commonly-used UK online real-estate agent platform (Rightmove.co.uk). In order to contribute to knowledge about the use of Geco's black box, namely the unknown sampling population and the specific search queries influencing the variables, the GI4S series are compared to direct web navigation.

Findings – The main finding from this study is that GI4S data produce immediate real-time results with a high level of reliability in explaining the future volume of transactions and house prices in comparison to the direct website data. Furthermore, the results reveal that the number of visits to Rightmove.co.uk is driven by GI4S data and vice versa, and indeed without a contemporaneous relationship.

Originality/value – This study contributes to the new emerging and innovative field of research involving search engine data. It also contributes to the knowledge base about the increasing use of online consumer data in economic research in property markets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traffic noise causes adverse effects on the health and quality of life of individuals and communities exposed to it, including annoyance, sleep disturbance, decreased performance at school/work, stress, hypertension, and ischemic heart disease. In Australia there are few standards or policies addressing noise in urban environments, with many discrepancies in noise level thresholds when comparing states and regions. Currently Victoria has a day-to-night threshold for noise levels well above accepted levels in Europe, and there is no standard for the late night period. A better understanding of the health impacts of noise in the Australian context is vital for informing development and implementation of policy and legislation for road traffic noise management. This paper reviews the evidence base and policies related to traffic noise in urban areas, and presents a case study of noise mapping and assessing population health impacts (eg. sleep disturbance), in Geelong,Vcitoria,Australia.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose a behavior-based detection that can discriminate Distributed Denial of Service (DDoS) attack traffic from legitimated traffic regardless to various types of the attack packets and methods. Current DDoS attacks are carried out by attack tools, worms and botnets using different packet-transmission rates and packet forms to beat defense systems. These various attack strategies lead to defense systems requiring various detection methods in order to identify the attacks. Moreover, DDoS attacks can craft the traffics like flash crowd events and fly under the radar through the victim. We notice that DDoS attacks have features of repeatable patterns which are different from legitimate flash crowd traffics. In this paper, we propose a comparable detection methods based on the Pearson’s correlation coefficient. Our methods can extract the repeatable features from the packet arrivals in the DDoS traffics but not in flash crowd traffics. The extensive simulations were tested for the optimization of the detection methods. We then performed experiments with several datasets and our results affirm that the proposed methods can differentiate DDoS attacks from legitimate traffics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a novel traffic classification scheme to improve classification performance when few training data arc available. In the proposed scheme, traffic flows are described using the discretized statistical features and flow correlation information is modeled by bag-of-flow (BoF). We solve the BoF-based traffic classification in a classifier combination framework and theoretically analyze the performance benefit. Furthermore, a new BoF-based traffic classification method is proposed to aggregate the naive Bayes (NB) predictions of the correlated flows. We also present an analysis on prediction error sensitivity of the aggregation strategies. Finally, a large number of experiments are carried out on two large-scale real-world traffic datasets to evaluate the proposed scheme. The experimental results show that the proposed scheme can achieve much better classification performance than existing state-of-the-art traffic classification methods.