990 resultados para National Science Foundation
Resumo:
We postulate that exogenous losses-which are typically regarded as introducing undesirable "noise" that needs to be filtered out or hidden from end points-can be surprisingly beneficial. In this paper we evaluate the effects of exogenous losses on transmission control loops, focusing primarily on efficiency and convergence to fairness properties. By analytically capturing the effects of exogenous losses, we are able to characterize the transient behavior of TCP. Our numerical results suggest that "noise" resulting from exogenous losses should not be filtered out blindly, and that a careful examination of the parameter space leads to better strategies regarding the treatment of exogenous losses inside the network. Specifically, we show that while low levels of exogenous losses do help connections converge to their fair share, higher levels of losses lead to inefficient network utilization. We draw the line between these two cases by determining whether or not it is advantageous to hide, or more interestingly introduce, exogenous losses. Our proposed approach is based on classifying the effects of exogenous losses into long-term and short-term effects. Such classification informs the extent to which we control exogenous losses, so as to operate in an efficient and fair region. We validate our results through simulations.
Resumo:
Network traffic arises from the superposition of Origin-Destination (OD) flows. Hence, a thorough understanding of OD flows is essential for modeling network traffic, and for addressing a wide variety of problems including traffic engineering, traffic matrix estimation, capacity planning, forecasting and anomaly detection. However, to date, OD flows have not been closely studied, and there is very little known about their properties. We present the first analysis of complete sets of OD flow timeseries, taken from two different backbone networks (Abilene and Sprint-Europe). Using Principal Component Analysis (PCA), we find that the set of OD flows has small intrinsic dimension. In fact, even in a network with over a hundred OD flows, these flows can be accurately modeled in time using a small number (10 or less) of independent components or dimensions. We also show how to use PCA to systematically decompose the structure of OD flow timeseries into three main constituents: common periodic trends, short-lived bursts, and noise. We provide insight into how the various constituents contribute to the overall structure of OD flows and explore the extent to which this decomposition varies over time.
Resumo:
In a recent paper, Structural Analysis of Network Traffic Flows, we analyzed the set of Origin Destination traffic flows from the Sprint-Europe and Abilene backbone networks. This report presents the complete set of results from analyzing data from both networks. The results in this report are specific to the Sprint-1 and Abilene datasets studied in the above paper. The following results are presented here: 1 Rows of Principal Matrix (V) 2 1.1 Sprint-1 Dataset ................................ 2 1.2 Abilene Dataset.................................. 9 2 Set of Eigenflows 14 2.1 Sprint-1 Dataset.................................. 14 2.2 Abilene Dataset................................... 21 3 Classifying Eigenflows 26 3.1 Sprint-1 Dataset.................................. 26 3.2 Abilene Datase.................................... 44
Resumo:
This paper introduces BoostMap, a method that can significantly reduce retrieval time in image and video database systems that employ computationally expensive distance measures, metric or non-metric. Database and query objects are embedded into a Euclidean space, in which similarities can be rapidly measured using a weighted Manhattan distance. Embedding construction is formulated as a machine learning task, where AdaBoost is used to combine many simple, 1D embeddings into a multidimensional embedding that preserves a significant amount of the proximity structure in the original space. Performance is evaluated in a hand pose estimation system, and a dynamic gesture recognition system, where the proposed method is used to retrieve approximate nearest neighbors under expensive image and video similarity measures. In both systems, BoostMap significantly increases efficiency, with minimal losses in accuracy. Moreover, the experiments indicate that BoostMap compares favorably with existing embedding methods that have been employed in computer vision and database applications, i.e., FastMap and Bourgain embeddings.
Resumo:
We consider the problem of efficiently and fairly allocating bandwidth at a highly congested link to a diverse set of flows, including TCP flows with various Round Trip Times (RTT), non-TCP-friendly flows such as Constant-Bit-Rate (CBR) applications using UDP, misbehaving, or malicious flows. Though simple, a FIFO queue management is vulnerable. Fair Queueing (FQ) can guarantee max-min fairness but fails at efficiency. RED-PD exploits the history of RED's actions in preferentially dropping packets from higher-rate flows. Thus, RED-PD attempts to achieve fairness at low cost. By relying on RED's actions, RED-PD turns out not to be effective in dealing with non-adaptive flows in settings with a highly heterogeneous mix of flows. In this paper, we propose a new approach we call RED-NB (RED with No Bias). RED-NB does not rely on RED's actions. Rather it explicitly maintains its own history for the few high-rate flows. RED-NB then adaptively adjusts flow dropping probabilities to achieve max-min fairness. In addition, RED-NB helps RED itself at very high loads by tuning RED's dropping behavior to the flow characteristics (restricted in this paper to RTTs) to eliminate its bias against long-RTT TCP flows while still taking advantage of RED's features at low loads. Through extensive simulations, we confirm the fairness of RED-NB and show that it outperforms RED, RED-PD, and CHOKe in all scenarios.
Resumo:
The best-effort nature of the Internet poses a significant obstacle to the deployment of many applications that require guaranteed bandwidth. In this paper, we present a novel approach that enables two edge/border routers-which we call Internet Traffic Managers (ITM)-to use an adaptive number of TCP connections to set up a tunnel of desirable bandwidth between them. The number of TCP connections that comprise this tunnel is elastic in the sense that it increases/decreases in tandem with competing cross traffic to maintain a target bandwidth. An origin ITM would then schedule incoming packets from an application requiring guaranteed bandwidth over that elastic tunnel. Unlike many proposed solutions that aim to deliver soft QoS guarantees, our elastic-tunnel approach does not require any support from core routers (as with IntServ and DiffServ); it is scalable in the sense that core routers do not have to maintain per-flow state (as with IntServ); and it is readily deployable within a single ISP or across multiple ISPs. To evaluate our approach, we develop a flow-level control-theoretic model to study the transient behavior of established elastic TCP-based tunnels. The model captures the effect of cross-traffic connections on our bandwidth allocation policies. Through extensive simulations, we confirm the effectiveness of our approach in providing soft bandwidth guarantees. We also outline our kernel-level ITM prototype implementation.
Resumo:
TCP performance degrades when end-to-end connections extend over wireless connections-links which are characterized by high bit error rate and intermittent connectivity. Such link characteristics can significantly degrade TCP performance as the TCP sender assumes wireless losses to be congestion losses resulting in unnecessary congestion control actions. Link errors can be reduced by increasing transmission power, code redundancy (FEC) or number of retransmissions (ARQ). But increasing power costs resources, increasing code redundancy reduces available channel bandwidth and increasing persistency increases end-to-end delay. The paper proposes a TCP optimization through proper tuning of power management, FEC and ARQ in wireless environments (WLAN and WWAN). In particular, we conduct analytical and numerical analysis taking into "wireless-aware" TCP) performance under different settings. Our results show that increasing power, redundancy and/or retransmission levels always improves TCP performance by reducing link-layer losses. However, such improvements are often associated with cost and arbitrary improvement cannot be realized without paying a lot in return. It is therefore important to consider some kind of net utility function that should be optimized, thus maximizing throughput at the least possible cost.
Resumo:
(This Technical Report revises TR-BUCS-2003-011) The Transmission Control Protocol (TCP) has been the protocol of choice for many Internet applications requiring reliable connections. The design of TCP has been challenged by the extension of connections over wireless links. In this paper, we investigate a Bayesian approach to infer at the source host the reason of a packet loss, whether congestion or wireless transmission error. Our approach is "mostly" end-to-end since it requires only one long-term average quantity (namely, long-term average packet loss probability over the wireless segment) that may be best obtained with help from the network (e.g. wireless access agent).Specifically, we use Maximum Likelihood Ratio tests to evaluate TCP as a classifier of the type of packet loss. We study the effectiveness of short-term classification of packet errors (congestion vs. wireless), given stationary prior error probabilities and distributions of packet delays conditioned on the type of packet loss (measured over a larger time scale). Using our Bayesian-based approach and extensive simulations, we demonstrate that congestion-induced losses and losses due to wireless transmission errors produce sufficiently different statistics upon which an efficient online error classifier can be built. We introduce a simple queueing model to underline the conditional delay distributions arising from different kinds of packet losses over a heterogeneous wired/wireless path. We show how Hidden Markov Models (HMMs) can be used by a TCP connection to infer efficiently conditional delay distributions. We demonstrate how estimation accuracy is influenced by different proportions of congestion versus wireless losses and penalties on incorrect classification.
Resumo:
Internet Traffic Managers (ITMs) are special machines placed at strategic places in the Internet. itmBench is an interface that allows users (e.g. network managers, service providers, or experimental researchers) to register different traffic control functionalities to run on one ITM or an overlay of ITMs. Thus itmBench offers a tool that is extensible and powerful yet easy to maintain. ITM traffic control applications could be developed either using a kernel API so they run in kernel space, or using a user-space API so they run in user space. We demonstrate the flexibility of itmBench by showing the implementation of both a kernel module that provides a differentiated network service, and a user-space module that provides an overlay routing service. Our itmBench Linux-based prototype is free software and can be obtained from http://www.cs.bu.edu/groups/itm/.
Resumo:
We present a type system, StaXML, which employs the stacked type syntax to represent essential aspects of the potential roles of XML fragments to the structure of complete XML documents. The simplest application of this system is to enforce well-formedness upon the construction of XML documents without requiring the use of templates or balanced "gap plugging" operators; this allows it to be applied to programs written according to common imperative web scripting idioms, particularly the echoing of unbalanced XML fragments to an output buffer. The system can be extended to verify particular XML applications such as XHTML and identifying individual XML tags constructed from their lexical components. We also present StaXML for PHP, a prototype precompiler for the PHP4 scripting language which infers StaXML types for expressions without assistance from the programmer.
Resumo:
Anomalies are unusual and significant changes in a network's traffic levels, which can often involve multiple links. Diagnosing anomalies is critical for both network operators and end users. It is a difficult problem because one must extract and interpret anomalous patterns from large amounts of high-dimensional, noisy data. In this paper we propose a general method to diagnose anomalies. This method is based on a separation of the high-dimensional space occupied by a set of network traffic measurements into disjoint subspaces corresponding to normal and anomalous network conditions. We show that this separation can be performed effectively using Principal Component Analysis. Using only simple traffic measurements from links, we study volume anomalies and show that the method can: (1) accurately detect when a volume anomaly is occurring; (2) correctly identify the underlying origin-destination (OD) flow which is the source of the anomaly; and (3) accurately estimate the amount of traffic involved in the anomalous OD flow. We evaluate the method's ability to diagnose (i.e., detect, identify, and quantify) both existing and synthetically injected volume anomalies in real traffic from two backbone networks. Our method consistently diagnoses the largest volume anomalies, and does so with a very low false alarm rate.
Resumo:
The Border Gateway Protocol (BGP) is an interdomain routing protocol that allows each Autonomous System (AS) to define its own routing policies independently and use them to select the best routes. By means of policies, ASes are able to prevent some traffic from accessing their resources, or direct their traffic to a preferred route. However, this flexibility comes at the expense of a possibility of divergence behavior because of mutually conflicting policies. Since BGP is not guaranteed to converge even in the absence of network topology changes, it is not safe. In this paper, we propose a randomized approach to providing safety in BGP. The proposed algorithm dynamically detects policy conflicts, and tries to eliminate the conflict by changing the local preference of the paths involved. Both the detection and elimination of policy conflicts are performed locally, i.e. by using only local information. Randomization is introduced to prevent synchronous updates of the local preferences of the paths involved in the same conflict.
Resumo:
Accurate knowledge of traffic demands in a communication network enables or enhances a variety of traffic engineering and network management tasks of paramount importance for operational networks. Directly measuring a complete set of these demands is prohibitively expensive because of the huge amounts of data that must be collected and the performance impact that such measurements would impose on the regular behavior of the network. As a consequence, we must rely on statistical techniques to produce estimates of actual traffic demands from partial information. The performance of such techniques is however limited due to their reliance on limited information and the high amount of computations they incur, which limits their convergence behavior. In this paper we study a two-step approach for inferring network traffic demands. First we elaborate and evaluate a modeling approach for generating good starting points to be fed to iterative statistical inference techniques. We call these starting points informed priors since they are obtained using actual network information such as packet traces and SNMP link counts. Second we provide a very fast variant of the EM algorithm which extends its computation range, increasing its accuracy and decreasing its dependence on the quality of the starting point. Finally, we evaluate and compare alternative mechanisms for generating starting points and the convergence characteristics of our EM algorithm against a recently proposed Weighted Least Squares approach.
Resumo:
A framework for the simultaneous localization and recognition of dynamic hand gestures is proposed. At the core of this framework is a dynamic space-time warping (DSTW) algorithm, that aligns a pair of query and model gestures in both space and time. For every frame of the query sequence, feature detectors generate multiple hand region candidates. Dynamic programming is then used to compute both a global matching cost, which is used to recognize the query gesture, and a warping path, which aligns the query and model sequences in time, and also finds the best hand candidate region in every query frame. The proposed framework includes translation invariant recognition of gestures, a desirable property for many HCI systems. The performance of the approach is evaluated on a dataset of hand signed digits gestured by people wearing short sleeve shirts, in front of a background containing other non-hand skin-colored objects. The algorithm simultaneously localizes the gesturing hand and recognizes the hand-signed digit. Although DSTW is illustrated in a gesture recognition setting, the proposed algorithm is a general method for matching time series, that allows for multiple candidate feature vectors to be extracted at each time step.
Resumo:
BoostMap is a recently proposed method for efficient approximate nearest neighbor retrieval in arbitrary non-Euclidean spaces with computationally expensive and possibly non-metric distance measures. Database and query objects are embedded into a Euclidean space, in which similarities can be rapidly measured using a weighted Manhattan distance. The key idea is formulating embedding construction as a machine learning task, where AdaBoost is used to combine simple, 1D embeddings into a multidimensional embedding that preserves a large amount of the proximity structure of the original space. This paper demonstrates that, using the machine learning formulation of BoostMap, we can optimize embeddings for indexing and classification, in ways that are not possible with existing alternatives for constructive embeddings, and without additional costs in retrieval time. First, we show how to construct embeddings that are query-sensitive, in the sense that they yield a different distance measure for different queries, so as to improve nearest neighbor retrieval accuracy for each query. Second, we show how to optimize embeddings for nearest neighbor classification tasks, by tuning them to approximate a parameter space distance measure, instead of the original feature-based distance measure.