33 resultados para computer networks
em Boston University Digital Common
Resumo:
Numerous problems exist that can be modeled as traffic through a network in which constraints exist to regulate flow. Vehicular road travel, computer networks, and cloud based resource distribution, among others all have natural representations in this manner. As these networks grow in size and/or complexity, analysis and certification of the safety invariants becomes increasingly costly. The NetSketch formalism introduces a lightweight verification framework that allows for greater scalability than traditional analysis methods. The NetSketch tool was developed to provide the power of this formalism in an easy to use and intuitive user interface.
Resumo:
Nearest neighbor retrieval is the task of identifying, given a database of objects and a query object, the objects in the database that are the most similar to the query. Retrieving nearest neighbors is a necessary component of many practical applications, in fields as diverse as computer vision, pattern recognition, multimedia databases, bioinformatics, and computer networks. At the same time, finding nearest neighbors accurately and efficiently can be challenging, especially when the database contains a large number of objects, and when the underlying distance measure is computationally expensive. This thesis proposes new methods for improving the efficiency and accuracy of nearest neighbor retrieval and classification in spaces with computationally expensive distance measures. The proposed methods are domain-independent, and can be applied in arbitrary spaces, including non-Euclidean and non-metric spaces. In this thesis particular emphasis is given to computer vision applications related to object and shape recognition, where expensive non-Euclidean distance measures are often needed to achieve high accuracy. The first contribution of this thesis is the BoostMap algorithm for embedding arbitrary spaces into a vector space with a computationally efficient distance measure. Using this approach, an approximate set of nearest neighbors can be retrieved efficiently - often orders of magnitude faster than retrieval using the exact distance measure in the original space. The BoostMap algorithm has two key distinguishing features with respect to existing embedding methods. First, embedding construction explicitly maximizes the amount of nearest neighbor information preserved by the embedding. Second, embedding construction is treated as a machine learning problem, in contrast to existing methods that are based on geometric considerations. The second contribution is a method for constructing query-sensitive distance measures for the purposes of nearest neighbor retrieval and classification. In high-dimensional spaces, query-sensitive distance measures allow for automatic selection of the dimensions that are the most informative for each specific query object. It is shown theoretically and experimentally that query-sensitivity increases the modeling power of embeddings, allowing embeddings to capture a larger amount of the nearest neighbor structure of the original space. The third contribution is a method for speeding up nearest neighbor classification by combining multiple embedding-based nearest neighbor classifiers in a cascade. In a cascade, computationally efficient classifiers are used to quickly classify easy cases, and classifiers that are more computationally expensive and also more accurate are only applied to objects that are harder to classify. An interesting property of the proposed cascade method is that, under certain conditions, classification time actually decreases as the size of the database increases, a behavior that is in stark contrast to the behavior of typical nearest neighbor classification systems. The proposed methods are evaluated experimentally in several different applications: hand shape recognition, off-line character recognition, online character recognition, and efficient retrieval of time series. In all datasets, the proposed methods lead to significant improvements in accuracy and efficiency compared to existing state-of-the-art methods. In some datasets, the general-purpose methods introduced in this thesis even outperform domain-specific methods that have been custom-designed for such datasets.
Resumo:
For communication-intensive parallel applications, the maximum degree of concurrency achievable is limited by the communication throughput made available by the network. In previous work [HPS94], we showed experimentally that the performance of certain parallel applications running on a workstation network can be improved significantly if a congestion control protocol is used to enhance network performance. In this paper, we characterize and analyze the communication requirements of a large class of supercomputing applications that fall under the category of fixed-point problems, amenable to solution by parallel iterative methods. This results in a set of interface and architectural features sufficient for the efficient implementation of the applications over a large-scale distributed system. In particular, we propose a direct link between the application and network layer, supporting congestion control actions at both ends. This in turn enhances the system's responsiveness to network congestion, improving performance. Measurements are given showing the efficacy of our scheme to support large-scale parallel computations.
Resumo:
We study the impact of heterogeneity of nodes, in terms of their energy, in wireless sensor networks that are hierarchically clustered. In these networks some of the nodes become cluster heads, aggregate the data of their cluster members and transmit it to the sink. We assume that a percentage of the population of sensor nodes is equipped with additional energy resources-this is a source of heterogeneity which may result from the initial setting or as the operation of the network evolves. We also assume that the sensors are randomly (uniformly) distributed and are not mobile, the coordinates of the sink and the dimensions of the sensor field are known. We show that the behavior of such sensor networks becomes very unstable once the first node dies, especially in the presence of node heterogeneity. Classical clustering protocols assume that all the nodes are equipped with the same amount of energy and as a result, they can not take full advantage of the presence of node heterogeneity. We propose SEP, a heterogeneous-aware protocol to prolong the time interval before the death of the first node (we refer to as stability period), which is crucial for many applications where the feedback from the sensor network must be reliable. SEP is based on weighted election probabilities of each node to become cluster head according to the remaining energy in each node. We show by simulation that SEP always prolongs the stability period compared to (and that the average throughput is greater than) the one obtained using current clustering protocols. We conclude by studying the sensitivity of our SEP protocol to heterogeneity parameters capturing energy imbalance in the network. We found that SEP yields longer stability region for higher values of extra energy brought by more powerful nodes.
Resumo:
Wireless sensor networks have recently emerged as enablers of important applications such as environmental, chemical and nuclear sensing systems. Such applications have sophisticated spatial-temporal semantics that set them aside from traditional wireless networks. For example, the computation of temperature averaged over the sensor field must take into account local densities. This is crucial since otherwise the estimated average temperature can be biased by over-sampling areas where a lot more sensors exist. Thus, we envision that a fundamental service that a wireless sensor network should provide is that of estimating local densities. In this paper, we propose a lightweight probabilistic density inference protocol, we call DIP, which allows each sensor node to implicitly estimate its neighborhood size without the explicit exchange of node identifiers as in existing density discovery schemes. The theoretical basis of DIP is a probabilistic analysis which gives the relationship between the number of sensor nodes contending in the neighborhood of a node and the level of contention measured by that node. Extensive simulations confirm the premise of DIP: it can provide statistically reliable and accurate estimates of local density at a very low energy cost and constant running time. We demonstrate how applications could be built on top of our DIP-based service by computing density-unbiased statistics from estimated local densities.
Resumo:
Wireless sensor networks are characterized by limited energy resources. To conserve energy, application-specific aggregation (fusion) of data reports from multiple sensors can be beneficial in reducing the amount of data flowing over the network. Furthermore, controlling the topology by scheduling the activity of nodes between active and sleep modes has often been used to uniformly distribute the energy consumption among all nodes by de-synchronizing their activities. We present an integrated analytical model to study the joint performance of in-network aggregation and topology control. We define performance metrics that capture the tradeoffs among delay, energy, and fidelity of the aggregation. Our results indicate that to achieve high fidelity levels under medium to high event reporting load, shorter and fatter aggregation/routing trees (toward the sink) offer the best delay-energy tradeoff as long as topology control is well coordinated with routing.
Resumo:
Routing protocols in wireless sensor networks (WSN) face two main challenges: first, the challenging environments in which WSNs are deployed negatively affect the quality of the routing process. Therefore, routing protocols for WSNs should recognize and react to node failures and packet losses. Second, sensor nodes are battery-powered, which makes power a scarce resource. Routing protocols should optimize power consumption to prolong the lifetime of the WSN. In this paper, we present a new adaptive routing protocol for WSNs, we call it M^2RC. M^2RC has two phases: mesh establishment phase and data forwarding phase. In the first phase, M^2RC establishes the routing state to enable multipath data forwarding. In the second phase, M^2RC forwards data packets from the source to the sink. Targeting hop-by-hop reliability, an M^2RC forwarding node waits for an acknowledgement (ACK) that its packets were correctly received at the next neighbor. Based on this feedback, an M^2RC node applies multiplicative-increase/additive-decrease (MIAD) to control the number of neighbors targeted by its packet broadcast. We simulated M^2RC in the ns-2 simulator and compared it to GRAB, Max-power, and Min-power routing schemes. Our simulations show that M^2RC achieves the highest throughput with at least 10-30% less consumed power per delivered report in scenarios where a certain number of nodes unexpectedly fail.
Resumo:
Recent work in sensor databases has focused extensively on distributed query problems, notably distributed computation of aggregates. Existing methods for computing aggregates broadcast queries to all sensors and use in-network aggregation of responses to minimize messaging costs. In this work, we focus on uniform random sampling across nodes, which can serve both as an alternative building block for aggregation and as an integral component of many other useful randomized algorithms. Prior to our work, the best existing proposals for uniform random sampling of sensors involve contacting all nodes in the network. We propose a practical method which is only approximately uniform, but contacts a number of sensors proportional to the diameter of the network instead of its size. The approximation achieved is tunably close to exact uniform sampling, and only relies on well-known existing primitives, namely geographic routing, distributed computation of Voronoi regions and von Neumann's rejection method. Ultimately, our sampling algorithm has the same worst-case asymptotic cost as routing a point-to-point message, and thus it is asymptotically optimal among request/reply-based sampling methods. We provide experimental results demonstrating the effectiveness of our algorithm on both synthetic and real sensor topologies.
Resumo:
The quality of available network connections can often have a large impact on the performance of distributed applications. For example, document transfer applications such as FTP, Gopher and the World Wide Web suffer increased response times as a result of network congestion. For these applications, the document transfer time is directly related to the available bandwidth of the connection. Available bandwidth depends on two things: 1) the underlying capacity of the path from client to server, which is limited by the bottleneck link; and 2) the amount of other traffic competing for links on the path. If measurements of these quantities were available to the application, the current utilization of connections could be calculated. Network utilization could then be used as a basis for selection from a set of alternative connections or servers, thus providing reduced response time. Such a dynamic server selection scheme would be especially important in a mobile computing environment in which the set of available servers is frequently changing. In order to provide these measurements at the application level, we introduce two tools: bprobe, which provides an estimate of the uncongested bandwidth of a path; and cprobe, which gives an estimate of the current congestion along a path. These two measures may be used in combination to provide the application with an estimate of available bandwidth between server and client thereby enabling application-level congestion avoidance. In this paper we discuss the design and implementation of our probe tools, specifically illustrating the techniques used to achieve accuracy and robustness. We present validation studies for both tools which demonstrate their reliability in the face of actual Internet conditions; and we give results of a survey of available bandwidth to a random set of WWW servers as a sample application of our probe technique. We conclude with descriptions of other applications of our measurement tools, several of which are currently under development.
Resumo:
Replication is a commonly proposed solution to problems of scale associated with distributed services. However, when a service is replicated, each client must be assigned a server. Prior work has generally assumed that assignment to be static. In contrast, we propose dynamic server selection, and show that it enables application-level congestion avoidance. To make dynamic server selection practical, we demonstrate the use of three tools. In addition to direct measurements of round-trip latency, we introduce and validate two new tools: bprobe, which estimates the maximum possible bandwidth along a given path; and cprobe, which estimates the current congestion along a path. Using these tools we demonstrate dynamic server selection and compare it to previous static approaches. We show that dynamic server selection consistently outperforms static policies by as much as 50%. Furthermore, we demonstrate the importance of each of our tools in performing dynamic server selection.
Resumo:
The popularity of TCP/IP coupled with the premise of high speed communication using Asynchronous Transfer Mode (ATM) technology have prompted the network research community to propose a number of techniques to adapt TCP/IP to ATM network environments. ATM offers Available Bit Rate (ABR) and Unspecified Bit Rate (UBR) services for best-effort traffic, such as conventional file transfer. However, recent studies have shown that TCP/IP, when implemented using ABR or UBR, leads to serious performance degradations, especially when the utilization of network resources (such as switch buffers) is high. Proposed techniques-switch-level enhancements, for example-that attempt to patch up TCP/IP over ATMs have had limited success in alleviating this problem. The major reason for TCP/IP's poor performance over ATMs has been consistently attributed to packet fragmentation, which is the result of ATM's 53-byte cell-oriented switching architecture. In this paper, we present a new transport protocol, TCP Boston, that turns ATM's 53-byte cell-oriented switching architecture into an advantage for TCP/IP. At the core of TCP Boston is the Adaptive Information Dispersal Algorithm (AIDA), an efficient encoding technique that allows for dynamic redundancy control. AIDA makes TCP/IP's performance less sensitive to cell losses, thus ensuring a graceful degradation of TCP/IP's performance when faced with congested resources. In this paper, we introduce AIDA and overview the main features of TCP Boston. We present detailed simulation results that show the superiority of our protocol when compared to other adaptations of TCP/IP over ATMs. In particular, we show that TCP Boston improves TCP/IP's performance over ATMs for both network-centric metrics (e.g., effective throughput) and application-centric metrics (e.g., response time).
Resumo:
High-speed networks, such as ATM networks, are expected to support diverse Quality of Service (QoS) constraints, including real-time QoS guarantees. Real-time QoS is required by many applications such as those that involve voice and video communication. To support such services, routing algorithms that allow applications to reserve the needed bandwidth over a Virtual Circuit (VC) have been proposed. Commonly, these bandwidth-reservation algorithms assign VCs to routes using the least-loaded concept, and thus result in balancing the load over the set of all candidate routes. In this paper, we show that for such reservation-based protocols|which allow for the exclusive use of a preset fraction of a resource's bandwidth for an extended period of time-load balancing is not desirable as it results in resource fragmentation, which adversely affects the likelihood of accepting new reservations. In particular, we show that load-balancing VC routing algorithms are not appropriate when the main objective of the routing protocol is to increase the probability of finding routes that satisfy incoming VC requests, as opposed to equalizing the bandwidth utilization along the various routes. We present an on-line VC routing scheme that is based on the concept of "load profiling", which allows a distribution of "available" bandwidth across a set of candidate routes to match the characteristics of incoming VC QoS requests. We show the effectiveness of our load-profiling approach when compared to traditional load-balancing and load-packing VC routing schemes.
Resumo:
Overlay networks have emerged as a powerful and highly flexible method for delivering content. We study how to optimize throughput of large, multipoint transfers across richly connected overlay networks, focusing on the question of what to put in each transmitted packet. We first make the case for transmitting encoded content in this scenario, arguing for the digital fountain approach which enables end-hosts to efficiently restitute the original content of size n from a subset of any n symbols from a large universe of encoded symbols. Such an approach affords reliability and a substantial degree of application-level flexibility, as it seamlessly tolerates packet loss, connection migration, and parallel transfers. However, since the sets of symbols acquired by peers are likely to overlap substantially, care must be taken to enable them to collaborate effectively. We provide a collection of useful algorithmic tools for efficient estimation, summarization, and approximate reconciliation of sets of symbols between pairs of collaborating peers, all of which keep messaging complexity and computation to a minimum. Through simulations and experiments on a prototype implementation, we demonstrate the performance benefits of our informed content delivery mechanisms and how they complement existing overlay network architectures.
Resumo:
The current congestion-oriented design of TCP hinders its ability to perform well in hybrid wireless/wired networks. We propose a new improvement on TCP NewReno (NewReno-FF) using a new loss labeling technique to discriminate wireless from congestion losses. The proposed technique is based on the estimation of average and variance of the round trip time using a filter cal led Flip Flop filter that is augmented with history information. We show the comparative performance of TCP NewReno, NewReno-FF, and TCP Westwood through extensive simulations. We study the fundamental gains and limits using TCP NewReno with varying Loss Labeling accuracy (NewReno-LL) as a benchmark. Lastly our investigation opens up important research directions. First, there is a need for a finer grained classification of losses (even within congestion and wireless losses) for TCP in heterogeneous networks. Second, it is essential to develop an appropriate control strategy for recovery after the correct classification of a packet loss.
Resumo:
Traditionally, slotted communication protocols have employed guard times to delineate and align slots. These guard times may expand the slot duration significantly, especially when clocks are allowed to drift for longer time to reduce clock synchronization overhead. Recently, a new class of lightweight protocols for statistical estimation in wireless sensor networks have been proposed. This new class requires very short transmission durations (jam signals), thus the traditional approach of using guard times would impose significant overhead. We propose a new, more efficient algorithm to align slots. Based on geometrical properties of space, we prove that our approach bounds the slot duration by only a constant factor of what is needed. Furthermore, we show by simulation that this bound is loose and an even smaller slot duration is required, making our approach even more efficient.