27 resultados para Servers

em Boston University Digital Common


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we examine a number of admission control and scheduling protocols for high-performance web servers based on a 2-phase policy for serving HTTP requests. The first "registration" phase involves establishing the TCP connection for the HTTP request and parsing/interpreting its arguments, whereas the second "service" phase involves the service/transmission of data in response to the HTTP request. By introducing a delay between these two phases, we show that the performance of a web server could be potentially improved through the adoption of a number of scheduling policies that optimize the utilization of various system components (e.g. memory cache and I/O). In addition, to its premise for improving the performance of a single web server, the delineation between the registration and service phases of an HTTP request may be useful for load balancing purposes on clusters of web servers. We are investigating the use of such a mechanism as part of the Commonwealth testbed being developed at Boston University.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Energy-efficient communication has recently become a key challenge for both researchers and industries. In this paper, we propose a new model in which a Content Provider and an Internet Service Provider cooperate to reduce the total power consumption. We solve the problem optimally and compare it with a classic formulation, whose aim is to minimize user delay. Results, although preliminary, show that power savings can be huge: up to 71% on real ISP topologies. We also show how the degree of cooperation impacts overall power consumption. Finally, we consider the impact of the Content Provider location on the total power savings.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose and evaluate an implementation of a prototype scalable web server. The prototype consists of a load-balanced cluster of hosts that collectively accept and service TCP connections. The host IP addresses are advertised using the Round Robin DNS technique, allowing any host to receive requests from any client. Once a client attempts to establish a TCP connection with one of the hosts, a decision is made as to whether or not the connection should be redirected to a different host---namely, the host with the lowest number of established connections. We use the low-overhead Distributed Packet Rewriting (DPR) technique to redirect TCP connections. In our prototype, each host keeps information about connections in hash tables and linked lists. Every time a packet arrives, it is examined to see if it has to be redirected or not. Load information is maintained using periodic broadcasts amongst the cluster hosts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Under high loads, a Web server may be servicing many hundreds of connections concurrently. In traditional Web servers, the question of the order in which concurrent connections are serviced has been left to the operating system. In this paper we ask whether servers might provide better service by using non-traditional service ordering. In particular, for the case when a Web server is serving static files, we examine the costs and benefits of a policy that gives preferential service to short connections. We start by assessing the scheduling behavior of a commonly used server (Apache running on Linux) with respect to connection size and show that it does not appear to provide preferential service to short connections. We then examine the potential performance improvements of a policy that does favor short connections (shortest-connection-first). We show that mean response time can be improved by factors of four or five under shortest-connection-first, as compared to an (Apache-like) size-independent policy. Finally we assess the costs of shortest-connection-first scheduling in terms of unfairness (i.e., the degree to which long connections suffer). We show that under shortest-connection-first scheduling, long connections pay very little penalty. This surprising result can be understood as a consequence of heavy-tailed Web server workloads, in which most connections are small, but most server load is due to the few large connections. We support this explanation using analysis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A well-known paradigm for load balancing in distributed systems is the``power of two choices,''whereby an item is stored at the less loaded of two (or more) random alternative servers. We investigate the power of two choices in natural settings for distributed computing where items and servers reside in a geometric space and each item is associated with the server that is its nearest neighbor. This is in fact the backdrop for distributed hash tables such as Chord, where the geometric space is determined by clockwise distance on a one-dimensional ring. Theoretically, we consider the following load balancing problem. Suppose that servers are initially hashed uniformly at random to points in the space. Sequentially, each item then considers d candidate insertion points also chosen uniformly at random from the space,and selects the insertion point whose associated server has the least load. For the one-dimensional ring, and for Euclidean distance on the two-dimensional torus, we demonstrate that when n data items are hashed to n servers,the maximum load at any server is log log n / log d + O(1) with high probability. While our results match the well-known bounds in the standard setting in which each server is selected equiprobably, our applications do not have this feature, since the sizes of the nearest-neighbor regions around servers are non-uniform. Therefore, the novelty in our methods lies in developing appropriate tail bounds on the distribution of nearest-neighbor region sizes and in adapting previous arguments to this more general setting. In addition, we provide simulation results demonstrating the load balance that results as the system size scales into the millions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We analyzed the logs of our departmental HTTP server http://cs-www.bu.edu as well as the logs of the more popular Rolling Stones HTTP server http://www.stones.com. These servers have very different purposes; the former caters primarily to local clients, whereas the latter caters exclusively to remote clients all over the world. In both cases, our analysis showed that remote HTTP accesses were confined to a very small subset of documents. Using a validated analytical model of server popularity and file access profiles, we show that by disseminating the most popular documents on servers (proxies) closer to the clients, network traffic could be reduced considerably, while server loads are balanced. We argue that this process could be generalized so as to provide for an automated demand-based duplication of documents. We believe that such server-based information dissemination protocols will be more effective at reducing both network bandwidth and document retrieval times than client-based caching protocols [2].

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Speculative service implies that a client's request for a document is serviced by sending, in addition to the document requested, a number of other documents (or pointers thereto) that the server speculates will be requested by the client in the near future. This speculation is based on statistical information that the server maintains for each document it serves. The notion of speculative service is analogous to prefetching, which is used to improve cache performance in distributed/parallel shared memory systems, with the exception that servers (not clients) control when and what to prefetch. Using trace simulations based on the logs of our departmental HTTP server http://cs-www.bu.edu, we show that both server load and service time could be reduced considerably, if speculative service is used. This is above and beyond what is currently achievable using client-side caching [3] and server-side dissemination [2]. We identify a number of parameters that could be used to fine-tune the level of speculation performed by the server.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The World Wide Web (WWW or Web) is growing rapidly on the Internet. Web users want fast response time and easy access to a enormous variety of information across the world. Thus, performance is becoming a main issue in the Web. Fractals have been used to study fluctuating phenomena in many different disciplines, from the distribution of galaxies in astronomy to complex physiological control systems. The Web is also a complex, irregular, and random system. In this paper, we look at the document reference pattern at Internet Web servers and use fractal-based models to understand aspects (e.g. caching schemes) that affect the Web performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The quality of available network connections can often have a large impact on the performance of distributed applications. For example, document transfer applications such as FTP, Gopher and the World Wide Web suffer increased response times as a result of network congestion. For these applications, the document transfer time is directly related to the available bandwidth of the connection. Available bandwidth depends on two things: 1) the underlying capacity of the path from client to server, which is limited by the bottleneck link; and 2) the amount of other traffic competing for links on the path. If measurements of these quantities were available to the application, the current utilization of connections could be calculated. Network utilization could then be used as a basis for selection from a set of alternative connections or servers, thus providing reduced response time. Such a dynamic server selection scheme would be especially important in a mobile computing environment in which the set of available servers is frequently changing. In order to provide these measurements at the application level, we introduce two tools: bprobe, which provides an estimate of the uncongested bandwidth of a path; and cprobe, which gives an estimate of the current congestion along a path. These two measures may be used in combination to provide the application with an estimate of available bandwidth between server and client thereby enabling application-level congestion avoidance. In this paper we discuss the design and implementation of our probe tools, specifically illustrating the techniques used to achieve accuracy and robustness. We present validation studies for both tools which demonstrate their reliability in the face of actual Internet conditions; and we give results of a survey of available bandwidth to a random set of WWW servers as a sample application of our probe technique. We conclude with descriptions of other applications of our measurement tools, several of which are currently under development.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The exploding demand for services like the World Wide Web reflects the potential that is presented by globally distributed information systems. The number of WWW servers world-wide has doubled every 3 to 5 months since 1993, outstripping even the growth of the Internet. At each of these self-managed sites, the Common Gateway Interface (CGI) and Hypertext Transfer Protocol (HTTP) already constitute a rudimentary basis for contributing local resources to remote collaborations. However, the Web has serious deficiencies that make it unsuited for use as a true medium for metacomputing --- the process of bringing hardware, software, and expertise from many geographically dispersed sources to bear on large scale problems. These deficiencies are, paradoxically, the direct result of the very simple design principles that enabled its exponential growth. There are many symptoms of the problems exhibited by the Web: disk and network resources are consumed extravagantly; information search and discovery are difficult; protocols are aimed at data movement rather than task migration, and ignore the potential for distributing computation. However, all of these can be seen as aspects of a single problem: as a distributed system for metacomputing, the Web offers unpredictable performance and unreliable results. The goal of our project is to use the Web as a medium (within either the global Internet or an enterprise intranet) for metacomputing in a reliable way with performance guarantees. We attack this problem one four levels: (1) Resource Management Services: Globally distributed computing allows novel approaches to the old problems of performance guarantees and reliability. Our first set of ideas involve setting up a family of real-time resource management models organized by the Web Computing Framework with a standard Resource Management Interface (RMI), a Resource Registry, a Task Registry, and resource management protocols to allow resource needs and availability information be collected and disseminated so that a family of algorithms with varying computational precision and accuracy of representations can be chosen to meet realtime and reliability constraints. (2) Middleware Services: Complementary to techniques for allocating and scheduling available resources to serve application needs under realtime and reliability constraints, the second set of ideas aim at reduce communication latency, traffic congestion, server work load, etc. We develop customizable middleware services to exploit application characteristics in traffic analysis to drive new server/browser design strategies (e.g., exploit self-similarity of Web traffic), derive document access patterns via multiserver cooperation, and use them in speculative prefetching, document caching, and aggressive replication to reduce server load and bandwidth requirements. (3) Communication Infrastructure: Finally, to achieve any guarantee of quality of service or performance, one must get at the network layer that can provide the basic guarantees of bandwidth, latency, and reliability. Therefore, the third area is a set of new techniques in network service and protocol designs. (4) Object-Oriented Web Computing Framework A useful resource management system must deal with job priority, fault-tolerance, quality of service, complex resources such as ATM channels, probabilistic models, etc., and models must be tailored to represent the best tradeoff for a particular setting. This requires a family of models, organized within an object-oriented framework, because no one-size-fits-all approach is appropriate. This presents a software engineering challenge requiring integration of solutions at all levels: algorithms, models, protocols, and profiling and monitoring tools. The framework captures the abstract class interfaces of the collection of cooperating components, but allows the concretization of each component to be driven by the requirements of a specific approach and environment.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As the World Wide Web (Web) is increasingly adopted as the infrastructure for large-scale distributed information systems, issues of performance modeling become ever more critical. In particular, locality of reference is an important property in the performance modeling of distributed information systems. In the case of the Web, understanding the nature of reference locality will help improve the design of middleware, such as caching, prefetching, and document dissemination systems. For example, good measurements of reference locality would allow us to generate synthetic reference streams with accurate performance characteristics, would allow us to compare empirically measured streams to explain differences, and would allow us to predict expected performance for system design and capacity planning. In this paper we propose models for both temporal and spatial locality of reference in streams of requests arriving at Web servers. We show that simple models based only on document popularity (likelihood of reference) are insufficient for capturing either temporal or spatial locality. Instead, we rely on an equivalent, but numerical, representation of a reference stream: a stack distance trace. We show that temporal locality can be characterized by the marginal distribution of the stack distance trace, and we propose models for typical distributions and compare their cache performance to our traces. We also show that spatial locality in a reference stream can be characterized using the notion of self-similarity. Self-similarity describes long-range correlations in the dataset, which is a property that previous researchers have found hard to incorporate into synthetic reference strings. We show that stack distance strings appear to be strongly self-similar, and we provide measurements of the degree of self-similarity in our traces. Finally, we discuss methods for generating synthetic Web traces that exhibit the properties of temporal and spatial locality that we measured in our data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper addresses the problem of analyzing performance of WWW servers. The web has experienced a phenomenal growth and has become the most popular Internet application. As a consequence of its large popularity, the Internet has suffered from various performance problems, such as network congestion and overloaded servers. These days, it is not uncommon to find servers refusing connections because they are overloaded. Performance has always been a key issue in the design and operation of on-line systems. With regard to Internet, performance is also critical, because users want fast and easy access to all objects (i.e., documents, pictures, audio, and video) available on the net. Thus, it is important to understand WWW performance issues. This paper focuses on the performance analysis of a Web server. Using a synthetic benchmark (WebStone), we analyze three different Web server software running on top of a Windows NT platform and performing some typical WWW tasks.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Server performance has become a crucial issue for improving the overall performance of the World-Wide Web. This paper describes Webmonitor, a tool for evaluating and understanding server performance, and presents new results for a realistic workload. Webmonitor measures activity and resource consumption, both within the kernel and in HTTP processes running in user space. Webmonitor is implemented using an efficient combination of sampling and event-driven techniques that exhibit low overhead. Our initial implementation is for the Apache World-Wide Web server running on the Linux operating system. We demonstrate the utility of Webmonitor by measuring and understanding the performance of a Pentium-based PC acting as a dedicated WWW server. Our workload uses a file size distribution with a heavy tail. This captures the fact that Web servers must concurrently handle some requests for large audio and video files, and a large number of requests for small documents, containing text or images. Our results show that in a Web server saturated by client requests, over 90% of the time spent handling HTTP requests is spent in the kernel. Furthermore, keeping TCP connections open, as required by TCP, causes a factor of 2-9 increase in the elapsed time required to service an HTTP request. Data gathered from Webmonitor provide insight into the causes of this performance penalty. Specifically, we observe a significant increase in resource consumption along three dimensions: the number of HTTP processes running at the same time, CPU utilization, and memory utilization. These results emphasize the important role of operating system and network protocol implementation in determining Web server performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

One role for workload generation is as a means for understanding how servers and networks respond to variation in load. This enables management and capacity planning based on current and projected usage. This paper applies a number of observations of Web server usage to create a realistic Web workload generation tool which mimics a set of real users accessing a server. The tool, called Surge (Scalable URL Reference Generator) generates references matching empirical measurements of 1) server file size distribution; 2) request size distribution; 3) relative file popularity; 4) embedded file references; 5) temporal locality of reference; and 6) idle periods of individual users. This paper reviews the essential elements required in the generation of a representative Web workload. It also addresses the technical challenges to satisfying this large set of simultaneous constraints on the properties of the reference stream, the solutions we adopted, and their associated accuracy. Finally, we present evidence that Surge exercises servers in a manner significantly different from other Web server benchmarks.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Reliability and availability have long been considered twin system properties that could be enhanced by distribution. Paradoxically, the traditional definitions of these properties do not recognize the positive impact of recovery as distinct from simple repair and restart on reliability, nor the negative effect of recovery, and of internetworking of clients and servers, on availability. As a result of employing the standard definitions, reliability would tend to be underestimated, and availability overestimated. We offer revised definitions of these two critical metrics, which we call service reliability and service availability, that improve the match between their formal expression, and intuitive meaning. A fortuitous advantage of our approach is that the product of our two metrics yields a highly meaningful figure of merit for the overall dependability of a system. But techniques that enhance system dependability exact a performance cost, so we conclude with a cohesive definition of performability that rewards the system for performance that is delivered to its client applications, after discounting the following consequences of failure: service denial and interruption, lost work, and recovery cost.