14 resultados para Cache replacement policies

em Boston University Digital Common


Relevância:

100.00% 100.00%

Publicador:

Resumo:

There has been considerable work done in the study of Web reference streams: sequences of requests for Web objects. In particular, many studies have looked at the locality properties of such streams, because of the impact of locality on the design and performance of caching and prefetching systems. However, a general framework for understanding why reference streams exhibit given locality properties has not yet emerged. In this work we take a first step in this direction, based on viewing the Web as a set of reference streams that are transformed by Web components (clients, servers, and intermediaries). We propose a graph-based framework for describing this collection of streams and components. We identify three basic stream transformations that occur at nodes of the graph: aggregation, disaggregation and filtering, and we show how these transformations can be used to abstract the effects of different Web components on their associated reference streams. This view allows a structured approach to the analysis of why reference streams show given properties at different points in the Web. Applying this approach to the study of locality requires good metrics for locality. These metrics must meet three criteria: 1) they must accurately capture temporal locality; 2) they must be independent of trace artifacts such as trace length; and 3) they must not involve manual procedures or model-based assumptions. We describe two metrics meeting these criteria that each capture a different kind of temporal locality in reference streams. The popularity component of temporal locality is captured by entropy, while the correlation component is captured by interreference coefficient of variation. We argue that these metrics are more natural and more useful than previously proposed metrics for temporal locality. We use this framework to analyze a diverse set of Web reference traces. We find that this framework can shed light on how and why locality properties vary across different locations in the Web topology. For example, we find that filtering and aggregation have opposing effects on the popularity component of the temporal locality, which helps to explain why multilevel caching can be effective in the Web. Furthermore, we find that all transformations tend to diminish the correlation component of temporal locality, which has implications for the utility of different cache replacement policies at different points in the Web.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Web caching aims to reduce network traffic, server load, and user-perceived retrieval delays by replicating "popular" content on proxy caches that are strategically placed within the network. While key to effective cache utilization, popularity information (e.g. relative access frequencies of objects requested through a proxy) is seldom incorporated directly in cache replacement algorithms. Rather, other properties of the request stream (e.g. temporal locality and content size), which are easier to capture in an on-line fashion, are used to indirectly infer popularity information, and hence drive cache replacement policies. Recent studies suggest that the correlation between these secondary properties and popularity is weakening due in part to the prevalence of efficient client and proxy caches (which tend to mask these correlations). This trend points to the need for proxy cache replacement algorithms that directly capture and use popularity information. In this paper, we (1) present an on-line algorithm that effectively captures and maintains an accurate popularity profile of Web objects requested through a caching proxy, (2) propose a novel cache replacement policy that uses such information to generalize the well-known GreedyDual-Size algorithm, and (3) show the superiority of our proposed algorithm by comparing it to a host of recently-proposed and widely-used algorithms using extensive trace-driven simulations and a variety of performance metrics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Temporal locality of reference in Web request streams emerges from two distinct phenomena: the popularity of Web objects and the {\em temporal correlation} of requests. Capturing these two elements of temporal locality is important because it enables cache replacement policies to adjust how they capitalize on temporal locality based on the relative prevalence of these phenomena. In this paper, we show that temporal locality metrics proposed in the literature are unable to delineate between these two sources of temporal locality. In particular, we show that the commonly-used distribution of reference interarrival times is predominantly determined by the power law governing the popularity of documents in a request stream. To capture (and more importantly quantify) both sources of temporal locality in a request stream, we propose a new and robust metric that enables accurate delineation between locality due to popularity and that due to temporal correlation. Using this metric, we characterize the locality of reference in a number of representative proxy cache traces. Our findings show that there are measurable differences between the degrees (and sources) of temporal locality across these traces, and that these differences are effectively captured using our proposed metric. We illustrate the significance of our findings by summarizing the performance of a novel Web cache replacement policy---called GreedyDual*---which exploits both long-term popularity and short-term temporal correlation in an adaptive fashion. Our trace-driven simulation experiments (which are detailed in an accompanying Technical Report) show the superior performance of GreedyDual* when compared to other Web cache replacement policies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The relative importance of long-term popularity and short-term temporal correlation of references for Web cache replacement policies has not been studied thoroughly. This is partially due to the lack of accurate characterization of temporal locality that enables the identification of the relative strengths of these two sources of temporal locality in a reference stream. In [21], we have proposed such a metric and have shown that Web reference streams differ significantly in the prevalence of these two sources of temporal locality. These finding underscore the importance of a Web caching strategy that can adapt in a dynamic fashion to the prevalence of these two sources of temporal locality. In this paper, we propose a novel cache replacement algorithm, GreedyDual*, which is a generalization of GreedyDual-Size. GreedyDual* uses the metrics proposed in [21] to adjust the relative worth of long-term popularity versus short-term temporal correlation of references. Our trace-driven simulation experiments show the superior performance of GreedyDual* when compared to other Web cache replacement policies proposed in the literature.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In many networked applications, independent caching agents cooperate by servicing each other's miss streams, without revealing the operational details of the caching mechanisms they employ. Inference of such details could be instrumental for many other processes. For example, it could be used for optimized forwarding (or routing) of one's own miss stream (or content) to available proxy caches, or for making cache-aware resource management decisions. In this paper, we introduce the Cache Inference Problem (CIP) as that of inferring the characteristics of a caching agent, given the miss stream of that agent. While CIP is insolvable in its most general form, there are special cases of practical importance in which it is, including when the request stream follows an Independent Reference Model (IRM) with generalized power-law (GPL) demand distribution. To that end, we design two basic "litmus" tests that are able to detect LFU and LRU replacement policies, the effective size of the cache and of the object universe, and the skewness of the GPL demand for objects. Using extensive experiments under synthetic as well as real traces, we show that our methods infer such characteristics accurately and quite efficiently, and that they remain robust even when the IRM/GPL assumptions do not hold, and even when the underlying replacement policies are not "pure" LFU or LRU. We exemplify the value of our inference framework by considering example applications.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Although cooperation generally increases the amount of resources available to a community of nodes, thus improving individual and collective performance, it also allows for the appearance of potential mistreatment problems through the exposition of one node’s resources to others. We study such concerns by considering a group of independent, rational, self-aware nodes that cooperate using on-line caching algorithms, where the exposed resource is the storage of each node. Motivated by content networking applications – including web caching, CDNs, and P2P – this paper extends our previous work on the off-line version of the problem, which was limited to object replication and was conducted under a game-theoretic framework. We identify and investigate two causes of mistreatment: (1) cache state interactions (due to the cooperative servicing of requests) and (2) the adoption of a common scheme for cache replacement/redirection/admission policies. Using analytic models, numerical solutions of these models, as well as simulation experiments, we show that online cooperation schemes using caching are fairly robust to mistreatment caused by state interactions. When this becomes possible, the interaction through the exchange of miss-streams has to be very intense, making it feasible for the mistreated nodes to detect and react to the exploitation. This robustness ceases to exist when nodes fetch and store objects in response to remote requests, i.e., when they operate as Level-2 caches (or proxies) for other nodes. Regarding mistreatment due to a common scheme, we show that this can easily take place when the “outlier” characteristics of some of the nodes get overlooked. This finding underscores the importance of allowing cooperative caching nodes the flexibility of choosing from a diverse set of schemes to fit the peculiarities of individual nodes. To that end, we outline an emulation-based framework for the development of mistreatment-resilient distributed selfish caching schemes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Although cooperation generally increases the amount of resources available to a community of nodes, thus improving individual and collective performance, it also allows for the appearance of potential mistreatment problems through the exposition of one node's resources to others. We study such concerns by considering a group of independent, rational, self-aware nodes that cooperate using on-line caching algorithms, where the exposed resource is the storage at each node. Motivated by content networking applications -- including web caching, CDNs, and P2P -- this paper extends our previous work on the on-line version of the problem, which was conducted under a game-theoretic framework, and limited to object replication. We identify and investigate two causes of mistreatment: (1) cache state interactions (due to the cooperative servicing of requests) and (2) the adoption of a common scheme for cache management policies. Using analytic models, numerical solutions of these models, as well as simulation experiments, we show that on-line cooperation schemes using caching are fairly robust to mistreatment caused by state interactions. To appear in a substantial manner, the interaction through the exchange of miss-streams has to be very intense, making it feasible for the mistreated nodes to detect and react to exploitation. This robustness ceases to exist when nodes fetch and store objects in response to remote requests, i.e., when they operate as Level-2 caches (or proxies) for other nodes. Regarding mistreatment due to a common scheme, we show that this can easily take place when the "outlier" characteristics of some of the nodes get overlooked. This finding underscores the importance of allowing cooperative caching nodes the flexibility of choosing from a diverse set of schemes to fit the peculiarities of individual nodes. To that end, we outline an emulation-based framework for the development of mistreatment-resilient distributed selfish caching schemes. Our framework utilizes a simple control-theoretic approach to dynamically parameterize the cache management scheme. We show performance evaluation results that quantify the benefits from instantiating such a framework, which could be substantial under skewed demand profiles.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For many librarians, institutional repositories (IRs) promised significant change for academic libraries. We envisioned enlarging collection development scope to include locally produced scholarship and an expansion of library services to embrace scholarly publication and distribution. However, at the University of Rochester, as at many other institutions, this transformational technology was introduced in the conservative, controlled manner associated with stereotypical librarian culture, and so these expected changes never materialized. In this case study, we focus on the creation of our institutional repository (a potentially disruptive technology) and how its success was hampered by our organizational culture, manifested as a lengthy and complicated set of policies. In the following pages, we briefly describe our repository project, talk about our original policies, look at the ways those policies impeded our project, and discuss the disruption of those policies and the benefits in user uptake that resulted.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the increasing demand for document transfer services such as the World Wide Web comes a need for better resource management to reduce the latency of documents in these systems. To address this need, we analyze the potential for document caching at the application level in document transfer services. We have collected traces of actual executions of Mosaic, reflecting over half a million user requests for WWW documents. Using those traces, we study the tradeoffs between caching at three levels in the system, and the potential for use of application-level information in the caching system. Our traces show that while a high hit rate in terms of URLs is achievable, a much lower hit rate is possible in terms of bytes, because most profitably-cached documents are small. We consider the performance of caching when applied at the level of individual user sessions, at the level of individual hosts, and at the level of a collection of hosts on a single LAN. We show that the performance gain achievable by caching at the session level (which is straightforward to implement) is nearly all of that achievable at the LAN level (where caching is more difficult to implement). However, when resource requirements are considered, LAN level caching becomes much more desirable, since it can achieve a given level of caching performance using a much smaller amount of cache space. Finally, we consider the use of organizational boundary information as an example of the potential for use of application-level information in caching. Our results suggest that distinguishing between documents produced locally and those produced remotely can provide useful leverage in designing caching policies, because of differences in the potential for sharing these two document types among multiple users.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The World Wide Web (WWW or Web) is growing rapidly on the Internet. Web users want fast response time and easy access to a enormous variety of information across the world. Thus, performance is becoming a main issue in the Web. Fractals have been used to study fluctuating phenomena in many different disciplines, from the distribution of galaxies in astronomy to complex physiological control systems. The Web is also a complex, irregular, and random system. In this paper, we look at the document reference pattern at Internet Web servers and use fractal-based models to understand aspects (e.g. caching schemes) that affect the Web performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we examine a number of admission control and scheduling protocols for high-performance web servers based on a 2-phase policy for serving HTTP requests. The first "registration" phase involves establishing the TCP connection for the HTTP request and parsing/interpreting its arguments, whereas the second "service" phase involves the service/transmission of data in response to the HTTP request. By introducing a delay between these two phases, we show that the performance of a web server could be potentially improved through the adoption of a number of scheduling policies that optimize the utilization of various system components (e.g. memory cache and I/O). In addition, to its premise for improving the performance of a single web server, the delineation between the registration and service phases of an HTTP request may be useful for load balancing purposes on clusters of web servers. We are investigating the use of such a mechanism as part of the Commonwealth testbed being developed at Boston University.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With web caching and cache-related services like CDNs and edge services playing an increasingly significant role in the modern internet, the problem of the weak consistency and coherence provisions in current web protocols is becoming increasingly significant and drawing the attention of the standards community [LCD01]. Toward this end, we present definitions of consistency and coherence for web-like environments, that is, distributed client-server information systems where the semantics of interactions with resource are more general than the read/write operations found in memory hierarchies and distributed file systems. We then present a brief review of proposed mechanisms which strengthen the consistency of caches in the web, focusing upon their conceptual contributions and their weaknesses in real-world practice. These insights motivate a new mechanism, which we call "Basis Token Consistency" or BTC; when implemented at the server, this mechanism allows any client (independent of the presence and conformity of any intermediaries) to maintain a self-consistent view of the server's state. This is accomplished by annotating responses with additional per-resource application information which allows client caches to recognize the obsolescence of currently cached entities and identify responses from other caches which are already stale in light of what has already been seen. The mechanism requires no deviation from the existing client-server communication model, and does not require servers to maintain any additional per-client state. We discuss how our mechanism could be integrated into a fragment-assembling Content Management System (CMS), and present a simulation-driven performance comparison between the BTC algorithm and the use of the Time-To-Live (TTL) heuristic.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the problem of delivering popular streaming media to a large number of asynchronous clients. We propose and evaluate a cache-and-relay end-system multicast approach, whereby a client joining a multicast session caches the stream, and if needed, relays that stream to neighboring clients which may join the multicast session at some later time. This cache-and-relay approach is fully distributed, scalable, and efficient in terms of network link cost. In this paper we analytically derive bounds on the network link cost of our cache-and-relay approach, and we evaluate its performance under assumptions of limited client bandwidth and limited client cache capacity. When client bandwidth is limited, we show that although finding an optimal solution is NP-hard, a simple greedy algorithm performs surprisingly well in that it incurs network link costs that are very close to a theoretical lower bound. When client cache capacity is limited, we show that our cache-and-relay approach can still significantly reduce network link cost. We have evaluated our cache-and-relay approach using simulations over large, synthetic random networks, power-law degree networks, and small-world networks, as well as over large real router-level Internet maps.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Transport protocols are an integral part of the inter-process communication (IPC) service used by application processes to communicate over the network infrastructure. With almost 30 years of research on transport, one would have hoped that we have a good handle on the problem. Unfortunately, that is not true. As the Internet continues to grow, new network technologies and new applications continue to emerge putting transport protocols in a never-ending flux as they are continuously adapted for these new environments. In this work, we propose a clean-slate transport architecture that renders all possible transport solutions as simply combinations of policies instantiated on a single common structure. We identify a minimal set of mechanisms that once instantiated with the appropriate policies allows any transport solution to be realized. Given our proposed architecture, we contend that there are no more transport protocols to design—only policies to specify. We implement our transport architecture in a declarative language, Network Datalog (NDlog), making the specification of different transport policies easy, compact, reusable, dynamically configurable and potentially verifiable. In NDlog, transport state is represented as database relations, state is updated/queried using database operations, and transport policies are specified using declarative rules. We identify limitations with NDlog that could potentially threaten the correctness of our specification. We propose several language extensions to NDlog that would significantly improve the programmability of transport policies.