14 resultados para prefetch


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We investigate speculative prefetching under a model in which prefetching is neither aborted nor preempted by demand fetch but instead gets equal priority in network bandwidth utilisation. We argue that the non-abortive assumption is appropriate for wireless networks where bandwidth is low and latency is high, and the non-preemptive assumption is appropriate for Internet where prioritization is not always possible. This paper assumes the existence of an access model to provide some knowledge about future accesses and investigates analytically the performance of a prefetcher that utilises this knowledge. In mobile computing, because resources are severely constrained, performance prediction is as important as access prediction. For uniform retrieval time, we derive a theoretical limit of improvement in access time due to prefetching. This leads to the formulation of an optimal algorithrn for prefetching one access ahead. For non-uniform retrieval time, two different types of prefetching of multiple documents, namely mainline and branch prefetch, are evaluated against prefetch of single document. In mainline prefetch, the most probable sequence of future accesses is prefetched. In branch prefetch, a set of different alternatives for future accesses is prefetched. Under some conditions, mainline prefetch may give slight improvement in user-perceived access time over single prefetch with nominal extra retrieval cost, where retrieval cost is defined as the expected network time wasted in non-useful prefetch. Branch prefetch performs better than mainline prefetch but incurs more retrieval cost.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

CD-ROMs have proliferated as a distribution media for desktop machines for a large variety of multimedia applications (targeted for a single-user environment) like encyclopedias, magazines and games. With CD-ROM capacities up to 3 GB being available in the near future, they will form an integral part of Video on Demand (VoD) servers to store full-length movies and multimedia. In the first section of this paper we look at issues related to the single- user desktop environment. Since these multimedia applications are highly interactive in nature, we take a pragmatic approach, and have made a detailed study of the multimedia application behavior in terms of the I/O request patterns generated to the CD-ROM subsystem by tracing these patterns. We discuss prefetch buffer design and seek time characteristics in the context of the analysis of these traces. We also propose an adaptive main-memory hosted cache that receives caching hints from the application to reduce the latency when the user moves from one node of the hyper graph to another. In the second section we look at the use of CD-ROM in a VoD server and discuss the problem of scheduling multiple request streams and buffer management in this scenario. We adapt the C-SCAN (Circular SCAN) algorithm to suit the CD-ROM drive characteristics and prove that it is optimal in terms of buffer size management. We provide computationally inexpensive relations by which this algorithm can be implemented. We then propose an admission control algorithm which admits new request streams without disrupting the continuity of playback of the previous request streams. The algorithm also supports operations such as fast forward and replay. Finally, we discuss the problem of optimal placement of MPEG streams on CD-ROMs in the third section.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Data Prefetchers identify and make use of any regularity present in the history/training stream to predict future references and prefetch them into the cache. The training information used is typically the primary misses seen at a particular cache level, which is a filtered version of the accesses seen by the cache. In this work we demonstrate that extending the training information to include secondary misses and hits along with primary misses helps improve the performance of prefetchers. In addition to empirical evaluation, we use the information theoretic metric entropy, to quantify the regularity present in extended histories. Entropy measurements indicate that extended histories are more regular than the default primary miss only training stream. Entropy measurements also help corroborate our empirical findings. With extended histories, further benefits can be achieved by triggering prefetches during secondary misses also. In this paper we explore the design space of extended prefetch histories and alternative prefetch trigger points for delta correlation prefetchers. We observe that different prefetch schemes benefit to a different extent with extended histories and alternative trigger points. Also the best performing design point varies on a per-benchmark basis. To meet these requirements, we propose a simple adaptive scheme that identifies the best performing design point for a benchmark-prefetcher combination at runtime. In SPEC2000 benchmarks, using all the L2 accesses as history for prefetcher improves the performance in terms of both IPC and misses reduced over techniques that use only primary misses as history. The adaptive scheme improves the performance of CZone prefetcher over Baseline by 4.6% on an average. These performance gains are accompanied by a moderate reduction in the memory traffic requirements.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We leverage the buffering capabilities of end-systems to achieve scalable, asynchronous delivery of streams in a peer-to-peer environment. Unlike existing cache-and-relay schemes, we propose a distributed prefetching protocol where peers prefetch and store portions of the streaming media ahead of their playout time, thus not only turning themselves to possible sources for other peers but their prefetched data can allow them to overcome the departure of their source-peer. This stands in sharp contrast to existing cache-and-relay schemes where the departure of the source-peer forces its peer children to go the original server, thus disrupting their service and increasing server and network load. Through mathematical analysis and simulations, we show the effectiveness of maintaining such asynchronous multicasts from several source-peers to other children peers, and the efficacy of prefetching in the face of peer departures. We confirm the scalability of our dPAM protocol as it is shown to significantly reduce server load.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Speculative service implies that a client's request for a document is serviced by sending, in addition to the document requested, a number of other documents (or pointers thereto) that the server speculates will be requested by the client in the near future. This speculation is based on statistical information that the server maintains for each document it serves. The notion of speculative service is analogous to prefetching, which is used to improve cache performance in distributed/parallel shared memory systems, with the exception that servers (not clients) control when and what to prefetch. Using trace simulations based on the logs of our departmental HTTP server http://cs-www.bu.edu, we show that both server load and service time could be reduced considerably, if speculative service is used. This is above and beyond what is currently achievable using client-side caching [3] and server-side dissemination [2]. We identify a number of parameters that could be used to fine-tune the level of speculation performed by the server.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Task-based dataflow programming models and runtimes emerge as promising candidates for programming multicore and manycore architectures. These programming models analyze dynamically task dependencies at runtime and schedule independent tasks concurrently to the processing elements. In such models, cache locality, which is critical for performance, becomes more challenging in the presence of fine-grain tasks, and in architectures with many simple cores.

This paper presents a combined hardware-software approach to improve cache locality and offer better performance is terms of execution time and energy in the memory system. We propose the explicit bulk prefetcher (EBP) and epoch-based cache management (ECM) to help runtimes prefetch task data and guide the replacement decisions in caches. The runtimem software can use this hardware support to expose its internal knowledge about the tasks to the architecture and achieve more efficient task-based execution. Our combined scheme outperforms HW-only prefetchers and state-of-the-art replacement policies, improves performance by an average of 17%, generates on average 26% fewer L2 misses, and consumes on average 28% less energy in the components of the memory system.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Anonymous web browsing is a hot topic with many potential applications for privacy reasons. The current dominant strategy to achieve anonymity is packet padding with dummy packets as cover traffic. However, this method introduces extra bandwidth cost and extra delay. Therefore, it is not practical for anonymous web browsing applications. In order to solve this problem, we propose to use the predicted web pages that users are going to access as the cover traffic rather than dummy packets. Moreover, we defined anonymity level as a metric to measure anonymity degrees, and established a mathematical model for anonymity systems, and transformed the anonymous communication problem into an optimization problem. As a result, users can find tradeoffs among anonymity level and cost. With the proposed model, we can describe and compare our proposal and the previous schemas in a theoretical style. The preliminary experiments on the real data set showed the huge potential of the proposed strategy in terms of resource saving.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Speculative prefetching has been proposed to improve the response time of network access. Previous studies in speculative prefetching focus on building and evaluating access models for the purpose of access prediction. This paper investigates a complementary area which has been largely ignored, that of performance modeling. We analyze the performance of a prefetcher that has uncertain knowledge about future accesses. Our performance metric is the improvement in access time, for which we derive a formula in terms of resource parameters (time available and time required for prefetehing) and speculative parameters (probabilities for next access). We develop a prefetch algorithm to maximize the improvement in access time. The algorithm is based on finding the best solution to a stretch knapsack problem, using theoretically proven apparatus to reduce the search space. An integration between speculative prefetching and caching is also investigated.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Mobile users connected to wireless networks expect performance comparable to those on wired networks for interactive multimedia applications. Satisfying Quality of Service (QoS) requirements for such applications in wireless networks is a challenging problem due to limitations of low bandwidth, high error rate and frequent disconnections of wireless channels. In addition, wireless networks suffer from varying bandwidth. In this paper we investigate object prefetching during times of connectedness and bandwidth availability to enhance user perceived connectedness. This paper presents an access model that is suitable for multimedia access in wireless networks. Access modelling for the purpose of predicting future accesses in the context of speculative prefetching has received much attention in the literature. The model recognizes that a web page, instead of just a single file, is typically a compound of several files. When it comes to making prefetch decisions, most previous studies in speculative prefetching resort to simple heuristics, such as prefetching an item with access probabilities larger than a manually tuned threshold. This paper takes a different approach. Specifically, it models the performance of the prefetcher, taking into account access predictions and resource parameters, and develops a prefetch policy based on a theoretical analysis of the model. Since the analysis considers cache as one of the resource parameters, the resulting policy integrates prefetch and cache replacement decisions. The paper investigates the effect of prefetching on network load. In order to make effective use of available resources and maximize access improvement, it is beneficial to prefetch all items with access probabilities exceeding certain threshold.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

To improve the accuracy of access prediction, a prefetcher for web browsing should recognize the fact that a web page is a compound. By this term we mean that a user request for a single web page may require the retrieval of several multimedia items. Our prediction algorithm builds an access graph that captures the dynamics of web navigation rather than merely attaching probabilities to hypertext structure. When it comes to making prefetch decisions, most previous studies in speculative prefetching resort to simple heuristics, such as prefetching an item with access probabilities larger than a manually tuned threshold. The paper takes a different approach. Specifically, it models the performance of the prefetcher and develops a prefetch policy based on a theoretical analysis of the model. In the analysis, we derive a formula for the expected improvement in access time when prefetch is performed in anticipation for a compound request. We then develop an algorithm that integrates prefetch and cache replacement decisions so as to maximize this improvement. We present experimental results to demonstrate the effectiveness of compound-based prefetching in low bandwidth networks.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Previous studies in speculative prefetching focus on building and evaluating access models for the purpose of access prediction. This paper on the other hand investigates the performance of speculative prefetching. When prefetching is performed speculatively, there is bound to be an increase in the network load. Furthermore, the prefetched items must compete for space with existing cache occupants. These two factors-increased load and eviction of potentially useful cache entries-are considered in the analysis. We obtain the following conclusion: to maximise the improvement in access time, prefetch exclusively all items with access probabilities exceeding a certain threshold.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dynamically reconfigurable hardware is a promising technology that combines in the same device both the high performance and the flexibility that many recent applications demand. However, one of its main drawbacks is the reconfiguration overhead, which involves important delays in the task execution, usually in the order of hundreds of milliseconds, as well as high energy consumption. One of the most powerful ways to tackle this problem is configuration reuse, since reusing a task does not involve any reconfiguration overhead. In this paper we propose a configuration replacement policy for reconfigurable systems that maximizes task reuse in highly dynamic environments. We have integrated this policy in an external taskgraph execution manager that applies task prefetch by loading and executing the tasks as soon as possible (ASAP). However, we have also modified this ASAP technique in order to make the replacements more flexible, by taking into account the mobility of the tasks and delaying some of the reconfigurations. In addition, this replacement policy is a hybrid design-time/run-time approach, which performs the bulk of the computations at design time in order to save run-time computations. Our results illustrate that the proposed strategy outperforms other state-ofthe-art replacement policies in terms of reuse rates and achieves near-optimal reconfiguration overhead reductions. In addition, by performing the bulk of the computations at design time, we reduce the execution time of the replacement technique by 10 times with respect to an equivalent purely run-time one.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

New generation embedded systems demand high performance, efficiency and flexibility. Reconfigurable hardware can provide all these features. However the costly reconfiguration process and the lack of management support have prevented a broader use of these resources. To solve these issues we have developed a scheduler that deals with task-graphs at run-time, steering its execution in the reconfigurable resources while carrying out both prefetch and replacement techniques that cooperate to hide most of the reconfiguration delays. In our scheduling environment task-graphs are analyzed at design-time to extract useful information. This information is used at run-time to obtain near-optimal schedules, escaping from local-optimum decisions, while only carrying out simple computations. Moreover, we have developed a hardware implementation of the scheduler that applies all the optimization techniques while introducing a delay of only a few clock cycles. In the experiments our scheduler clearly outperforms conventional run-time schedulers based on As-Soon-As-Possible techniques. In addition, our replacement policy, specially designed for reconfigurable systems, achieves almost optimal results both regarding reuse and performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Reconfigurable hardware can be used to build a multitasking system where tasks are assigned to HW resources at run-time according to the requirements of the running applications. These tasks are frequently represented as direct acyclic graphs and their execution is typically controlled by an embedded processor that schedules the graph execution. In order to improve the efficiency of the system, the scheduler can apply prefetch and reuse techniques that can greatly reduce the reconfiguration latencies. For an embedded processor all these computations represent a heavy computational load that can significantly reduce the system performance. To overcome this problem we have implemented a HW scheduler using reconfigurable resources. In addition we have implemented both prefetch and replacement techniques that obtain as good results as previous complex SW approaches, while demanding just a few clock cycles to carry out the computations. We consider that the HW cost of the system (in our experiments 3% of a Virtex-II PRO xc2vp30 FPGA) is affordable taking into account the great efficiency of the techniques applied to hide the reconfiguration latency and the negligible run-time penalty introduced by the scheduler computations.