989 resultados para Modèles cache


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

L’objectif de la présente étude est d’étudier les différences entre les homicides par strangulation et les suicides par pendaison ainsi que l’importance du rôle que jouent les lésions dans la détermination du mode de décès dans les cas apparents de pendaison, et ce dans l’optique d’établir un outil permettant de renseigner le coroner ou médecin légiste sur le mode de décès probable dans les cas apparents de pendaison. Deux cent quatorze cas de suicide par pendaison ont été révisés rétrospectivement et comparés à 51 cas d’homicide par strangulation. La fréquence d’ecchymoses (6,1 %), d’abrasions (4,7 %) et de lacérations (0,5 %) était significativement plus faible chez les victimes de suicide par pendaison que chez les victimes d’homicides par strangulation (58,8 %, 51,0 % et 5,9 % respectivement). Les ecchymoses, chez les victimes de suicide par pendaison, se trouvent habituellement sur les membres supérieurs antérieurs et postérieurs ou sur les membres inférieurs antérieurs. Elles se situent généralement soit sur les membres supérieurs, soit sur les membres inférieurs, et non aux deux endroits à la fois. Les abrasions sont davantage susceptibles de se trouver sur la face postérieure des membres supérieurs et sur la face antérieure des membres inférieurs. Cette concentration préférentielle n’est pas observée chez les victimes d’homicide par strangulation. De possibles critères de suspicion et des modèles de prédiction du mode de décès sont évalués.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Computer resource allocation represents a significant challenge particularly for multiprocessor systems, which consist of shared computing resources to be allocated among co-runner processes and threads. While an efficient resource allocation would result in a highly efficient and stable overall multiprocessor system and individual thread performance, ineffective poor resource allocation causes significant performance bottlenecks even for the system with high computing resources. This thesis proposes a cache aware adaptive closed loop scheduling framework as an efficient resource allocation strategy for the highly dynamic resource management problem, which requires instant estimation of highly uncertain and unpredictable resource patterns. Many different approaches to this highly dynamic resource allocation problem have been developed but neither the dynamic nature nor the time-varying and uncertain characteristics of the resource allocation problem is well considered. These approaches facilitate either static and dynamic optimization methods or advanced scheduling algorithms such as the Proportional Fair (PFair) scheduling algorithm. Some of these approaches, which consider the dynamic nature of multiprocessor systems, apply only a basic closed loop system; hence, they fail to take the time-varying and uncertainty of the system into account. Therefore, further research into the multiprocessor resource allocation is required. Our closed loop cache aware adaptive scheduling framework takes the resource availability and the resource usage patterns into account by measuring time-varying factors such as cache miss counts, stalls and instruction counts. More specifically, the cache usage pattern of the thread is identified using QR recursive least square algorithm (RLS) and cache miss count time series statistics. For the identified cache resource dynamics, our closed loop cache aware adaptive scheduling framework enforces instruction fairness for the threads. Fairness in the context of our research project is defined as a resource allocation equity, which reduces corunner thread dependence in a shared resource environment. In this way, instruction count degradation due to shared cache resource conflicts is overcome. In this respect, our closed loop cache aware adaptive scheduling framework contributes to the research field in two major and three minor aspects. The two major contributions lead to the cache aware scheduling system. The first major contribution is the development of the execution fairness algorithm, which degrades the co-runner cache impact on the thread performance. The second contribution is the development of relevant mathematical models, such as thread execution pattern and cache access pattern models, which in fact formulate the execution fairness algorithm in terms of mathematical quantities. Following the development of the cache aware scheduling system, our adaptive self-tuning control framework is constructed to add an adaptive closed loop aspect to the cache aware scheduling system. This control framework in fact consists of two main components: the parameter estimator, and the controller design module. The first minor contribution is the development of the parameter estimators; the QR Recursive Least Square(RLS) algorithm is applied into our closed loop cache aware adaptive scheduling framework to estimate highly uncertain and time-varying cache resource patterns of threads. The second minor contribution is the designing of a controller design module; the algebraic controller design algorithm, Pole Placement, is utilized to design the relevant controller, which is able to provide desired timevarying control action. The adaptive self-tuning control framework and cache aware scheduling system in fact constitute our final framework, closed loop cache aware adaptive scheduling framework. The third minor contribution is to validate this cache aware adaptive closed loop scheduling framework efficiency in overwhelming the co-runner cache dependency. The timeseries statistical counters are developed for M-Sim Multi-Core Simulator; and the theoretical findings and mathematical formulations are applied as MATLAB m-file software codes. In this way, the overall framework is tested and experiment outcomes are analyzed. According to our experiment outcomes, it is concluded that our closed loop cache aware adaptive scheduling framework successfully drives co-runner cache dependent thread instruction count to co-runner independent instruction count with an error margin up to 25% in case cache is highly utilized. In addition, thread cache access pattern is also estimated with 75% accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Packet forwarding is a memory-intensive application requiring multiple accesses through a trie structure. With the requirement to process packets at line rates, high-performance routers need to forward millions of packets every second with each packet needing up to seven memory accesses. Earlier work shows that a single cache for the nodes of a trie can reduce the number of external memory accesses. It is observed that the locality characteristics of the level-one nodes of a trie are significantly different from those of lower level nodes. Hence, we propose a heterogeneously segmented cache architecture (HSCA) which uses separate caches for level-one and lower level nodes, each with carefully chosen sizes. Besides reducing misses, segmenting the cache allows us to focus on optimizing the more frequently accessed level-one node segment. We find that due to the nonuniform distribution of nodes among cache sets, the level-one nodes cache is susceptible t high conflict misses. We reduce conflict misses by introducing a novel two-level mapping-based cache placement framework. We also propose an elegant way to fit the modified placement function into the cache organization with minimal increase in access time. Further, we propose an attribute preserving trace generation methodology which emulates real traces and can generate traces with varying locality. Performanc results reveal that our HSCA scheme results in a 32 percent speedup in average memory access time over a unified nodes cache. Also, HSC outperforms IHARC, a cache for lookup results, with as high as a 10-fold speedup in average memory access time. Two-level mappin further enhances the performance of the base HSCA by up to 13 percent leading to an overall improvement of up to 40 percent over the unified scheme.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mobile applications are being increasingly deployed on a massive scale in various mobile sensor grid database systems. With limited resources from the mobile devices, how to process the huge number of queries from mobile users with distributed sensor grid databases becomes a critical problem for such mobile systems. While the fundamental semantic cache technique has been investigated for query optimization in sensor grid database systems, the problem is still difficult due to the fact that more realistic multi-dimensional constraints have not been considered in existing methods. To solve the problem, a new semantic cache scheme is presented in this paper for location-dependent data queries in distributed sensor grid database systems. It considers multi-dimensional constraints or factors in a unified cost model architecture, determines the parameters of the cost model in the scheme by using the concept of Nash equilibrium from game theory, and makes semantic cache decisions from the established cost model. The scenarios of three factors of semantic, time and locations are investigated as special cases, which improve existing methods. Experiments are conducted to demonstrate the semantic cache scheme presented in this paper for distributed sensor grid database systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

CMPs enable simultaneous execution of multiple applications on the same platforms that share cache resources. Diversity in the cache access patterns of these simultaneously executing applications can potentially trigger inter-application interference, leading to cache pollution. Whereas a large cache can ameliorate this problem, the issues of larger power consumption with increasing cache size, amplified at sub-100nm technologies, makes this solution prohibitive. In this paper in order to address the issues relating to power-aware performance of caches, we propose a caching structure that addresses the following: 1. Definition of application-specific cache partitions as an aggregation of caching units (molecules). The parameters of each molecule namely size, associativity and line size are chosen so that the power consumed by it and access time are optimal for the given technology. 2. Application-Specific resizing of cache partitions with variable and adaptive associativity per cache line, way size and variable line size. 3. A replacement policy that is transparent to the partition in terms of size, heterogeneity in associativity and line size. Through simulation studies we establish the superiority of molecular cache (caches built as aggregations of molecules) that offers a 29% power advantage over that of an equivalently performing traditional cache.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Software transactional memory (STM) is a promising programming paradigm for shared memory multithreaded programs as an alternative to traditional lock based synchronization. However adoption of STM in mainstream software has been quite low due to its considerable overheads and its poor cache/memory performance. In this paper, we perform a detailed study of the cache behavior of STM applications and quantify the impact of different STM factors on the cache misses experienced by the applications. Based on our analysis, we propose a compiler driven Lock-Data Colocation (LDC), targeted at reducing the cache overheads on STM. We show that LDC is effective in improving the cache behavior of STM applications by reducing the dcache miss latency and improving execution time performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work, we propose a new organization for the last level shared cache of a rnulticore system. Our design is based on the observation that the Next-Use distance, measured in terms of intervening misses between the eviction of a line and its next use, for lines brought in by a given delinquent PC falls within a predictable range of values. We exploit this correlation to improve the performance of shared caches in multi-core architectures by proposing the NUcache organization.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a cache coherence protocol for multistage interconnection network (MIN)-based multiprocessors with two distinct private caches: private-blocks caches (PCache) containing blocks private to a process and shared-blocks caches (SCache) containing data accessible by all processes. The architecture is extended by a coherence control bus connecting all shared-block cache controllers. Timing problems due to variable transit delays through the MIN are dealt with by introducing Transient states in the proposed cache coherence protocol. The impact of the coherence protocol on system performance is evaluated through a performance study of three phases. Assuming homogeneity of all nodes, a single-node queuing model (phase 3) is developed to analyze system performance. This model is solved for processor and coherence bus utilizations using the mean value analysis (MVA) technique with shared-blocks steady state probabilities (phase 1) and communication delays (phase 2) as input parameters. The performance of our system is compared to that of a system with an equivalent-sized unified cache and with a multiprocessor implementing a directory-based coherence protocol. System performance measures are verified through simulation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Several replacement policies for web caches have been proposed and studied extensively in the literature. Different replacement policies perform better in terms of (i) the number of objects found in the cache (cache hit), (ii) the network traffic avoided by fetching the referenced object from the cache, or (iii) the savings in response time. In this paper, we propose a simple and efficient replacement policy (hereafter known as SE) which improves all three performance measures. Trace-driven simulations were done to evaluate the performance of SE. We compare SE with two widely used and efficient replacement policies, namely Least Recently Used (LRU) and Least Unified Value (LUV) algorithms. Our results show that SE performs at least as well as, if not better than, both these replacement policies. Unlike various other replacement policies proposed in literature, our SE policy does not require parameter tuning or a-priori trace analysis and has an efficient and simple implementation that can be incorporated in any existing proxy server or web server with ease.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Today's feature-rich multimedia products require embedded system solution with complex System-on-Chip (SoC) to meet market expectations of high performance at a low cost and lower energy consumption. The memory architecture of the embedded system strongly influences critical system design objectives like area, power and performance. Hence the embedded system designer performs a complete memory architecture exploration to custom design a memory architecture for a given set of applications. Further, the designer would be interested in multiple optimal design points to address various market segments. However, tight time-to-market constraints enforces short design cycle time. In this paper we address the multi-level multi-objective memory architecture exploration problem through a combination of exhaustive-search based memory exploration at the outer level and a two step based integrated data layout for SPRAM-Cache based architectures at the inner level. We present a two step integrated approach for data layout for SPRAM-Cache based hybrid architectures with the first step as data-partitioning that partitions data between SPRAM and Cache, and the second step is the cache conscious data layout. We formulate the cache-conscious data layout as a graph partitioning problem and show that our approach gives up to 34% improvement over an existing approach and also optimizes the off-chip memory address space. We experimented our approach with 3 embedded multimedia applications and our approach explores several hundred memory configurations for each application, yielding several optimal design points in a few hours of computation on a standard desktop.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The inherent temporal locality in memory accesses is filtered out by the L1 cache. As a consequence, an L2 cache with LRU replacement incurs significantly higher misses than the optimal replacement policy (OPT). We propose to narrow this gap through a novel replacement strategy that mimics the replacement decisions of OPT. The L2 cache is logically divided into two components, a Shepherd Cache (SC) with a simple FIFO replacement and a Main Cache (MC) with an emulation of optimal replacement. The SC plays the dual role of caching lines and guiding the replacement decisions in MC. Our pro- posed organization can cover 40% of the gap between OPT and LRU for a 2MB cache resulting in 7% overall speedup. Comparison with the dynamic insertion policy, a victim buffer, a V-Way cache and an LRU based fully associative cache demonstrates that our scheme performs better than all these strategies.