989 resultados para Memory management


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis presents a novel program parallelization technique incorporating with dynamic and static scheduling. It utilizes a problem specific pattern developed from the prior knowledge of the targeted problem abstraction. Suitable for solving complex parallelization problems such as data intensive all-to-all comparison constrained by memory, the technique delivers more robust and faster task scheduling compared to the state-of-the art techniques. Good performance is achieved from the technique in data intensive bioinformatics applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The research question of this thesis was how knowledge can be managed with information systems. Information systems can support but not replace knowledge management. Systems can mainly store epistemic organisational knowledge included in content, and process data and information. Certain value can be achieved by adding communication technology to systems. All communication, however, can not be managed. A new layer between communication and manageable information was named as knowformation. Knowledge management literature was surveyed, together with information species from philosophy, physics, communication theory, and information system science. Positivism, post-positivism, and critical theory were studied, but knowformation in extended organisational memory seemed to be socially constructed. A memory management model of an extended enterprise (M3.exe) and knowformation concept were findings from iterative case studies, covering data, information and knowledge management systems. The cases varied from groups towards extended organisation. Systems were investigated, and administrators, users (knowledge workers) and managers interviewed. The model building required alternative sets of data, information and knowledge, instead of using the traditional pyramid. Also the explicit-tacit dichotomy was reconsidered. As human knowledge is the final aim of all data and information in the systems, the distinction between management of information vs. management of people was harmonised. Information systems were classified as the core of organisational memory. The content of the systems is in practice between communication and presentation. Firstly, the epistemic criterion of knowledge is not required neither in the knowledge management literature, nor from the content of the systems. Secondly, systems deal mostly with containers, and the knowledge management literature with applied knowledge. Also the construction of reality based on the system content and communication supports the knowformation concept. Knowformation belongs to memory management model of an extended enterprise (M3.exe) that is divided into horizontal and vertical key dimensions. Vertically, processes deal with content that can be managed, whereas communication can be supported, mainly by infrastructure. Horizontally, the right hand side of the model contains systems, and the left hand side content, which should be independent from each other. A strategy based on the model was defined.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Exploiting the performance potential of GPUs requires managing the data transfers to and from them efficiently which is an error-prone and tedious task. In this paper, we develop a software coherence mechanism to fully automate all data transfers between the CPU and GPU without any assistance from the programmer. Our mechanism uses compiler analysis to identify potential stale accesses and uses a runtime to initiate transfers as necessary. This allows us to avoid redundant transfers that are exhibited by all other existing automatic memory management proposals. We integrate our automatic memory manager into the X10 compiler and runtime, and find that it not only results in smaller and simpler programs, but also eliminates redundant memory transfers. Tested on eight programs ported from the Rodinia benchmark suite it achieves (i) a 1.06x speedup over hand-tuned manual memory management, and (ii) a 1.29x speedup over another recently proposed compiler--runtime automatic memory management system. Compared to other existing runtime-only and compiler-only proposals, it also transfers 2.2x to 13.3x less data on average.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The interactions among three important issues involved in the implementation of logic programs in parallel (goal scheduling, precedence, and memory management) are discussed. A simplified, parallel memory management model and an efficient, load-balancing goal scheduling strategy are presented. It is shown how, for systems which support "don't know" non-determinism, special care has to be taken during goal scheduling if the space recovery characteristics of sequential systems are to be preserved. A solution based on selecting only "newer" goals for execution is described, and an algorithm is proposed for efficiently maintaining and determining precedence relationships and variable ages across parallel goals. It is argued that the proposed schemes and algorithms make it possible to extend the storage performance of sequential systems to parallel execution without the considerable overhead previously associated with it. The results are applicable to a wide class of parallel and coroutining systems, and they represent an efficient alternative to "all heap" or "spaghetti stack" allocation models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present an overview of the stack-based memory management techniques that we used in our non-deterministic and-parallel Prolog systems: &-Prolog and DASWAM. We believe that the problems associated with non-deterministic and-parallel systems are more general than those encountered in or-parallel and deterministic and-parallel systems, which can be seen as subsets of this more general case. We develop on the previously proposed "marker scheme", lifting some of the restrictions associated with the selection of goals while keeping (virtual) memory consumption down. We also review some of the other problems associated with the stack-based management scheme, such as handling of forward and backward execution, cut, and roll-backs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we examine the issue of memory management in the parallel execution of logic programs. We concentrate on non-deterministic and-parallel schemes which we believe present a relatively general set of problems to be solved, including most of those encountered in the memory management of or-parallel systems. We present a distributed stack memory management model which allows flexible scheduling of goals. Previously proposed models (based on the "Marker model") are lacking in that they impose restrictions on the selection of goals to be executed or they may require consume a large amount of virtual memory. This paper first presents results which imply that the above mentioned shortcomings can have significant performance impacts. An extension of the Marker Model is then proposed which allows flexible scheduling of goals while keeping (virtual) memory consumption down. Measurements are presented which show the advantage of this solution. Methods for handling forward and backward execution, cut and roll back are discussed in the context of the proposed scheme. In addition, the paper shows how the same mechanism for flexible scheduling can be applied to allow the efficient handling of the very general form of suspension that can occur in systems which combine several types of and-parallelism and more sophisticated methods of executing logic programs. We believe that the results are applicable to many and- and or-parallel systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

GraphChi is the first reported disk-based graph engine that can handle billion-scale graphs on a single PC efficiently. GraphChi is able to execute several advanced data mining, graph mining and machine learning algorithms on very large graphs. With the novel technique of parallel sliding windows (PSW) to load subgraph from disk to memory for vertices and edges updating, it can achieve data processing performance close to and even better than those of mainstream distributed graph engines. GraphChi mentioned that its memory is not effectively utilized with large dataset, which leads to suboptimal computation performances. In this paper we are motivated by the concepts of 'pin ' from TurboGraph and 'ghost' from GraphLab to propose a new memory utilization mode for GraphChi, which is called Part-in-memory mode, to improve the GraphChi algorithm performance. The main idea is to pin a fixed part of data inside the memory during the whole computing process. Part-in-memory mode is successfully implemented with only about 40 additional lines of code to the original GraphChi engine. Extensive experiments are performed with large real datasets (including Twitter graph with 1.4 billion edges). The preliminary results show that Part-in-memory mode memory management approach effectively reduces the GraphChi running time by up to 60% in PageRank algorithm. Interestingly it is found that a larger portion of data pinned in memory does not always lead to better performance in the case that the whole dataset cannot be fitted in memory. There exists an optimal portion of data which should be kept in the memory to achieve the best computational performance.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

DRAM technology faces density and power challenges to increase capacity because of limitations of physical cell design. To overcome these limitations, system designers are exploring alternative solutions that combine DRAM and emerging NVRAM technologies. Previous work on heterogeneous memories focuses, mainly, on two system designs: PCache, a hierarchical, inclusive memory system, and HRank, a flat, non-inclusive memory system. We demonstrate that neither of these designs can universally achieve high performance and energy efficiency across a suite of HPC workloads. In this work, we investigate the impact of a number of multilevel memory designs on the performance, power, and energy consumption of applications. To achieve this goal and overcome the limited number of available tools to study heterogeneous memories, we created HMsim, an infrastructure that enables n-level, heterogeneous memory studies by leveraging existing memory simulators. We, then, propose HpMC, a new memory controller design that combines the best aspects of existing management policies to improve performance and energy. Our energy-aware memory management system dynamically switches between PCache and HRank based on the temporal locality of applications. Our results show that HpMC reduces energy consumption from 13% to 45% compared to PCache and HRank, while providing the same bandwidth and higher capacity than a conventional DRAM system.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The generation of a correlation matrix from a large set of long gene sequences is a common requirement in many bioinformatics problems such as phylogenetic analysis. The generation is not only computationally intensive but also requires significant memory resources as, typically, few gene sequences can be simultaneously stored in primary memory. The standard practice in such computation is to use frequent input/output (I/O) operations. Therefore, minimizing the number of these operations will yield much faster run-times. This paper develops an approach for the faster and scalable computing of large-size correlation matrices through the full use of available memory and a reduced number of I/O operations. The approach is scalable in the sense that the same algorithms can be executed on different computing platforms with different amounts of memory and can be applied to different problems with different correlation matrix sizes. The significant performance improvement of the approach over the existing approaches is demonstrated through benchmark examples.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The generation of a correlation matrix for set of genomic sequences is a common requirement in many bioinformatics problems such as phylogenetic analysis. Each sequence may be millions of bases long and there may be thousands of such sequences which we wish to compare, so not all sequences may fit into main memory at the same time. Each sequence needs to be compared with every other sequence, so we will generally need to page some sequences in and out more than once. In order to minimize execution time we need to minimize this I/O. This paper develops an approach for faster and scalable computing of large-size correlation matrices through the maximal exploitation of available memory and reducing the number of I/O operations. The approach is scalable in the sense that the same algorithms can be executed on different computing platforms with different amounts of memory and can be applied to different bioinformatics problems with different correlation matrix sizes. The significant performance improvement of the approach over previous work is demonstrated through benchmark examples.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We propose and evaluate admission control mechanisms for ACCORD, an Admission Control and Capacity Overload management Real-time Database framework-an architecture and a transaction model-for hard deadline RTDB systems. The system architecture consists of admission control and scheduling components which provide early notification of failure to submitted transactions that are deemed not valuable or incapable of completing on time. In particular, our Concurrency Admission Control Manager (CACM) ensures that transactions which are admitted do not overburden the system by requiring a level of concurrency that is not sustainable. The transaction model consists of two components: a primary task and a compensating task. The execution requirements of the primary task are not known a priori, whereas those of the compensating task are known a priori. Upon the submission of a transaction, the Admission Control Mechanisms are employed to decide whether to admit or reject that transaction. Once admitted, a transaction is guaranteed to finish executing before its deadline. A transaction is considered to have finished executing if exactly one of two things occur: Either its primary task is completed (successful commitment), or its compensating task is completed (safe termination). Committed transactions bring a profit to the system, whereas a terminated transaction brings no profit. The goal of the admission control and scheduling protocols (e.g., concurrency control, I/O scheduling, memory management) employed in the system is to maximize system profit. In that respect, we describe a number of concurrency admission control strategies and contrast (through simulations) their relative performance.