791 resultados para Transactional memory


Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Concurrency control is mostly based on locks and is therefore notoriously difficult to use. Even though some programming languages provide high-level constructs, these add complexity and potentially hard-to-detect bugs to the application. Transactional memory is an attractive mechanism that does not have the drawbacks of locks, however the underlying implementation is often difficult to integrate into an existing language. In this paper we show how we have introduced transactional semantics into Smalltalk by using the reflective facilities of the language. Our approach is based on method annotations, incremental parse tree transformations and an optimistic commit protocol. The implementation does not depend on modifications to the virtual machine and therefore can be changed at the language level. We report on a practical case study, benchmarks and further and on-going work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the multi-core CPU world, transactional memory (TM)has emerged as an alternative to lock-based programming for thread synchronization. Recent research proposes the use of TM in GPU architectures, where a high number of computing threads, organized in SIMT fashion, requires an effective synchronization method. In contrast to CPUs, GPUs offer two memory spaces: global memory and local memory. The local memory space serves as a shared scratch-pad for a subset of the computing threads, and it is used by programmers to speed-up their applications thanks to its low latency. Prior work from the authors proposed a lightweight hardware TM (HTM) support based in the local memory, modifying the SIMT execution model and adding a conflict detection mechanism. An efficient implementation of these features is key in order to provide an effective synchronization mechanism at the local memory level. After a quick description of the main features of our HTM design for GPU local memory, in this work we gather together a number of proposals designed with the aim of improving those mechanisms with high impact on performance. Firstly, the SIMT execution model is modified to increase the parallelism of the application when transactions must be serialized in order to make forward progress. Secondly, the conflict detection mechanism is optimized depending on application characteristics, such us the read/write sets, the probability of conflict between transactions and the existence of read-only transactions. As these features can be present in hardware simultaneously, it is a task of the compiler and runtime to determine which ones are more important for a given application. This work includes a discussion on the analysis to be done in order to choose the best configuration solution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current industry proposals for Hardware Transactional Memory (HTM) focus on best-effort solutions (BE-HTM) where hardware limits are imposed on transactions. These designs may show a significant performance degradation due to high contention scenarios and different hardware and operating system limitations that abort transactions, e.g. cache overflows, hardware and software exceptions, etc. To deal with these events and to ensure forward progress, BE-HTM systems usually provide a software fallback path to execute a lock-based version of the code. In this paper, we propose a hardware implementation of an irrevocability mechanism as an alternative to the software fallback path to gain insight into the hardware improvements that could enhance the execution of such a fallback. Our mechanism anticipates the abort that causes the transaction serialization, and stalls other transactions in the system so that transactional work loss is mini- mized. In addition, we evaluate the main software fallback path approaches and propose the use of ticket locks that hold precise information of the number of transactions waiting to enter the fallback. Thus, the separation of transactional and fallback execution can be achieved in a precise manner. The evaluation is carried out using the Simics/GEMS simulator and the complete range of STAMP transactional suite benchmarks. We obtain significant performance benefits of around twice the speedup and an abort reduction of 50% over the software fallback path for a number of benchmarks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hardware vendors make an important effort creating low-power CPUs that keep battery duration and durability above acceptable levels. In order to achieve this goal and provide good performance-energy for a wide variety of applications, ARM designed the big.LITTLE architecture. This heterogeneous multi-core architecture features two different types of cores: big cores oriented to performance and little cores, slower and aimed to save energy consumption. As all the cores have access to the same memory, multi-threaded applications must resort to some mutual exclusion mechanism to coordinate the access to shared data by the concurrent threads. Transactional Memory (TM) represents an optimistic approach for shared-memory synchronization. To take full advantage of the features offered by software TM, but also benefit from the characteristics of the heterogeneous big.LITTLE architectures, our focus is to propose TM solutions that take into account the power/performance requirements of the application and what it is offered by the architecture. In order to understand the current state-of-the-art and obtain useful information for future power-aware software TM solutions, we have performed an analysis of a popular TM library running on top of an ARM big.LITTLE processor. Experiments show, in general, better scalability for the LITTLE cores for most of the applications except for one, which requires the computing performance that the big cores offer.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we propose a new technique that can identify transaction-local memory (i.e. captured memory), in managed environments, while having a low runtime overhead. We implemented our proposal in a well known STM framework (Deuce) and we tested it in STMBench7 with two different STMs: TL2 and LSA. In both STMs the performance improved significantly (4 times and 2.6 times, respectively). Moreover, running the STAMP benchmarks with our approach shows improvements of 7 times in the best case for the Vacation application.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para a obtenção do Grau de Mestre em Engenharia Informática

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertação de Mestrado em Engenharia Informática

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Even though Software Transactional Memory (STM) is one of the most promising approaches to simplify concurrent programming, current STM implementations incur significant overheads that render them impractical for many real-sized programs. The key insight of this work is that we do not need to use the same costly barriers for all the memory managed by a real-sized application, if only a small fraction of the memory is under contention lightweight barriers may be used in this case. In this work, we propose a new solution based on an approach of adaptive object metadata (AOM) to promote the use of a fast path to access objects that are not under contention. We show that this approach is able to make the performance of an STM competitive with the best fine-grained lock-based approaches in some of the more challenging benchmarks. (C) 2015 Elsevier Inc. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Classical lock-based concurrency control does not scale with current and foreseen multi-core architectures, opening space for alternative concurrency control mechanisms. The concept of transactions executing concurrently in isolation with an underlying mechanism maintaining a consistent system state was already explored in fault-tolerant and distributed systems, and is currently being explored by transactional memory, this time being used to manage concurrent memory access. In this paper we discuss the use of Software Transactional Memory (STM), and how Ada can provide support for it. Furthermore, we draft a general programming interface to transactional memory, supporting future implementations of STM oriented to real-time systems.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Informática

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Heutzutage haben selbst durchschnittliche Computersysteme mehrere unabhängige Recheneinheiten (Kerne). Wird ein rechenintensives Problem in mehrere Teilberechnungen unterteilt, können diese parallel und damit schneller verarbeitet werden. Obwohl die Entwicklung paralleler Programme mittels Abstraktionen vereinfacht werden kann, ist es selbst für Experten anspruchsvoll, effiziente und korrekte Programme zu schreiben. Während traditionelle Programmiersprachen auf einem eher geringen Abstraktionsniveau arbeiten, bieten funktionale Programmiersprachen wie z.B. Haskell, Möglichkeiten zur fortgeschrittenen Abstrahierung. Das Ziel der vorliegenden Dissertation war es, zu untersuchen, wie gut verschiedene Arten der Abstraktion das Programmieren mit Concurrent Haskell unterstützen. Concurrent Haskell ist eine Bibliothek für Haskell, die parallele Programmierung auf Systemen mit gemeinsamem Speicher ermöglicht. Im Mittelpunkt der Dissertation standen zwei Forschungsfragen. Erstens wurden verschiedene Synchronisierungsansätze verglichen, die sich in ihrem Abstraktionsgrad unterscheiden. Zweitens wurde untersucht, wie Abstraktionen verwendet werden können, um die Komplexität der Parallelisierung vor dem Entwickler zu verbergen. Bei dem Vergleich der Synchronisierungsansätze wurden Locks, Compare-and-Swap Operationen und Software Transactional Memory berücksichtigt. Die Ansätze wurden zunächst bezüglich ihrer Eignung für die Synchronisation einer Prioritätenwarteschlange auf Basis von Skiplists untersucht. Anschließend wurden verschiedene Varianten des Taskpool Entwurfsmusters implementiert (globale Taskpools sowie private Taskpools mit und ohne Taskdiebstahl). Zusätzlich wurde für das Entwurfsmuster eine Abstraktionsschicht entwickelt, welche eine einfache Formulierung von Taskpool-basierten Algorithmen erlaubt. Für die Untersuchung der Frage, ob Haskells Abstraktionsmethoden die Komplexität paralleler Programmierung verbergen können, wurden zunächst stencil-basierte Algorithmen betrachtet. Es wurde eine Bibliothek entwickelt, die eine deklarative Beschreibung von stencil-basierten Algorithmen sowie ihre parallele Ausführung erlaubt. Mit Hilfe dieses deklarativen Interfaces wurde die parallele Implementation vollständig vor dem Anwender verborgen. Anschließend wurde eine eingebettete domänenspezifische Sprache (EDSL) für Knoten-basierte Graphalgorithmen sowie eine entsprechende Ausführungsplattform entwickelt. Die Plattform erlaubt die automatische parallele Verarbeitung dieser Algorithmen. Verschiedene Beispiele zeigten, dass die EDSL eine knappe und dennoch verständliche Formulierung von Graphalgorithmen ermöglicht.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Software Transactional Memory (STM) systems have poor performance under high contention scenarios. Since many transactions compete for the same data, most of them are aborted, wasting processor runtime. Contention management policies are typically used to avoid that, but they are passive approaches as they wait for an abort to happen so they can take action. More proactive approaches have emerged, trying to predict when a transaction is likely to abort so its execution can be delayed. Such techniques are limited, as they do not replace the doomed transaction by another or, when they do, they rely on the operating system for that, having little or no control on which transaction should run. In this paper we propose LUTS, a Lightweight User-Level Transaction Scheduler, which is based on an execution context record mechanism. Unlike other techniques, LUTS provides the means for selecting another transaction to run in parallel, thus improving system throughput. Moreover, it avoids most of the issues caused by pseudo parallelism, as it only launches as many system-level threads as the number of available processor cores. We discuss LUTS design and present three conflict-avoidance heuristics built around LUTS scheduling capabilities. Experimental results, conducted with STMBench7 and STAMP benchmark suites, show LUTS efficiency when running high contention applications and how conflict-avoidance heuristics can improve STM performance even more. In fact, our transaction scheduling techniques are capable of improving program performance even in overloaded scenarios. © 2011 Springer-Verlag.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Software transaction memory (STM) systems have been used as an approach to improve performance, by allowing the concurrent execution of atomic blocks. However, under high-contention workloads, STM-based systems can considerably degrade performance, as transaction conflict rate increases. Contention management policies have been used as a way to select which transaction to abort when a conflict occurs. In general, contention managers are not capable of avoiding conflicts, as they can only select which transaction to abort and the moment it should restart. Since contention managers act only after a conflict is detected, it becomes harder to effectively increase transaction throughput. More proactive approaches have emerged, aiming at predicting when a transaction is likely to abort, postponing its execution. Nevertheless, most of the proposed proactive techniques are limited, as they do not replace the doomed transaction by another or, when they do, they rely on the operating system for that, having little or no control on which transaction to run. This article proposes LUTS, a lightweight user-level transaction scheduler. Unlike other techniques, LUTS provides the means for selecting another transaction to run in parallel, thus improving system throughput. We discuss LUTS design and propose a dynamic conflict-avoidance heuristic built around its scheduling capabilities. Experimental results, conducted with the STAMP and STMBench7 benchmark suites, running on TinySTM and SwissTM, show how our conflict-avoidance heuristic can effectively improve STM performance on high contention applications. © 2012 Springer Science+Business Media, LLC.