Biblioteca Digital

853 resultados para parallel execution

Towards the Combination of Work-Stealing and Semi-Partitioned Scheduling for Parallel Tasks

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Paper/Poster presented in Work in Progress Session, 28th GI/ITG International Conference on Architecture of Computing Systems (ARCS 2015). 24 to 26, Mar, 2015. Porto, Portugal.

Veja mais

Towards the Combination of Work-Stealing and Semi-Partitioned Scheduling for Parallel Tasks

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Poster presented in Work in Progress Session, 28th GI/ITG International Conference on Architecture of Computing Systems (ARCS 2015). 24 to 26, Mar, 2015. Porto, Portugal.

Veja mais

Non-preemptive and SRP-based fullypreemptive scheduling of real-time Software Transactional Memory

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent embedded processor architectures containing multiple heterogeneous cores and non-coherent caches renewed attention to the use of Software Transactional Memory (STM) as a building block for developing parallel applications. STM promises to ease concurrent and parallel software development, but relies on the possibility of abort conflicting transactions to maintain data consistency, which in turns affects the execution time of tasks carrying transactions. Because of this fact the timing behaviour of the task set may not be predictable, thus it is crucial to limit the execution time overheads resulting from aborts. In this paper we formalise a FIFO-based algorithm to order the sequence of commits of concurrent transactions. Then, we propose and evaluate two non-preemptive and one SRP-based fully-preemptive scheduling strategies, in order to avoid transaction starvation.

Veja mais

Using Quicktrace to collect runtime execution traces easily and automatically

Relevância:

20.00% 20.00%

Publicador:

Resumo:

IEEE Real-Time Systems Symposium (RTSS 2015). 1 to 4, Dec, 2015. U.S.A.

Veja mais

Parallel Software framework for time-critical many-core systems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The recent technological advancements and market trends are causing an interesting phenomenon towards the convergence of High-Performance Computing (HPC) and Embedded Computing (EC) domains. On one side, new kinds of HPC applications are being required by markets needing huge amounts of information to be processed within a bounded amount of time. On the other side, EC systems are increasingly concerned with providing higher performance in real-time, challenging the performance capabilities of current architectures. The advent of next-generation many-core embedded platforms has the chance of intercepting this converging need for predictable high-performance, allowing HPC and EC applications to be executed on efficient and powerful heterogeneous architectures integrating general-purpose processors with many-core computing fabrics. To this end, it is of paramount importance to develop new techniques for exploiting the massively parallel computation capabilities of such platforms in a predictable way. P-SOCRATES will tackle this important challenge by merging leading research groups from the HPC and EC communities. The time-criticality and parallelisation challenges common to both areas will be addressed by proposing an integrated framework for executing workload-intensive applications with real-time requirements on top of next-generation commercial-off-the-shelf (COTS) platforms based on many-core accelerated architectures. The project will investigate new HPC techniques that fulfil real-time requirements. The main sources of indeterminism will be identified, proposing efficient mapping and scheduling algorithms, along with the associated timing and schedulability analysis, to guarantee the real-time and performance requirements of the applications.

Veja mais

Parallel programming in biomedical signal processing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Biomédica

Veja mais

Distributed processing of large remote sensing images using MapReduce - A case of Edge Detection

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.

Veja mais

Algorithmic skeleton framework for the orchestration of GPU computations

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Informática

Veja mais

A distributed platform for the volunteer execution of workflows on a local area network

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Thesis submitted in fulfilment of the requirements for the Degree of Master of Science in Computer Science

Veja mais

Optimization of breast tomosynthesis image reconstruction using parallel computing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Breast cancer is the most common cancer among women, being a major public health problem. Worldwide, X-ray mammography is the current gold-standard for medical imaging of breast cancer. However, it has associated some well-known limitations. The false-negative rates, up to 66% in symptomatic women, and the false-positive rates, up to 60%, are a continued source of concern and debate. These drawbacks prompt the development of other imaging techniques for breast cancer detection, in which Digital Breast Tomosynthesis (DBT) is included. DBT is a 3D radiographic technique that reduces the obscuring effect of tissue overlap and appears to address both issues of false-negative and false-positive rates. The 3D images in DBT are only achieved through image reconstruction methods. These methods play an important role in a clinical setting since there is a need to implement a reconstruction process that is both accurate and fast. This dissertation deals with the optimization of iterative algorithms, with parallel computing through an implementation on Graphics Processing Units (GPUs) to make the 3D reconstruction faster using Compute Unified Device Architecture (CUDA). Iterative algorithms have shown to produce the highest quality DBT images, but since they are computationally intensive, their clinical use is currently rejected. These algorithms have the potential to reduce patient dose in DBT scans. A method of integrating CUDA in Interactive Data Language (IDL) is proposed in order to accelerate the DBT image reconstructions. This method has never been attempted before for DBT. In this work the system matrix calculation, the most computationally expensive part of iterative algorithms, is accelerated. A speedup of 1.6 is achieved proving the fact that GPUs can accelerate the IDL implementation.

Veja mais

Towards an algorithmic skeleton framework for programming the Intel R Xeon PhiTM processor

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Intel R Xeon PhiTM is the first processor based on Intel’s MIC (Many Integrated Cores) architecture. It is a co-processor specially tailored for data-parallel computations, whose basic architectural design is similar to the ones of GPUs (Graphics Processing Units), leveraging the use of many integrated low computational cores to perform parallel computations. The main novelty of the MIC architecture, relatively to GPUs, is its compatibility with the Intel x86 architecture. This enables the use of many of the tools commonly available for the parallel programming of x86-based architectures, which may lead to a smaller learning curve. However, programming the Xeon Phi still entails aspects intrinsic to accelerator-based computing, in general, and to the MIC architecture, in particular. In this thesis we advocate the use of algorithmic skeletons for programming the Xeon Phi. Algorithmic skeletons abstract the complexity inherent to parallel programming, hiding details such as resource management, parallel decomposition, inter-execution flow communication, thus removing these concerns from the programmer’s mind. In this context, the goal of the thesis is to lay the foundations for the development of a simple but powerful and efficient skeleton framework for the programming of the Xeon Phi processor. For this purpose we build upon Marrow, an existing framework for the orchestration of OpenCLTM computations in multi-GPU and CPU environments. We extend Marrow to execute both OpenCL and C++ parallel computations on the Xeon Phi. We evaluate the newly developed framework, several well-known benchmarks, like Saxpy and N-Body, will be used to compare, not only its performance to the existing framework when executing on the co-processor, but also to assess the performance on the Xeon Phi versus a multi-GPU environment.

Veja mais

A public transport bus assignment problem: parallel metaheuristics assessment

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Combinatorial Optimization Problems occur in a wide variety of contexts and generally are NP-hard problems. At a corporate level solving this problems is of great importance since they contribute to the optimization of operational costs. In this thesis we propose to solve the Public Transport Bus Assignment problem considering an heterogeneous fleet and line exchanges, a variant of the Multi-Depot Vehicle Scheduling Problem in which additional constraints are enforced to model a real life scenario. The number of constraints involved and the large number of variables makes impracticable solving to optimality using complete search techniques. Therefore, we explore metaheuristics, that sacrifice optimality to produce solutions in feasible time. More concretely, we focus on the development of algorithms based on a sophisticated metaheuristic, Ant-Colony Optimization (ACO), which is based on a stochastic learning mechanism. For complex problems with a considerable number of constraints, sophisticated metaheuristics may fail to produce quality solutions in a reasonable amount of time. Thus, we developed parallel shared-memory (SM) synchronous ACO algorithms, however, synchronism originates the straggler problem. Therefore, we proposed three SM asynchronous algorithms that break the original algorithm semantics and differ on the degree of concurrency allowed while manipulating the learned information. Our results show that our sequential ACO algorithms produced better solutions than a Restarts metaheuristic, the ACO algorithms were able to learn and better solutions were achieved by increasing the amount of cooperation (number of search agents). Regarding parallel algorithms, our asynchronous ACO algorithms outperformed synchronous ones in terms of speedup and solution quality, achieving speedups of 17.6x. The cooperation scheme imposed by asynchronism also achieved a better learning rate than the original one.

Veja mais

What do broadband consumers want? Design and execution of a questionnaire on demand for bundled services on the Portuguese telecomunications market

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The report addresses the question of what are the preferences of broadband consumers on the Portuguese telecommunication market. A triple play bundle is being investigated. The discrete choice analysis, adopted in the study, base on 110 responses, mainly from NOVA students. The data for the analysis was collected via manually designed on-line survey. The results show that the price attribute is relatively the most important one while the television attribute is being overlooked in the decision making process. Main effects examined in the research are robust. In addition, "extras" components are being tested in terms of users' preferences.

Veja mais

An effective on-the-job training model-to facilitate strategy execution

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Strategy execution has been a heated topic in the management world in recent years. However, according to a survey done by the Conference Board (2014), the chief executives are so concerned about the execution in their companies and have rated it as the No.1 or No.2 most challenging issue. Many of them choose to invest in training with a purpose to harvest the most for strategy execution. Therefore, this research is trying to find out a model to design training programs that can at most contribute to the success of strategy execution with three real-life training cases done by BTS Consulting Service. It was found that strategy execution could be greatly supported by training programs that take into consideration the four factors, namely Alignment, Mindset to Change, Capability and Organization Support. Main implications of the findings are presented and discussed. Key

Veja mais

Estudo de viabilidade de paralelização de códigos de análise de dados em PROOF

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação de mestrado em Engenharia Informática

Veja mais

853 resultados para parallel execution

Filtro por publicador