997 resultados para structured parallel computations


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: The aim of the SPHERE study is to design, implement and evaluate tailored practice and personal care plans to improve the process of care and objective clinical outcomes for patients with established coronary heart disease (CHD) in general practice across two different health systems on the island of Ireland.CHD is a common cause of death and a significant cause of morbidity in Ireland. Secondary prevention has been recommended as a key strategy for reducing levels of CHD mortality and general practice has been highlighted as an ideal setting for secondary prevention initiatives. Current indications suggest that there is considerable room for improvement in the provision of secondary prevention for patients with established heart disease on the island of Ireland. The review literature recommends structured programmes with continued support and follow-up of patients; the provision of training, tailored to practice needs of access to evidence of effectiveness of secondary prevention; structured recall programmes that also take account of individual practice needs; and patient-centred consultations accompanied by attention to disease management guidelines.

Methods: SPHERE is a cluster randomised controlled trial, with practice-level randomisation to intervention and control groups, recruiting 960 patients from 48 practices in three study centres (Belfast, Dublin and Galway). Primary outcomes are blood pressure, total cholesterol, physical and mental health status (SF-12) and hospital re-admissions. The intervention takes place over two years and data is collected at baseline, one-year and two-year follow-up. Data is obtained from medical charts, consultations with practitioners, and patient postal questionnaires. The SPHERE intervention involves the implementation of a structured systematic programme of care for patients with CHD attending general practice. It is a multi-faceted intervention that has been developed to respond to barriers and solutions to optimal secondary prevention identified in preliminary qualitative research with practitioners and patients. General practitioners and practice nurses attend training sessions in facilitating behaviour change and medication prescribing guidelines for secondary prevention of CHD. Patients are invited to attend regular four-monthly consultations over two years, during which targets and goals for secondary prevention are set and reviewed. The analysis will be strengthened by economic, policy and qualitative components.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper a model of grid computation that supports both heterogeneity and dynamicity is presented. The model presupposes that user sites contain software components awaiting execution on the grid. User sites and grid sites interact by means of managers which control dynamic behaviour. The orchestration language ORC [9,10] offers an abstract means of specifying operations for resource acquisition and execution monitoring while allowing for the possibility of non-responsive hardware. It is demonstrated that ORC is sufficiently expressive to model typical kinds of grid interactions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As data analytics are growing in importance they are also quickly becoming one of the dominant application domains that require parallel processing. This paper investigates the applicability of OpenMP, the dominant shared-memory parallel programming model in high-performance computing, to the domain of data analytics. We contrast the performance and programmability of key data analytics benchmarks against Phoenix++, a state-of-the-art shared memory map/reduce programming system. Our study shows that OpenMP outperforms the Phoenix++ system by a large margin for several benchmarks. In other cases, however, the programming model is lacking support for this application domain.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Approximate execution is a viable technique for environments with energy constraints, provided that applications are given the mechanisms to produce outputs of the highest possible quality within the available energy budget. This paper introduces a framework for energy-constrained execution with controlled and graceful quality loss. A simple programming model allows developers to structure the computation in different tasks, and to express the relative importance of these tasks for the quality of the end result. For non-significant tasks, the developer can also supply less costly, approximate versions. The target energy consumption for a given execution is specified when the application is launched. A significance-aware runtime system employs an application-specific analytical energy model to decide how many cores to use for the execution, the operating frequency for these cores, as well as the degree of task approximation, so as to maximize the quality of the output while meeting the user-specified energy constraints. Evaluation on a dual-socket 16-core Intel platform using 9 benchmark kernels shows that the proposed framework picks the optimal configuration with high accuracy. Also, a comparison with loop perforation (a well-known compile-time approximation technique), shows that the proposed framework results in significantly higher quality for the same energy budget.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The UMTS turbo encoder is composed of parallel concatenation of two Recursive Systematic Convolutional (RSC) encoders which start and end at a known state. This trellis termination directly affects the performance of turbo codes. This paper presents performance analysis of multi-point trellis termination of turbo codes which is to terminate RSC encoders at more than one point of the current frame while keeping the interleaver length the same. For long interleaver lengths, this approach provides dividing a data frame into sub-frames which can be treated as independent blocks. A novel decoding architecture using multi-point trellis termination and collision-free interleavers is presented. Collision-free interleavers are used to solve memory collision problems encountered by parallel decoding of turbo codes. The proposed parallel decoding architecture reduces the decoding delay caused by the iterative nature and forward-backward metric computations of turbo decoding algorithms. Our simulations verified that this turbo encoding and decoding scheme shows Bit Error Rate (BER) performance very close to that of the UMTS turbo coding while providing almost %50 time saving for the 2-point termination and %80 time saving for the 5-point termination.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Turbo codes experience a significant decoding delay because of the iterative nature of the decoding algorithms, the high number of metric computations and the complexity added by the (de)interleaver. The extrinsic information is exchanged sequentially between two Soft-Input Soft-Output (SISO) decoders. Instead of this sequential process, a received frame can be divided into smaller windows to be processed in parallel. In this paper, a novel parallel processing methodology is proposed based on the previous parallel decoding techniques. A novel Contention-Free (CF) interleaver is proposed as part of the decoding architecture which allows using extrinsic Log-Likelihood Ratios (LLRs) immediately as a-priori LLRs to start the second half of the iterative turbo decoding. The simulation case studies performed in this paper show that our parallel decoding method can provide %80 time saving compared to the standard decoding and %30 time saving compared to the previous parallel decoding methods at the expense of 0.3 dB Bit Error Rate (BER) performance degradation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Embedded real-time applications increasingly present high computation requirements, which need to be completed within specific deadlines, but that present highly variable patterns, depending on the set of data available in a determined instant. The current trend to provide parallel processing in the embedded domain allows providing higher processing power; however, it does not address the variability in the processing pattern. Dimensioning each device for its worst-case scenario implies lower average utilization, and increased available, but unusable, processing in the overall system. A solution for this problem is to extend the parallel execution of the applications, allowing networked nodes to distribute the workload, on peak situations, to neighbour nodes. In this context, this report proposes a framework to develop parallel and distributed real-time embedded applications, transparently using OpenMP and Message Passing Interface (MPI), within a programming model based on OpenMP. The technical report also devises an integrated timing model, which enables the structured reasoning on the timing behaviour of these hybrid architectures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In der vorliegenden Dissertation werden Systeme von parallel arbeitenden und miteinander kommunizierenden Restart-Automaten (engl.: systems of parallel communicating restarting automata; abgekürzt PCRA-Systeme) vorgestellt und untersucht. Dabei werden zwei bekannte Konzepte aus den Bereichen Formale Sprachen und Automatentheorie miteinander vescrknüpft: das Modell der Restart-Automaten und die sogenannten PC-Systeme (systems of parallel communicating components). Ein PCRA-System besteht aus endlich vielen Restart-Automaten, welche einerseits parallel und unabhängig voneinander lokale Berechnungen durchführen und andererseits miteinander kommunizieren dürfen. Die Kommunikation erfolgt dabei durch ein festgelegtes Kommunikationsprotokoll, das mithilfe von speziellen Kommunikationszuständen realisiert wird. Ein wesentliches Merkmal hinsichtlich der Kommunikationsstruktur in Systemen von miteinander kooperierenden Komponenten ist, ob die Kommunikation zentralisiert oder nichtzentralisiert erfolgt. Während in einer nichtzentralisierten Kommunikationsstruktur jede Komponente mit jeder anderen Komponente kommunizieren darf, findet jegliche Kommunikation innerhalb einer zentralisierten Kommunikationsstruktur ausschließlich mit einer ausgewählten Master-Komponente statt. Eines der wichtigsten Resultate dieser Arbeit zeigt, dass zentralisierte Systeme und nichtzentralisierte Systeme die gleiche Berechnungsstärke besitzen (das ist im Allgemeinen bei PC-Systemen nicht so). Darüber hinaus bewirkt auch die Verwendung von Multicast- oder Broadcast-Kommunikationsansätzen neben Punkt-zu-Punkt-Kommunikationen keine Erhöhung der Berechnungsstärke. Desweiteren wird die Ausdrucksstärke von PCRA-Systemen untersucht und mit der von PC-Systemen von endlichen Automaten und mit der von Mehrkopfautomaten verglichen. PC-Systeme von endlichen Automaten besitzen bekanntermaßen die gleiche Ausdrucksstärke wie Einwegmehrkopfautomaten und bilden eine untere Schranke für die Ausdrucksstärke von PCRA-Systemen mit Einwegkomponenten. Tatsächlich sind PCRA-Systeme auch dann stärker als PC-Systeme von endlichen Automaten, wenn die Komponenten für sich genommen die gleiche Ausdrucksstärke besitzen, also die regulären Sprachen charakterisieren. Für PCRA-Systeme mit Zweiwegekomponenten werden als untere Schranke die Sprachklassen der Zweiwegemehrkopfautomaten im deterministischen und im nichtdeterministischen Fall gezeigt, welche wiederum den bekannten Komplexitätsklassen L (deterministisch logarithmischer Platz) und NL (nichtdeterministisch logarithmischer Platz) entsprechen. Als obere Schranke wird die Klasse der kontextsensitiven Sprachen gezeigt. Außerdem werden Erweiterungen von Restart-Automaten betrachtet (nonforgetting-Eigenschaft, shrinking-Eigenschaft), welche bei einzelnen Komponenten eine Erhöhung der Berechnungsstärke bewirken, in Systemen jedoch deren Stärke nicht erhöhen. Die von PCRA-Systemen charakterisierten Sprachklassen sind unter diversen Sprachoperationen abgeschlossen und einige Sprachklassen sind sogar abstrakte Sprachfamilien (sogenannte AFL's). Abschließend werden für PCRA-Systeme spezifische Probleme auf ihre Entscheidbarkeit hin untersucht. Es wird gezeigt, dass Leerheit, Universalität, Inklusion, Gleichheit und Endlichkeit bereits für Systeme mit zwei Restart-Automaten des schwächsten Typs nicht semientscheidbar sind. Für das Wortproblem wird gezeigt, dass es im deterministischen Fall in quadratischer Zeit und im nichtdeterministischen Fall in exponentieller Zeit entscheidbar ist.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we face the problem of positioning a camera attached to the end-effector of a robotic manipulator so that it gets parallel to a planar object. Such problem has been treated for a long time in visual servoing. Our approach is based on linking to the camera several laser pointers so that its configuration is aimed to produce a suitable set of visual features. The aim of using structured light is not only for easing the image processing and to allow low-textured objects to be treated, but also for producing a control scheme with nice properties like decoupling, stability, well conditioning and good camera trajectory

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The chess endgame is increasingly being seen through the lens of, and therefore effectively defined by, a data ‘model’ of itself. It is vital that such models are clearly faithful to the reality they purport to represent. This paper examines that issue and systems engineering responses to it, using the chess endgame as the exemplar scenario. A structured survey has been carried out of the intrinsic challenges and complexity of creating endgame data by reviewing the past pattern of errors during work in progress, surfacing in publications and occurring after the data was generated. Specific measures are proposed to counter observed classes of error-risk, including a preliminary survey of techniques for using state-of-the-art verification tools to generate EGTs that are correct by construction. The approach may be applied generically beyond the game domain.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One among the most influential and popular data mining methods is the k-Means algorithm for cluster analysis. Techniques for improving the efficiency of k-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting geometrical constraints and an efficient data structure, notably a multidimensional binary search tree (KD-Tree). These techniques allow to reduce the number of distance computations the algorithm performs at each iteration. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient k-Means variants in parallel computing environments. In this work, we provide a parallel formulation of the KD-Tree based k-Means algorithm for distributed memory systems and address its load balancing issue. Three solutions have been developed and tested. Two approaches are based on a static partitioning of the data set and a third solution incorporates a dynamic load balancing policy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many scientific and engineering applications involve inverting large matrices or solving systems of linear algebraic equations. Solving these problems with proven algorithms for direct methods can take very long to compute, as they depend on the size of the matrix. The computational complexity of the stochastic Monte Carlo methods depends only on the number of chains and the length of those chains. The computing power needed by inherently parallel Monte Carlo methods can be satisfied very efficiently by distributed computing technologies such as Grid computing. In this paper we show how a load balanced Monte Carlo method for computing the inverse of a dense matrix can be constructed, show how the method can be implemented on the Grid, and demonstrate how efficiently the method scales on multiple processors. (C) 2007 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper is concerned with the uniformization of a system of afine recurrence equations. This transformation is used in the design (or compilation) of highly parallel embedded systems (VLSI systolic arrays, signal processing filters, etc.). In this paper, we present and implement an automatic system to achieve uniformization of systems of afine recurrence equations. We unify the results from many earlier papers, develop some theoretical extensions, and then propose effective uniformization algorithms. Our results can be used in any high level synthesis tool based on polyhedral representation of nested loop computations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We examine efficient computer implementation of one method of deterministic global optimisation, the cutting angle method. In this method the objective function is approximated from values below the function with a piecewise linear auxiliary function. The global minimum of the objective function is approximated from the sequence of minima of this auxiliary function. Computing the minima of the auxiliary function is a combinatorial problem, and we show that it can be effectively parallelised. We discuss the improvements made to the serial implementation of the cutting angle method, and ways of distributing computations across multiple processors on parallel and cluster computers.