942 resultados para Parallel programming model


Relevância:

40.00% 40.00%

Publicador:

Resumo:

3rd Workshop on High-performance and Real-time Embedded Systems (HIRES 2015). 21, Jan, 2015. Amsterdam, Netherlands.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

With the shift towards many-core computer architectures, dataflow programming has been proposed as one potential solution for producing software that scales to a varying number of processor cores. Programming for parallel architectures is considered difficult as the current popular programming languages are inherently sequential and introducing parallelism is typically up to the programmer. Dataflow, however, is inherently parallel, describing an application as a directed graph, where nodes represent calculations and edges represent a data dependency in form of a queue. These queues are the only allowed communication between the nodes, making the dependencies between the nodes explicit and thereby also the parallelism. Once a node have the su cient inputs available, the node can, independently of any other node, perform calculations, consume inputs, and produce outputs. Data ow models have existed for several decades and have become popular for describing signal processing applications as the graph representation is a very natural representation within this eld. Digital lters are typically described with boxes and arrows also in textbooks. Data ow is also becoming more interesting in other domains, and in principle, any application working on an information stream ts the dataflow paradigm. Such applications are, among others, network protocols, cryptography, and multimedia applications. As an example, the MPEG group standardized a dataflow language called RVC-CAL to be use within reconfigurable video coding. Describing a video coder as a data ow network instead of with conventional programming languages, makes the coder more readable as it describes how the video dataflows through the different coding tools. While dataflow provides an intuitive representation for many applications, it also introduces some new problems that need to be solved in order for data ow to be more widely used. The explicit parallelism of a dataflow program is descriptive and enables an improved utilization of available processing units, however, the independent nodes also implies that some kind of scheduling is required. The need for efficient scheduling becomes even more evident when the number of nodes is larger than the number of processing units and several nodes are running concurrently on one processor core. There exist several data ow models of computation, with different trade-offs between expressiveness and analyzability. These vary from rather restricted but statically schedulable, with minimal scheduling overhead, to dynamic where each ring requires a ring rule to evaluated. The model used in this work, namely RVC-CAL, is a very expressive language, and in the general case it requires dynamic scheduling, however, the strong encapsulation of dataflow nodes enables analysis and the scheduling overhead can be reduced by using quasi-static, or piecewise static, scheduling techniques. The scheduling problem is concerned with nding the few scheduling decisions that must be run-time, while most decisions are pre-calculated. The result is then an, as small as possible, set of static schedules that are dynamically scheduled. To identify these dynamic decisions and to find the concrete schedules, this thesis shows how quasi-static scheduling can be represented as a model checking problem. This involves identifying the relevant information to generate a minimal but complete model to be used for model checking. The model must describe everything that may affect scheduling of the application while omitting everything else in order to avoid state space explosion. This kind of simplification is necessary to make the state space analysis feasible. For the model checker to nd the actual schedules, a set of scheduling strategies are de ned which are able to produce quasi-static schedulers for a wide range of applications. The results of this work show that actor composition with quasi-static scheduling can be used to transform data ow programs to t many different computer architecture with different type and number of cores. This in turn, enables dataflow to provide a more platform independent representation as one application can be fitted to a specific processor architecture without changing the actual program representation. Instead, the program representation is in the context of design space exploration optimized by the development tools to fit the target platform. This work focuses on representing the dataflow scheduling problem as a model checking problem and is implemented as part of a compiler infrastructure. The thesis also presents experimental results as evidence of the usefulness of the approach.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

UNE EXPOSITION NÉONATALE À L’OXYGÈNE MÈNE À DES MODIFICATIONS DE LA FONCTION MITOCHONDRIALE CHEZ LE RAT ADULTE Introduction: L’exposition à l’oxygène (O2) des ratons nouveau-nés a des conséquences à l’âge adulte dont une hypertension artérielle (HTA), une dysfonction vasculaire, une néphropénie et des indices de stress oxydant. En considérant que les reins sont encore en développement actif lors des premiers jours après la naissance chez les rats, jouent un rôle clé dans le développement de l’hypertension et qu’une dysfonction mitochondriale est associé à une augmentation du stress oxydant, nous postulons que les conditions délétères néonatales peuvent avoir un impact significatif au niveau rénal sur la modulation de l’expression de protéines clés du fonctionnement mitochondrial et une production mitochondriale excessive d’espèces réactives de l’ O2. Méthodes: Des ratons Sprague-Dawley sont exposés à 80% d’O2 (H) ou 21% O2 (Ctrl) du 3e au 10e jr de vie. En considérant que plusieurs organes des rats sont encore en développement actif à la naissance, ces rongeurs sont un modèle reconnu pour étudier les complications d’une hyperoxie néonatale, comme celles liées à une naissance prématurée chez l’homme. À 4 et à 16 semaines, les reins sont prélevés et les mitochondries sont extraites suivant une méthode d’extraction standard, avec un tampon contenant du sucrose 0.32 M et différentes centrifugations. L’expression des protéines mitochondriales a été mesurée par Western blot, tandis que la production d’ H202 et les activités des enzymes clés du cycle de Krebs ont été évaluées par spectrophotométrie. Les résultats sont exprimés par la moyenne ± SD. Résultats: Les rats mâles H de 16 semaines (n=6) présentent une activité de citrate synthase (considéré standard interne de l’expression protéique et de l’abondance mitochondriales) augmentée (12.4 ± 8.4 vs 4.1 ± 0.5 μmole/mL/min), une diminution de l’activité d’aconitase (enzyme sensible au redox mitochondrial) (0.11 ± 0.05 vs 0.20 ± 0.04 μmoles/min/mg mitochondrie), ainsi qu’une augmentation dans la production de H202 (7.0 ± 1.3 vs 5.4 ± 0.8 ρmoles/mg protéines mitochondriales) comparativement au groupe Ctrl (n=6 mâles et 4 femelles). Le groupe H (vs Ctrl) présente également une diminution dans l’expression de peroxiredoxin-3 (Prx3) (H 0.61±0.06 vs. Ctrl 0.78±0.02 unité relative, -23%; p<0.05), une protéine impliquée dans l’élimination d’ H202, de l’expression du cytochrome C oxidase (Complexe IV) (H 1.02±0.04 vs. Ctrl 1.20±0.02 unité relative, -15%; p<0.05), une protéine de la chaine de respiration mitochondriale, tandis que l’expression de la protéine de découplage (uncoupling protein)-2 (UCP2), impliquée dans la dispersion du gradient proton, est significativement augmentée (H 1.05±0.02 vs. Ctrl 0.90±0.03 unité relative, +17%; p<0.05). Les femelles H (n=6) (vs Ctrl, n=6) de 16 semaines démontrent une augmentation significative de l’activité de l’aconitase (0.33±0.03 vs 0.17±0.02 μmoles/min/mg mitochondrie), de l’expression de l’ATP synthase sous unité β (H 0.73±0.02 vs. Ctrl 0.59±0.02 unité relative, +25%; p<0.05) et de l’expression de MnSOD (H 0.89±0.02 vs. Ctrl 0.74±0.03 unité relative, +20%; p<0.05) (superoxide dismutase mitochondriale, important antioxidant), tandis que l’expression de Prx3 est significativement réduite (H 1.1±0.07 vs. Ctrl 0.85±0.01 unité relative, -24%; p<0.05). À 4 semaines, les mâles H (vs Ctrl) présentent une augmentation significative de l’expression de Prx3 (H 0.72±0.03 vs. Ctrl 0.56±0.04 unité relative, +31%; p<0.05) et les femelles présentent une augmentation significative de l’expression d’UCP2 (H 1.22±0.05 vs. Ctrl 1.03±0.04 unité relative, +18%; p<0.05) et de l’expression de MnSOD (H 1.36±0.01 vs. 1.19±0.06 unité relative, +14%; p<0.05). Conclusions: Une exposition néonatale à l’O2 chez le rat adulte mène à des indices de dysfonction mitochondriale dans les reins adultes, associée à une augmentation dans la production d’espèces réactives de l’oxygène, suggérant que ces modifications mitochondriales pourraient jouer un rôle dans l’hypertension artérielle et d’un stress oxydant, et par conséquent, être un facteur possible dans la progression vers des maladies cardiovasculaires. Mots-clés: Mitochondries, Reins, Hypertension, Oxygène, Stress Oxydant, Programmation

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Genetic programming is known to provide good solutions for many problems like the evolution of network protocols and distributed algorithms. In such cases it is most likely a hardwired module of a design framework that assists the engineer to optimize specific aspects of the system to be developed. It provides its results in a fixed format through an internal interface. In this paper we show how the utility of genetic programming can be increased remarkably by isolating it as a component and integrating it into the model-driven software development process. Our genetic programming framework produces XMI-encoded UML models that can easily be loaded into widely available modeling tools which in turn posses code generation as well as additional analysis and test capabilities. We use the evolution of a distributed election algorithm as an example to illustrate how genetic programming can be combined with model-driven development. This example clearly illustrates the advantages of our approach – the generation of source code in different programming languages.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A foundational model of concurrency is developed in this thesis. We examine issues in the design of parallel systems and show why the actor model is suitable for exploiting large-scale parallelism. Concurrency in actors is constrained only by the availability of hardware resources and by the logical dependence inherent in the computation. Unlike dataflow and functional programming, however, actors are dynamically reconfigurable and can model shared resources with changing local state. Concurrency is spawned in actors using asynchronous message-passing, pipelining, and the dynamic creation of actors. This thesis deals with some central issues in distributed computing. Specifically, problems of divergence and deadlock are addressed. For example, actors permit dynamic deadlock detection and removal. The problem of divergence is contained because independent transactions can execute concurrently and potentially infinite processes are nevertheless available for interaction.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The Danish Eulerian Model (DEM) is a powerful air pollution model, designed to calculate the concentrations of various dangerous species over a large geographical region (e.g. Europe). It takes into account the main physical and chemical processes between these species, the actual meteorological conditions, emissions, etc.. This is a huge computational task and requires significant resources of storage and CPU time. Parallel computing is essential for the efficient practical use of the model. Some efficient parallel versions of the model were created over the past several years. A suitable parallel version of DEM by using the Message Passing Interface library (AIPI) was implemented on two powerful supercomputers of the EPCC - Edinburgh, available via the HPC-Europa programme for transnational access to research infrastructures in EC: a Sun Fire E15K and an IBM HPCx cluster. Although the implementation is in principal, the same for both supercomputers, few modifications had to be done for successful porting of the code on the IBM HPCx cluster. Performance analysis and parallel optimization was done next. Results from bench marking experiments will be presented in this paper. Another set of experiments was carried out in order to investigate the sensitivity of the model to variation of some chemical rate constants in the chemical submodel. Certain modifications of the code were necessary to be done in accordance with this task. The obtained results will be used for further sensitivity analysis Studies by using Monte Carlo simulation.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Large scale air pollution models are powerful tools, designed to meet the increasing demand in different environmental studies. The atmosphere is the most dynamic component of the environment, where the pollutants can be moved quickly on far distnce. Therefore the air pollution modeling must be done in a large computational domain. Moreover, all relevant physical, chemical and photochemical processes must be taken into account. In such complex models operator splitting is very often applied in order to achieve sufficient accuracy as well as efficiency of the numerical solution. The Danish Eulerian Model (DEM) is one of the most advanced such models. Its space domain (4800 × 4800 km) covers Europe, most of the Mediterian and neighboring parts of Asia and the Atlantic Ocean. Efficient parallelization is crucial for the performance and practical capabilities of this huge computational model. Different splitting schemes, based on the main processes mentioned above, have been implemented and tested with respect to accuracy and performance in the new version of DEM. Some numerical results of these experiments are presented in this paper.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A connection between a fuzzy neural network model with the mixture of experts network (MEN) modelling approach is established. Based on this linkage, two new neuro-fuzzy MEN construction algorithms are proposed to overcome the curse of dimensionality that is inherent in the majority of associative memory networks and/or other rule based systems. The first construction algorithm employs a function selection manager module in an MEN system. The second construction algorithm is based on a new parallel learning algorithm in which each model rule is trained independently, for which the parameter convergence property of the new learning method is established. As with the first approach, an expert selection criterion is utilised in this algorithm. These two construction methods are equivalent in their effectiveness in overcoming the curse of dimensionality by reducing the dimensionality of the regression vector, but the latter has the additional computational advantage of parallel processing. The proposed algorithms are analysed for effectiveness followed by numerical examples to illustrate their efficacy for some difficult data based modelling problems.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We have optimised the atmospheric radiation algorithm of the FAMOUS climate model on several hardware platforms. The optimisation involved translating the Fortran code to C and restructuring the algorithm around the computation of a single air column. Instead of the existing MPI-based domain decomposition, we used a task queue and a thread pool to schedule the computation of individual columns on the available processors. Finally, four air columns are packed together in a single data structure and computed simultaneously using Single Instruction Multiple Data operations. The modified algorithm runs more than 50 times faster on the CELL’s Synergistic Processing Elements than on its main PowerPC processing element. On Intel-compatible processors, the new radiation code runs 4 times faster. On the tested graphics processor, using OpenCL, we find a speed-up of more than 2.5 times as compared to the original code on the main CPU. Because the radiation code takes more than 60% of the total CPU time, FAMOUS executes more than twice as fast. Our version of the algorithm returns bit-wise identical results, which demonstrates the robustness of our approach. We estimate that this project required around two and a half man-years of work.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper we describe the development of a program that aims at the optimal integration of observed data in an oceanographic model describ

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The paper presents a constructive heuristic algorithm (CHA) for solving directly the long-term transmission-network-expansion-planning (LTTNEP) problem using the DC model. The LTTNEP is a very complex mixed-integer nonlinear-programming problem and presents a combinatorial growth in the search space. The CHA is used to find a solution for the LTTNEP problem of good quality. A sensitivity index is used in each step of the CHA to add circuits to the system. This sensitivity index is obtained by solving the relaxed problem of LTTNEP, i.e. considering the number of circuits to be added as a continuous variable. The relaxed problem is a large and complex nonlinear-programming problem and was solved through the interior-point method (IPM). Tests were performed using Garver's system, the modified IEEE 24-Bus system and the Southern Brazilian reduced system. The results presented show the good performance of IPM inside the CHA.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a new parallel methodology for calculating the determinant of matrices of the order n, with computational complexity O(n), using the Gauss-Jordan Elimination Method and Chio's Rule as references. We intend to present our step-by-step methodology using clear mathematical language, where we will demonstrate how to calculate the determinant of a matrix of the order n in an analytical format. We will also present a computational model with one sequential algorithm and one parallel algorithm using a pseudo-code.