Biblioteca Digital

952 resultados para parallel systems

Experimenting with independent and-parallel prolog using standard prolog

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper presents an approximation to the study of parallel systems using sequential tools. The Independent And-parallelism in Prolog is an example of parallel processing paradigm in the framework of logic programming, and implementations like parallel processing. But this potential can also be explored using only sequential systems. Being the spirit of this paper to show how this can be done with a standard system, only standard Prolog will be used in the implementations included. Such implementations include tests for parallelism in And-Prolog, a correctnesschecking meta-interpreter of parallel execution for

On real-time partitioned multicore systems

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Partitioning is a common approach to developing mixed-criticality systems, where partitions are isolated from each other both in the temporal and the spatial domain in order to prevent low-criticality subsystems from compromising other subsystems with high level of criticality in case of misbehaviour. The advent of many-core processors, on the other hand, opens the way to highly parallel systems in which all partitions can be allocated to dedicated processor cores. This trend will simplify processor scheduling, although other issues such as mutual interference in the temporal domain may arise as a consequence of memory and device sharing. The paper describes an architecture for multi-core partitioned systems including critical subsystems built with the Ada Ravenscar profile. Some implementation issues are discussed, and experience on implementing the ORK kernel on the XtratuM partitioning hypervisor is presented.

Memory isolation in many-core embedded systems

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The current approach to developing mixed-criticality sys- tems is by partitioning the hardware resources (processors, memory and I/O devices) among the different applications. Partitions are isolated from each other both in the temporal and the spatial domain, so that low-criticality applications cannot compromise other applications with a higher level of criticality in case of misbehaviour. New architectures based on many-core processors open the way to highly parallel systems in which each partition can be allocated to a set of dedicated proces- sor cores, thus simplifying partition scheduling and temporal separation. Moreover, spatial isolation can also benefit from many-core architectures, by using simpler hardware mechanisms to protect the address spaces of different applications. This paper describes an architecture for many- core embedded partitioned systems, together with some implementation advice for spatial isolation.

Resource use pattern analysis for predicting resource availability in opportunistic grids

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work presents a method for predicting resource availability in opportunistic grids by means of use pattern analysis (UPA), a technique based on non-supervised learning methods. This prediction method is based on the assumption of the existence of several classes of computational resource use patterns, which can be used to predict the resource availability. Trace-driven simulations validate this basic assumptions, which also provide the parameter settings for the accurate learning of resource use patterns. Experiments made with an implementation of the UPA method show the feasibility of its use in the scheduling of grid tasks with very little overhead. The experiments also demonstrate the method`s superiority over other predictive and non-predictive methods. An adaptative prediction method is suggested to deal with the lack of training data at initialization. Further adaptative behaviour is motivated by experiments which show that, in some special environments, reliable resource use patterns may not always be detected. Copyright (C) 2009 John Wiley & Sons, Ltd.

Combining RTSJ with Fork/Join: a priority-based model

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper discusses the increased need to support dynamic task-level parallelism in embedded real-time systems and proposes a Java framework that combines the Real-Time Specification for Java (RTSJ) with the Fork/Join (FJ) model, following a fixed priority-based scheduling scheme. Our work intends to support parallel runtimes that will coexist with a wide range of other complex independently developed applications, without any previous knowledge about their real execution requirements, number of parallel sub-tasks, and when those sub-tasks will be generated.

Sintonización dinámica de aplicaciones MPI

Relevância:

60.00% 60.00%

Publicador:

Resumo:

En la actualidad, la computación de altas prestaciones está siendo utilizada en multitud de campos científicos donde los distintos problemas estudiados se resuelven mediante aplicaciones paralelas/distribuidas. Estas aplicaciones requieren gran capacidad de cómputo, bien sea por la complejidad de los problemas o por la necesidad de solventar situaciones en tiempo real. Por lo tanto se debe aprovechar los recursos y altas capacidades computacionales de los sistemas paralelos en los que se ejecutan estas aplicaciones con el fin de obtener un buen rendimiento. Sin embargo, lograr este rendimiento en una aplicación ejecutándose en un sistema es una dura tarea que requiere un alto grado de experiencia, especialmente cuando se trata de aplicaciones que presentan un comportamiento dinámico o cuando se usan sistemas heterogéneos. En estos casos actualmente se plantea realizar una mejora de rendimiento automática y dinámica de las aplicaciones como mejor enfoque para el análisis del rendimiento. El presente trabajo de investigación se sitúa dentro de este ámbito de estudio y su objetivo principal es sintonizar dinámicamente mediante MATE (Monitoring, Analysis and Tuning Environment) una aplicación MPI empleada en computación de altas prestaciones que siga un paradigma Master/Worker. Las técnicas de sintonización integradas en MATE han sido desarrolladas a partir del estudio de un modelo de rendimiento que refleja los cuellos de botella propios de aplicaciones situadas bajo un paradigma Master/Worker: balanceo de carga y número de workers. La ejecución de la aplicación elegida bajo el control dinámico de MATE y de la estrategia de sintonización implementada ha permitido observar la adaptación del comportamiento de dicha aplicación a las condiciones actuales del sistema donde se ejecuta, obteniendo así una mejora de su rendimiento.

Optimització d'una aplicacio bioinformàtica d'alineament de seqüències executada en processadors multi-core i many-core (GPUs)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Las aplicaciones de alineamiento de secuencias son una herramienta importante para la comunidad científica. Estas aplicaciones bioinformáticas son usadas en muchos campos distintos como pueden ser la medicina, la biología, la farmacología, la genética, etc. A día de hoy los algoritmos de alineamiento de secuencias tienen una complejidad elevada y cada día tienen que manejar un volumen de datos más grande. Por esta razón se deben buscar alternativas para que estas aplicaciones sean capaces de manejar el aumento de tamaño que los bancos de secuencias están sufriendo día a día. En este proyecto se estudian y se investigan mejoras en este tipo de aplicaciones como puede ser el uso de sistemas paralelos que pueden mejorar el rendimiento notablemente.

Fine-grained multi-phase array designs

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Hybrid multiprocessor architectures which combine re-configurable computing and multiprocessors on a chip are being proposed to transcend the performance of standard multi-core parallel systems. Both fine-grained and coarse-grained parallel algorithm implementations are feasible in such hybrid frameworks. A compositional strategy for designing fine-grained multi-phase regular processor arrays to target hybrid architectures is presented in this paper. The method is based on deriving component designs using classical regular array techniques and composing the components into a unified global design. Effective designs with phase-changes and data routing at run-time are characteristics of these designs. In order to describe the data transfer between phases, the concept of communication domain is introduced so that the producer–consumer relationship arising from multi-phase computation can be treated in a unified way as a data routing phase. This technique is applied to derive new designs of multi-phase regular arrays with different dataflow between phases of computation.

Caracterização morfodinâmica do estuário do Rio Açu, Macau/RN

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Estuaries are coastal environments ephemeral life in geological time, derived from the drowning of the shoreline as a function of elevation relative sea level. Such parallel systems is characterized by having two sources of sediment, the river and the sea. The study area comprises the Acu River estuary, located on the northern coast of Rio Grande do Norte State, in a region of intense economic activity, mainly focused on the exploration of oil onshore and offshore, likely to accidental spills. In the oil sector are developed for salt production, shrimp farming, agriculture, fisheries and tourism, which by interacting with sensitive ecosystems, such as estuaries, may alter the natural conditions, thus making it an area susceptible to contamination is essential in understanding the morphodynamic variables that occur in this environment to obtain an environmental license. Information about the submarine relief the estuaries are of great importance for the planning of the activity of environmental monitoring, development and coastal systems, among others, allowing an easy management of risk areas, and assist in the creation of thematic maps of the main aspects of landscape. Morphodynamic studies were performed in this estuary in different seasonal periods in 2009 to observe and quantify morphological changes that have occurred and relate these to the hydrodynamic forcing from the river and its interaction with the tides. Thus, efforts in this area is possible to know the bottom morphology through records of good quality equipment acquired by high resolution geophysical (side-scan sonar and profiler current by doppler effect). The combination of these data enabled the identification of different forms of bed for the winter and summer that were framed in a lower flow regime and later may have been destroyed or modified forms of generating fund scheme than the number according Froude, with different characteristics due mainly to the variation of the depth and type of sedimentary material they are made, and other hydrodynamic parameters. Thus, these features background regions are printed in the channel, sandy banks and muddy plains that border the entire area

Speedup and scalability analysis of Master-Slave applications on large heterogeneous clusters

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Although cluster environments have an enormous potential processing power, real applications that take advantage of this power remain an elusive goal. This is due, in part, to the lack of understanding about the characteristics of the applications best suited for these environments. This paper focuses on Master/Slave applications for large heterogeneous clusters. It defines application, cluster and execution models to derive an analytic expression for the execution time. It defines speedup and derives speedup bounds based on the inherent parallelism of the application and the aggregated computing power of the cluster. The paper derives an analytical expression for efficiency and uses it to define scalability of the algorithm-cluster combination based on the isoefficiency metric. Furthermore, the paper establishes necessary and sufficient conditions for an algorithm-cluster combination to be scalable which are easy to verify and use in practice. Finally, it covers the impact of network contention as the number of processors grow. (C) 2007 Elsevier B.V. All rights reserved.

Classificação e comparação de ferramentas para análise de desempenho de sistemas paralelos

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Pós-graduação em Engenharia Elétrica - FEIS

IDRA (IDeal Resource Allocation): A tool for computing ideal speedups

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Performance studies of actual parallel systems usually tend to concéntrate on the effectiveness of a given implementation. This is often done in the absolute, without quantitave reference to the potential parallelism contained in the programs from the point of view of the execution paradigm. We feel that studying the parallelism inherent to the programs is interesting, as it gives information about the best possible behavior of any implementation and thus allows contrasting the results obtained. We propose a method for obtaining ideal speedups for programs through a combination of sequential or parallel execution and simulation, and the algorithms that allow implementing the method. Our approach is novel and, we argüe, more accurate than previously proposed methods, in that a crucial part of the data - the execution times of tasks - is obtained from actual executions, while speedup is computed by simulation. This allows obtaining speedup (and other) data under controlled and ideal assumptions regarding issues such as number of processor, scheduling algorithm and overheads, etc. The results obtained can be used for example to evalúate the ideal parallelism that a program contains for a given model of execution and to compare such "perfect" parallelism to that obtained by a given implementation of that model. We also present a tool, IDRA, which implements the proposed method, and results obtained with IDRA for benchmark programs, which are then compared with those obtained in actual executions on real parallel systems.

A proposal for a flexible scheduling and memory management scheme for non-deterministic, andparallel execution of logic programs

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we examine the issue of memory management in the parallel execution of logic programs. We concentrate on non-deterministic and-parallel schemes which we believe present a relatively general set of problems to be solved, including most of those encountered in the memory management of or-parallel systems. We present a distributed stack memory management model which allows flexible scheduling of goals. Previously proposed models (based on the "Marker model") are lacking in that they impose restrictions on the selection of goals to be executed or they may require consume a large amount of virtual memory. This paper first presents results which imply that the above mentioned shortcomings can have significant performance impacts. An extension of the Marker Model is then proposed which allows flexible scheduling of goals while keeping (virtual) memory consumption down. Measurements are presented which show the advantage of this solution. Methods for handling forward and backward execution, cut and roll back are discussed in the context of the proposed scheme. In addition, the paper shows how the same mechanism for flexible scheduling can be applied to allow the efficient handling of the very general form of suspension that can occur in systems which combine several types of and-parallelism and more sophisticated methods of executing logic programs. We believe that the results are applicable to many and- and or-parallel systems.

Submodular Optimization and Data Processing

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Scalable unstructured mesh decomposition

Relevância:

60.00% 60.00%

Publicador:

Resumo:

As the efficiency of parallel software increases it is becoming common to measure near linear speedup for many applications. For a problem size N on P processors then with software running at O(N=P ) the performance restrictions due to file i/o systems and mesh decomposition running at O(N) become increasingly apparent especially for large P . For distributed memory parallel systems an additional limit to scalability results from the finite memory size available for i/o scatter/gather operations. Simple strategies developed to address the scalability of scatter/gather operations for unstructured mesh based applications have been extended to provide scalable mesh decomposition through the development of a parallel graph partitioning code, JOSTLE [8]. The focus of this work is directed towards the development of generic strategies that can be incorporated into the Computer Aided Parallelisation Tools (CAPTools) project.

«
1
2
3
4
5
6
7
8
...
63
64
»