882 resultados para parallel processing systems


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação para obtenção do grau de Mestre em Engenharia Electrotécnica Ramo Automação e Electrónica Industrial

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Os sistemas de tempo real modernos geram, cada vez mais, cargas computacionais pesadas e dinâmicas, começando-se a tornar pouco expectável que sejam implementados em sistemas uniprocessador. Na verdade, a mudança de sistemas com um único processador para sistemas multi- processador pode ser vista, tanto no domínio geral, como no de sistemas embebidos, como uma forma eficiente, em termos energéticos, de melhorar a performance das aplicações. Simultaneamente, a proliferação das plataformas multi-processador transformaram a programação paralela num tópico de elevado interesse, levando o paralelismo dinâmico a ganhar rapidamente popularidade como um modelo de programação. A ideia, por detrás deste modelo, é encorajar os programadores a exporem todas as oportunidades de paralelismo através da simples indicação de potenciais regiões paralelas dentro das aplicações. Todas estas anotações são encaradas pelo sistema unicamente como sugestões, podendo estas serem ignoradas e substituídas, por construtores sequenciais equivalentes, pela própria linguagem. Assim, o modo como a computação é na realidade subdividida, e mapeada nos vários processadores, é da responsabilidade do compilador e do sistema computacional subjacente. Ao retirar este fardo do programador, a complexidade da programação é consideravelmente reduzida, o que normalmente se traduz num aumento de produtividade. Todavia, se o mecanismo de escalonamento subjacente não for simples e rápido, de modo a manter o overhead geral em níveis reduzidos, os benefícios da geração de um paralelismo com uma granularidade tão fina serão meramente hipotéticos. Nesta perspetiva de escalonamento, os algoritmos que empregam uma política de workstealing são cada vez mais populares, com uma eficiência comprovada em termos de tempo, espaço e necessidades de comunicação. Contudo, estes algoritmos não contemplam restrições temporais, nem outra qualquer forma de atribuição de prioridades às tarefas, o que impossibilita que sejam diretamente aplicados a sistemas de tempo real. Além disso, são tradicionalmente implementados no runtime da linguagem, criando assim um sistema de escalonamento com dois níveis, onde a previsibilidade, essencial a um sistema de tempo real, não pode ser assegurada. Nesta tese, é descrita a forma como a abordagem de work-stealing pode ser resenhada para cumprir os requisitos de tempo real, mantendo, ao mesmo tempo, os seus princípios fundamentais que tão bons resultados têm demonstrado. Muito resumidamente, a única fila de gestão de processos convencional (deque) é substituída por uma fila de deques, ordenada de forma crescente por prioridade das tarefas. De seguida, aplicamos por cima o conhecido algoritmo de escalonamento dinâmico G-EDF, misturamos as regras de ambos, e assim nasce a nossa proposta: o algoritmo de escalonamento RTWS. Tirando partido da modularidade oferecida pelo escalonador do Linux, o RTWS é adicionado como uma nova classe de escalonamento, de forma a avaliar na prática se o algoritmo proposto é viável, ou seja, se garante a eficiência e escalonabilidade desejadas. Modificar o núcleo do Linux é uma tarefa complicada, devido à complexidade das suas funções internas e às fortes interdependências entre os vários subsistemas. Não obstante, um dos objetivos desta tese era ter a certeza que o RTWS é mais do que um conceito interessante. Assim, uma parte significativa deste documento é dedicada à discussão sobre a implementação do RTWS e à exposição de situações problemáticas, muitas delas não consideradas em teoria, como é o caso do desfasamento entre vários mecanismo de sincronização. Os resultados experimentais mostram que o RTWS, em comparação com outro trabalho prático de escalonamento dinâmico de tarefas com restrições temporais, reduz significativamente o overhead de escalonamento através de um controlo de migrações, e mudanças de contexto, eficiente e escalável (pelo menos até 8 CPUs), ao mesmo tempo que alcança um bom balanceamento dinâmico da carga do sistema, até mesmo de uma forma não custosa. Contudo, durante a avaliação realizada foi detetada uma falha na implementação do RTWS, pela forma como facilmente desiste de roubar trabalho, o que origina períodos de inatividade, no CPU em questão, quando a utilização geral do sistema é baixa. Embora o trabalho realizado se tenha focado em manter o custo de escalonamento baixo e em alcançar boa localidade dos dados, a escalonabilidade do sistema nunca foi negligenciada. Na verdade, o algoritmo de escalonamento proposto provou ser bastante robusto, não falhando qualquer meta temporal nas experiências realizadas. Portanto, podemos afirmar que alguma inversão de prioridades, causada pela sub-política de roubo BAS, não compromete os objetivos de escalonabilidade, e até ajuda a reduzir a contenção nas estruturas de dados. Mesmo assim, o RTWS também suporta uma sub-política de roubo determinística: PAS. A avaliação experimental, porém, não ajudou a ter uma noção clara do impacto de uma e de outra. No entanto, de uma maneira geral, podemos concluir que o RTWS é uma solução promissora para um escalonamento eficiente de tarefas paralelas com restrições temporais.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

TLE in infancy has been the subject of varied research. Topographical and structural evidence is coincident with the neuronal systems responsible for auditory processing of the highest specialization and complexity. Recent studies have been showing the need of a hemispheric asymmetry for an optimization in central auditory processing (CAP) and acquisition and learning of a language system. A new functional research paradigm is required to study mental processes that require methods of cognitive-sensory information analysis processed in very short periods of time (msec), such as the ERPs. Thus, in this article, we hypothesize that the TLE in infancy could be a good model for topographic and functional study of CAP and its development process, contributing to a better understanding of the learning difficulties that children with this neurological disorder have.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Coronary artery disease (CAD) is currently one of the most prevalent diseases in the world population and calcium deposits in coronary arteries are one direct risk factor. These can be assessed by the calcium score (CS) application, available via a computed tomography (CT) scan, which gives an accurate indication of the development of the disease. However, the ionising radiation applied to patients is high. This study aimed to optimise the protocol acquisition in order to reduce the radiation dose and explain the flow of procedures to quantify CAD. The main differences in the clinical results, when automated or semiautomated post-processing is used, will be shown, and the epidemiology, imaging, risk factors and prognosis of the disease described. The software steps and the values that allow the risk of developingCADto be predicted will be presented. A64-row multidetector CT scan with dual source and two phantoms (pig hearts) were used to demonstrate the advantages and disadvantages of the Agatston method. The tube energy was balanced. Two measurements were obtained in each of the three experimental protocols (64, 128, 256 mAs). Considerable changes appeared between the values of CS relating to the protocol variation. The predefined standard protocol provided the lowest dose of radiation (0.43 mGy). This study found that the variation in the radiation dose between protocols, taking into consideration the dose control systems attached to the CT equipment and image quality, was not sufficient to justify changing the default protocol provided by the manufacturer.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In Distributed Computer-Controlled Systems (DCCS), both real-time and reliability requirements are of major concern. Architectures for DCCS must be designed considering the integration of processing nodes and the underlying communication infrastructure. Such integration must be provided by appropriate software support services. In this paper, an architecture for DCCS is presented, its structure is outlined, and the services provided by the support software are presented. These are considered in order to guarantee the real-time and reliability requirements placed by current and future systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a global multiprocessor scheduling algorithm for the Linux kernel that combines the global EDF scheduler with a priority-aware work-stealing load balancing scheme, enabling parallel real-time tasks to be executed on more than one processor at a given time instant. We state that some priority inversion may actually be acceptable, provided it helps reduce contention, communication, synchronisation and coordination between parallel threads, while still guaranteeing the expected system’s predictability. Experimental results demonstrate the low scheduling overhead of the proposed approach comparatively to an existing real-time deadline-oriented scheduling class for the Linux kernel.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dynamic parallel scheduling using work-stealing has gained popularity in academia and industry for its good performance, ease of implementation and theoretical bounds on space and time. Cores treat their own double-ended queues (deques) as a stack, pushing and popping threads from the bottom, but treat the deque of another randomly selected busy core as a queue, stealing threads only from the top, whenever they are idle. However, this standard approach cannot be directly applied to real-time systems, where the importance of parallelising tasks is increasing due to the limitations of multiprocessor scheduling theory regarding parallelism. Using one deque per core is obviously a source of priority inversion since high priority tasks may eventually be enqueued after lower priority tasks, possibly leading to deadline misses as in this case the lower priority tasks are the candidates when a stealing operation occurs. Our proposal is to replace the single non-priority deque of work-stealing with ordered per-processor priority deques of ready threads. The scheduling algorithm starts with a single deque per-core, but unlike traditional work-stealing, the total number of deques in the system may now exceed the number of processors. Instead of stealing randomly, cores steal from the highest priority deque.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Multicore platforms have transformed parallelism into a main concern. Parallel programming models are being put forward to provide a better approach for application programmers to expose the opportunities for parallelism by pointing out potentially parallel regions within tasks, leaving the actual and dynamic scheduling of these regions onto processors to be performed at runtime, exploiting the maximum amount of parallelism. It is in this context that this paper proposes a scheduling approach that combines the constant-bandwidth server abstraction with a priority-aware work-stealing load balancing scheme which, while ensuring isolation among tasks, enables parallel tasks to be executed on more than one processor at a given time instant.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Consider the problem of designing an algorithm for acquiring sensor readings. Consider specifically the problem of obtaining an approximate representation of sensor readings where (i) sensor readings originate from different sensor nodes, (ii) the number of sensor nodes is very large, (iii) all sensor nodes are deployed in a small area (dense network) and (iv) all sensor nodes communicate over a communication medium where at most one node can transmit at a time (a single broadcast domain). We present an efficient algorithm for this problem, and our novel algorithm has two desired properties: (i) it obtains an interpolation based on all sensor readings and (ii) it is scalable, that is, its time-complexity is independent of the number of sensor nodes. Achieving these two properties is possible thanks to the close interlinking of the information processing algorithm, the communication system and a model of the physical world.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It has been shown that in reality at least two general scenarios of data structuring are possible: (a) a self-similar (SS) scenario when the measured data form an SS structure and (b) a quasi-periodic (QP) scenario when the repeated (strongly correlated) data form random sequences that are almost periodic with respect to each other. In the second case it becomes possible to describe their behavior and express a part of their randomness quantitatively in terms of the deterministic amplitude–frequency response belonging to the generalized Prony spectrum. This possibility allows us to re-examine the conventional concept of measurements and opens a new way for the description of a wide set of different data. In particular, it concerns different complex systems when the ‘best-fit’ model pretending to be the description of the data measured is absent but the barest necessity of description of these data in terms of the reduced number of quantitative parameters exists. The possibilities of the proposed approach and detection algorithm of the QP processes were demonstrated on actual data: spectroscopic data recorded for pure water and acoustic data for a test hole. The suggested methodology allows revising the accepted classification of different incommensurable and self-affine spatial structures and finding accurate interpretation of the generalized Prony spectroscopy that includes the Fourier spectroscopy as a partial case.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The availability of small inexpensive sensor elements enables the employment of large wired or wireless sensor networks for feeding control systems. Unfortunately, the need to transmit a large number of sensor measurements over a network negatively affects the timing parameters of the control loop. This paper presents a solution to this problem by representing sensor measurements with an approximate representation-an interpolation of sensor measurements as a function of space coordinates. A priority-based medium access control (MAC) protocol is used to select the sensor messages with high information content. Thus, the information from a large number of sensor measurements is conveyed within a few messages. This approach greatly reduces the time for obtaining a snapshot of the environment state and therefore supports the real-time requirements of feedback control loops.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cooperating objects (COs) is a recently coined term used to signify the convergence of classical embedded computer systems, wireless sensor networks and robotics and control. We present essential elements of a reference architecture for scalable data processing for the CO paradigm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Workflows have been successfully applied to express the decomposition of complex scientific applications. This has motivated many initiatives that have been developing scientific workflow tools. However the existing tools still lack adequate support to important aspects namely, decoupling the enactment engine from workflow tasks specification, decentralizing the control of workflow activities, and allowing their tasks to run autonomous in distributed infrastructures, for instance on Clouds. Furthermore many workflow tools only support the execution of Direct Acyclic Graphs (DAG) without the concept of iterations, where activities are executed millions of iterations during long periods of time and supporting dynamic workflow reconfigurations after certain iteration. We present the AWARD (Autonomic Workflow Activities Reconfigurable and Dynamic) model of computation, based on the Process Networks model, where the workflow activities (AWA) are autonomic processes with independent control that can run in parallel on distributed infrastructures, e. g. on Clouds. Each AWA executes a Task developed as a Java class that implements a generic interface allowing end-users to code their applications without concerns for low-level details. The data-driven coordination of AWA interactions is based on a shared tuple space that also enables support to dynamic workflow reconfiguration and monitoring of the execution of workflows. We describe how AWARD supports dynamic reconfiguration and discuss typical workflow reconfiguration scenarios. For evaluation we describe experimental results of AWARD workflow executions in several application scenarios, mapped to a small dedicated cluster and the Amazon (Elastic Computing EC2) Cloud.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The scarcity and diversity of resources among the devices of heterogeneous computing environments may affect their ability to perform services with specific Quality of Service constraints, particularly in dynamic distributed environments where the characteristics of the computational load cannot always be predicted in advance. Our work addresses this problem by allowing resource constrained devices to cooperate with more powerful neighbour nodes, opportunistically taking advantage of global distributed resources and processing power. Rather than assuming that the dynamic configuration of this cooperative service executes until it computes its optimal output, the paper proposes an anytime approach that has the ability to tradeoff deliberation time for the quality of the solution. Extensive simulations demonstrate that the proposed anytime algorithms are able to quickly find a good initial solution and effectively optimise the rate at which the quality of the current solution improves at each iteration, with an overhead that can be considered negligible.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This letter presents a new parallel method for hyperspectral unmixing composed by the efficient combination of two popular methods: vertex component analysis (VCA) and sparse unmixing by variable splitting and augmented Lagrangian (SUNSAL). First, VCA extracts the endmember signatures, and then, SUNSAL is used to estimate the abundance fractions. Both techniques are highly parallelizable, which significantly reduces the computing time. A design for the commodity graphics processing units of the two methods is presented and evaluated. Experimental results obtained for simulated and real hyperspectral data sets reveal speedups up to 100 times, which grants real-time response required by many remotely sensed hyperspectral applications.