59 resultados para Parallel algorithm

em Instituto Politécnico do Porto, Portugal


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a global multiprocessor scheduling algorithm for the Linux kernel that combines the global EDF scheduler with a priority-aware work-stealing load balancing scheme, enabling parallel real-time tasks to be executed on more than one processor at a given time instant. We state that some priority inversion may actually be acceptable, provided it helps reduce contention, communication, synchronisation and coordination between parallel threads, while still guaranteeing the expected system’s predictability. Experimental results demonstrate the low scheduling overhead of the proposed approach comparatively to an existing real-time deadline-oriented scheduling class for the Linux kernel.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dynamic parallel scheduling using work-stealing has gained popularity in academia and industry for its good performance, ease of implementation and theoretical bounds on space and time. Cores treat their own double-ended queues (deques) as a stack, pushing and popping threads from the bottom, but treat the deque of another randomly selected busy core as a queue, stealing threads only from the top, whenever they are idle. However, this standard approach cannot be directly applied to real-time systems, where the importance of parallelising tasks is increasing due to the limitations of multiprocessor scheduling theory regarding parallelism. Using one deque per core is obviously a source of priority inversion since high priority tasks may eventually be enqueued after lower priority tasks, possibly leading to deadline misses as in this case the lower priority tasks are the candidates when a stealing operation occurs. Our proposal is to replace the single non-priority deque of work-stealing with ordered per-processor priority deques of ready threads. The scheduling algorithm starts with a single deque per-core, but unlike traditional work-stealing, the total number of deques in the system may now exceed the number of processors. Instead of stealing randomly, cores steal from the highest priority deque.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The recent trends of chip architectures with higher number of heterogeneous cores, and non-uniform memory/non-coherent caches, brings renewed attention to the use of Software Transactional Memory (STM) as a fundamental building block for developing parallel applications. Nevertheless, although STM promises to ease concurrent and parallel software development, it relies on the possibility of aborting conflicting transactions to maintain data consistency, which impacts on the responsiveness and timing guarantees required by embedded real-time systems. In these systems, contention delays must be (efficiently) limited so that the response times of tasks executing transactions are upper-bounded and task sets can be feasibly scheduled. In this paper we assess the use of STM in the development of embedded real-time software, defending that the amount of contention can be reduced if read-only transactions access recent consistent data snapshots, progressing in a wait-free manner. We show how the required number of versions of a shared object can be calculated for a set of tasks. We also outline an algorithm to manage conflicts between update transactions that prevents starvation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper addresses the problem of finding several different solutions with the same optimum performance in single objective real-world engineering problems. In this paper a parallel robot design is proposed. Thereby, this paper presents a genetic algorithm to optimize uni-objective problems with an infinite number of optimal solutions. The algorithm uses the maximin concept and ε-dominance to promote diversity over the admissible space. The performance of the proposed algorithm is analyzed with three well-known test functions and a function obtained from practical real-world engineering optimization problems. A spreading analysis is performed showing that the solutions drawn by the algorithm are well dispersed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Face à estagnação da tecnologia uniprocessador registada na passada década, aos principais fabricantes de microprocessadores encontraram na tecnologia multi-core a resposta `as crescentes necessidades de processamento do mercado. Durante anos, os desenvolvedores de software viram as suas aplicações acompanhar os ganhos de performance conferidos por cada nova geração de processadores sequenciais, mas `a medida que a capacidade de processamento escala em função do número de processadores, a computação sequencial tem de ser decomposta em várias partes concorrentes que possam executar em paralelo, para que possam utilizar as unidades de processamento adicionais e completar mais rapidamente. A programação paralela implica um paradigma completamente distinto da programação sequencial. Ao contrário dos computadores sequenciais tipificados no modelo de Von Neumann, a heterogeneidade de arquiteturas paralelas requer modelos de programação paralela que abstraiam os programadores dos detalhes da arquitectura e simplifiquem o desenvolvimento de aplicações concorrentes. Os modelos de programação paralela mais populares incitam os programadores a identificar instruções concorrentes na sua lógica de programação, e a especificá-las sob a forma de tarefas que possam ser atribuídas a processadores distintos para executarem em simultâneo. Estas tarefas são tipicamente lançadas durante a execução, e atribuídas aos processadores pelo motor de execução subjacente. Como os requisitos de processamento costumam ser variáveis, e não são conhecidos a priori, o mapeamento de tarefas para processadores tem de ser determinado dinamicamente, em resposta a alterações imprevisíveis dos requisitos de execução. `A medida que o volume da computação cresce, torna-se cada vez menos viável garantir as suas restrições temporais em plataformas uniprocessador. Enquanto os sistemas de tempo real se começam a adaptar ao paradigma de computação paralela, há uma crescente aposta em integrar execuções de tempo real com aplicações interativas no mesmo hardware, num mundo em que a tecnologia se torna cada vez mais pequena, leve, ubíqua, e portável. Esta integração requer soluções de escalonamento que simultaneamente garantam os requisitos temporais das tarefas de tempo real e mantenham um nível aceitável de QoS para as restantes execuções. Para tal, torna-se imperativo que as aplicações de tempo real paralelizem, de forma a minimizar os seus tempos de resposta e maximizar a utilização dos recursos de processamento. Isto introduz uma nova dimensão ao problema do escalonamento, que tem de responder de forma correcta a novos requisitos de execução imprevisíveis e rapidamente conjeturar o mapeamento de tarefas que melhor beneficie os critérios de performance do sistema. A técnica de escalonamento baseado em servidores permite reservar uma fração da capacidade de processamento para a execução de tarefas de tempo real, e assegurar que os efeitos de latência na sua execução não afectam as reservas estipuladas para outras execuções. No caso de tarefas escalonadas pelo tempo de execução máximo, ou tarefas com tempos de execução variáveis, torna-se provável que a largura de banda estipulada não seja consumida por completo. Para melhorar a utilização do sistema, os algoritmos de partilha de largura de banda (capacity-sharing) doam a capacidade não utilizada para a execução de outras tarefas, mantendo as garantias de isolamento entre servidores. Com eficiência comprovada em termos de espaço, tempo, e comunicação, o mecanismo de work-stealing tem vindo a ganhar popularidade como metodologia para o escalonamento de tarefas com paralelismo dinâmico e irregular. O algoritmo p-CSWS combina escalonamento baseado em servidores com capacity-sharing e work-stealing para cobrir as necessidades de escalonamento dos sistemas abertos de tempo real. Enquanto o escalonamento em servidores permite partilhar os recursos de processamento sem interferências a nível dos atrasos, uma nova política de work-stealing que opera sobre o mecanismo de capacity-sharing aplica uma exploração de paralelismo que melhora os tempos de resposta das aplicações e melhora a utilização do sistema. Esta tese propõe uma implementação do algoritmo p-CSWS para o Linux. Em concordância com a estrutura modular do escalonador do Linux, ´e definida uma nova classe de escalonamento que visa avaliar a aplicabilidade da heurística p-CSWS em circunstâncias reais. Ultrapassados os obstáculos intrínsecos `a programação da kernel do Linux, os extensos testes experimentais provam que o p-CSWS ´e mais do que um conceito teórico atrativo, e que a exploração heurística de paralelismo proposta pelo algoritmo beneficia os tempos de resposta das aplicações de tempo real, bem como a performance e eficiência da plataforma multiprocessador.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the last years there has been a huge growth and consolidation of the Data Mining field. Some efforts are being done that seek the establishment of standards in the area. Included on these efforts there can be enumerated SEMMA and CRISP-DM. Both grow as industrial standards and define a set of sequential steps that pretends to guide the implementation of data mining applications. The question of the existence of substantial differences between them and the traditional KDD process arose. In this paper, is pretended to establish a parallel between these and the KDD process as well as an understanding of the similarities between them.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the last years there has been a huge growth and consolidation of the Data Mining field. Some efforts are being done that seek the establishment of standards in the area. Included on these efforts there can be enumerated SEMMA and CRISP-DM. Both grow as industrial standards and define a set of sequential steps that pretends to guide the implementation of data mining applications. The question of the existence of substantial differences between them and the traditional KDD process arose. In this paper, is pretended to establish a parallel between these and the KDD process as well as an understanding of the similarities between them.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

5th. European Congress on Computational Methods in Applied Sciences and Engineering (ECCOMAS 2008) 8th. World Congress on Computational Mechanics (WCCM8)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years the use of several new resources in power systems, such as distributed generation, demand response and more recently electric vehicles, has significantly increased. Power systems aim at lowering operational costs, requiring an adequate energy resources management. In this context, load consumption management plays an important role, being necessary to use optimization strategies to adjust the consumption to the supply profile. These optimization strategies can be integrated in demand response programs. The control of the energy consumption of an intelligent house has the objective of optimizing the load consumption. This paper presents a genetic algorithm approach to manage the consumption of a residential house making use of a SCADA system developed by the authors. Consumption management is done reducing or curtailing loads to keep the power consumption in, or below, a specified energy consumption limit. This limit is determined according to the consumer strategy and taking into account the renewable based micro generation, energy price, supplier solicitations, and consumers’ preferences. The proposed approach is compared with a mixed integer non-linear approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To maintain a power system within operation limits, a level ahead planning it is necessary to apply competitive techniques to solve the optimal power flow (OPF). OPF is a non-linear and a large combinatorial problem. The Ant Colony Search (ACS) optimization algorithm is inspired by the organized natural movement of real ants and has been successfully applied to different large combinatorial optimization problems. This paper presents an implementation of Ant Colony optimization to solve the OPF in an economic dispatch context. The proposed methodology has been developed to be used for maintenance and repairing planning with 48 to 24 hours antecipation. The main advantage of this method is its low execution time that allows the use of OPF when a large set of scenarios has to be analyzed. The paper includes a case study using the IEEE 30 bus network. The results are compared with other well-known methodologies presented in the literature.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a Unit Commitment model with reactive power compensation that has been solved by Genetic Algorithm (GA) optimization techniques. The GA has been developed a computational tools programmed/coded in MATLAB. The main objective is to find the best generations scheduling whose active power losses are minimal and the reactive power to be compensated, subjected to the power system technical constraints. Those are: full AC power flow equations, active and reactive power generation constraints. All constraints that have been represented in the objective function are weighted with a penalty factors. The IEEE 14-bus system has been used as test case to demonstrate the effectiveness of the proposed algorithm. Results and conclusions are dully drawn.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Electricity market players operating in a liberalized environment requires access to an adequate decision support tool, allowing them to consider all the business opportunities and take strategic decisions. Ancillary services represent a good negotiation opportunity that must be considered by market players. For this, decision support tools must include ancillary market simulation. This paper proposes two different methods (Linear Programming and Genetic Algorithm approaches) for ancillary services dispatch. The methodologies are implemented in MASCEM, a multi-agent based electricity market simulator. A test case concerning the dispatch of Regulation Down, Regulation Up, Spinning Reserve and Non-Spinning Reserve services is included in this paper.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Although it is always weak between RFID Tag and Terminal in focus of the security, there are no security skills in RFID Tag. Recently there are a lot of studying in order to protect it, but because it has some physical limitation of RFID, that is it should be low electric power and high speed, it is impossible to protect with the skills. At present, the methods of RFID security are using a security server, a security policy and security. One of them the most famous skill is the security module, then they has an authentication skill and an encryption skill. In this paper, we designed and implemented after modification original SEED into 8 Round and 64 bits for Tag.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Real-time embedded applications require to process large amounts of data within small time windows. Parallelize and distribute workloads adaptively is suitable solution for computational demanding applications. The purpose of the Parallel Real-Time Framework for distributed adaptive embedded systems is to guarantee local and distributed processing of real-time applications. This work identifies some promising research directions for parallel/distributed real-time embedded applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Embedded real-time applications increasingly present high computation requirements, which need to be completed within specific deadlines, but that present highly variable patterns, depending on the set of data available in a determined instant. The current trend to provide parallel processing in the embedded domain allows providing higher processing power; however, it does not address the variability in the processing pattern. Dimensioning each device for its worst-case scenario implies lower average utilization, and increased available, but unusable, processing in the overall system. A solution for this problem is to extend the parallel execution of the applications, allowing networked nodes to distribute the workload, on peak situations, to neighbour nodes. In this context, this report proposes a framework to develop parallel and distributed real-time embedded applications, transparently using OpenMP and Message Passing Interface (MPI), within a programming model based on OpenMP. The technical report also devises an integrated timing model, which enables the structured reasoning on the timing behaviour of these hybrid architectures.