957 resultados para parallel applications


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work presents the concept, design and implementation of a MP-SoC platform, named STORM (MP-SoC DirecTory-Based PlatfORM). Currently the platform is composed of the following modules: SPARC V8 processor, GPOP processor, Cache module, Memory module, Directory module and two different modles of Network-on-Chip, NoCX4 and Obese Tree. All modules were implemented using SystemC, simulated and validated, individually or in group. The modules description is presented in details. For programming the platform in C it was implemented a SPARC assembler, fully compatible with gcc s generated assembly code. For the parallel programming it was implemented a library for mutex managing, using the due assembler s support. A total of 10 simulations of increasing complexity are presented for the validation of the presented concepts. The simulations include real parallel applications, such as matrix multiplication, Mergesort, KMP, Motion Estimation and DCT 2D

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A capacidade de processamento das instituições de pesquisa vem crescendo significativamente à medida que processadores e estações de trabalho cada vez mais poderosos vão surgindo no mercado. Considerando a melhoria de desempenho na área de redes de computadores e visando suprir a demanda por processamento cada vez maior, surgiu a ideia de utilizar computadores independentes conectados em rede como plataforma para execução de aplicações paralelas, originando assim a área de computação em grade. Em uma rede que se encontra sob um mesmo domínio administrativo, é comum que exista o compartilhamento de recursos como discos, impressoras, etc. Mas quando a rede ultrapassa um domínio administrativo, este compartilhamento se torna muito limitado. A finalidade das grades de computação é permitir compartilhamento de recursos mesmo que estes estejam espalhados por diversos domínios administrativos. Esta dissertação propõe uma arquitetura para o estabelecimento dinâmico de conexões multidomínio que faz uso da comutação de rajadas ópticas (OBS – Optical Burst Switching) utilizando um plano de controle GMPLS (Generalized Multiprotocol Label Switching). A arquitetura baseia-se no armazenamento de informações sobre recursos de grade de sistemas autônomos (AS -Autonomous Systems) distintos em um componente chamado Servidor GOBS Raiz (Grid OBS) e na utilização do roteamento explícito para reservar os recursos ao longo de uma rota que satisfaça as restrições de desempenho de uma aplicação. A validação da proposta é feita através de simulações que mostram que a arquitetura é capaz de garantir níveis de desempenho diferenciados de acordo com a classe da aplicação e proporciona uma melhor utilização dos recursos de rede e de computação.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Breakthrough advances in microprocessor technology and efficient power management have altered the course of development of processors with the emergence of multi-core processor technology, in order to bring higher level of processing. The utilization of many-core technology has boosted computing power provided by cluster of workstations or SMPs, providing large computational power at an affordable cost using solely commodity components. Different implementations of message-passing libraries and system softwares (including Operating Systems) are installed in such cluster and multi-cluster computing systems. In order to guarantee correct execution of message-passing parallel applications in a computing environment other than that originally the parallel application was developed, review of the application code is needed. In this paper, a hybrid communication interfacing strategy is proposed, to execute a parallel application in a group of computing nodes belonging to different clusters or multi-clusters (computing systems may be running different operating systems and MPI implementations), interconnected with public or private IP addresses, and responding interchangeably to user execution requests. Experimental results demonstrate the feasibility of this proposed strategy and its effectiveness, through the execution of benchmarking parallel applications.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A computação paralela permite uma série de vantagens para a execução de aplicações de grande porte, sendo que o uso efetivo dos recursos computacionais paralelos é um aspecto relevante da computação de alto desempenho. Este trabalho apresenta uma metodologia que provê a execução, de forma automatizada, de aplicações paralelas baseadas no modelo BSP com tarefas heterogêneas. É considerado no modelo adotado, que o tempo de computação de cada tarefa secundária não possui uma alta variância entre uma iteração e outra. A metodologia é denominada de ASE e é composta por três etapas: Aquisição (Acquisition), Escalonamento (Scheduling) e Execução (Execution). Na etapa de Aquisição, os tempos de processamento das tarefas são obtidos; na etapa de Escalonamento a metodologia busca encontrar a distribuição de tarefas que maximize a velocidade de execução da aplicação paralela, mas minimizando o uso de recursos, por meio de um algoritmo desenvolvido neste trabalho; e por fim a etapa de Execução executa a aplicação paralela com a distribuição definida na etapa anterior. Ferramentas que são aplicadas na metodologia foram implementadas. Um conjunto de testes aplicando a metodologia foi realizado e os resultados apresentados mostram que os objetivos da proposta foram alcançados.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This talk explores how the runtime system and operating system can leverage metrics that express the significance and resilience of application components in order to reduce the energy footprint of parallel applications. We will explore in particular how software can tolerate and indeed exploit higher error rates in future processors and memory technologies that may operate outside their safe margins.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this and a preceding paper, we provide an introduction to the Fujitsu VPP range of vector-parallel supercomputers and to some of the computational chemistry software available for the VPP. Here, we consider the implementation and performance of seven popular chemistry application packages. The codes discussed range from classical molecular dynamics to semiempirical and ab initio quantum chemistry. All have evolved from sequential codes, and have typically been parallelised using a replicated data approach. As such they are well suited to the large-memory/fast-processor architecture of the VPP. For one code, CASTEP, a distributed-memory data-driven parallelisation scheme is presented. (C) 2000 Published by Elsevier Science B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Real-time embedded applications require to process large amounts of data within small time windows. Parallelize and distribute workloads adaptively is suitable solution for computational demanding applications. The purpose of the Parallel Real-Time Framework for distributed adaptive embedded systems is to guarantee local and distributed processing of real-time applications. This work identifies some promising research directions for parallel/distributed real-time embedded applications.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The performance of the parallel vector implementation of the one- and two-dimensional orthogonal transforms is evaluated. The orthogonal transforms are computed using actual or modified fast Fourier transform (FFT) kernels. The factors considered in comparing the speed-up of these vectorized digital signal processing algorithms are discussed and it is shown that the traditional way of comparing th execution speed of digital signal processing algorithms by the ratios of the number of multiplications and additions is no longer effective for vector implementation; the structure of the algorithm must also be considered as a factor when comparing the execution speed of vectorized digital signal processing algorithms. Simulation results on the Cray X/MP with the following orthogonal transforms are presented: discrete Fourier transform (DFT), discrete cosine transform (DCT), discrete sine transform (DST), discrete Hartley transform (DHT), discrete Walsh transform (DWHT), and discrete Hadamard transform (DHDT). A comparison between the DHT and the fast Hartley transform is also included.(34 refs)

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The main focus of this research is to design and develop a high performance linear actuator based on a four bar mechanism. The present work includes the detailed analysis (kinematics and dynamics), design, implementation and experimental validation of the newly designed actuator. High performance is characterized by the acceleration of the actuator end effector. The principle of the newly designed actuator is to network the four bar rhombus configuration (where some bars are extended to form an X shape) to attain high acceleration. Firstly, a detailed kinematic analysis of the actuator is presented and kinematic performance is evaluated through MATLAB simulations. A dynamic equation of the actuator is achieved by using the Lagrangian dynamic formulation. A SIMULINK control model of the actuator is developed using the dynamic equation. In addition, Bond Graph methodology is presented for the dynamic simulation. The Bond Graph model comprises individual component modeling of the actuator along with control. Required torque was simulated using the Bond Graph model. Results indicate that, high acceleration (around 20g) can be achieved with modest (3 N-m or less) torque input. A practical prototype of the actuator is designed using SOLIDWORKS and then produced to verify the proof of concept. The design goal was to achieve the peak acceleration of more than 10g at the middle point of the travel length, when the end effector travels the stroke length (around 1 m). The actuator is primarily designed to operate in standalone condition and later to use it in the 3RPR parallel robot. A DC motor is used to operate the actuator. A quadrature encoder is attached with the DC motor to control the end effector. The associated control scheme of the actuator is analyzed and integrated with the physical prototype. From standalone experimentation of the actuator, around 17g acceleration was achieved by the end effector (stroke length was 0.2m to 0.78m). Results indicate that the developed dynamic model results are in good agreement. Finally, a Design of Experiment (DOE) based statistical approach is also introduced to identify the parametric combination that yields the greatest performance. Data are collected by using the Bond Graph model. This approach is helpful in designing the actuator without much complexity.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper deals with the monolithic decoupled XYZ compliant parallel mechanisms (CPMs) for multi-function applications, which can be fabricated monolithically without assembly and has the capability of kinetostatic decoupling. At first, the conceptual design of monolithic decoupled XYZ CPMs is presented using identical spatial compliant multi-beam modules based on a decoupled 3-PPPR parallel kinematic mechanism. Three types of applications: motion/positioning stages, force/acceleration sensors and energy harvesting devices are described in principle. The kinetostatic and dynamic modelling is then conducted to capture the displacements of any stage under loads acting at any stage and the natural frequency with the comparisons with FEA results. Finally, performance characteristics analysis for motion stage applications is detailed investigated to show how the change of the geometrical parameter can affect the performance characteristics, which provides initial optimal estimations. Results show that the smaller thickness of beams and larger dimension of cubic stages can improve the performance characteristics excluding natural frequency under allowable conditions. In order to improve the natural frequency characteristic, a stiffness-enhanced monolithic decoupled configuration that is achieved through employing more beams in the spatial modules or reducing the mass of each cubic stage mass can be adopted. In addition, an isotropic variation with different motion range along each axis and same payload in each leg is proposed. The redundant design for monolithic fabrication is introduced in this paper, which can overcome the drawback of monolithic fabrication that the failed compliant beam is difficult to replace, and extend the CPM’s life.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In many areas of simulation, a crucial component for efficient numerical computations is the use of solution-driven adaptive features: locally adapted meshing or re-meshing; dynamically changing computational tasks. The full advantages of high performance computing (HPC) technology will thus only be able to be exploited when efficient parallel adaptive solvers can be realised. The resulting requirement for HPC software is for dynamic load balancing, which for many mesh-based applications means dynamic mesh re-partitioning. The DRAMA project has been initiated to address this issue, with a particular focus being the requirements of industrial Finite Element codes, but codes using Finite Volume formulations will also be able to make use of the project results.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The central product of the DRAMA (Dynamic Re-Allocation of Meshes for parallel Finite Element Applications) project is a library comprising a variety of tools for dynamic re-partitioning of unstructured Finite Element (FE) applications. The input to the DRAMA library is the computational mesh, and corresponding costs, partitioned into sub-domains. The core library functions then perform a parallel computation of a mesh re-allocation that will re-balance the costs based on the DRAMA cost model. We discuss the basic features of this cost model, which allows a general approach to load identification, modelling and imbalance minimisation. Results from crash simulations are presented which show the necessity for multi-phase/multi-constraint partitioning components.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Feature selection is a pattern recognition approach to choose important variables according to some criteria in order to distinguish or explain certain phenomena (i.e., for dimensionality reduction). There are many genomic and proteomic applications that rely on feature selection to answer questions such as selecting signature genes which are informative about some biological state, e. g., normal tissues and several types of cancer; or inferring a prediction network among elements such as genes, proteins and external stimuli. In these applications, a recurrent problem is the lack of samples to perform an adequate estimate of the joint probabilities between element states. A myriad of feature selection algorithms and criterion functions have been proposed, although it is difficult to point the best solution for each application. Results: The intent of this work is to provide an open-source multiplataform graphical environment for bioinformatics problems, which supports many feature selection algorithms, criterion functions and graphic visualization tools such as scatterplots, parallel coordinates and graphs. A feature selection approach for growing genetic networks from seed genes ( targets or predictors) is also implemented in the system. Conclusion: The proposed feature selection environment allows data analysis using several algorithms, criterion functions and graphic visualization tools. Our experiments have shown the software effectiveness in two distinct types of biological problems. Besides, the environment can be used in different pattern recognition applications, although the main concern regards bioinformatics tasks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, artificial neural networks are employed in a novel approach to identify harmonic components of single-phase nonlinear load currents, whose amplitude and phase angle are subject to unpredictable changes, even in steady-state. The first six harmonic current components are identified through the variation analysis of waveform characteristics. The effectiveness of this method is tested by applying it to the model of a single-phase active power filter, dedicated to the selective compensation of harmonic current drained by an AC controller. Simulation and experimental results are presented to validate the proposed approach. (C) 2010 Elsevier B. V. All rights reserved.