Biblioteca Digital

988 resultados para parallel implementation

Design and implementation of a windows-based parallel computing environment for large scale optimization

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A parallel computing environment to support optimization of large-scale engineering systems is designed and implemented on Windows-based personal computer networks, using the master-worker model and the Parallel Virtual Machine (PVM). It is involved in decomposition of a large engineering system into a number of smaller subsystems optimized in parallel on worker nodes and coordination of subsystem optimization results on the master node. The environment consists of six functional modules, i.e. the master control, the optimization model generator, the optimizer, the data manager, the monitor, and the post processor. Object-oriented design of these modules is presented. The environment supports steps from the generation of optimization models to the solution and the visualization on networks of computers. User-friendly graphical interfaces make it easy to define the problem, and monitor and steer the optimization process. It has been verified by an example of a large space truss optimization. (C) 2004 Elsevier Ltd. All rights reserved.

A parallel neuroscale implementation using C++ and MPI

Relevância:

40.00% 40.00%

Publicador:

Resumo:

DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT

Design and implementation of a floating point unit for rigel, a massively parallel accelerator

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Scientific applications rely heavily on floating point data types. Floating point operations are complex and require complicated hardware that is both area and power intensive. The emergence of massively parallel architectures like Rigel creates new challenges and poses new questions with respect to floating point support. The massively parallel aspect of Rigel places great emphasis on area efficient, low power designs. At the same time, Rigel is a general purpose accelerator and must provide high performance for a wide class of applications. This thesis presents an analysis of various floating point unit (FPU) components with respect to Rigel, and attempts to present a candidate design of an FPU that balances performance, area, and power and is suitable for massively parallel architectures like Rigel.

Harmonic identification using parallel neural networks in single-phase systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, artificial neural networks are employed in a novel approach to identify harmonic components of single-phase nonlinear load currents, whose amplitude and phase angle are subject to unpredictable changes, even in steady-state. The first six harmonic current components are identified through the variation analysis of waveform characteristics. The effectiveness of this method is tested by applying it to the model of a single-phase active power filter, dedicated to the selective compensation of harmonic current drained by an AC controller. Simulation and experimental results are presented to validate the proposed approach. (C) 2010 Elsevier B. V. All rights reserved.

Computational chemistry on Fujitsu vector-parallel processors: Development and performance of applications software

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this and a preceding paper, we provide an introduction to the Fujitsu VPP range of vector-parallel supercomputers and to some of the computational chemistry software available for the VPP. Here, we consider the implementation and performance of seven popular chemistry application packages. The codes discussed range from classical molecular dynamics to semiempirical and ab initio quantum chemistry. All have evolved from sequential codes, and have typically been parallelised using a replicated data approach. As such they are well suited to the large-memory/fast-processor architecture of the VPP. For one code, CASTEP, a distributed-memory data-driven parallelisation scheme is presented. (C) 2000 Published by Elsevier Science B.V. All rights reserved.

Target similarity effects: Support for the parallel distributed processing assumptions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent research has begun to provide support for the assumptions that memories are stored as a composite and are accessed in parallel (Tehan & Humphreys, 1998). New predictions derived from these assumptions and from the Chappell and Humphreys (1994) implementation of these assumptions were tested. In three experiments, subjects studied relatively short lists of words. Some of the Lists contained two similar targets (thief and theft) or two dissimilar targets (thief and steal) associated with the same cue (ROBBERY). AS predicted, target similarity affected performance in cued recall but not free association. Contrary to predictions, two spaced presentations of a target did not improve performance in free association. Two additional experiments confirmed and extended this finding. Several alternative explanations for the target similarity effect, which incorporate assumptions about separate representations and sequential search, are rejected. The importance of the finding that, in at least one implicit memory paradigm, repetition does not improve performance is also discussed.

Alternatives for parallel Krylov subspace basis computation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Numerical methods related to Krylov subspaces are widely used in large sparse numerical linear algebra. Vectors in these subspaces are manipulated via their representation onto orthonormal bases. Nowadays, on serial computers, the method of Arnoldi is considered as a reliable technique for constructing such bases. However, although easily parallelizable, this technique is not as scalable as expected for communications. In this work we examine alternative methods aimed at overcoming this drawback. Since they retrieve upon completion the same information as Arnoldi's algorithm does, they enable us to design a wide family of stable and scalable Krylov approximation methods for various parallel environments. We present timing results obtained from their implementation on two distributed-memory multiprocessor supercomputers: the Intel Paragon and the IBM Scalable POWERparallel SP2. (C) 1997 by John Wiley & Sons, Ltd.

Kdd, semma crisp-dm: a parallel overviee

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the last years there has been a huge growth and consolidation of the Data Mining field. Some efforts are being done that seek the establishment of standards in the area. Included on these efforts there can be enumerated SEMMA and CRISP-DM. Both grow as industrial standards and define a set of sequential steps that pretends to guide the implementation of data mining applications. The question of the existence of substantial differences between them and the traditional KDD process arose. In this paper, is pretended to establish a parallel between these and the KDD process as well as an understanding of the similarities between them.

KDD, SEMMA and CRISP-DM: a parallel overview

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the last years there has been a huge growth and consolidation of the Data Mining field. Some efforts are being done that seek the establishment of standards in the area. Included on these efforts there can be enumerated SEMMA and CRISP-DM. Both grow as industrial standards and define a set of sequential steps that pretends to guide the implementation of data mining applications. The question of the existence of substantial differences between them and the traditional KDD process arose. In this paper, is pretended to establish a parallel between these and the KDD process as well as an understanding of the similarities between them.

Towards the wide implementation of standards IEC 61968/70 (CIM) and IEC 61850 in the distribution system (CIGRÉ C6-105_2010)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Currently, power systems (PS) already accommodate a substantial penetration of distributed generation (DG) and operate in competitive environments. In the future, as the result of the liberalisation and political regulations, PS will have to deal with large-scale integration of DG and other distributed energy resources (DER), such as storage and provide market agents to ensure a flexible and secure operation. This cannot be done with the traditional PS operational tools used today like the quite restricted information systems Supervisory Control and Data Acquisition (SCADA) [1]. The trend to use the local generation in the active operation of the power system requires new solutions for data management system. The relevant standards have been developed separately in the last few years so there is a need to unify them in order to receive a common and interoperable solution. For the distribution operation the CIM models described in the IEC 61968/70 are especially relevant. In Europe dispersed and renewable energy resources (D&RER) are mostly operated without remote control mechanisms and feed the maximal amount of available power into the grid. To improve the network operation performance the idea of virtual power plants (VPP) will become a reality. In the future power generation of D&RER will be scheduled with a high accuracy. In order to realize VPP decentralized energy management, communication facilities are needed that have standardized interfaces and protocols. IEC 61850 is suitable to serve as a general standard for all communication tasks in power systems [2]. The paper deals with international activities and experiences in the implementation of a new data management and communication concept in the distribution system. The difficulties in the coordination of the inconsistent developed in parallel communication and data management standards - are first addressed in the paper. The upcoming unification work taking into account the growing role of D&RER in the PS is shown. It is possible to overcome the lag in current practical experiences using new tools for creating and maintenance the CIM data and simulation of the IEC 61850 protocol – the prototype of which is presented in the paper –. The origin and the accuracy of the data requirements depend on the data use (e.g. operation or planning) so some remarks concerning the definition of the digital interface incorporated in the merging unit idea from the power utility point of view are presented in the paper too. To summarize some required future work has been identified.

Supporting real-time parallel task models with work-stealing

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dynamic parallel scheduling using work-stealing has gained popularity in academia and industry for its good performance, ease of implementation and theoretical bounds on space and time. Cores treat their own double-ended queues (deques) as a stack, pushing and popping threads from the bottom, but treat the deque of another randomly selected busy core as a queue, stealing threads only from the top, whenever they are idle. However, this standard approach cannot be directly applied to real-time systems, where the importance of parallelising tasks is increasing due to the limitations of multiprocessor scheduling theory regarding parallelism. Using one deque per core is obviously a source of priority inversion since high priority tasks may eventually be enqueued after lower priority tasks, possibly leading to deadline misses as in this case the lower priority tasks are the candidates when a stealing operation occurs. Our proposal is to replace the single non-priority deque of work-stealing with ordered per-processor priority deques of ready threads. The scheduling algorithm starts with a single deque per-core, but unlike traditional work-stealing, the total number of deques in the system may now exceed the number of processors. Instead of stealing randomly, cores steal from the highest priority deque.

A debugging engine for parallel and distributed programs

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação apresentada para a obtenção do Grau de Doutor em Informática pela Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia.

Control of a 6-DOF parallel manipulator through a mechatronic approach

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper shows that a hierarchical architecture, distributing several control actions in growing levels of complexity and using resources of reconfigurable computing, enables one to take into account the ease of future modifications, updates and improvements in robotic applications. An experimental example of a Stewart—Gough platform control (a platform applied as the solution to countless practical problems) is presented using reconfigurable computing. The software and hardware developed are structured in independent blocks. This open architecture implementation allows easy expansion of the system and better adaptation of the platform to its related tasks.

Optimization of multi-stage amplifiers in deep-submicron CMOS using a distributed/parallel genetic algorithm

Relevância:

30.00% 30.00%

Publicador:

Resumo:

IEEE International Symposium on Circuits and Systems, pp. 724 – 727, Seattle, EUA

Parallel hyperspectral unmixing method via split augmented lagrangian on GPU

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the main problems of hyperspectral data analysis is the presence of mixed pixels due to the low spatial resolution of such images. Linear spectral unmixing aims at inferring pure spectral signatures and their fractions at each pixel of the scene. The huge data volumes acquired by hyperspectral sensors put stringent requirements on processing and unmixing methods. This letter proposes an efficient implementation of the method called simplex identification via split augmented Lagrangian (SISAL) which exploits the graphics processing unit (GPU) architecture at low level using Compute Unified Device Architecture. SISAL aims to identify the endmembers of a scene, i.e., is able to unmix hyperspectral data sets in which the pure pixel assumption is violated. The proposed implementation is performed in a pixel-by-pixel fashion using coalesced accesses to memory and exploiting shared memory to store temporary data. Furthermore, the kernels have been optimized to minimize the threads divergence, therefore achieving high GPU occupancy. The experimental results obtained for the simulated and real hyperspectral data sets reveal speedups up to 49 times, which demonstrates that the GPU implementation can significantly accelerate the method's execution over big data sets while maintaining the methods accuracy.

«
1
2
3
4
5
6
7
8
...
65
66
»