45 resultados para HPC parallel computer architecture queues fault tolerance programmability ADAM

em Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho"


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work shows the design, simulation, and analysis of two optical interconnection networks for a Dataflow parallel computer architecture. To verify the optical interconnection network performance on the Dataflow architecture, we have analyzed the load balancing among the processors during the parallel programs executions. The load balancing is a very important parameter because it is directly associated to the dataflow parallelism degree. This article proves that optical interconnection networks designed with simple optical devices can provide efficiently the dataflow requirements of a high performance communication system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper aims to present, using a set of guidelines, how to apply the conservative distributed simulation paradigm (CMB protocol) to develop efficient applications. Using these guidelines, even a user with little experience on distributed simulation and computer architecture can have good performance on distributed simulations using conservative synchronization protocols for parallel processes.The set of guidelines is focus on a specific application domain, the performance evaluation of computer systems, considering models with coarse granularity and few logical processes and running over two platforms: parallel (high performance communication environment) and distributed (low performance communication environment).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a data mining environment for knowledge discovery in bioinformatics applications. The system has a generic kernel that implements the mining functions to be applied to input primary databases, with a warehouse architecture, of biomedical information. Both supervised and unsupervised classification can be implemented within the kernel and applied to data extracted from the primary database, with the results being suitably stored in a complex object database for knowledge discovery. The kernel also includes a specific high-performance library that allows designing and applying the mining functions in parallel machines. The experimental results obtained by the application of the kernel functions are reported. © 2003 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main objective involved with this paper consists of presenting the results obtained from the application of artificial neural networks and statistical tools in the automatic identification and classification process of faults in electric power distribution systems. The developed techniques to treat the proposed problem have used, in an integrated way, several approaches that can contribute to the successful detection process of faults, aiming that it is carried out in a reliable and safe way. The compilations of the results obtained from practical experiments accomplished in a pilot radial distribution feeder have demonstrated that the developed techniques provide accurate results, identifying and classifying efficiently the several occurrences of faults observed in the feeder.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to simplify computer management, several system administrators are adopting advanced techniques to manage software configuration of enterprise computer networks, but the tight coupling between hardware and software makes every PC an individual managed entity, lowering the scalability and increasing the costs to manage hundreds or thousands of PCs. Virtualization is an established technology, however its use is been more focused on server consolidation and virtual desktop infrastructure, not for managing distributed computers over a network. This paper discusses the feasibility of the Distributed Virtual Machine Environment, a new approach for enterprise computer management that combines virtualization and distributed system architecture as the basis of the management architecture. © 2008 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Transactional memory (TM) is a new synchronization mechanism devised to simplify parallel programming, thereby helping programmers to unleash the power of current multicore processors. Although software implementations of TM (STM) have been extensively analyzed in terms of runtime performance, little attention has been paid to an equally important constraint faced by nearly all computer systems: energy consumption. In this work we conduct a comprehensive study of energy and runtime tradeoff sin software transactional memory systems. We characterize the behavior of three state-of-the-art lock-based STM algorithms, along with three different conflict resolution schemes. As a result of this characterization, we propose a DVFS-based technique that can be integrated into the resolution policies so as to improve the energy-delay product (EDP). Experimental results show that our DVFS-enhanced policies are indeed beneficial for applications with high contention levels. Improvements of up to 59% in EDP can be observed in this scenario, with an average EDP reduction of 16% across the STAMP workloads. © 2012 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pós-graduação em Física - IGCE

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Huge image collections are becoming available lately. In this scenario, the use of Content-Based Image Retrieval (CBIR) systems has emerged as a promising approach to support image searches. The objective of CBIR systems is to retrieve the most similar images in a collection, given a query image, by taking into account image visual properties such as texture, color, and shape. In these systems, the effectiveness of the retrieval process depends heavily on the accuracy of ranking approaches. Recently, re-ranking approaches have been proposed to improve the effectiveness of CBIR systems by taking into account the relationships among images. The re-ranking approaches consider the relationships among all images in a given dataset. These approaches typically demands a huge amount of computational power, which hampers its use in practical situations. On the other hand, these methods can be massively parallelized. In this paper, we propose to speedup the computation of the RL-Sim algorithm, a recently proposed image re-ranking approach, by using the computational power of Graphics Processing Units (GPU). GPUs are emerging as relatively inexpensive parallel processors that are becoming available on a wide range of computer systems. We address the image re-ranking performance challenges by proposing a parallel solution designed to fit the computational model of GPUs. We conducted an experimental evaluation considering different implementations and devices. Experimental results demonstrate that significant performance gains can be obtained. Our approach achieves speedups of 7x from serial implementation considering the overall algorithm and up to 36x on its core steps.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Techniques of image combination, with extraction of objects to set a final scene, are very used in applications from photos montages to cinematographic productions. These techniques are called digital matting. With them is possible to decrease the cost of productions, because it is not necessary for the actor to be filmed in the location where the final scene occurs. This feature also favors its use in programs made to digital television, which demands a high quality image. Many digital matting algorithms use markings done on the images, to demarcate what is the foreground, the background and the uncertainty areas. This marking is called trimap, which is a triple map containing these three informations. The trimap is done, typically, from manual markings. In this project, methods were created that can be used in digital matting algorithms, with restriction of time and without human interaction, that is, the creation of an algorithm that generates the trimap automatically. This last one can be generated from the difference between a color of an arbitrary background and the foreground, or by using a depth map. It was also created a matting method, based on the Geodesic Matting (BAI; SAPIRO, 2009), which has an inferior processing time then the original one. Aiming to improve the performance of the applications that generates the trimap and of the algorithms that generates the alphamap (map that associates a value to the transparency of each pixel of the image), allowing its use in applications with time restrictions, it was used the CUDA architecture. Taking advantage, this way, of the computational power and the features of the GPGPU, which is massively parallel

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Resistive-type of superconducting fault current limiters (RSFCL) have been developed for medium voltage class aiming to operate at 1 MVA power capacity and short time recovery (< 2 s). A RSFCL in form of superconducting modular device was designed and constructed using 50 m-length of YBCO coated conductor tapes for operation under 1 kV / 1 kA and acting time of 0.1 s. In order to increase the acting time the RSFCL was combined with an air-core reactor in parallel to increase the fault limiting time up to 1 s. The tests determined the electrical and thermal characteristics of the combined resistive/ inductive protection unit. The combined fault current limiter reached a limiting current of 583 A, corresponding to a limiting factor of 3.3 times within an acting time of up to 1 s.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents results from an efficient approach to an automatic detection and extraction of human faces from images with any color, texture or objects in background, that consist in find isosceles triangles formed by the eyes and mouth.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article describes a technique for Large Scale Virtual Environments (LSVEs) partitioning in hexagon cells and using portal in the cell interfaces to reduce the number of messages on the network and the complexity of the virtual world. These environments usually demand a high volume of data that must be sent only to those users who needs the information [Greenhalgh, Benford 1997].

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most of architectures proposed for developing Distributed Virtual Environment (DVE) allow limited number of users. To support the development of applications using the internet infrastructure, with hundred or, perhaps, thousands users logged simultaneously on DVE, several techniques for managing resources, such as bandwidth and capability of processing, must be implemented. The strategy presented in this paper combines methods to attain the scalability required, In special the multicast protocol at application level.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper addresses the problem of processing biological data, such as cardiac beats in the audio and ultrasonic range, and on calculating wavelet coefficients in real time, with the processor clock running at a frequency of present application-specified integrated circuits and field programmable gate array. The parallel filter architecture for discrete wavelet transform (DWT) has been improved, calculating the wavelet coefficients in real time with hardware reduced up to 60%. The new architecture, which also processes inverse DWT, is implemented with the Radix-2 or the Booth-Wallace constant multipliers. One integrated circuit signal analyzer in the ultrasonic range, including series memory register banks, is presented. © 2007 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Reconfigurable computing is one of the most recent research topics in computer science. The Altera - Nios II soft-core processor can be included in a large set of reconfigurable architectures, especially because it is designed in software, allowing it to be configured according to the application. The recent growth in applications that demand reconfigurable computing made necessary the building of compilers that translate high level languages source codes into reconfigurable devices instruction sets. In this paper we present a compiler that takes as input the bytecodes generated by a Java front-end compiler and generates a set of instructions that attends to the Nios II processor instruction set rules. Our work shows how we process Java bytecodes to the intermediate code, in the Nios II instructions format, and build the control flow and the control dependence graphs. © 2009 IEEE.