813 resultados para Parallel algorithms


Relevância:

20.00% 20.00%

Publicador:

Resumo:

El avance en la potencia de cómputo en nuestros días viene dado por la paralelización del procesamiento, dadas las características que disponen las nuevas arquitecturas de hardware. Utilizar convenientemente este hardware impacta en la aceleración de los algoritmos en ejecución (programas). Sin embargo, convertir de forma adecuada el algoritmo en su forma paralela es complejo, y a su vez, esta forma, es específica para cada tipo de hardware paralelo. En la actualidad los procesadores de uso general más comunes son los multicore, procesadores paralelos, también denominados Symmetric Multi-Processors (SMP). Hoy en día es difícil hallar un procesador para computadoras de escritorio que no tengan algún tipo de paralelismo del caracterizado por los SMP, siendo la tendencia de desarrollo, que cada día nos encontremos con procesadores con mayor numero de cores disponibles. Por otro lado, los dispositivos de procesamiento de video (Graphics Processor Units - GPU), a su vez, han ido desarrollando su potencia de cómputo por medio de disponer de múltiples unidades de procesamiento dentro de su composición electrónica, a tal punto que en la actualidad no es difícil encontrar placas de GPU con capacidad de 200 a 400 hilos de procesamiento paralelo. Estos procesadores son muy veloces y específicos para la tarea que fueron desarrollados, principalmente el procesamiento de video. Sin embargo, como este tipo de procesadores tiene muchos puntos en común con el procesamiento científico, estos dispositivos han ido reorientándose con el nombre de General Processing Graphics Processor Unit (GPGPU). A diferencia de los procesadores SMP señalados anteriormente, las GPGPU no son de propósito general y tienen sus complicaciones para uso general debido al límite en la cantidad de memoria que cada placa puede disponer y al tipo de procesamiento paralelo que debe realizar para poder ser productiva su utilización. Los dispositivos de lógica programable, FPGA, son dispositivos capaces de realizar grandes cantidades de operaciones en paralelo, por lo que pueden ser usados para la implementación de algoritmos específicos, aprovechando el paralelismo que estas ofrecen. Su inconveniente viene derivado de la complejidad para la programación y el testing del algoritmo instanciado en el dispositivo. Ante esta diversidad de procesadores paralelos, el objetivo de nuestro trabajo está enfocado en analizar las características especificas que cada uno de estos tienen, y su impacto en la estructura de los algoritmos para que su utilización pueda obtener rendimientos de procesamiento acordes al número de recursos utilizados y combinarlos de forma tal que su complementación sea benéfica. Específicamente, partiendo desde las características del hardware, determinar las propiedades que el algoritmo paralelo debe tener para poder ser acelerado. Las características de los algoritmos paralelos determinará a su vez cuál de estos nuevos tipos de hardware son los mas adecuados para su instanciación. En particular serán tenidos en cuenta el nivel de dependencia de datos, la necesidad de realizar sincronizaciones durante el procesamiento paralelo, el tamaño de datos a procesar y la complejidad de la programación paralela en cada tipo de hardware. Today´s advances in high-performance computing are driven by parallel processing capabilities of available hardware architectures. These architectures enable the acceleration of algorithms when thes ealgorithms are properly parallelized and exploit the specific processing power of the underneath architecture. Most current processors are targeted for general pruposes and integrate several processor cores on a single chip, resulting in what is known as a Symmetric Multiprocessing (SMP) unit. Nowadays even desktop computers make use of multicore processors. Meanwhile, the industry trend is to increase the number of integrated rocessor cores as technology matures. On the other hand, Graphics Processor Units (GPU), originally designed to handle only video processing, have emerged as interesting alternatives to implement algorithm acceleration. Current available GPUs are able to implement from 200 to 400 threads for parallel processing. Scientific computing can be implemented in these hardware thanks to the programability of new GPUs that have been denoted as General Processing Graphics Processor Units (GPGPU).However, GPGPU offer little memory with respect to that available for general-prupose processors; thus, the implementation of algorithms need to be addressed carefully. Finally, Field Programmable Gate Arrays (FPGA) are programmable devices which can implement hardware logic with low latency, high parallelism and deep pipelines. Thes devices can be used to implement specific algorithms that need to run at very high speeds. However, their programmability is harder that software approaches and debugging is typically time-consuming. In this context where several alternatives for speeding up algorithms are available, our work aims at determining the main features of thes architectures and developing the required know-how to accelerate algorithm execution on them. We look at identifying those algorithms that may fit better on a given architecture as well as compleme

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As digital imaging processing techniques become increasingly used in a broad range of consumer applications, the critical need to evaluate algorithm performance has become recognised by developers as an area of vital importance. With digital image processing algorithms now playing a greater role in security and protection applications, it is of crucial importance that we are able to empirically study their performance. Apart from the field of biometrics little emphasis has been put on algorithm performance evaluation until now and where evaluation has taken place, it has been carried out in a somewhat cumbersome and unsystematic fashion, without any standardised approach. This paper presents a comprehensive testing methodology and framework aimed towards automating the evaluation of image processing algorithms. Ultimately, the test framework aims to shorten the algorithm development life cycle by helping to identify algorithm performance problems quickly and more efficiently.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Magdeburg, Univ., Fak. für Verfahrens- und Systemtechnik, Diss., 2012

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Magdeburg, Univ., Fak. für Mathematik, Habil.-Schr., 2006

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Magdeburg, Univ., Fak. für Verfahrens- und Systemtechnik, Diss., 2012

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work describes a test tool that allows to make performance tests of different end-to-end available bandwidth estimation algorithms along with their different implementations. The goal of such tests is to find the best-performing algorithm and its implementation and use it in congestion control mechanism for high-performance reliable transport protocols. The main idea of this paper is to describe the options which provide available bandwidth estimation mechanism for highspeed data transport protocols and to develop basic functionality of such test tool with which it will be possible to manage entities of test application on all involved testing hosts, aided by some middleware.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nowadays a huge attention of the academia and research teams is attracted to the potential of the usage of the 60 GHz frequency band in the wireless communications. The use of the 60GHz frequency band offers great possibilities for wide variety of applications that are yet to be implemented. These applications also imply huge implementation challenges. Such example is building a high data rate transceiver which at the same time would have very low power consumption. In this paper we present a prototype of Single Carrier -SC transceiver system, illustrating a brief overview of the baseband design, emphasizing the most important decisions that need to be done. A brief overview of the possible approaches when implementing the equalizer, as the most complex module in the SC transceiver, is also presented. The main focus of this paper is to suggest a parallel architecture for the receiver in a Single Carrier communication system. This would provide higher data rates that the communication system canachieve, for a price of higher power consumption. The suggested architecture of such receiver is illustrated in this paper,giving the results of its implementation in comparison with its corresponding serial implementation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Magdeburg, Univ., Fak. für Naturwiss., Diss., 2014

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Some practical aspects of Genetic algorithms’ implementation regarding to life cycle management of electrotechnical equipment are considered.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Advances in computer memory technology justify research towards new and different views on computer organization. This paper proposes a novel memory-centric computing architecture with the goal to merge memory and processing elements in order to provide better conditions for parallelization and performance. The paper introduces the architectural concepts and afterwards shows the design and implementation of a corresponding assembler and simulator.