Biblioteca Digital

91 resultados para CUDA

Aceleração GPU da animação de superfícies deformáveis

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dissertação de Mestrado em Engenharia Informática

Veja mais

Paralelização de algoritmos de Filtragem baseados em XPATH/XML com recurso a GPUs

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dissertação de Mestrado em Engenharia Informática

Veja mais

GPU implementation of the simplex identification via split augmented Lagrangian

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Hyperspectral imaging can be used for object detection and for discriminating between different objects based on their spectral characteristics. One of the main problems of hyperspectral data analysis is the presence of mixed pixels, due to the low spatial resolution of such images. This means that several spectrally pure signatures (endmembers) are combined into the same mixed pixel. Linear spectral unmixing follows an unsupervised approach which aims at inferring pure spectral signatures and their material fractions at each pixel of the scene. The huge data volumes acquired by such sensors put stringent requirements on processing and unmixing methods. This paper proposes an efficient implementation of a unsupervised linear unmixing method on GPUs using CUDA. The method finds the smallest simplex by solving a sequence of nonsmooth convex subproblems using variable splitting to obtain a constraint formulation, and then applying an augmented Lagrangian technique. The parallel implementation of SISAL presented in this work exploits the GPU architecture at low level, using shared memory and coalesced accesses to memory. The results herein presented indicate that the GPU implementation can significantly accelerate the method's execution over big datasets while maintaining the methods accuracy.

Veja mais

Parallel GPU architecture for hyperspectral unmixing based on augmented Lagrangian method

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Hyperspectral imaging has become one of the main topics in remote sensing applications, which comprise hundreds of spectral bands at different (almost contiguous) wavelength channels over the same area generating large data volumes comprising several GBs per flight. This high spectral resolution can be used for object detection and for discriminate between different objects based on their spectral characteristics. One of the main problems involved in hyperspectral analysis is the presence of mixed pixels, which arise when the spacial resolution of the sensor is not able to separate spectrally distinct materials. Spectral unmixing is one of the most important task for hyperspectral data exploitation. However, the unmixing algorithms can be computationally very expensive, and even high power consuming, which compromises the use in applications under on-board constraints. In recent years, graphics processing units (GPUs) have evolved into highly parallel and programmable systems. Specifically, several hyperspectral imaging algorithms have shown to be able to benefit from this hardware taking advantage of the extremely high floating-point processing performance, compact size, huge memory bandwidth, and relatively low cost of these units, which make them appealing for onboard data processing. In this paper, we propose a parallel implementation of an augmented Lagragian based method for unsupervised hyperspectral linear unmixing on GPUs using CUDA. The method called simplex identification via split augmented Lagrangian (SISAL) aims to identify the endmembers of a scene, i.e., is able to unmix hyperspectral data sets in which the pure pixel assumption is violated. The efficient implementation of SISAL method presented in this work exploits the GPU architecture at low level, using shared memory and coalesced accesses to memory.

Veja mais

Parallel hyperspectral compressive sensing method on GPU

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Remote hyperspectral sensors collect large amounts of data per flight usually with low spatial resolution. It is known that the bandwidth connection between the satellite/airborne platform and the ground station is reduced, thus a compression onboard method is desirable to reduce the amount of data to be transmitted. This paper presents a parallel implementation of an compressive sensing method, called parallel hyperspectral coded aperture (P-HYCA), for graphics processing units (GPU) using the compute unified device architecture (CUDA). This method takes into account two main properties of hyperspectral dataset, namely the high correlation existing among the spectral bands and the generally low number of endmembers needed to explain the data, which largely reduces the number of measurements necessary to correctly reconstruct the original data. Experimental results conducted using synthetic and real hyperspectral datasets on two different GPU architectures by NVIDIA: GeForce GTX 590 and GeForce GTX TITAN, reveal that the use of GPUs can provide real-time compressive sensing performance. The achieved speedup is up to 20 times when compared with the processing time of HYCA running on one core of the Intel i7-2600 CPU (3.4GHz), with 16 Gbyte memory.

Veja mais

Parallel hyperspectral coded aperture for compressive sensing on GPUs

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The application of compressive sensing (CS) to hyperspectral images is an active area of research over the past few years, both in terms of the hardware and the signal processing algorithms. However, CS algorithms can be computationally very expensive due to the extremely large volumes of data collected by imaging spectrometers, a fact that compromises their use in applications under real-time constraints. This paper proposes four efficient implementations of hyperspectral coded aperture (HYCA) for CS, two of them termed P-HYCA and P-HYCA-FAST and two additional implementations for its constrained version (CHYCA), termed P-CHYCA and P-CHYCA-FAST on commodity graphics processing units (GPUs). HYCA algorithm exploits the high correlation existing among the spectral bands of the hyperspectral data sets and the generally low number of endmembers needed to explain the data, which largely reduces the number of measurements necessary to correctly reconstruct the original data. The proposed P-HYCA and P-CHYCA implementations have been developed using the compute unified device architecture (CUDA) and the cuFFT library. Moreover, this library has been replaced by a fast iterative method in the P-HYCA-FAST and P-CHYCA-FAST implementations that leads to very significant speedup factors in order to achieve real-time requirements. The proposed algorithms are evaluated not only in terms of reconstruction error for different compressions ratios but also in terms of computational performance using two different GPU architectures by NVIDIA: 1) GeForce GTX 590; and 2) GeForce GTX TITAN. Experiments are conducted using both simulated and real data revealing considerable acceleration factors and obtaining good results in the task of compressing remotely sensed hyperspectral data sets.

Veja mais

Estrutura a partir do Movimento em Sistemas Autónomos

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Mestrado em Engenharia Electrotécnica e de Computadores - Ramo de Sistemas Autónomos

Veja mais

Reconstrução/processamento de imagem médica com GPU em tomossíntese

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Biomédica

Veja mais

Reconstrução de imagem médica de mamografia por emissão de positrões (PEM) com GPU

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Biomédica

Veja mais

Geometry based visualization with OpenCL

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Informática

Veja mais

Multi-GPU support on the marrow algorithmic skeleton framework

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Informática

Veja mais

Optimization of breast tomosynthesis image reconstruction using parallel computing

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Breast cancer is the most common cancer among women, being a major public health problem. Worldwide, X-ray mammography is the current gold-standard for medical imaging of breast cancer. However, it has associated some well-known limitations. The false-negative rates, up to 66% in symptomatic women, and the false-positive rates, up to 60%, are a continued source of concern and debate. These drawbacks prompt the development of other imaging techniques for breast cancer detection, in which Digital Breast Tomosynthesis (DBT) is included. DBT is a 3D radiographic technique that reduces the obscuring effect of tissue overlap and appears to address both issues of false-negative and false-positive rates. The 3D images in DBT are only achieved through image reconstruction methods. These methods play an important role in a clinical setting since there is a need to implement a reconstruction process that is both accurate and fast. This dissertation deals with the optimization of iterative algorithms, with parallel computing through an implementation on Graphics Processing Units (GPUs) to make the 3D reconstruction faster using Compute Unified Device Architecture (CUDA). Iterative algorithms have shown to produce the highest quality DBT images, but since they are computationally intensive, their clinical use is currently rejected. These algorithms have the potential to reduce patient dose in DBT scans. A method of integrating CUDA in Interactive Data Language (IDL) is proposed in order to accelerate the DBT image reconstructions. This method has never been attempted before for DBT. In this work the system matrix calculation, the most computationally expensive part of iterative algorithms, is accelerated. A speedup of 1.6 is achieved proving the fact that GPUs can accelerate the IDL implementation.

Veja mais

Otimização da implementação em GPUs de um algoritmo de simulação de materiais granulares

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As simulações que pretendam modelar fenómenos reais com grande precisão em tempo útil exigem enormes quantidades de recursos computacionais, sejam estes de processamento, de memória, ou comunicação. Se até há pouco tempo estas capacidades estavam confinadas a grandes supercomputadores, com o advento dos processadores multicore e GPUs manycore os recursos necessários para este tipo de problemas estão agora acessíveis a preços razoáveis não só a investigadores como aos utilizadores em geral. O presente trabalho está focado na otimização de uma aplicação que simula o comportamento dinâmico de materiais granulares secos, um problema do âmbito da Engenharia Civil, mais especificamente na área da Geotecnia, na qual estas simulações permitem por exemplo investigar a deslocação de grandes massas sólidas provocadas pelo colapso de taludes. Assim, tem havido interesse em abordar esta temática e produzir simulações representativas de situações reais, nomeadamente por parte do CGSE (Australian Research Council Centre of Excellence for Geotechnical Science and Engineering) da Universidade de Newcastle em colaboração com um membro da UNIC (Centro de Investigação em Estruturas de Construção da FCT/UNL) que tem vindo a desenvolver a sua própria linha de investigação, que se materializou na implementação, em CUDA, de um algoritmo para GPUs que possibilita simulações de sistemas com um elevado número de partículas. O trabalho apresentado consiste na otimização, assente na premissa da não alteração (ou alteração mínima) do código original, da supracitada implementação, de forma a obter melhorias significativas tanto no tempo global de execução da aplicação, como no aumento do número de partículas a simular. Ao mesmo tempo, valida-se a formulação proposta ao conseguir simulações que refletem, com grande precisão, os fenómenos físicos. Com as otimizações realizadas, conseguiu-se obter uma redução de cerca de 30% do tempo inicial cumprindo com os requisitos de correção e precisão necessários.

Veja mais

Programação paralela estruturada aplicada a um problema de reconstrução de imagem

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A Digital Breast Tomosynthesis (DBT) é uma técnica que permite obter imagens mamárias 3D de alta qualidade, que só podem ser obtidas através de métodos de re-construção. Os métodos de reconstrução mais rápidos são os iterativos, sendo no en-tanto computacionalmente exigentes, necessitando de sofrer muitas optimizações. Exis-tem optimizações que usam computação paralela através da implementação em GPUs usando CUDA. Como é sabido, o desenvolvimento de programas eficientes que usam GPUs é ainda uma tarefa demorada, dado que os modelos de programação disponíveis são de baixo nível, e a portabilidade do código para outras arquitecturas não é imedia-ta. É uma mais valia poder criar programas paralelos de forma rápida, com possibili-dade de serem usados em diferentes arquitecturas, sem exigir muitos conhecimentos sobre a arquitectura subjacente e sobre os modelos de programação de baixo nível. Para resolver este problema, propomos a utilização de soluções existentes que reduzam o esforço de paralelização, permitindo a sua portabilidade, garantindo ao mesmo tempo um desempenho aceitável. Para tal, vamos utilizar um framework (FastFlow) com suporte para Algorithmic Skeletons, que tiram partido da programação paralela estruturada, capturando esquemas/padrões recorrentes que são comuns na programação paralela. O trabalho realizado centrou-se na paralelização de uma das fases de reconstru-ção da imagem 3D – geração da matriz de sistema – que é uma das mais demoradas do processo de reconstrução; esse trabalho incluiu um método de ordenação modificado em relação ao existente. Foram realizadas diferentes implementações em CPU e GPU (usando OpenMP, CUDA e FastFlow) o que permitiu comparar estes ambientes de programação em termos de facilidade de desenvolvimento e eficiência da solução. A comparação feita permite concluir que o desempenho das soluções baseadas no FastFlow não é muito diferente das tradicionais o que sugere que ferramentas deste tipo podem simplificar e agilizar a implementação de um algoritmos na área de recons-trução de imagens 3D, mantendo um bom desempenho.

Veja mais

Estudi GPGPU dedicat al processament de neuroimatges

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La tecnologia GPGPU permet paral∙lelitzar càlculs executant operacions aritmètiques en els múltiples processadors de que disposen els xips gràfics. S'ha fet servir l'entorn de desenvolupament CUDA de la companyia NVIDIA, que actualment és la solució GPGPU més avançada del mercat. L'algorisme de neuroimatge implementat pertany a un estudi VBM desenvolupat amb l'eina SPM. Es tracta concretament del procés de segmentació d'imatges de ressonància magnètica cerebrals, en els diferents teixits dels quals es composa el cervell: matèria blanca, matèria grisa i líquid cefaloraquidi. S'han implementat models en els llenguatges Matlab, C i CUDA, i s'ha fet un estudi comparatiu per plataformes hardware diferents.

Veja mais

91 resultados para CUDA

Filtro por publicador