523 resultados para GPGPU, CUDA, OpenCL, Programmazione Parallela
Resumo:
Agent-oriented programming (AOP) è un paradigma di programmazione che concepisce un software come insieme di agenti che possiedono caratteristiche di autonomia, proattività e che sono in grado di comunicare con altri agenti. Sebbene sia stato impiegato soprattutto nell'ambito dell'intelligenza artificiale questo tipo di programmazione si rivela utile per lo sviluppo di sistemi distribuiti riuscendo a gestire agilmente problemi di concorrenza. Lo scopo di questa tesi è analizzare le caratteristiche del paradigma e dei software basati su agenti, utilizzando come caso di studio Sarl, un linguaggio general-purpose molto recente. La parte principale del lavoro consiste nella descrizione dei modelli teorici che hanno portato alla nascita della programmazione ad agenti, in particolare del modello BDI, e dei principali framework per lo sviluppo di sistemi multi-agente.
Resumo:
Image and video compression play a major role in the world today, allowing the storage and transmission of large multimedia content volumes. However, the processing of this information requires high computational resources, hence the improvement of the computational performance of these compression algorithms is very important. The Multidimensional Multiscale Parser (MMP) is a pattern-matching-based compression algorithm for multimedia contents, namely images, achieving high compression ratios, maintaining good image quality, Rodrigues et al. [2008]. However, in comparison with other existing algorithms, this algorithm takes some time to execute. Therefore, two parallel implementations for GPUs were proposed by Ribeiro [2016] and Silva [2015] in CUDA and OpenCL-GPU, respectively. In this dissertation, to complement the referred work, we propose two parallel versions that run the MMP algorithm in CPU: one resorting to OpenMP and another that converts the existing OpenCL-GPU into OpenCL-CPU. The proposed solutions are able to improve the computational performance of MMP by 3 and 2:7 , respectively. The High Efficiency Video Coding (HEVC/H.265) is the most recent standard for compression of image and video. Its impressive compression performance, makes it a target for many adaptations, particularly for holoscopic image/video processing (or light field). Some of the proposed modifications to encode this new multimedia content are based on geometry-based disparity compensations (SS), developed by Conti et al. [2014], and a Geometric Transformations (GT) module, proposed by Monteiro et al. [2015]. These compression algorithms for holoscopic images based on HEVC present an implementation of specific search for similar micro-images that is more efficient than the one performed by HEVC, but its implementation is considerably slower than HEVC. In order to enable better execution times, we choose to use the OpenCL API as the GPU enabling language in order to increase the module performance. With its most costly setting, we are able to reduce the GT module execution time from 6.9 days to less then 4 hours, effectively attaining a speedup of 45 .
Resumo:
In questa tesi viene trattata la problematica di determinare le migliori K soluzioni per due problemi di ottimizzazione, il Knapsack Problem 0-1 e lo Shortest Path Problem. Tali soluzioni possono essere impiegate all'interno di metodi di column generation per la risoluzione di problemi reali, ad esempio Bin Packing Problems e problemi di scheduling di veicoli ed equipaggi. Sono stati implementati, per verificarne sperimentalmente le prestazioni, nuovi algoritmi di programmazione dinamica, sviluppati nell’ambito di un programma di ricerca. Inizialmente, per entrambi i problemi, è stato descritto un algoritmo che determinasse le migliori K soluzioni per ogni possibile sottoproblema; partendo da uno zaino con capacità nulla, nel caso del Knapsack Problem 0-1, e dalla determinazione di un cammino dal vertice sorgente in se stesso per lo Shortest Path Problem, l’algoritmo determina le migliori soluzioni di sottoproblemi via via sempre più grandi, utilizzando le soluzioni costruite per gli stati precedenti, fino a ottenere le migliori soluzioni del problema globale. Successivamente, è stato definito un algoritmo basato su un approccio di ricorsione backward; in questo caso si utilizza una funzione ricorsiva che, chiamata a partire dallo stato corrispondente al problema globale, viene richiamata solo sugli stati intermedi strettamente necessari, e per ognuno di essi non vengono determinate soluzioni superflue.
Resumo:
L’uscita dalla pandemia, il conflitto russo-ucraino e la corsa ai bonus edilizia hanno provocato un generale aumento del prezzo dei materiali edili e delle opere provvisionali. I ponteggi hanno visto particolari rincari, per via del fatto che superbonus e bonus facciate ne richiedono un ampio utilizzo. Nel presente elaborato si è cercata una soluzione a tale criticità nelle direzioni che sta prendendo a livello internazionale il settore delle costruzioni, sulla strada della digitalizzazione e dell’applicazione di soluzioni informatiche in grado di ottimizzare i processi. Si è perciò creato un modello parametrico di ponteggio a telai prefabbricati che si adatti nel miglior modo possibile a ogni tipo di fabbricato. Nello specifico, è stato elaborato un algoritmo che, dati il sedime e l’altezza dell’edificio, generi in maniera automatica il modello tridimensionale del ponteggio, consentendo la visualizzazione immediata del suo sviluppo nello spazio del cantiere e l’estrapolazione di elaborati grafici quali pianta e prospetti. Nella presente dissertazione viene illustrato come è stato sviluppato tale algoritmo, utilizzando il software Revit e in particolare lo strumento di programmazione visiva Dynamo. Infine, poiché l’obiettivo è quello di individuare un algoritmo applicabile a qualsiasi edificio dato, il modello viene testato su un caso di studio particolare: la ristrutturazione di un fabbricato all’interno del complesso Cooperativa Sociale CIM.
Resumo:
Lo scenario televisivo degli ultimi venti anni ha registrato cambiamenti sconvolgenti. Cambiano i modelli di business, le logiche di produzione, distribuzione e di fruizione. Siamo oggi in un mondo ibrido, dove broadcast e broadband si fondono, e sul quale insistono tutti i soggetti in campo: content provider, mvpd, broadcaster free e pay, compagnie telefoniche, piattaforme ott e aggregatori di contenuti. Lo sguardo più profondo però individua nel panorama una costellazione di altre realtà importanti. Gli anni Settanta hanno visto il fiorire in modo impetuoso di tante emittenti locali che hanno fatto la storia della televisione in Italia. L’idea dell’elaborato nasce proprio dalla funzione che queste realtà svolgono oggi, per la loro rilevanza a livello territoriale, culturale e sociale. In particolare, il caso RTV38 in Toscana rappresenta in modo emblematico questo importante ruolo, bilanciato da un lato con un servizio pubblico reso alla comunità, prefigurandosi come vera e propria istituzione, e dall’altro concentrandosi sull’attività strettamente commerciale e privata.
Resumo:
Floating-point computing with more than one TFLOP of peak performance is already a reality in recent Field-Programmable Gate Arrays (FPGA). General-Purpose Graphics Processing Units (GPGPU) and recent many-core CPUs have also taken advantage of the recent technological innovations in integrated circuit (IC) design and had also dramatically improved their peak performances. In this paper, we compare the trends of these computing architectures for high-performance computing and survey these platforms in the execution of algorithms belonging to different scientific application domains. Trends in peak performance, power consumption and sustained performances, for particular applications, show that FPGAs are increasing the gap to GPUs and many-core CPUs moving them away from high-performance computing with intensive floating-point calculations. FPGAs become competitive for custom floating-point or fixed-point representations, for smaller input sizes of certain algorithms, for combinational logic problems and parallel map-reduce problems. © 2014 Technical University of Munich (TUM).
Resumo:
Dissertação de Mestrado em Engenharia Informática
Resumo:
Hyperspectral imaging can be used for object detection and for discriminating between different objects based on their spectral characteristics. One of the main problems of hyperspectral data analysis is the presence of mixed pixels, due to the low spatial resolution of such images. This means that several spectrally pure signatures (endmembers) are combined into the same mixed pixel. Linear spectral unmixing follows an unsupervised approach which aims at inferring pure spectral signatures and their material fractions at each pixel of the scene. The huge data volumes acquired by such sensors put stringent requirements on processing and unmixing methods. This paper proposes an efficient implementation of a unsupervised linear unmixing method on GPUs using CUDA. The method finds the smallest simplex by solving a sequence of nonsmooth convex subproblems using variable splitting to obtain a constraint formulation, and then applying an augmented Lagrangian technique. The parallel implementation of SISAL presented in this work exploits the GPU architecture at low level, using shared memory and coalesced accesses to memory. The results herein presented indicate that the GPU implementation can significantly accelerate the method's execution over big datasets while maintaining the methods accuracy.
Resumo:
Hyperspectral imaging has become one of the main topics in remote sensing applications, which comprise hundreds of spectral bands at different (almost contiguous) wavelength channels over the same area generating large data volumes comprising several GBs per flight. This high spectral resolution can be used for object detection and for discriminate between different objects based on their spectral characteristics. One of the main problems involved in hyperspectral analysis is the presence of mixed pixels, which arise when the spacial resolution of the sensor is not able to separate spectrally distinct materials. Spectral unmixing is one of the most important task for hyperspectral data exploitation. However, the unmixing algorithms can be computationally very expensive, and even high power consuming, which compromises the use in applications under on-board constraints. In recent years, graphics processing units (GPUs) have evolved into highly parallel and programmable systems. Specifically, several hyperspectral imaging algorithms have shown to be able to benefit from this hardware taking advantage of the extremely high floating-point processing performance, compact size, huge memory bandwidth, and relatively low cost of these units, which make them appealing for onboard data processing. In this paper, we propose a parallel implementation of an augmented Lagragian based method for unsupervised hyperspectral linear unmixing on GPUs using CUDA. The method called simplex identification via split augmented Lagrangian (SISAL) aims to identify the endmembers of a scene, i.e., is able to unmix hyperspectral data sets in which the pure pixel assumption is violated. The efficient implementation of SISAL method presented in this work exploits the GPU architecture at low level, using shared memory and coalesced accesses to memory.
Resumo:
Remote hyperspectral sensors collect large amounts of data per flight usually with low spatial resolution. It is known that the bandwidth connection between the satellite/airborne platform and the ground station is reduced, thus a compression onboard method is desirable to reduce the amount of data to be transmitted. This paper presents a parallel implementation of an compressive sensing method, called parallel hyperspectral coded aperture (P-HYCA), for graphics processing units (GPU) using the compute unified device architecture (CUDA). This method takes into account two main properties of hyperspectral dataset, namely the high correlation existing among the spectral bands and the generally low number of endmembers needed to explain the data, which largely reduces the number of measurements necessary to correctly reconstruct the original data. Experimental results conducted using synthetic and real hyperspectral datasets on two different GPU architectures by NVIDIA: GeForce GTX 590 and GeForce GTX TITAN, reveal that the use of GPUs can provide real-time compressive sensing performance. The achieved speedup is up to 20 times when compared with the processing time of HYCA running on one core of the Intel i7-2600 CPU (3.4GHz), with 16 Gbyte memory.
Resumo:
The application of compressive sensing (CS) to hyperspectral images is an active area of research over the past few years, both in terms of the hardware and the signal processing algorithms. However, CS algorithms can be computationally very expensive due to the extremely large volumes of data collected by imaging spectrometers, a fact that compromises their use in applications under real-time constraints. This paper proposes four efficient implementations of hyperspectral coded aperture (HYCA) for CS, two of them termed P-HYCA and P-HYCA-FAST and two additional implementations for its constrained version (CHYCA), termed P-CHYCA and P-CHYCA-FAST on commodity graphics processing units (GPUs). HYCA algorithm exploits the high correlation existing among the spectral bands of the hyperspectral data sets and the generally low number of endmembers needed to explain the data, which largely reduces the number of measurements necessary to correctly reconstruct the original data. The proposed P-HYCA and P-CHYCA implementations have been developed using the compute unified device architecture (CUDA) and the cuFFT library. Moreover, this library has been replaced by a fast iterative method in the P-HYCA-FAST and P-CHYCA-FAST implementations that leads to very significant speedup factors in order to achieve real-time requirements. The proposed algorithms are evaluated not only in terms of reconstruction error for different compressions ratios but also in terms of computational performance using two different GPU architectures by NVIDIA: 1) GeForce GTX 590; and 2) GeForce GTX TITAN. Experiments are conducted using both simulated and real data revealing considerable acceleration factors and obtaining good results in the task of compressing remotely sensed hyperspectral data sets.
Resumo:
Mestrado em Engenharia Electrotécnica e de Computadores - Ramo de Sistemas Autónomos
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Biomédica
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática