Biblioteca Digital

14 resultados para Parallel processing (Electronic computers) - Research

em Reposit

A many-core co-processor for embedded parallel computing on FPGA

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Single processor architectures are unable to provide the required performance of high performance embedded systems. Parallel processing based on general-purpose processors can achieve these performances with a considerable increase of required resources. However, in many cases, simplified optimized parallel cores can be used instead of general-purpose processors achieving better performance at lower resource utilization. In this paper, we propose a configurable many-core architecture to serve as a co-processor for high-performance embedded computing on Field-Programmable Gate Arrays. The architecture consists of an array of configurable simple cores with support for floating-point operations interconnected with a configurable interconnection network. For each core it is possible to configure the size of the internal memory, the supported operations and number of interfacing ports. The architecture was tested in a ZYNQ-7020 FPGA in the execution of several parallel algorithms. The results show that the proposed many-core architecture achieves better performance than that achieved with a parallel generalpurpose processor and that up to 32 floating-point cores can be implemented in a ZYNQ-7020 SoC FPGA.

Veja mais

Sparse matrix multiplication on a reconfigurable many-core architecture

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sparse matrix-vector multiplication (SMVM) is a fundamental operation in many scientific and engineering applications. In many cases sparse matrices have thousands of rows and columns where most of the entries are zero, while non-zero data is spread over the matrix. This sparsity of data locality reduces the effectiveness of data cache in general-purpose processors quite reducing their performance efficiency when compared to what is achieved with dense matrix multiplication. In this paper, we propose a parallel processing solution for SMVM in a many-core architecture. The architecture is tested with known benchmarks using a ZYNQ-7020 FPGA. The architecture is scalable in the number of core elements and limited only by the available memory bandwidth. It achieves performance efficiencies up to almost 70% and better performances than previous FPGA designs.

Veja mais

Parallel hyperspectral coded aperture for compressive sensing on GPUs

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The application of compressive sensing (CS) to hyperspectral images is an active area of research over the past few years, both in terms of the hardware and the signal processing algorithms. However, CS algorithms can be computationally very expensive due to the extremely large volumes of data collected by imaging spectrometers, a fact that compromises their use in applications under real-time constraints. This paper proposes four efficient implementations of hyperspectral coded aperture (HYCA) for CS, two of them termed P-HYCA and P-HYCA-FAST and two additional implementations for its constrained version (CHYCA), termed P-CHYCA and P-CHYCA-FAST on commodity graphics processing units (GPUs). HYCA algorithm exploits the high correlation existing among the spectral bands of the hyperspectral data sets and the generally low number of endmembers needed to explain the data, which largely reduces the number of measurements necessary to correctly reconstruct the original data. The proposed P-HYCA and P-CHYCA implementations have been developed using the compute unified device architecture (CUDA) and the cuFFT library. Moreover, this library has been replaced by a fast iterative method in the P-HYCA-FAST and P-CHYCA-FAST implementations that leads to very significant speedup factors in order to achieve real-time requirements. The proposed algorithms are evaluated not only in terms of reconstruction error for different compressions ratios but also in terms of computational performance using two different GPU architectures by NVIDIA: 1) GeForce GTX 590; and 2) GeForce GTX TITAN. Experiments are conducted using both simulated and real data revealing considerable acceleration factors and obtaining good results in the task of compressing remotely sensed hyperspectral data sets.

Veja mais

Solid-state Marx based two-switch voltage modulator for the On-Line Isotope Mass Separator accelerator at the European Organization for Nuclear Research

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new circuit topology is proposed to replace the actual pulse transformer and thyratron based resonant modulator that supplies the 60 kV target potential for the ion acceleration of the On-Line Isotope Mass Separator accelerator, the stability of which is critical for the mass resolution downstream separator, at the European Organization for Nuclear Research. The improved modulator uses two solid-state switches working together, each one based on the Marx generator concept, operating as series and parallel switches, reducing the stress on the series stacked semiconductors, and also as auxiliary pulse generator in order to fulfill the target requirements. Preliminary results of a 10 kV prototype, using 1200 V insulated gate bipolar transistors and capacitors in the solid-state Marx circuits, ten stages each, with an electrical equivalent circuit of the target, are presented, demonstrating both the improved voltage stability and pulse flexibility potential wanted for this new modulator.

Veja mais

Parallel implementation of vertex component analysis for hyperspectral endmember extraction

Relevância:

30.00% 30.00%

Publicador:

Resumo:

International Conference with Peer Review 2012 IEEE International Conference in Geoscience and Remote Sensing Symposium (IGARSS), 22-27 July 2012, Munich, Germany

Veja mais

Climatology of the Iberia coastal low-level wind jet: weather research forecasting model high-resolution results

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Coastal low-level jets (CLLJ) are a low-tropospheric wind feature driven by the pressure gradient produced by a sharp contrast between high temperatures over land and lower temperatures over the sea. This contrast between the cold ocean and the warm land in the summer is intensified by the impact of the coastal parallel winds on the ocean generating upwelling currents, sharpening the temperature gradient close to the coast and giving rise to strong baroclinic structures at the coast. During summertime, the Iberian Peninsula is often under the effect of the Azores High and of a thermal low pressure system inland, leading to a seasonal wind, in the west coast, called the Nortada (northerly wind). This study presents a regional climatology of the CLLJ off the west coast of the Iberian Peninsula, based on a 9km resolution downscaling dataset, produced using the Weather Research and Forecasting (WRF) mesoscale model, forced by 19 years of ERA-Interim reanalysis (1989-2007). The simulation results show that the jet hourly frequency of occurrence in the summer is above 30% and decreases to about 10% during spring and autumn. The monthly frequencies of occurrence can reach higher values, around 40% in summer months, and reveal large inter-annual variability in all three seasons. In the summer, at a daily base, the CLLJ is present in almost 70% of the days. The CLLJ wind direction is mostly from north-northeasterly and occurs more persistently in three areas where the interaction of the jet flow with local capes and headlands is more pronounced. The coastal jets in this area occur at heights between 300 and 400 m, and its speed has a mean around 15 m/s, reaching maximum speeds of 25 m/s.

Veja mais

Parallel hyperspectral unmixing on GPUs

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This letter presents a new parallel method for hyperspectral unmixing composed by the efficient combination of two popular methods: vertex component analysis (VCA) and sparse unmixing by variable splitting and augmented Lagrangian (SUNSAL). First, VCA extracts the endmember signatures, and then, SUNSAL is used to estimate the abundance fractions. Both techniques are highly parallelizable, which significantly reduces the computing time. A design for the commodity graphics processing units of the two methods is presented and evaluated. Experimental results obtained for simulated and real hyperspectral data sets reveal speedups up to 100 times, which grants real-time response required by many remotely sensed hyperspectral applications.

Veja mais

Large-area homogeneous periodic surface structures generated on the surface sputtered boron carbide thin films by femtosecond laser processing

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Amorphous and crystalline sputtered boron carbide thin films have a very high hardness even surpassing that of bulk crystalline boron carbide (≈41 GPa). However, magnetron sputtered B-C films have high friction coefficients (C.o.F) which limit their industrial application. Nanopatterning of materials surfaces has been proposed as a solution to decrease the C.o.F. The contact area of the nanopatterned surfaces is decreased due to the nanometre size of the asperities which results in a significant reduction of adhesion and friction. In the present work, the surface of amorphous and polycrystalline B-C thin films deposited by magnetron sputtering was nanopatterned using infrared femtosecond laser radiation. Successive parallel laser tracks 10 μm apart were overlapped in order to obtain a processed area of about 3 mm2. Sinusoidal-like undulations with the same spatial period as the laser tracks were formed on the surface of the amorphous boron carbide films after laser processing. The undulations amplitude increases with increasing laser fluence. The formation of undulations with a 10 μm period was also observed on the surface of the crystalline boron carbide film processed with a pulse energy of 72 μJ. The amplitude of the undulations is about 10 times higher than in the amorphous films processed at the same pulse energy due to the higher roughness of the films and consequent increase in laser radiation absorption. LIPSS formation on the surface of the films was achieved for the three B-C films under study. However, LIPSS are formed under different circumstances. Processing of the amorphous films at low fluence (72 μJ) results in LIPSS formation only on localized spots on the film surface. LIPSS formation was also observed on the top of the undulations formed after laser processing with 78 μJ of the amorphous film deposited at 800 °C. Finally, large-area homogeneous LIPSS coverage of the boron carbide crystalline films surface was achieved within a large range of laser fluences although holes are also formed at higher laser fluences.

Veja mais

Parallel GPU architecture for hyperspectral unmixing based on augmented Lagrangian method

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hyperspectral imaging has become one of the main topics in remote sensing applications, which comprise hundreds of spectral bands at different (almost contiguous) wavelength channels over the same area generating large data volumes comprising several GBs per flight. This high spectral resolution can be used for object detection and for discriminate between different objects based on their spectral characteristics. One of the main problems involved in hyperspectral analysis is the presence of mixed pixels, which arise when the spacial resolution of the sensor is not able to separate spectrally distinct materials. Spectral unmixing is one of the most important task for hyperspectral data exploitation. However, the unmixing algorithms can be computationally very expensive, and even high power consuming, which compromises the use in applications under on-board constraints. In recent years, graphics processing units (GPUs) have evolved into highly parallel and programmable systems. Specifically, several hyperspectral imaging algorithms have shown to be able to benefit from this hardware taking advantage of the extremely high floating-point processing performance, compact size, huge memory bandwidth, and relatively low cost of these units, which make them appealing for onboard data processing. In this paper, we propose a parallel implementation of an augmented Lagragian based method for unsupervised hyperspectral linear unmixing on GPUs using CUDA. The method called simplex identification via split augmented Lagrangian (SISAL) aims to identify the endmembers of a scene, i.e., is able to unmix hyperspectral data sets in which the pure pixel assumption is violated. The efficient implementation of SISAL method presented in this work exploits the GPU architecture at low level, using shared memory and coalesced accesses to memory.

Veja mais

Parallel hyperspectral compressive sensing method on GPU

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Remote hyperspectral sensors collect large amounts of data per flight usually with low spatial resolution. It is known that the bandwidth connection between the satellite/airborne platform and the ground station is reduced, thus a compression onboard method is desirable to reduce the amount of data to be transmitted. This paper presents a parallel implementation of an compressive sensing method, called parallel hyperspectral coded aperture (P-HYCA), for graphics processing units (GPU) using the compute unified device architecture (CUDA). This method takes into account two main properties of hyperspectral dataset, namely the high correlation existing among the spectral bands and the generally low number of endmembers needed to explain the data, which largely reduces the number of measurements necessary to correctly reconstruct the original data. Experimental results conducted using synthetic and real hyperspectral datasets on two different GPU architectures by NVIDIA: GeForce GTX 590 and GeForce GTX TITAN, reveal that the use of GPUs can provide real-time compressive sensing performance. The achieved speedup is up to 20 times when compared with the processing time of HYCA running on one core of the Intel i7-2600 CPU (3.4GHz), with 16 Gbyte memory.

Veja mais

Parallel hyperspectral unmixing method via split augmented lagrangian on GPU

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the main problems of hyperspectral data analysis is the presence of mixed pixels due to the low spatial resolution of such images. Linear spectral unmixing aims at inferring pure spectral signatures and their fractions at each pixel of the scene. The huge data volumes acquired by hyperspectral sensors put stringent requirements on processing and unmixing methods. This letter proposes an efficient implementation of the method called simplex identification via split augmented Lagrangian (SISAL) which exploits the graphics processing unit (GPU) architecture at low level using Compute Unified Device Architecture. SISAL aims to identify the endmembers of a scene, i.e., is able to unmix hyperspectral data sets in which the pure pixel assumption is violated. The proposed implementation is performed in a pixel-by-pixel fashion using coalesced accesses to memory and exploiting shared memory to store temporary data. Furthermore, the kernels have been optimized to minimize the threads divergence, therefore achieving high GPU occupancy. The experimental results obtained for the simulated and real hyperspectral data sets reveal speedups up to 49 times, which demonstrates that the GPU implementation can significantly accelerate the method's execution over big data sets while maintaining the methods accuracy.

Veja mais

Parallel method for sparse semisupervised hyperspectral unmixing

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Parallel hyperspectral unmixing problem is considered in this paper. A semisupervised approach is developed under the linear mixture model, where the abundance's physical constraints are taken into account. The proposed approach relies on the increasing availability of spectral libraries of materials measured on the ground instead of resorting to endmember extraction methods. Since Libraries are potentially very large and hyperspectral datasets are of high dimensionality a parallel implementation in a pixel-by-pixel fashion is derived to properly exploits the graphics processing units (GPU) architecture at low level, thus taking full advantage of the computational power of GPUs. Experimental results obtained for real hyperspectral datasets reveal significant speedup factors, up to 164 times, with regards to optimized serial implementation.

Veja mais

Parallel hyperspectral unmixing method on GPU

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many Hyperspectral imagery applications require a response in real time or near-real time. To meet this requirement this paper proposes a parallel unmixing method developed for graphics processing units (GPU). This method is based on the vertex component analysis (VCA), which is a geometrical based method highly parallelizable. VCA is a very fast and accurate method that extracts endmember signatures from large hyperspectral datasets without the use of any a priori knowledge about the constituent spectra. Experimental results obtained for simulated and real hyperspectral datasets reveal considerable acceleration factors, up to 24 times.

Veja mais

Parallel sparse unmixing of hyperspectral data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, a new parallel method for sparse spectral unmixing of remotely sensed hyperspectral data on commodity graphics processing units (GPUs) is presented. A semi-supervised approach is adopted, which relies on the increasing availability of spectral libraries of materials measured on the ground instead of resorting to endmember extraction methods. This method is based on the spectral unmixing by splitting and augmented Lagrangian (SUNSAL) that estimates the material's abundance fractions. The parallel method is performed in a pixel-by-pixel fashion and its implementation properly exploits the GPU architecture at low level, thus taking full advantage of the computational power of GPUs. Experimental results obtained for simulated and real hyperspectral datasets reveal significant speedup factors, up to 1 64 times, with regards to optimized serial implementation.

Veja mais

14 resultados para Parallel processing (Electronic computers) - Research

em Reposit

Filtro por publicador