35 resultados para Parallel numerical algorithms
em Repositório Científico do Instituto Politécnico de Lisboa - Portugal
Resumo:
A previously developed model is used to numerically simulate real clinical cases of the surgical correction of scoliosis. This model consists of one-dimensional finite elements with spatial deformation in which (i) the column is represented by its axis; (ii) the vertebrae are assumed to be rigid; and (iii) the deformability of the column is concentrated in springs that connect the successive rigid elements. The metallic rods used for the surgical correction are modeled by beam elements with linear elastic behavior. To obtain the forces at the connections between the metallic rods and the vertebrae geometrically, non-linear finite element analyses are performed. The tightening sequence determines the magnitude of the forces applied to the patient column, and it is desirable to keep those forces as small as possible. In this study, a Genetic Algorithm optimization is applied to this model in order to determine the sequence that minimizes the corrective forces applied during the surgery. This amounts to find the optimal permutation of integers 1, ... , n, n being the number of vertebrae involved. As such, we are faced with a combinatorial optimization problem isomorph to the Traveling Salesman Problem. The fitness evaluation requires one computing intensive Finite Element Analysis per candidate solution and, thus, a parallel implementation of the Genetic Algorithm is developed.
Resumo:
Single processor architectures are unable to provide the required performance of high performance embedded systems. Parallel processing based on general-purpose processors can achieve these performances with a considerable increase of required resources. However, in many cases, simplified optimized parallel cores can be used instead of general-purpose processors achieving better performance at lower resource utilization. In this paper, we propose a configurable many-core architecture to serve as a co-processor for high-performance embedded computing on Field-Programmable Gate Arrays. The architecture consists of an array of configurable simple cores with support for floating-point operations interconnected with a configurable interconnection network. For each core it is possible to configure the size of the internal memory, the supported operations and number of interfacing ports. The architecture was tested in a ZYNQ-7020 FPGA in the execution of several parallel algorithms. The results show that the proposed many-core architecture achieves better performance than that achieved with a parallel generalpurpose processor and that up to 32 floating-point cores can be implemented in a ZYNQ-7020 SoC FPGA.
Resumo:
Hyperspectral imaging has become one of the main topics in remote sensing applications, which comprise hundreds of spectral bands at different (almost contiguous) wavelength channels over the same area generating large data volumes comprising several GBs per flight. This high spectral resolution can be used for object detection and for discriminate between different objects based on their spectral characteristics. One of the main problems involved in hyperspectral analysis is the presence of mixed pixels, which arise when the spacial resolution of the sensor is not able to separate spectrally distinct materials. Spectral unmixing is one of the most important task for hyperspectral data exploitation. However, the unmixing algorithms can be computationally very expensive, and even high power consuming, which compromises the use in applications under on-board constraints. In recent years, graphics processing units (GPUs) have evolved into highly parallel and programmable systems. Specifically, several hyperspectral imaging algorithms have shown to be able to benefit from this hardware taking advantage of the extremely high floating-point processing performance, compact size, huge memory bandwidth, and relatively low cost of these units, which make them appealing for onboard data processing. In this paper, we propose a parallel implementation of an augmented Lagragian based method for unsupervised hyperspectral linear unmixing on GPUs using CUDA. The method called simplex identification via split augmented Lagrangian (SISAL) aims to identify the endmembers of a scene, i.e., is able to unmix hyperspectral data sets in which the pure pixel assumption is violated. The efficient implementation of SISAL method presented in this work exploits the GPU architecture at low level, using shared memory and coalesced accesses to memory.
Resumo:
The application of compressive sensing (CS) to hyperspectral images is an active area of research over the past few years, both in terms of the hardware and the signal processing algorithms. However, CS algorithms can be computationally very expensive due to the extremely large volumes of data collected by imaging spectrometers, a fact that compromises their use in applications under real-time constraints. This paper proposes four efficient implementations of hyperspectral coded aperture (HYCA) for CS, two of them termed P-HYCA and P-HYCA-FAST and two additional implementations for its constrained version (CHYCA), termed P-CHYCA and P-CHYCA-FAST on commodity graphics processing units (GPUs). HYCA algorithm exploits the high correlation existing among the spectral bands of the hyperspectral data sets and the generally low number of endmembers needed to explain the data, which largely reduces the number of measurements necessary to correctly reconstruct the original data. The proposed P-HYCA and P-CHYCA implementations have been developed using the compute unified device architecture (CUDA) and the cuFFT library. Moreover, this library has been replaced by a fast iterative method in the P-HYCA-FAST and P-CHYCA-FAST implementations that leads to very significant speedup factors in order to achieve real-time requirements. The proposed algorithms are evaluated not only in terms of reconstruction error for different compressions ratios but also in terms of computational performance using two different GPU architectures by NVIDIA: 1) GeForce GTX 590; and 2) GeForce GTX TITAN. Experiments are conducted using both simulated and real data revealing considerable acceleration factors and obtaining good results in the task of compressing remotely sensed hyperspectral data sets.
Resumo:
Storm- and tsunami-deposits are generated by similar depositional mechanisms making their discrimination hard to establish using classic sedimentologic methods. Here we propose an original approach to identify tsunami-induced deposits by combining numerical simulation and rock magnetism. To test our method, we investigate the tsunami deposit of the Boca do Rio estuary generated by the 1755 earthquake in Lisbon which is well described in the literature. We first test the 1755 tsunami scenario using a numerical inundation model to provide physical parameters for the tsunami wave. Then we use concentration (MS. SIRM) and grain size (chi(ARM), ARM, B1/2, ARM/SIRM) sensitive magnetic proxies coupled with SEM microscopy to unravel the magnetic mineralogy of the tsunami-induced deposit and its associated depositional mechanisms. In order to study the connection between the tsunami deposit and the different sedimentologic units present in the estuary, magnetic data were processed by multivariate statistical analyses. Our numerical simulation show a large inundation of the estuary with flow depths varying from 0.5 to 6 m and run up of similar to 7 m. Magnetic data show a dominance of paramagnetic minerals (quartz) mixed with lesser amount of ferromagnetic minerals, namely titanomagnetite and titanohematite both of a detrital origin and reworked from the underlying units. Multivariate statistical analyses indicate a better connection between the tsunami-induced deposit and a mixture of Units C and D. All these results point to a scenario where the energy released by the tsunami wave was strong enough to overtop and erode important amount of sand from the littoral dune and mixed it with reworked materials from underlying layers at least 1 m in depth. The method tested here represents an original and promising tool to identify tsunami-induced deposits in similar embayed beach environments.
Resumo:
Earthquakes and tsunamis along Morocco's coasts have been reported since historical times. The threat posed by tsunamis must be included in coastal risk studies. This study focuses on the tsunami impact and vulnerability assessment of the Casablanca harbour and surrounding area using a combination of tsunami inundation numerical modelling, field survey data and geographic information system. The tsunami scenario used here is compatible with the 1755 Lisbon event that we considered to be the worst case tsunami scenario. Hydrodynamic modelling was performed with an adapted version of the Cornell Multigrid Coupled Tsunami Model from Cornell University. The simulation covers the eastern domain of the Azores-Gibraltar fracture zone corresponding to the largest tsunamigenic area in the North Atlantic. The proposed vulnerability model attempts to provide an insight into the tsunami vulnerability of building stock. Results in the form of a vulnerability map will be useful for decision makers and local authorities in preventing the community resiliency for tsunami hazards.
Resumo:
The effects of the Miocene through Present compression in the Tagus Abyssal Plain are mapped using the most up to date available to scientific community multi-channel seismic reflection and refraction data. Correlation of the rift basin fault pattern with the deep crustal structure is presented along seismic line IAM-5. Four structural domains were recognized. In the oceanic realm mild deformation concentrates in Domain I adjacent to the Tore-Madeira Rise. Domain 2 is characterized by the absence of shortening structures, except near the ocean-continent transition (OCT), implying that Miocene deformation did not propagate into the Abyssal Plain, In Domain 3 we distinguish three sub-domains: Sub-domain 3A which coincides with the OCT, Sub-domain 3B which is a highly deformed adjacent continental segment, and Sub-domain 3C. The Miocene tectonic inversion is mainly accommodated in Domain 3 by oceanwards directed thrusting at the ocean-continent transition and continentwards on the continental slope. Domain 4 corresponds to the non-rifted continental margin where only minor extensional and shortening deformation structures are observed. Finite element numerical models address the response of the various domains to the Miocene compression, emphasizing the long-wavelength differential vertical movements and the role of possible rheologic contrasts. The concentration of the Miocene deformation in the transitional zone (TC), which is the addition of Sub-domain 3A and part of 3B, is a result of two main factors: (1) focusing of compression in an already stressed region due to plate curvature and sediment loading; and (2) theological weakening. We estimate that the frictional strength in the TC is reduced in 30% relative to the surrounding regions. A model of compressive deformation propagation by means of horizontal impingement of the middle continental crust rift wedge and horizontal shearing on serpentinized mantle in the oceanic realm is presented. This model is consistent with both the geological interpretation of seismic data and the results of numerical modelling.
Resumo:
Storm- and tsunami-deposits are generated by similar depositional mechanisms making their discrimination hard to establish using classic sedimentologic methods. Here we propose an original approach to identify tsunami-induced deposits by combining numerical simulation and rock magnetism. To test our method, we investigate the tsunami deposit of the Boca do Rio estuary generated by the 1755 earthquake in Lisbon which is well described in the literature. We first test the 1755 tsunami scenario using a numerical inundation model to provide physical parameters for the tsunami wave. Then we use concentration (MS. SIRM) and grain size (chi(ARM), ARM, B1/2, ARM/SIRM) sensitive magnetic proxies coupled with SEM microscopy to unravel the magnetic mineralogy of the tsunami-induced deposit and its associated depositional mechanisms. In order to study the connection between the tsunami deposit and the different sedimentologic units present in the estuary, magnetic data were processed by multivariate statistical analyses. Our numerical simulation show a large inundation of the estuary with flow depths varying from 0.5 to 6 m and run up of similar to 7 m. Magnetic data show a dominance of paramagnetic minerals (quartz) mixed with lesser amount of ferromagnetic minerals, namely titanomagnetite and titanohematite both of a detrital origin and reworked from the underlying units. Multivariate statistical analyses indicate a better connection between the tsunami-induced deposit and a mixture of Units C and D. All these results point to a scenario where the energy released by the tsunami wave was strong enough to overtop and erode important amount of sand from the littoral dune and mixed it with reworked materials from underlying layers at least 1 m in depth. The method tested here represents an original and promising tool to identify tsunami-induced deposits in similar embayed beach environments.
Resumo:
Microcrystalline silicon is a two-phase material. Its composition can be interpreted as a series of grains of crystalline silicon imbedded in an amorphous silicon tissue, with a high concentration of dangling bonds in the transition regions. In this paper, results for the transport properties of a mu c-Si:H p-i-n junction obtained by means of two-dimensional numerical simulation are reported. The role played by the boundary regions between the crystalline grains and the amorphous matrix is taken into account and these regions are treated similar to a heterojunction interface. The device is analysed under AM1.5 illumination and the paper outlines the influence of the local electric field at the grain boundary transition regions on the internal electric configuration of the device and on the transport mechanism within the mu c-Si:H intrinsic layer.
Resumo:
Storm- and tsunami-deposits are generated by similar depositional mechanisms making their discrimination hard to establish using classic sedimentologic methods. Here we propose an original approach to identify tsunami-induced deposits by combining numerical simulation and rock magnetism. To test our method, we investigate the tsunami deposit of the Boca do Rio estuary generated by the 1755 earthquake in Lisbon which is well described in the literature. We first test the 1755 tsunami scenario using a numerical inundation model to provide physical parameters for the tsunami wave. Then we use concentration (MS. SIRM) and grain size (chi(ARM), ARM, B1/2, ARM/SIRM) sensitive magnetic proxies coupled with SEM microscopy to unravel the magnetic mineralogy of the tsunami-induced deposit and its associated depositional mechanisms. In order to study the connection between the tsunami deposit and the different sedimentologic units present in the estuary, magnetic data were processed by multivariate statistical analyses. Our numerical simulation show a large inundation of the estuary with flow depths varying from 0.5 to 6 m and run up of similar to 7 m. Magnetic data show a dominance of paramagnetic minerals (quartz) mixed with lesser amount of ferromagnetic minerals, namely titanomagnetite and titanohematite both of a detrital origin and reworked from the underlying units. Multivariate statistical analyses indicate a better connection between the tsunami-induced deposit and a mixture of Units C and D. All these results point to a scenario where the energy released by the tsunami wave was strong enough to overtop and erode important amount of sand from the littoral dune and mixed it with reworked materials from underlying layers at least 1 m in depth. The method tested here represents an original and promising tool to identify tsunami-induced deposits in similar embayed beach environments.
Resumo:
A novel high throughput and scalable unified architecture for the computation of the transform operations in video codecs for advanced standards is presented in this paper. This structure can be used as a hardware accelerator in modern embedded systems to efficiently compute all the two-dimensional 4 x 4 and 2 x 2 transforms of the H.264/AVC standard. Moreover, its highly flexible design and hardware efficiency allows it to be easily scaled in terms of performance and hardware cost to meet the specific requirements of any given video coding application. Experimental results obtained using a Xilinx Virtex-5 FPGA demonstrated the superior performance and hardware efficiency levels provided by the proposed structure, which presents a throughput per unit of area relatively higher than other similar recently published designs targeting the H.264/AVC standard. Such results also showed that, when integrated in a multi-core embedded system, this architecture provides speedup factors of about 120x concerning pure software implementations of the transform algorithms, therefore allowing the computation, in real-time, of all the above mentioned transforms for Ultra High Definition Video (UHDV) sequences (4,320 x 7,680 @ 30 fps).
Resumo:
The application of a-SiC:H/a-Si:H pinpin photodiodes for optoelectronic applications as a WDM demultiplexer device has been demonstrated useful in optical communications that use the WDM technique to encode multiple signals in the visible light range. This is required in short range optical communication applications, where for costs reasons the link is provided by Plastic Optical Fibers. Characterization of these devices has shown the presence of large photocapacitive effects. By superimposing background illumination to the pulsed channel the device behaves as a filter, producing signal attenuation, or as an amplifier, producing signal gain, depending on the channel/background wavelength combination. We present here results, obtained by numerical simulations, about the internal electric configuration of a-SiC:H/a-Si:H pinpin photodiode. These results address the explanation of the device functioning in the frequency domain to a wavelength tunable photo-capacitance due to the accumulation of space charge localized at the bottom diode that, according to the Shockley-Read-Hall model, it is mainly due to defect trapping. Experimental result about measurement of the photodiode capacitance under different conditions of illumination and applied bias will be also presented. The combination of these analyses permits the description of a wavelength controlled photo-capacitance that combined with the series and parallel resistance of the diodes may result in the explicit definition of cut off frequencies for frequency capacitive filters activated by the light background or an oscillatory resonance of photogenerated carriers between the two diodes. (C) 2013 Elsevier B.V. All rights reserved.
Resumo:
This work aims at investigating the impact of treating breast cancer using different radiation therapy (RT) techniques – forwardly-planned intensity-modulated, f-IMRT, inversely-planned IMRT and dynamic conformal arc (DCART) RT – and their effects on the whole-breast irradiation and in the undesirable irradiation of the surrounding healthy tissues. Two algorithms of iPlan BrainLAB treatment planning system were compared: Pencil Beam Convolution (PBC) and commercial Monte Carlo (iMC). Seven left-sided breast patients submitted to breast-conserving surgery were enrolled in the study. For each patient, four RT techniques – f-IMRT, IMRT using 2-fields and 5-fields (IMRT2 and IMRT5, respectively) and DCART – were applied. The dose distributions in the planned target volume (PTV) and the dose to the organs at risk (OAR) were compared analyzing dose–volume histograms; further statistical analysis was performed using IBM SPSS v20 software. For PBC, all techniques provided adequate coverage of the PTV. However, statistically significant dose differences were observed between the techniques, in the PTV, OAR and also in the pattern of dose distribution spreading into normal tissues. IMRT5 and DCART spread low doses into greater volumes of normal tissue, right breast, right lung and heart than tangential techniques. However, IMRT5 plans improved distributions for the PTV, exhibiting better conformity and homogeneity in target and reduced high dose percentages in ipsilateral OAR. DCART did not present advantages over any of the techniques investigated. Differences were also found comparing the calculation algorithms: PBC estimated higher doses for the PTV, ipsilateral lung and heart than the iMC algorithm predicted.
Resumo:
Trabalho Final de Mestrado para obtenção do grau de Mestre em Engenharia Mecânica