983 resultados para Graphics processing unit programming


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Graphics Processing Units (GPUs) are becoming popular accelerators in modern High-Performance Computing (HPC) clusters. Installing GPUs on each node of the cluster is not efficient resulting in high costs and power consumption as well as underutilisation of the accelerator. The research reported in this paper is motivated towards the use of few physical GPUs by providing cluster nodes access to remote GPUs on-demand for a financial risk application. We hypothesise that sharing GPUs between several nodes, referred to as multi-tenancy, reduces the execution time and energy consumed by an application. Two data transfer modes between the CPU and the GPUs, namely concurrent and sequential, are explored. The key result from the experiments is that multi-tenancy with few physical GPUs using sequential data transfers lowers the execution time and the energy consumed, thereby improving the overall performance of the application.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A major weakness among loading models for pedestrians walking on flexible structures proposed in recent years is the various uncorroborated assumptions made in their development. This applies to spatio-temporal characteristics of pedestrian loading and the nature of multi-object interactions. To alleviate this problem, a framework for the determination of localised pedestrian forces on full-scale structures is presented using a wireless attitude and heading reference systems (AHRS). An AHRS comprises a triad of tri-axial accelerometers, gyroscopes and magnetometers managed by a dedicated data processing unit, allowing motion in three-dimensional space to be reconstructed. A pedestrian loading model based on a single point inertial measurement from an AHRS is derived and shown to perform well against benchmark data collected on an instrumented treadmill. Unlike other models, the current model does not take any predefined form nor does it require any extrapolations as to the timing and amplitude of pedestrian loading. In order to assess correctly the influence of the moving pedestrian on behaviour of a structure, an algorithm for tracking the point of application of pedestrian force is developed based on data from a single AHRS attached to a foot. A set of controlled walking tests with a single pedestrian is conducted on a real footbridge for validation purposes. A remarkably good match between the measured and simulated bridge response is found, indeed confirming applicability of the proposed framework.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The problem addressed concerns the determination of the average numberof successive attempts of guessing a word of a certain length consisting of letters withgiven probabilities of occurrence. Both first- and second-order approximations to a naturallanguage are considered. The guessing strategy used is guessing words in decreasing orderof probability. When word and alphabet sizes are large, approximations are necessary inorder to estimate the number of guesses. Several kinds of approximations are discusseddemonstrating moderate requirements regarding both memory and central processing unit(CPU) time. When considering realistic sizes of alphabets and words (100), the numberof guesses can be estimated within minutes with reasonable accuracy (a few percent) andmay therefore constitute an alternative to, e.g., various entropy expressions. For manyprobability distributions, the density of the logarithm of probability products is close to anormal distribution. For those cases, it is possible to derive an analytical expression for theaverage number of guesses. The proportion of guesses needed on average compared to thetotal number decreases almost exponentially with the word length. The leading term in anasymptotic expansion can be used to estimate the number of guesses for large word lengths.Comparisons with analytical lower bounds and entropy expressions are also provided.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we develop a fast implementation of an hyperspectral coded aperture (HYCA) algorithm on different platforms using OpenCL, an open standard for parallel programing on heterogeneous systems, which includes a wide variety of devices, from dense multicore systems from major manufactures such as Intel or ARM to new accelerators such as graphics processing units (GPUs), field programmable gate arrays (FPGAs), the Intel Xeon Phi and other custom devices. Our proposed implementation of HYCA significantly reduces its computational cost. Our experiments have been conducted using simulated data and reveal considerable acceleration factors. This kind of implementations with the same descriptive language on different architectures are very important in order to really calibrate the possibility of using heterogeneous platforms for efficient hyperspectral imaging processing in real remote sensing missions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work presents a low cost architecture for development of synchronized phasor measurement units (PMU). The device is intended to be connected in the low voltage grid, which allows the monitoring of transmission and distribution networks. Developments of this project include a complete PMU, with instrumentation module for use in low voltage network, GPS module to provide the sync signal and time stamp for the measures, processing unit with the acquisition system, phasor estimation and formatting data according to the standard and finally, communication module for data transmission. For the development and evaluation of the performance of this PMU, it was developed a set of applications in LabVIEW environment with specific features that let analyze the behavior of the measures and identify the sources of error of the PMU, as well as to apply all the tests proposed by the standard. The first application, useful for the development of instrumentation, consists of a function generator integrated with an oscilloscope, which allows the generation and acquisition of signals synchronously, in addition to the handling of samples. The second and main, is the test platform, with capabality of generating all tests provided by the synchronized phasor measurement standard IEEE C37.118.1, allowing store data or make the analysis of the measurements in real time. Finally, a third application was developed to evaluate the results of the tests and generate calibration curves to adjust the PMU. The results include all the tests proposed by synchrophasors standard and an additional test that evaluates the impact of noise. Moreover, through two prototypes connected to the electrical installation of consumers in same distribution circuit, it was obtained monitoring records that allowed the identification of loads in consumer and power quality analysis, beyond the event detection at the distribution and transmission levels.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis develops and tests various transient and steady-state computational models such as direct numerical simulation (DNS), large eddy simulation (LES), filtered unsteady Reynolds-averaged Navier-Stokes (URANS) and steady Reynolds-averaged Navier-Stokes (RANS) with and without magnetic field to investigate turbulent flows in canonical as well as in the nozzle and mold geometries of the continuous casting process. The direct numerical simulations are first performed in channel, square and 2:1 aspect rectangular ducts to investigate the effect of magnetic field on turbulent flows. The rectangular duct is a more practical geometry for continuous casting nozzle and mold and has the option of applying magnetic field either perpendicular to broader side or shorter side. This work forms the part of a graphic processing unit (GPU) based CFD code (CU-FLOW) development for magnetohydrodynamic (MHD) turbulent flows. The DNS results revealed interesting effects of the magnetic field and its orientation on primary, secondary flows (instantaneous and mean), Reynolds stresses, turbulent kinetic energy (TKE) budgets, momentum budgets and frictional losses, besides providing DNS database for two-wall bounded square and rectangular duct MHD turbulent flows. Further, the low- and high-Reynolds number RANS models (k-ε and Reynolds stress models) are developed and tested with DNS databases for channel and square duct flows with and without magnetic field. The MHD sink terms in k- and ε-equations are implemented as proposed by Kenjereš and Hanjalić using a user defined function (UDF) in FLUENT. This work revealed varying accuracies of different RANS models at different levels. This work is useful for industry to understand the accuracies of these models, including continuous casting. After realizing the accuracy and computational cost of RANS models, the steady-state k-ε model is then combined with the particle image velocimetry (PIV) and impeller probe velocity measurements in a 1/3rd scale water model to study the flow quality coming out of the well- and mountain-bottom nozzles and the effect of stopper-rod misalignment on fluid flow. The mountain-bottom nozzle was found more prone to the longtime asymmetries and higher surface velocities. The left misalignment of stopper gave higher surface velocity on the right leading to significantly large number of vortices forming behind the nozzle on the left. Later, the transient and steady-state models such as LES, filtered URANS and steady RANS models are combined with ultrasonic Doppler velocimetry (UDV) measurements in a GaInSn model of typical continuous casting process. LES-CU-LOW is the fastest and the most accurate model owing to much finer mesh and a smaller timestep. This work provided a good understanding on the performance of these models. The behavior of instantaneous flows, Reynolds stresses and proper orthogonal decomposition (POD) analysis quantified the nozzle bottom swirl and its importance on the turbulent flow in the mold. Afterwards, the aforementioned work in GaInSn model is extended with electromagnetic braking (EMBr) to help optimize a ruler-type brake and its location for the continuous casting process. The magnetic field suppressed turbulence and promoted vortical structures with their axis aligned with the magnetic field suggesting tendency towards 2-d turbulence. The stronger magnetic field at the nozzle well and around the jet region created large scale and lower frequency flow behavior by suppressing nozzle bottom swirl and its front-back alternation. Based on this work, it is advised to avoid stronger magnetic field around jet and nozzle bottom to get more stable and less defect prone flow.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Solar Intensity X-ray and particle Spectrometer (SIXS) on board BepiColombo's Mercury Planetary Orbiter (MPO) will study solar energetic particles moving towards Mercury and solar X-rays on the dayside of Mercury. The SIXS instrument consists of two detector sub-systems; X-ray detector SIXS-X and particle detector SIXS-P. The SIXS-P subdetector will detect solar energetic electrons and protons in a broad energy range using a particle telescope approach with five outer Si detectors around a central CsI(Tl) scintillator. The measurements made by the SIXS instrument are necessary for other instruments on board the spacecraft. SIXS data will be used to study the Solar X-ray corona, solar flares, solar energetic particles, the Hermean magnetosphere, and solar eruptions. The SIXS-P detector was calibrated by comparing experimental measurement data from the instrument with Geant4 simulation data. Calibration curves were produced for the different side detectors and the core scintillator for electrons and protons, respectively. The side detector energy response was found to be linear for both electrons and protons. The core scintillator energy response to protons was found to be non-linear. The core scintillator calibration for electrons was omitted due to insufficient experimental data. The electron and proton acceptance of the SIXS-P detector was determined with Geant4 simulations. Electron and proton energy channels are clean in the main energy range of the instrument. At higher energies, protons and electrons produce non-ideal response in the energy channels. Due to the limited bandwidth of the spacecraft's telemetry, the particle measurements made by SIXS-P have to be pre-processed in the data processing unit of the SIXS instrument. A lookup table was created for the pre-processing of data with Geant4 simulations, and the ability of the lookup table to provide spectral information from a simulated electron event was analysed. The lookup table produces clean electron and proton channels and is able to separate protons and electrons. Based on a simulated solar energetic electron event, the incident electron spectrum cannot be determined from channel particle counts with a standard analysis method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The research described in this thesis was motivated by the need of a robust model capable of representing 3D data obtained with 3D sensors, which are inherently noisy. In addition, time constraints have to be considered as these sensors are capable of providing a 3D data stream in real time. This thesis proposed the use of Self-Organizing Maps (SOMs) as a 3D representation model. In particular, we proposed the use of the Growing Neural Gas (GNG) network, which has been successfully used for clustering, pattern recognition and topology representation of multi-dimensional data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models, without considering time constraints. It is proposed a hardware implementation leveraging the computing power of modern GPUs, which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). The proposed methods were applied to different problem and applications in the area of computer vision such as the recognition and localization of objects, visual surveillance or 3D reconstruction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Virtual Screening (VS) methods can considerably aid clinical research, predicting how ligands interact with drug targets. Most VS methods suppose a unique binding site for the target, but it has been demonstrated that diverse ligands interact with unrelated parts of the target and many VS methods do not take into account this relevant fact. This problem is circumvented by a novel VS methodology named BINDSURF that scans the whole protein surface to find new hotspots, where ligands might potentially interact with, and which is implemented in massively parallel Graphics Processing Units, allowing fast processing of large ligand databases. BINDSURF can thus be used in drug discovery, drug design, drug repurposing and therefore helps considerably in clinical research. However, the accuracy of most VS methods is constrained by limitations in the scoring function that describes biomolecular interactions, and even nowadays these uncertainties are not completely understood. In order to solve this problem, we propose a novel approach where neural networks are trained with databases of known active (drugs) and inactive compounds, and later used to improve VS predictions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Various mechanisms have been proposed to explain extreme waves or rogue waves in an oceanic environment including directional focusing, dispersive focusing, wave-current interaction, and nonlinear modulational instability. The Benjamin-Feir instability (nonlinear modulational instability), however, is considered to be one of the primary mechanisms for rogue-wave occurrence. The nonlinear Schrodinger equation is a well-established approximate model based on the same assumptions as required for the derivation of the Benjamin-Feir theory. Solutions of the nonlinear Schrodinger equation, including new rogue-wave type solutions are presented in the author's dissertation work. The solutions are obtained by using a predictive eigenvalue map based predictor-corrector procedure developed by the author. Features of the predictive map are explored and the influences of certain parameter variations are investigated. The solutions are rescaled to match the length scales of waves generated in a wave tank. Based on the information provided by the map and the details of physical scaling, a framework is developed that can serve as a basis for experimental investigations into a variety of extreme waves as well localizations in wave fields. To derive further fundamental insights into the complexity of extreme wave conditions, Smoothed Particle Hydrodynamics (SPH) simulations are carried out on an advanced Graphic Processing Unit (GPU) based parallel computational platform. Free surface gravity wave simulations have successfully characterized water-wave dispersion in the SPH model while demonstrating extreme energy focusing and wave growth in both linear and nonlinear regimes. A virtual wave tank is simulated wherein wave motions can be excited from either side. Focusing of several wave trains and isolated waves has been simulated. With properly chosen parameters, dispersion effects are observed causing a chirped wave train to focus and exhibit growth. By using the insights derived from the study of the nonlinear Schrodinger equation, modulational instability or self-focusing has been induced in a numerical wave tank and studied through several numerical simulations. Due to the inherent dissipative nature of SPH models, simulating persistent progressive waves can be problematic. This issue has been addressed and an observation-based solution has been provided. The efficacy of SPH in modeling wave focusing can be critical to further our understanding and predicting extreme wave phenomena through simulations. A deeper understanding of the mechanisms underlying extreme energy localization phenomena can help facilitate energy harnessing and serve as a basis to predict and mitigate the impact of energy focusing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays, new computers generation provides a high performance that enables to build computationally expensive computer vision applications applied to mobile robotics. Building a map of the environment is a common task of a robot and is an essential part to allow the robots to move through these environments. Traditionally, mobile robots used a combination of several sensors from different technologies. Lasers, sonars and contact sensors have been typically used in any mobile robotic architecture, however color cameras are an important sensor due to we want the robots to use the same information that humans to sense and move through the different environments. Color cameras are cheap and flexible but a lot of work need to be done to give robots enough visual understanding of the scenes. Computer vision algorithms are computational complex problems but nowadays robots have access to different and powerful architectures that can be used for mobile robotics purposes. The advent of low-cost RGB-D sensors like Microsoft Kinect which provide 3D colored point clouds at high frame rates made the computer vision even more relevant in the mobile robotics field. The combination of visual and 3D data allows the systems to use both computer vision and 3D processing and therefore to be aware of more details of the surrounding environment. The research described in this thesis was motivated by the need of scene mapping. Being aware of the surrounding environment is a key feature in many mobile robotics applications from simple robotic navigation to complex surveillance applications. In addition, the acquisition of a 3D model of the scenes is useful in many areas as video games scene modeling where well-known places are reconstructed and added to game systems or advertising where once you get the 3D model of one room the system can add furniture pieces using augmented reality techniques. In this thesis we perform an experimental study of the state-of-the-art registration methods to find which one fits better to our scene mapping purposes. Different methods are tested and analyzed on different scene distributions of visual and geometry appearance. In addition, this thesis proposes two methods for 3d data compression and representation of 3D maps. Our 3D representation proposal is based on the use of Growing Neural Gas (GNG) method. This Self-Organizing Maps (SOMs) has been successfully used for clustering, pattern recognition and topology representation of various kind of data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models without considering time constraints. Self-organising neural models have the ability to provide a good representation of the input space. In particular, the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time consuming, specially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This thesis proposes a hardware implementation leveraging the computing power of modern GPUs which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). Our proposed geometrical 3D compression method seeks to reduce the 3D information using plane detection as basic structure to compress the data. This is due to our target environments are man-made and therefore there are a lot of points that belong to a plane surface. Our proposed method is able to get good compression results in those man-made scenarios. The detected and compressed planes can be also used in other applications as surface reconstruction or plane-based registration algorithms. Finally, we have also demonstrated the goodness of the GPU technologies getting a high performance implementation of a CAD/CAM common technique called Virtual Digitizing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação (mestrado)–Universidade de Brasília, Universidade UnB de Planaltina, Programa de Pós-Graduação em Ciência de Materiais, 2015.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O presente trabalho pretende apresentar e descrever a metodologia processual e respectivas relexões que sustentam a criação de um artefato digital interativo construído de forma a que alguns dos elementos bidimensionais que o constituem sejam manipuladas de forma a que criem ao jogador a ilusão de tridimensionalidade. Em 1992 a Id Software com o jogo Wolfenstein 3D, introduziu uma referência visual à tridimensionalidade, utilizando para o efeito tecnologia 2D, a qual, através de um sistema de redimensionamento e posicionamento de imagens, consegui transmitir a noção de tridimensionalidade, utilizando na altura um tipo de jogo em primeira pessoa, ou seja, o jogador experiência uma campo visual que visa reproduzir a própria experiência do mundo táctil na relação que dispõe entre os espaços e objetos. Através do Processing, uma linguagem de programação que assenta no Java, estes objetivos conceptuais serão reproduzidos, que procuram, por um lado, transmitir esta aparente ilusão de tridimensionalidade e por outro não utilizar um tipo de artefacto digital que proporciona uma jogabilidade em primeira pessoa mas sim possibilitam ao jogador uma experiência visual que aborda todo o espaço em que é lhe permitido circular, no qual é lhe exposto as adversidades que precisa de superar para progredir. Para que isto seja possível o jogador assume o papel de um personagem e através da sua interação com o artefato, vai ediicando uma narrativa visual que visa o seu envolvimento com a temática representada. Como tema é utilizada uma representação da busca pelo Sarcófago do faraó da 18ª Dinastia Tutankamón (1332 - 1323 AC) pelo explorador britânico Howard Carter (1874 - 1939) cuja expedição no Vales do Reis em 1922 constitui ainda hoje a mais célebre descoberta arqueológica relacionada com Antigo Egipto. Ao longo desta Dissertação são abordados temas que visam resoluções tanto no campo técnico e tecnológico, através da programação e sua linguagem, como no campo visual e estético que visa uma conexão consciente com a temática a representar e a ser experienciada

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Solving a complex Constraint Satisfaction Problem (CSP) is a computationally hard task which may require a considerable amount of time. Parallelism has been applied successfully to the job and there are already many applications capable of harnessing the parallel power of modern CPUs to speed up the solving process. Current Graphics Processing Units (GPUs), containing from a few hundred to a few thousand cores, possess a level of parallelism that surpasses that of CPUs and there are much less applications capable of solving CSPs on GPUs, leaving space for further improvement. This paper describes work in progress in the solving of CSPs on GPUs, CPUs and other devices, such as Intel Many Integrated Cores (MICs), in parallel. It presents the gains obtained when applying more devices to solve some problems and the main challenges that must be faced when using devices with as different architectures as CPUs and GPUs, with a greater focus on how to effectively achieve good load balancing between such heterogeneous devices.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To reduce the amount of time needed to solve the most complex Constraint Satisfaction Problems (CSPs) usually multi-core CPUs are used. There are already many applications capable of harnessing the parallel power of these devices to speed up the CSPs solving process. Nowadays, the Graphics Processing Units (GPUs) possess a level of parallelism that surpass the CPUs, containing from a few hundred to a few thousand cores and there are much less applications capable of solving CSPs on GPUs, leaving space for possible improvements. This article describes the work in progress for solving CSPs on GPUs and CPUs and compares results with some state-of-the-art solvers, presenting already some good results on GPUs.