969 resultados para graphics processing units


Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this work, possibility of simulating biological organs in realtime using the Boundary Element Method (BEM) is investigated, with specific reference to the speed and the accuracy offered by BEM. First, a Graphics Processing Unit (GPU) is used to speed up the BEM computations to achieve the realtime performance. Next, instead of the GPU, a computer cluster is used. A pig liver is the biological organ considered. Results indicate that BEM is an interesting choice for the simulation of biological organs. Although the use of BEM for the simulation of biological organs is not new, the results presented in the present study are not found elsewhere in the literature.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this work, possibility of simulating biological organs in realtime using the Boundary Element Method (BEM) is investigated. Biological organs are assumed to follow linear elastostatic material behavior, and constant boundary element is the element type used. First, a Graphics Processing Unit (GPU) is used to speed up the BEM computations to achieve the realtime performance. Next, instead of the GPU, a computer cluster is used. Results indicate that BEM is fast enough to provide for realtime graphics if biological organs are assumed to follow linear elastostatic material behavior. Although the present work does not conduct any simulation using nonlinear material models, results from using the linear elastostatic material model imply that it would be difficult to obtain realtime performance if highly nonlinear material models that properly characterize biological organs are used. Although the use of BEM for the simulation of biological organs is not new, the results presented in the present study are not found elsewhere in the literature.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

PurposeTo extend the previously developed temporally constrained reconstruction (TCR) algorithm to allow for real-time availability of three-dimensional (3D) temperature maps capable of monitoring MR-guided high intensity focused ultrasound applications. MethodsA real-time TCR (RT-TCR) algorithm is developed that only uses current and previously acquired undersampled k-space data from a 3D segmented EPI pulse sequence, with the image reconstruction done in a graphics processing unit implementation to overcome computation burden. Simulated and experimental data sets of HIFU heating are used to evaluate the performance of the RT-TCR algorithm. ResultsThe simulation studies demonstrate that the RT-TCR algorithm has subsecond reconstruction time and can accurately measure HIFU-induced temperature rises of 20 degrees C in 15 s for 3D volumes of 16 slices (RMSE = 0.1 degrees C), 24 slices (RMSE = 0.2 degrees C), and 32 slices (RMSE = 0.3 degrees C). Experimental results in ex vivo porcine muscle demonstrate that the RT-TCR approach can reconstruct temperature maps with 192 x 162 x 66 mm 3D volume coverage, 1.5 x 1.5 x 3.0 mm resolution, and 1.2-s scan time with an accuracy of 0.5 degrees C. ConclusionThe RT-TCR algorithm offers an approach to obtaining large coverage 3D temperature maps in real-time for monitoring MR-guided high intensity focused ultrasound treatments. Magn Reson Med 71:1394-1404, 2014. (c) 2013 Wiley Periodicals, Inc.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The growing number of applications and processing units in modern Multiprocessor Systems-on-Chips (MPSoCs) come along with reduced time to market. Different IP cores can come from different vendors, and their trust levels are also different, but typically they use Network-on-Chip (NoC) as their communication infrastructure. An MPSoC can have multiple Trusted Execution Environments (TEEs). Apart from performance, power, and area research in the field of MPSoC, robust and secure system design is also gaining importance in the research community. To build a secure system, the designer must know beforehand all kinds of attack possibilities for the respective system (MPSoC). In this paper we survey the possible attack scenarios on present-day MPSoCs and investigate a new attack scenario, i.e., router attack targeted toward NoC architecture. We show the validity of this attack by analyzing different present-day NoC architectures and show that they are all vulnerable to this type of attack. By launching a router attack, an attacker can control the whole chip very easily, which makes it a very serious issue. Both routing tables and routing logic-based routers are vulnerable to such attacks. In this paper, we address attacks on routing tables. We propose different monitoring-based countermeasures against routing table-based router attack in an MPSoC having multiple TEEs. Synthesis results show that proposed countermeasures, viz. Runtime-monitor, Restart-monitor, Intermediate manager, and Auditor, occupy areas that are 26.6, 22, 0.2, and 12.2 % of a routing table-based router area. Apart from these, we propose Ejection address checker and Local monitoring module inside a router that cause 3.4 and 10.6 % increase of a router area, respectively. Simulation results are also given, which shows effectiveness of proposed monitoring-based countermeasures.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Rapid reconstruction of multidimensional image is crucial for enabling real-time 3D fluorescence imaging. This becomes a key factor for imaging rapidly occurring events in the cellular environment. To facilitate real-time imaging, we have developed a graphics processing unit (GPU) based real-time maximum a-posteriori (MAP) image reconstruction system. The parallel processing capability of GPU device that consists of a large number of tiny processing cores and the adaptability of image reconstruction algorithm to parallel processing (that employ multiple independent computing modules called threads) results in high temporal resolution. Moreover, the proposed quadratic potential based MAP algorithm effectively deconvolves the images as well as suppresses the noise. The multi-node multi-threaded GPU and the Compute Unified Device Architecture (CUDA) efficiently execute the iterative image reconstruction algorithm that is similar to 200-fold faster (for large dataset) when compared to existing CPU based systems. (C) 2015 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution 3.0 Unported License.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Coarse Grained Reconfigurable Architectures (CGRA) are emerging as embedded application processing units in computing platforms for Exascale computing. Such CGRAs are distributed memory multi- core compute elements on a chip that communicate over a Network-on-chip (NoC). Numerical Linear Algebra (NLA) kernels are key to several high performance computing applications. In this paper we propose a systematic methodology to obtain the specification of Compute Elements (CE) for such CGRAs. We analyze block Matrix Multiplication and block LU Decomposition algorithms in the context of a CGRA, and obtain theoretical bounds on communication requirements, and memory sizes for a CE. Support for high performance custom computations common to NLA kernels are met through custom function units (CFUs) in the CEs. We present results to justify the merits of such CFUs.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A propriedade de auto-cura, em redes inteligente de distribuição de energia elétrica, consiste em encontrar uma proposta de reconfiguração do sistema de distribuição com o objetivo de recuperar parcial ou totalmente o fornecimento de energia aos clientes da rede, na ocorrência de uma falha na rede que comprometa o fornecimento. A busca por uma solução satisfatória é um problema combinacional cuja complexidade está ligada ao tamanho da rede. Um método de busca exaustiva se torna um processo muito demorado e muitas vezes computacionalmente inviável. Para superar essa dificuldade, pode-se basear nas técnicas de geração de árvores de extensão mínima do grafo, representando a rede de distribuição. Porém, a maioria dos estudos encontrados nesta área são implementações centralizadas, onde proposta de reconfiguração é obtida por um sistema de supervisão central. Nesta dissertação, propõe-se uma implementação distribuída, onde cada chave da rede colabora na elaboração da proposta de reconfiguração. A solução descentralizada busca uma redução no tempo de reconfiguração da rede em caso de falhas simples ou múltiplas, aumentando assim a inteligência da rede. Para isso, o algoritmo distribuído GHS é utilizado como base na elaboração de uma solução de auto-cura a ser embarcada nos elementos processadores que compõem as chaves de comutação das linhas da rede inteligente de distribuição. A solução proposta é implementada utilizando robôs como unidades de processamento que se comunicam via uma mesma rede, constituindo assim um ambiente de processamento distribuído. Os diferentes estudos de casos testados mostram que, para redes inteligentes de distribuição compostas por um único alimentador, a solução proposta obteve sucesso na reconfiguração da rede, indiferentemente do número de falhas simultâneas. Na implementação proposta, o tempo de reconfiguração da rede não depende do número de linhas nela incluídas. A implementação apresentou resultados de custo de comunicação e tempo dentro dos limites teóricos estabelecidos pelo algoritmo GHS.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An approach of rapid hologram generation for the realistic three-dimensional (3-D) image reconstruction based on the angular tiling concept is proposed, using a new graphic rendering approach integrated with a previously developed layer-based method for hologram calculation. A 3-D object is simplified as layered cross-sectional images perpendicular to a chosen viewing direction, and our graphics rendering approach allows the incorporation of clear depth cues, occlusion, and shading in the generated holograms for angular tiling. The combination of these techniques together with parallel computing reduces the computation time of a single-view hologram for a 3-D image of extended graphics array resolution to 176 ms using a single consumer graphics processing unit card. © 2014 SPIE and IS and T.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the fluid simulation, the fluids and their surroundings may greatly change properties such as shape and temperature simultaneously, and different surroundings would characterize different interactions, which would change the shape and motion of the fluids in different ways. On the other hand, interactions among fluid mixtures of different kinds would generate more comprehensive behavior. To investigate the interaction behavior in physically based simulation of fluids, it is of importance to build physically correct models to represent the varying interactions between fluids and the environments, as well as interactions among the mixtures. In this paper, we will make a simple review of the interactions, and focus on those most interesting to us, and model them with various physical solutions. In particular, more detail will be given on the simulation of miscible and immiscible binary mixtures. In some of the methods, it is advantageous to be taken with the graphics processing unit (GPU) to achieve real-time computation for middle-scale simulation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

运动目标跟踪技术是未知环境下移动机器人研究领域的一个重要研究方向。该文提出了一种基于主动视觉和超声信息的移动机器人运动目标跟踪设计方法,利用一台SONY EV-D31彩色摄像机、自主研制的摄像机控制模块、图像采集与处理单元等构建了主动视觉系统。移动机器人采用了基于行为的分布式控制体系结构,利用主动视觉锁定运动目标,通过超声系统感知外部环境信息,能在未知的、动态的、非结构化复杂环境中可靠地跟踪运动目标。实验表明机器人具有较高的鲁棒性,运动目标跟踪系统运行可靠。

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this thesis I theoretically study quantum states of ultracold atoms. The majority of the Chapters focus on engineering specific quantum states of single atoms with high fidelity in experimentally realistic systems. In the sixth Chapter, I investigate the stability and dynamics of new multidimensional solitonic states that can be created in inhomogeneous atomic Bose-Einstein condensates. In Chapter three I present two papers in which I demonstrate how the coherent tunnelling by adiabatic passage (CTAP) process can be implemented in an experimentally realistic atom chip system, to coherently transfer the centre-of-mass of a single atom between two spatially distinct magnetic waveguides. In these works I also utilise GPU (Graphics Processing Unit) computing which offers a significant performance increase in the numerical simulation of the Schrödinger equation. In Chapter four I investigate the CTAP process for a linear arrangement of radio frequency traps where the centre-of-mass of both, single atoms and clouds of interacting atoms, can be coherently controlled. In Chapter five I present a theoretical study of adiabatic radio frequency potentials where I use Floquet theory to more accurately model situations where frequencies are close and/or field amplitudes are large. I also show how one can create highly versatile 2D adiabatic radio frequency potentials using multiple radio frequency fields with arbitrary field orientation and demonstrate their utility by simulating the creation of ring vortex solitons. In the sixth Chapter I discuss the stability and dynamics of a family of multidimensional solitonic states created in harmonically confined Bose-Einstein condensates. I demonstrate that these solitonic states have interesting dynamical instabilities, where a continuous collapse and revival of the initial state occurs. Through Bogoliubov analysis, I determine the modes responsible for the observed instabilities of each solitonic state and also extract information related to the time at which instability can be observed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This article describes advances in statistical computation for large-scale data analysis in structured Bayesian mixture models via graphics processing unit (GPU) programming. The developments are partly motivated by computational challenges arising in fitting models of increasing heterogeneity to increasingly large datasets. An example context concerns common biological studies using high-throughput technologies generating many, very large datasets and requiring increasingly high-dimensional mixture models with large numbers of mixture components.We outline important strategies and processes for GPU computation in Bayesian simulation and optimization approaches, give examples of the benefits of GPU implementations in terms of processing speed and scale-up in ability to analyze large datasets, and provide a detailed, tutorial-style exposition that will benefit readers interested in developing GPU-based approaches in other statistical models. Novel, GPU-oriented approaches to modifying existing algorithms software design can lead to vast speed-up and, critically, enable statistical analyses that presently will not be performed due to compute time limitations in traditional computational environments. Supplementalmaterials are provided with all source code, example data, and details that will enable readers to implement and explore the GPU approach in this mixture modeling context. © 2010 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we propose a multi-camera application capable of processing high resolution images and extracting features based on colors patterns over graphic processing units (GPU). The goal is to work in real time under the uncontrolled environment of a sport event like a football match. Since football players are composed for diverse and complex color patterns, a Gaussian Mixture Models (GMM) is applied as segmentation paradigm, in order to analyze sport live images and video. Optimization techniques have also been applied over the C++ implementation using profiling tools focused on high performance. Time consuming tasks were implemented over NVIDIA's CUDA platform, and later restructured and enhanced, speeding up the whole process significantly. Our resulting code is around 4-11 times faster on a low cost GPU than a highly optimized C++ version on a central processing unit (CPU) over the same data. Real time has been obtained processing until 64 frames per second. An important conclusion derived from our study is the scalability of the application to the number of cores on the GPU. © 2011 Springer-Verlag.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Differential equations are often directly solvable by analytical means only in their one dimensional version. Partial differential equations are generally not solvable by analytical means in two and three dimensions, with the exception of few special cases. In all other cases, numerical approximation methods need to be utilized. One of the most popular methods is the finite element method. The main areas of focus, here, are the Poisson heat equation and the plate bending equation. The purpose of this paper is to provide a quick walkthrough of the various approaches that the authors followed in pursuit of creating optimal solvers, accelerated with the use of graphical processing units, and comparing them in terms of accuracy and time efficiency with existing or self-made non-accelerated solvers.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the reinsurance market, the risks natural catastrophes pose to portfolios of properties must be quantified, so that they can be priced, and insurance offered. The analysis of such risks at a portfolio level requires a simulation of up to 800 000 trials with an average of 1000 catastrophic events per trial. This is sufficient to capture risk for a global multi-peril reinsurance portfolio covering a range of perils including earthquake, hurricane, tornado, hail, severe thunderstorm, wind storm, storm surge and riverine flooding, and wildfire. Such simulations are both computation and data intensive, making the application of high-performance computing techniques desirable.

In this paper, we explore the design and implementation of portfolio risk analysis on both multi-core and many-core computing platforms. Given a portfolio of property catastrophe insurance treaties, key risk measures, such as probable maximum loss, are computed by taking both primary and secondary uncertainties into account. Primary uncertainty is associated with whether or not an event occurs in a simulated year, while secondary uncertainty captures the uncertainty in the level of loss due to the use of simplified physical models and limitations in the available data. A combination of fast lookup structures, multi-threading and careful hand tuning of numerical operations is required to achieve good performance. Experimental results are reported for multi-core processors and systems using NVIDIA graphics processing unit and Intel Phi many-core accelerators.