972 resultados para Graphical processing units


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent advances in the massively parallel computational abilities of graphical processing units (GPUs) have increased their use for general purpose computation, as companies look to take advantage of big data processing techniques. This has given rise to the potential for malicious software targeting GPUs, which is of interest to forensic investigators examining the operation of software. The ability to carry out reverse-engineering of software is of great importance within the security and forensics elds, particularly when investigating malicious software or carrying out forensic analysis following a successful security breach. Due to the complexity of the Nvidia CUDA (Compute Uni ed Device Architecture) framework, it is not clear how best to approach the reverse engineering of a piece of CUDA software. We carry out a review of the di erent binary output formats which may be encountered from the CUDA compiler, and their implications on reverse engineering. We then demonstrate the process of carrying out disassembly of an example CUDA application, to establish the various techniques available to forensic investigators carrying out black-box disassembly and reverse engineering of CUDA binaries. We show that the Nvidia compiler, using default settings, leaks useful information. Finally, we demonstrate techniques to better protect intellectual property in CUDA algorithm implementations from reverse engineering.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biomedical engineering solutions like surgical simulators need High Performance Computing (HPC) to achieve real-time performance. Graphics Processing Units (GPUs) offer HPC capabilities at low cost and low power consumption. In this work, it is demonstrated that a liver which is discretized by about 2500 finite element nodes, can be graphically simulated in realtime, by making use of a GPU. Present work takes into consideration the time needed for the data transfer from CPU to GPU and back from GPU to CPU. Although behaviour of liver is very complicated, present computer simulation assumes linear elastostatics. One needs to use the commercial software ANSYS to obtain the global stiffness matrix of the liver. Results show that GPUs are useful for the real-time graphical simulation of liver, which in turn is needed in simulators that are used for training surgeons in laparoscopic surgery. Although the computer simulation should involve rendering also, neither rendering, nor the time needed for rendering and displaying the liver on a screen, is considered in the present work. The present work is just a demonstration of a concept; the concept is not really implemented and validated. Future work is to develop software which can accomplish real-time and very realistic graphical simulation of liver, with rendered image of liver on the screen changing in real-time according to the position of the surgical tool tip approximated as the mouse cursor in 3D.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Diffuse optical tomographic image reconstruction uses advanced numerical models that are computationally costly to be implemented in the real time. The graphics processing units (GPUs) offer desktop massive parallelization that can accelerate these computations. An open-source GPU-accelerated linear algebra library package is used to compute the most intensive matrix-matrix calculations and matrix decompositions that are used in solving the system of linear equations. These open-source functions were integrated into the existing frequency-domain diffuse optical image reconstruction algorithms to evaluate the acceleration capability of the GPUs (NVIDIA Tesla C 1060) with increasing reconstruction problem sizes. These studies indicate that single precision computations are sufficient for diffuse optical tomographic image reconstruction. The acceleration per iteration can be up to 40, using GPUs compared to traditional CPUs in case of three-dimensional reconstruction, where the reconstruction problem is more underdetermined, making the GPUs more attractive in the clinical settings. The current limitation of these GPUs in the available onboard memory (4 GB) that restricts the reconstruction of a large set of optical parameters, more than 13, 377. (C) 2010 Society of Photo-Optical Instrumentation Engineers. DOI: 10.1117/1.3506216]

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spatial light modulators based around liquid crystal on silicon have found use in a variety of telecommunications applications, including the optimization of multimode fibers, free-space communications, and wavelength selective switching. Ferroelectric liquid crystals are attractive in these areas due to their fast switching times and high phase stability, but the necessity for the liquid crystal to spend equal time in each of its two possible states is an issue of practical concern. Using the highly parallel nature of a graphics processing unit architecture, it is possible to calculate DC balancing schemes of exceptional quality and stability.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

High-throughput DNA sequencing (HTS) instruments today are capable of generating millions of sequencing reads in a short period of time, and this represents a serious challenge to current bioinformatics pipeline in processing such an enormous amount of data in a fast and economical fashion. Modern graphics cards are powerful processing units that consist of hundreds of scalar processors in parallel in order to handle the rendering of high-definition graphics in real-time. It is this computational capability that we propose to harness in order to accelerate some of the time-consuming steps in analyzing data generated by the HTS instruments. We have developed BarraCUDA, a novel sequence mapping software that utilizes the parallelism of NVIDIA CUDA graphics cards to map sequencing reads to a particular location on a reference genome. While delivering a similar mapping fidelity as other mainstream programs , BarraCUDA is a magnitude faster in mapping throughput compared to its CPU counterparts. The software is also capable of supporting multiple CUDA devices in parallel to further accelerate the mapping throughput. BarraCUDA is designed to take advantage of the parallelism of GPU to accelerate the mapping of millions of sequencing reads generated by HTS instruments. By doing this, we could, at least in part streamline the current bioinformatics pipeline such that the wider scientific community could benefit from the sequencing technology. BarraCUDA is currently available at http://seqbarracuda.sf.net

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: With the maturation of next-generation DNA sequencing (NGS) technologies, the throughput of DNA sequencing reads has soared to over 600 gigabases from a single instrument run. General purpose computing on graphics processing units (GPGPU), extracts the computing power from hundreds of parallel stream processors within graphics processing cores and provides a cost-effective and energy efficient alternative to traditional high-performance computing (HPC) clusters. In this article, we describe the implementation of BarraCUDA, a GPGPU sequence alignment software that is based on BWA, to accelerate the alignment of sequencing reads generated by these instruments to a reference DNA sequence. FINDINGS: Using the NVIDIA Compute Unified Device Architecture (CUDA) software development environment, we ported the most computational-intensive alignment component of BWA to GPU to take advantage of the massive parallelism. As a result, BarraCUDA offers a magnitude of performance boost in alignment throughput when compared to a CPU core while delivering the same level of alignment fidelity. The software is also capable of supporting multiple CUDA devices in parallel to further accelerate the alignment throughput. CONCLUSIONS: BarraCUDA is designed to take advantage of the parallelism of GPU to accelerate the alignment of millions of sequencing reads generated by NGS instruments. By doing this, we could, at least in part streamline the current bioinformatics pipeline such that the wider scientific community could benefit from the sequencing technology.BarraCUDA is currently available from http://seqbarracuda.sf.net.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Many analyses of microarray association studies involve permutation, bootstrap resampling and cross-validation, that are ideally formulated as embarrassingly parallel computing problems. Given that these analyses are computationally intensive, scalable approaches that can take advantage of multi-core processor systems need to be developed. RESULTS: We have developed a CUDA based implementation, permGPU, that employs graphics processing units in microarray association studies. We illustrate the performance and applicability of permGPU within the context of permutation resampling for a number of test statistics. An extensive simulation study demonstrates a dramatic increase in performance when using permGPU on an NVIDIA GTX 280 card compared to an optimized C/C++ solution running on a conventional Linux server. CONCLUSIONS: permGPU is available as an open-source stand-alone application and as an extension package for the R statistical environment. It provides a dramatic increase in performance for permutation resampling analysis in the context of microarray association studies. The current version offers six test statistics for carrying out permutation resampling analyses for binary, quantitative and censored time-to-event traits.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper investigates sub-integer implementations of the adaptive Gaussian mixture model (GMM) for background/foreground segmentation to allow the deployment of the method on low cost/low power processors that lack Floating Point Unit (FPU). We propose two novel integer computer arithmetic techniques to update Gaussian parameters. Specifically, the mean value and the variance of each Gaussian are updated by a redefined and generalised "round'' operation that emulates the original updating rules for a large set of learning rates. Weights are represented by counters that are updated following stochastic rules to allow a wider range of learning rates and the weight trend is approximated by a line or a staircase. We demonstrate that the memory footprint and computational cost of GMM are significantly reduced, without significantly affecting the performance of background/foreground segmentation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The present study was undertaken to evaluate the effectiveness of a few physico-chemical and biological methods for the treatment of effluents from natural rubber processing units. The overall objective of this study is to evaluate the effectiveness of certain physico-chemical and biological methods for the treatment of effluents from natural rubber processing units. survey of the chemical characteristics of the effluents discharged from rubber processing units showed that the effluents from latex concentration units were the most polluting

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present an approach to computing high-breakdown regression estimators in parallel on graphics processing units (GPU).We show that sorting the residuals is not necessary, and it can be substituted by calculating the median. We present and compare various methods to calculate the median and order statistics on GPUs. We introduce an alternative method based on the optimization of a convex function, and showits numerical superiority when calculating the order statistics of very large arrays on GPUs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We found an interesting relation between convex optimization and sorting problem. We present a parallel algorithm to compute multiple order statistics of the data by minimizing a number of related convex functions. The computed order statistics serve as splitters that group the data into buckets suitable for parallel bitonic sorting. This led us to a parallel bucket sort algorithm, which we implemented for many-core architecture of graphics processing units (GPUs). The proposed sorting method is competitive to the state-of-the-art GPU sorting algorithms and is superior to most of them for long sorting keys.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ein System in einem metastabilen Zustand muss eine bestimmte Barriere in derrnfreien Energie überwinden um einen Tropfen der stabilen Phase zu formen.rnHerkömmliche Untersuchungen nehmen hierbei kugelförmige Tropfen an. Inrnanisotropen Systemen (wie z.B. Kristallen) ist diese Annahme aber nicht ange-rnbracht. Bei tiefen Temperaturen wirkt sich die Anisotropie des Systems starkrnauf die freie Energie ihrer Oberfläche aus. Diese Wirkung wird oberhalb derrnAufrauungstemperatur T R schwächer. Das Ising-Modell ist ein einfaches Mo-rndell, welches eine solche Anisotropie aufweist. Wir führen großangelegte Sim-rnulationen durch, um die Effekte, die mit einer endlichen Simulationsbox ein-rnhergehen, sowie statistische Ungenauigkeiten möglichst klein zu halten. DasrnAusmaß der Simulationen die benötigt werden um sinnvolle Ergebnisse zu pro-rnduzieren, erfordert die Entwicklung eines skalierbaren Simulationsprogrammsrnfür das Ising-Modell, welcher auf verschiedenen parallelen Architekturen (z.B.rnGrafikkarten) verwendet werden kann. Plattformunabhängigkeit wird durch ab-rnstrakte Schnittstellen erreicht, welche plattformspezifische Implementierungs-rndetails verstecken. Wir benutzen eine Systemgeometrie die es erlaubt eine Ober-rnfläche mit einem variablen Winkel zur Kristallebene zu untersuchen. Die Ober-rnfläche ist in Kontakt mit einer harten Wand, wobei der Kontaktwinkel Θ durchrnein Oberflächenfeld eingestellt werden kann. Wir leiten eine Differenzialglei-rnchung ab, welche das Verhalten der freien Energie der Oberfläche in einemrnanisotropen System beschreibt. Kombiniert mit thermodynamischer Integrationrnkann die Gleichung benutzt werden, um die anisotrope Oberflächenspannungrnüber einen großen Winkelbereich zu integrieren. Vergleiche mit früheren Mes-rnsungen in anderen Geometrien und anderen Methoden zeigen hohe Überein-rnstimung und Genauigkeit, welche vor allem durch die im Vergleich zu früherenrnMessungen wesentlich größeren Simulationsdomänen erreicht wird. Die Temper-rnaturabhängigkeit der Oberflächensteifheit κ wird oberhalb von T R durch diernKrümmung der freien Energie der Oberfläche für kleine Winkel gemessen. DiesernMessung lässt sich mit Simulationsergebnissen in der Literatur vergleichen undrnhat bessere Übereinstimmung mit theoretischen Voraussagen über das Skalen-rnverhalten von κ. Darüber hinaus entwickeln wir ein Tieftemperatur-Modell fürrndas Verhalten um Θ = 90 Grad weit unterhalb von T R. Der Winkel bleibt bis zu einemrnkritischen Feld H C quasi null; oberhalb des kritischen Feldes steigt der Winkelrnrapide an. H C wird mit der freien Energie einer Stufe in Verbindung gebracht,rnwas es ermöglicht, das kritische Verhalten dieser Größe zu analysieren. Die harternWand muss in die Analyse einbezogen werden. Durch den Vergleich freier En-rnergien bei geschickt gewählten Systemgrößen ist es möglich, den Beitrag derrnKontaktlinie zur freien Energie in Abhängigkeit von Θ zu messen. Diese Anal-rnyse wird bei verschiedenen Temperaturen durchgeführt. Im letzten Kapitel wirdrneine 2D Fluiddynamik Simulation für Grafikkarten parallelisiert, welche u. a.rnbenutzt werden kann um die Dynamik der Atmosphäre zu simulieren. Wir im-rnplementieren einen parallelen Evolution Galerkin Operator und erreichen