246 resultados para virtualised GPU


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Modern cancer research often involves large datasets and the use of sophisticated statistical techniques. Together these add a heavy computational load to the analysis, which is often coupled with issues surrounding data accessibility. Connectivity mapping is an advanced bioinformatic and computational technique dedicated to therapeutics discovery and drug re-purposing around differential gene expression analysis. On a normal desktop PC, it is common for the connectivity mapping task with a single gene signature to take >2h to complete using sscMap, a popular Java application that runs on standard CPUs (Central Processing Units). Here, we describe new software, cudaMap, which has been implemented using CUDA C/C++ to harness the computational power of NVIDIA GPUs (Graphics Processing Units) to greatly reduce processing times for connectivity mapping.

Results: cudaMap can identify candidate therapeutics from the same signature in just over thirty seconds when using an NVIDIA Tesla C2050 GPU. Results from the analysis of multiple gene signatures, which would previously have taken several days, can now be obtained in as little as 10 minutes, greatly facilitating candidate therapeutics discovery with high throughput. We are able to demonstrate dramatic speed differentials between GPU assisted performance and CPU executions as the computational load increases for high accuracy evaluation of statistical significance.

Conclusion: Emerging 'omics' technologies are constantly increasing the volume of data and information to be processed in all areas of biomedical research. Embracing the multicore functionality of GPUs represents a major avenue of local accelerated computing. cudaMap will make a strong contribution in the discovery of candidate therapeutics by enabling speedy execution of heavy duty connectivity mapping tasks, which are increasingly required in modern cancer research. cudaMap is open source and can be freely downloaded from http://purl.oclc.org/NET/cudaMap.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a methodology for optimizing the execution of data parallel (sub-)tasks on CPU and GPU cores of the same heterogeneous architecture. The methodology is based on two main components: i) an analytical performance model for scheduling tasks among CPU and GPU cores, such that the global execution time of the overall data parallel pattern is optimized; and ii) an autonomic module which uses the analytical performance model to implement the data parallel computations in a completely autonomic way, requiring no programmer intervention to optimize the computation across CPU and GPU cores. The analytical performance model uses a small set of simple parameters to devise a partitioning-between CPU and GPU cores-of the tasks derived from structured data parallel patterns/algorithmic skeletons. The model takes into account both hardware related and application dependent parameters. It computes the percentage of tasks to be executed on CPU and GPU cores such that both kinds of cores are exploited and performance figures are optimized. The autonomic module, implemented in FastFlow, executes a generic map (reduce) data parallel pattern scheduling part of the tasks to the GPU and part to CPU cores so as to achieve optimal execution time. Experimental results on state-of-the-art CPU/GPU architectures are shown that assess both performance model properties and autonomic module effectiveness. © 2013 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cloud computing is a technological advancementthat provide resources through internet on pay-as-you-go basis.Cloud computing uses virtualisation technology to enhance theefficiency and effectiveness of its advantages. Virtualisation isthe key to consolidate the computing resources to run multiple instances on each hardware, increasing the utilization rate of every resource, thus reduces the number of resources needed to buy, rack, power, cool, and manage. Cloud computing has very appealing features, however, lots of enterprises and users are still reluctant to move into cloud due to serious security concerns related to virtualisation layer. Thus, it is foremost important to secure the virtual environment.In this paper, we present an elastic framework to secure virtualised environment for trusted cloud computing called Server Virtualisation Security System (SVSS). SVSS provide security solutions located on hyper visor for Virtual Machines by deploying malicious activity detection techniques, network traffic analysis techniques, and system resource utilization analysis techniques.SVSS consists of four modules: Anti-Virus Control Module,Traffic Behavior Monitoring Module, Malicious Activity Detection Module and Virtualisation Security Management Module.A SVSS prototype has been deployed to validate its feasibility,efficiency and accuracy on Xen virtualised environment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

While virtualisation can provide many benefits to a networks infrastructure, securing the virtualised environment is a big challenge. The security of a fully virtualised solution is dependent on the security of each of its underlying components, such as the hypervisor, guest operating systems and storage.

This paper presents a single security service running on the hypervisor that could potentially work to provide security service to all virtual machines running on the system. This paper presents a hypervisor hosted framework which performs specialised security tasks for all underlying virtual machines to protect against any malicious attacks by passively analysing the network traffic of VMs. This framework has been implemented using Xen Server and has been evaluated by detecting a Zeus Server setup and infected clients, distributed over a number of virtual machines. This framework is capable of detecting and identifying all infected VMs with no false positive or false negative detection.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Differential equations are often directly solvable by analytical means only in their one dimensional version. Partial differential equations are generally not solvable by analytical means in two and three dimensions, with the exception of few special cases. In all other cases, numerical approximation methods need to be utilized. One of the most popular methods is the finite element method. The main areas of focus, here, are the Poisson heat equation and the plate bending equation. The purpose of this paper is to provide a quick walkthrough of the various approaches that the authors followed in pursuit of creating optimal solvers, accelerated with the use of graphical processing units, and comparing them in terms of accuracy and time efficiency with existing or self-made non-accelerated solvers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The design cycle for complex special-purpose computing systems is extremely costly and time-consuming. It involves a multiparametric design space exploration for optimization, followed by design verification. Designers of special purpose VLSI implementations often need to explore parameters, such as optimal bitwidth and data representation, through time-consuming Monte Carlo simulations. A prominent example of this simulation-based exploration process is the design of decoders for error correcting systems, such as the Low-Density Parity-Check (LDPC) codes adopted by modern communication standards, which involves thousands of Monte Carlo runs for each design point. Currently, high-performance computing offers a wide set of acceleration options that range from multicore CPUs to Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs). The exploitation of diverse target architectures is typically associated with developing multiple code versions, often using distinct programming paradigms. In this context, we evaluate the concept of retargeting a single OpenCL program to multiple platforms, thereby significantly reducing design time. A single OpenCL-based parallel kernel is used without modifications or code tuning on multicore CPUs, GPUs, and FPGAs. We use SOpenCL (Silicon to OpenCL), a tool that automatically converts OpenCL kernels to RTL in order to introduce FPGAs as a potential platform to efficiently execute simulations coded in OpenCL. We use LDPC decoding simulations as a case study. Experimental results were obtained by testing a variety of regular and irregular LDPC codes that range from short/medium (e.g., 8,000 bit) to long length (e.g., 64,800 bit) DVB-S2 codes. We observe that, depending on the design parameters to be simulated, on the dimension and phase of the design, the GPU or FPGA may suit different purposes more conveniently, thus providing different acceleration factors over conventional multicore CPUs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Le code source de la libraire développée accompagne ce dépôt dans l'état où il était à ce moment. Il est possible de trouver une version plus à jour sur github (http://github.com/abergeron).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Large-scale simulations of parts of the brain using detailed neuronal models to improve our understanding of brain functions are becoming a reality with the usage of supercomputers and large clusters. However, the high acquisition and maintenance cost of these computers, including the physical space, air conditioning, and electrical power, limits the number of simulations of this kind that scientists can perform. Modern commodity graphical cards, based on the CUDA platform, contain graphical processing units (GPUs) composed of hundreds of processors that can simultaneously execute thousands of threads and thus constitute a low-cost solution for many high-performance computing applications. In this work, we present a CUDA algorithm that enables the execution, on multiple GPUs, of simulations of large-scale networks composed of biologically realistic Hodgkin-Huxley neurons. The algorithm represents each neuron as a CUDA thread, which solves the set of coupled differential equations that model each neuron. Communication among neurons located in different GPUs is coordinated by the CPU. We obtained speedups of 40 for the simulation of 200k neurons that received random external input and speedups of 9 for a network with 200k neurons and 20M neuronal connections, in a single computer with two graphic boards with two GPUs each, when compared with a modern quad-core CPU. Copyright (C) 2010 John Wiley & Sons, Ltd.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This research aims at improving the accessibility of cluster computer systems by introducing autonomic self-management facilities incorporating; 1) resource discovery and self awareness, 2) virtualised resource pools, and 3) automated cluster membership and self configuration. These facilities simplify the user's programming workload and improve system usability.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Este trabalho apresenta um conjunto de ferramentas que exploram as capacidades recentes das placas gráficas de computadores pessoais para prover a visualização e a interação com volumes de dados. O objetivo é oferecer ao usuário ferramentas que permitam a remoção interativa de partes não relevantes do volume. Assim, o usuário é capaz de selecionar um volume de interesse, o que pode tanto facilitar a compreensão da sua estrutura quanto a sua relação com os volumes circundantes. A técnica de visualização direta de volumes através do mapeamento de texturas é explorada para desenvolver estas ferramentas. O controle programável dos cálculos realizados pelo hardware gráfico para gerar a aparência de cada pixel na tela é usado para resolver a visibilidade de cada ponto do volume em tempo real. As ferramentas propostas permitem a modificação da visibilidade de cada ponto dentro do hardware gráfico, estendendo o benefício da visualização acelerada por hardware. Três ferramentas de interação são propostas: uma ferramenta de recorte planar que permite a seleção de um volume de interesse convexo; uma ferramenta do tipo “borracha”, para eliminar partes não relevantes da imagem; e uma ferramenta do tipo “escavadeira”, para remover camadas do volume Estas ferramentas exploram partes distintas do fluxo de visualização por texturas, onde é possível tomar a decisão sobre a visibilidade de cada ponto do volume. Cada ferramenta vem para resolver uma deficiência da ferramenta anterior. Com o recorte planar, o usuário aproxima grosseiramente o volume de interesse; com a borracha, ele refina o volume selecionado que, finalmente, é terminado com a escavadeira. Para aplicar as ferramentas propostas ao volume visualizado, são usadas técnicas de interação conhecidas, comuns nos sistemas de visualização 2D. Isto permite minimizar os esforços do usuário no treinamento do uso das ferramentas. Finalmente, são ilustradas as aplicações potenciais das ferramentas propostas para o estudo da anatomia do fígado humano. Nestas aplicações foi possível identificar algumas necessidades do usuário na visualização interativa de conjuntos de dados médicos. A partir destas observações, são propostas também novas ferramentas de interação, baseadas em modificações nas ferramentas propostas.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The vascular segmentation is important in diagnosing vascular diseases like stroke and is hampered by noise in the image and very thin vessels that can pass unnoticed. One way to accomplish the segmentation is extracting the centerline of the vessel with height ridges, which uses the intensity as features for segmentation. This process can take from seconds to minutes, depending on the current technology employed. In order to accelerate the segmentation method proposed by Aylward [Aylward & Bullitt 2002] we have adapted it to run in parallel using CUDA architecture. The performance of the segmentation method running on GPU is compared to both the same method running on CPU and the original Aylward s method running also in CPU. The improvemente of the new method over the original one is twofold: the starting point for the segmentation process is not a single point in the blood vessel but a volume, thereby making it easier for the user to segment a region of interest, and; the overall gain method was 873 times faster running on GPU and 150 times more fast running on the CPU than the original CPU in Aylward

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The X-ray crystal structure of a complex between ribonuclease T-1 and guanylyl(3'-6')-6'-deoxyhomouridine (GpcU) has been determined at 2.0 Angstrom resolution. This Ligand is an isosteric analogue of the minimal RNA substrate, guanylyl(3'-5')uridine (GpU), where a methylene is substituted for the uridine 5'-oxygen atom. Two protein molecules are part of the asymmetric unit and both have a GpcU bound at the active site in the same manner. The protein-protein interface reveals an extended aromatic stack involving both guanines and three enzyme phenolic groups. A third GpcU has its guanine moiety stacked on His92 at the active site on enzyme molecule A and interacts with GpcU on molecule B in a neighboring unit via hydrogen bonding between uridine ribose 2'- and 3'-OH groups. None of the uridine moieties of the three GpcU molecules in the asymmetric unit interacts directly with the protein. GpcU-active-site interactions involve extensive hydrogen bonding of the guanine moiety at the primary recognition site and of the guanosine 2'-hydroxyl group with His40 and Glu58. on the other hand, the phosphonate group is weakly bound only by a single hydrogen bond with Tyr38, unlike ligand phosphate groups of other substrate analogues and 3'-GMP, which hydrogen-bonded with three additional active-site residues. Hydrogen bonding of the guanylyl 2'-OH group and the phosphonate moiety is essentially the same as that recently observed for a novel structure of a RNase T-1-3'-GMP complex obtained immediately after in situ hydrolysis of exo-(S-p)-guanosine 2',3'-cyclophosphorothioate [Zegers et al. (1998) Nature Struct. Biol. 5, 280-283]. It is likely that GpcU at the active site represents a nonproductive binding mode for GpU [:Steyaert, J., and Engleborghs (1995) fur. J. Biochem. 233, 140-144]. The results suggest that the active site of ribonuclease T-1 is adapted for optimal tight binding of both the guanylyl 2'-OH and phosphate groups (of GpU) only in the transition state for catalytic transesterification, which is stabilized by adjacent binding of the leaving nucleoside (U) group.