246 resultados para virtualised GPU


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Fuel treatment is considered a suitable way to mitigate the hazard related to potential wildfires on a landscape. However, designing an optimal spatial layout of treatment units represents a difficult optimization problem. In fact, budget constraints, the probabilistic nature of fire spread and interactions among the different area units composing the whole treatment, give rise to challenging search spaces on typical landscapes. In this paper we formulate such optimization problem with the objective of minimizing the extension of land characterized by high fire hazard. Then, we propose a computational approach that leads to a spatially-optimized treatment layout exploiting Tabu Search and General-Purpose computing on Graphics Processing Units (GPGPU). Using an application example, we also show that the proposed methodology can provide high-quality design solutions in low computing time. © 2013 The Authors. Published by Elsevier B.V.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

现代GPU的图形绘制管线一般基于扫描线转换算法。模型中的三角面片经过光栅化后投影到屏幕上,在投影区域的各像素位置分别生成一个对应的基本处理单位,称之为片元。在光栅化过程中,当场景中的物体互相遮挡时,相应像素位置会产生多个片元与之对应。一般图形应用只需处理视点所能直接看到的表面,即每个像素只需要保留离视点最近或最远的片元信息,但是有一些绘制效果需要同时处理同一像素位置对应的多个片元,这些特殊的效果通常称为多片元效果,包括顺序独立的透明现象、半透明现象、体绘制以及折射等。然而,现有GPU只针对不透明物体的绘制进行了硬件层面的优化,使得光栅化后每个象素位置只保留最近或最远的片元,其余都在生成最终图像时被抛弃,所以目前多片元效果的绘制需要重复光栅化整个场景多遍,才可以收集到每个像素对应的所有片元。当场景规模较小时算法性能较高,但对于大型复杂场景来说,模型的多次顶点变换将成为绘制瓶颈,导致算法效率下降,因而难以在交互式应用中广泛使用。 针对这个问题,本文提出一种基于桶排序的高效深度剥离算法,将GPU上的多绘制目标缓存作为桶数组,采用桶排序原理以及最大/最小融合模式来收集投影到同一个像素上的多个片元并按深度排序,最后在后处理中再对场景进行延迟着色以绘制多片元效果。当发生桶内片元冲突时可以采用多遍绘制或者自适应划分的方式来降低片元冲突概率,以进一步提高绘制的准确性。针对透明现象的绘制,提出一种基于桶内动态融合的改进算法,采用并发读写的方法逐一融合落入同一个桶内的所有片元,并在后处理中按从前向后的顺序融合各个桶内的颜色值。由于同时发生桶内片元冲突和读写冲突的概率非常小,因而可以进一步提高绘制结果的准确性。实验结果表明,基于桶排序的深度剥离算法可以高效地处理大型场景多片元效果的绘制,同时生成与真实结果非常相近的绘制效果。 针对传统图形管线的不足之处,本文进一步设计并实现了CUDA渲染器:第一个可以在当前图形硬件上运行的全线可编程的图形管线系统,并基于该框架设计了两种新的透明现象的高效单遍绘制策略:第一种策略称之为多级深度测试策略,该策略利用了CUDA 的原子操作符atomicMin,可以在单遍绘制中动态收集所有片元并排序;第二种策略称之为固定数组缓存策略,该策略利用CUDA 的原子操作符atomicInc,可以在单遍绘制中按光栅化顺序收集所有的片元并在后处理中排序。实验结果表明,基于CUDA渲染器的这两种片元收集策略可以在单遍场景遍历中高效地绘制多片元效果,同时生成与真实结果非常相近的绘制效果。 未来的工作方向在于进一步完善基于桶排序的深度剥离算法,设计更加完善的深度区间划分方式,使得桶数组可以与片元一一对应,以完全消除桶内片元冲突。此外,可以进一步完善CUDA渲染器,使其可以更高效地处理遮挡剔除以及反走样等其它经典图形问题。

Relevância:

20.00% 20.00%

Publicador:

Resumo:

提出了一种GPU加速的实时基于图像的绘制算法.该算法利用极坐标系生成对物体全方位均匀采样的球面深度图像;然后根据推导的两个预变换公式将单幅球面深度图像预变换到物体包围球的一个与视点相关的切平面上,以生成中间图像;再利用纹理映射生成最终目标图像.利用现代图形硬件的可编程性和并行性,将预变换移植到Vertex Shader来加快绘制速度;利用硬件的光栅化功能来完成图像的插值,以得到连续无洞的结果图像.此外,还在Pixel Shader上进行逐像素的光照以及环境映射的计算,生成高质量的光照效果.最终,文章解决了算法的视点受限问题,并设计了一种动态LOD(Level of Details)算法,实现了一个实时漫游系统,保持了物体间正确的遮挡关系.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

通过对阴影图算法进行扩展,提出一种完全基于GPU的近似软影实时绘制算法,它是一种3遍算法:第一遍从光源中心计算场景的深度图;第二遍采用几何着色器提取物体的轮廓边,同时在轮廓边上生成新的几何图元,利用硬件自动插值功能向外绘制线性近似半影图,并根据第一遍得到的深度图在像素着色器中对背面轮廓形成的半影区进行剔除;对于重叠的半影区设定片元的伪深度值,利用硬件进行自动融合.第三遍分别查询深度图和半影图,确定场景的本影区以及半影区中像素的亮度,从而得到面光源照射下场景的近似软影效果.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article describes advances in statistical computation for large-scale data analysis in structured Bayesian mixture models via graphics processing unit (GPU) programming. The developments are partly motivated by computational challenges arising in fitting models of increasing heterogeneity to increasingly large datasets. An example context concerns common biological studies using high-throughput technologies generating many, very large datasets and requiring increasingly high-dimensional mixture models with large numbers of mixture components.We outline important strategies and processes for GPU computation in Bayesian simulation and optimization approaches, give examples of the benefits of GPU implementations in terms of processing speed and scale-up in ability to analyze large datasets, and provide a detailed, tutorial-style exposition that will benefit readers interested in developing GPU-based approaches in other statistical models. Novel, GPU-oriented approaches to modifying existing algorithms software design can lead to vast speed-up and, critically, enable statistical analyses that presently will not be performed due to compute time limitations in traditional computational environments. Supplementalmaterials are provided with all source code, example data, and details that will enable readers to implement and explore the GPU approach in this mixture modeling context. © 2010 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The use of accelerators, with compute architectures different and distinct from the CPU, has become a new research frontier in high-performance computing over the past ?ve years. This paper is a case study on how the instruction-level parallelism offered by three accelerator technologies, FPGA, GPU and ClearSpeed, can be exploited in atomic physics. The algorithm studied is the evaluation of two electron integrals, using direct numerical quadrature, a task that arises in the study of intermediate energy electron scattering by hydrogen atoms. The results of our ‘productivity’ study show that while each accelerator is viable, there are considerable differences in the implementation strategies that must be followed on each.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose a multi-camera application capable of processing high resolution images and extracting features based on colors patterns over graphic processing units (GPU). The goal is to work in real time under the uncontrolled environment of a sport event like a football match. Since football players are composed for diverse and complex color patterns, a Gaussian Mixture Models (GMM) is applied as segmentation paradigm, in order to analyze sport live images and video. Optimization techniques have also been applied over the C++ implementation using profiling tools focused on high performance. Time consuming tasks were implemented over NVIDIA's CUDA platform, and later restructured and enhanced, speeding up the whole process significantly. Our resulting code is around 4-11 times faster on a low cost GPU than a highly optimized C++ version on a central processing unit (CPU) over the same data. Real time has been obtained processing until 64 frames per second. An important conclusion derived from our study is the scalability of the application to the number of cores on the GPU. © 2011 Springer-Verlag.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The R-matrix method when applied to the study of intermediate energy electron scattering by the hydrogen atom gives rise to a large number of two electron integrals between numerical basis functions. Each integral is evaluated independently of the others, thereby rendering this a prime candidate for a parallel implementation. In this paper, we present a parallel implementation of this routine which uses a Graphical Processing Unit as a co-processor, giving a speedup of approximately 20 times when compared with a sequential version. We briefly consider properties of this calculation which make a GPU implementation appropriate with a view to identifying other calculations which might similarly benet.

Relevância:

20.00% 20.00%

Publicador: