Biblioteca Digital

105 resultados para GPU computing

Minimizing damage in postbuckling stiffened composite panels: An optimization strategy using High Performance Computing

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Real-time GPU color-based segmentation of football players

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose a multi-camera application capable of processing high resolution images and extracting features based on colors patterns over graphic processing units (GPU). The goal is to work in real time under the uncontrolled environment of a sport event like a football match. Since football players are composed for diverse and complex color patterns, a Gaussian Mixture Models (GMM) is applied as segmentation paradigm, in order to analyze sport live images and video. Optimization techniques have also been applied over the C++ implementation using profiling tools focused on high performance. Time consuming tasks were implemented over NVIDIA's CUDA platform, and later restructured and enhanced, speeding up the whole process significantly. Our resulting code is around 4-11 times faster on a low cost GPU than a highly optimized C++ version on a central processing unit (CPU) over the same data. Real time has been obtained processing until 64 frames per second. An important conclusion derived from our study is the scalability of the application to the number of cores on the GPU. © 2011 Springer-Verlag.

Veja mais

Application-based workload model for wireless sensor node computing platforms

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Wireless sensor node platforms are very diversified and very constrained, particularly in power consumption. When choosing or sizing a platform for a given application, it is necessary to be able to evaluate in an early design stage the impact of those choices. Applied to the computing platform implemented on the sensor node, it requires a good understanding of the workload it must perform. Nevertheless, this workload is highly application-dependent. It depends on the data sampling frequency together with application-specific data processing and management. It is thus necessary to have a model that can represent the workload of applications with various needs and characteristics. In this paper, we propose a workload model for wireless sensor node computing platforms. This model is based on a synthetic application that models the different computational tasks that the computing platform will perform to process sensor data. It allows to model the workload of various different applications by tuning data sampling rate and processing. A case study is performed by modeling different applications and by showing how it can be used for workload characterization. © 2011 IEEE.

Veja mais

Enabling science through emerging HPC technologies: accelerating numerical quadrature using a GPU

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The R-matrix method when applied to the study of intermediate energy electron scattering by the hydrogen atom gives rise to a large number of two electron integrals between numerical basis functions. Each integral is evaluated independently of the others, thereby rendering this a prime candidate for a parallel implementation. In this paper, we present a parallel implementation of this routine which uses a Graphical Processing Unit as a co-processor, giving a speedup of approximately 20 times when compared with a sequential version. We briefly consider properties of this calculation which make a GPU implementation appropriate with a view to identifying other calculations which might similarly benet.

Veja mais

Using Machine Descriptors to Select Parallelization Models and Strategies on Hierarchical Systems:Supercomputing'2001: High Performance Computing and Networking Conference (SC)

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Application Awareness in Adaptation Middleware: Balancing Transparency with Performance and Adaptivity:SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP), Miniworkshop on Adaptivity in Parallel and Distributed Computing through Interoperating Systems and Applications

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Unified Scheduling of Polymorphic Parallelism on the Cell Processor:Abstracts of the 2008 SIAM Conference on Parallel Processing for Scientific Computing, Miniworkshop on the Cell Processor (SIAM PP)

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Model-Based Hybrid MPI/OpenMP Power-Aware Computing:ACM/IEEE Supercomputing'2009: High-performance Computing, Networking, Storage and Analysis (SC): Poster Session

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Hybrid MPI/OpenMP Power-Aware Computing

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Power- aware MPI Task Aggregation Prediction for High-End Computing Systems

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Reconciling Explicit with Implicit Parallelism:2012 SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP), Savannah, GA, USA

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Generic low-latency NoC router architecture for FPGA computing systems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A novel cost-effective and low-latency wormhole router for packet-switched NoC designs, tailored for FPGA, is presented. This has been designed to be scalable at system level to fully exploit the characteristics and constraints of FPGA based systems, rather than custom ASIC technology. A key feature is that it achieves a low packet propagation latency of only two cycles per hop including both router pipeline delay and link traversal delay - a significant enhancement over existing FPGA designs - whilst being very competitive in terms of performance and hardware complexity. It can also be configured in various network topologies including 1-D, 2-D, and 3-D. Detailed design-space exploration has been carried for a range of scaling parameters, with the results of various design trade-offs being presented and discussed. By taking advantage of abundant buildin reconfigurable logic and routing resources, we have been able to create a new scalable on-chip FPGA based router that exhibits high dimensionality and connectivity. The architecture proposed can be easily migrated across many FPGA families to provide flexible, robust and cost-effective NoC solutions suitable for the implementation of high-performance FPGA computing systems. © 2011 IEEE.

Veja mais

Optimal Job Scheduling of Grid Computing Using Efficient Binary Artificial Bee Colony

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Swarm scheduling approaches for work-flow applications with security constraints in distributed data-intensive computing environments

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The scheduling problem in distributed data-intensive computing environments has become an active research topic due to the tremendous growth in grid and cloud computing environments. As an innovative distributed intelligent paradigm, swarm intelligence provides a novel approach to solving these potentially intractable problems. In this paper, we formulate the scheduling problem for work-flow applications with security constraints in distributed data-intensive computing environments and present a novel security constraint model. Several meta-heuristic adaptations to the particle swarm optimization algorithm are introduced to deal with the formulation of efficient schedules. A variable neighborhood particle swarm optimization algorithm is compared with a multi-start particle swarm optimization and multi-start genetic algorithm. Experimental results illustrate that population based meta-heuristics approaches usually provide a good balance between global exploration and local exploitation and their feasibility and effectiveness for scheduling work-flow applications. © 2010 Elsevier Inc. All rights reserved.

Veja mais

Trading Computing Resources across gLite and XtreemOS Platforms

Relevância:

20.00% 20.00%

Publicador:

Veja mais

105 resultados para GPU computing

Filtro por publicador