312 resultados para Single Graphics Processing Units

em Queensland University of Technology - ePrints Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tridiagonal diagonally dominant linear systems arise in many scientific and engineering applications. The standard Thomas algorithm for solving such systems is inherently serial forming a bottleneck in computation. Algorithms such as cyclic reduction and SPIKE reduce a single large tridiagonal system into multiple small independent systems which can be solved in parallel. We have developed portable cyclic reduction and SPIKE algorithm OpenCL implementations with the intent to target a range of co-processors in a heterogeneous computing environment including Field Programmable Gate Arrays (FPGAs), Graphics Processing Units (GPUs) and other multi-core processors. In this paper, we evaluate these designs in the context of solver performance, resource efficiency and numerical accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Machine vision represents a particularly attractive solution for sensing and detecting potential collision-course targets due to the relatively low cost, size, weight, and power requirements of vision sensors (as opposed to radar and TCAS). This paper describes the development and evaluation of a real-time vision-based collision detection system suitable for fixed-wing aerial robotics. Using two fixed-wing UAVs to recreate various collision-course scenarios, we were able to capture highly realistic vision (from an onboard camera perspective) of the moments leading up to a collision. This type of image data is extremely scarce and was invaluable in evaluating the detection performance of two candidate target detection approaches. Based on the collected data, our detection approaches were able to detect targets at distances ranging from 400m to about 900m. These distances (with some assumptions about closing speeds and aircraft trajectories) translate to an advanced warning of between 8-10 seconds ahead of impact, which approaches the 12.5 second response time recommended for human pilots. We overcame the challenge of achieving real-time computational speeds by exploiting the parallel processing architectures of graphics processing units found on commercially-off-the-shelf graphics devices. Our chosen GPU device suitable for integration onto UAV platforms can be expected to handle real-time processing of 1024 by 768 pixel image frames at a rate of approximately 30Hz. Flight trials using manned Cessna aircraft where all processing is performed onboard will be conducted in the near future, followed by further experiments with fully autonomous UAV platforms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The efficient computation of matrix function vector products has become an important area of research in recent times, driven in particular by two important applications: the numerical solution of fractional partial differential equations and the integration of large systems of ordinary differential equations. In this work we consider a problem that combines these two applications, in the form of a numerical solution algorithm for fractional reaction diffusion equations that after spatial discretisation, is advanced in time using the exponential Euler method. We focus on the efficient implementation of the algorithm on Graphics Processing Units (GPU), as we wish to make use of the increased computational power available with this hardware. We compute the matrix function vector products using the contour integration method in [N. Hale, N. Higham, and L. Trefethen. Computing Aα, log(A), and related matrix functions by contour integrals. SIAM J. Numer. Anal., 46(5):2505–2523, 2008]. Multiple levels of preconditioning are applied to reduce the GPU memory footprint and to further accelerate convergence. We also derive an error bound for the convergence of the contour integral method that allows us to pre-determine the appropriate number of quadrature points. Results are presented that demonstrate the effectiveness of the method for large two-dimensional problems, showing a speedup of more than an order of magnitude compared to a CPU-only implementation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article describes a maximum likelihood method for estimating the parameters of the standard square-root stochastic volatility model and a variant of the model that includes jumps in equity prices. The model is fitted to data on the S&P 500 Index and the prices of vanilla options written on the index, for the period 1990 to 2011. The method is able to estimate both the parameters of the physical measure (associated with the index) and the parameters of the risk-neutral measure (associated with the options), including the volatility and jump risk premia. The estimation is implemented using a particle filter whose efficacy is demonstrated under simulation. The computational load of this estimation method, which previously has been prohibitive, is managed by the effective use of parallel computing using graphics processing units (GPUs). The empirical results indicate that the parameters of the models are reliably estimated and consistent with values reported in previous work. In particular, both the volatility risk premium and the jump risk premium are found to be significant.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Games and related virtual environments have been a much-hyped area of the entertainment industry. The classic quote is that games are now approaching the size of Hollywood box office sales [1]. Books are now appearing that talk up the influence of games on business [2], and it is one of the key drivers of present hardware development. Some of this 3D technology is now embedded right down at the operating system level via the Windows Presentation Foundations – hit Windows/Tab on your Vista box to find out... In addition to this continued growth in the area of games, there are a number of factors that impact its development in the business community. Firstly, the average age of gamers is approaching the mid thirties. Therefore, a number of people who are in management positions in large enterprises are experienced in using 3D entertainment environments. Secondly, due to the pressure of demand for more computational power in both CPU and Graphical Processing Units (GPUs), your average desktop, any decent laptop, can run a game or virtual environment. In fact, the demonstrations at the end of this paper were developed at the Queensland University of Technology (QUT) on a standard Software Operating Environment, with an Intel Dual Core CPU and basic Intel graphics option. What this means is that the potential exists for the easy uptake of such technology due to 1. a broad range of workers being regularly exposed to 3D virtual environment software via games; 2. present desktop computing power now strong enough to potentially roll out a virtual environment solution across an entire enterprise. We believe such visual simulation environments can have a great impact in the area of business process modeling. Accordingly, in this article we will outline the communication capabilities of such environments, giving fantastic possibilities for business process modeling applications, where enterprises need to create, manage, and improve their business processes, and then communicate their processes to stakeholders, both process and non-process cognizant. The article then concludes with a demonstration of the work we are doing in this area at QUT.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The School of Electrical and Electronic Systems Engineering at Queensland University of Technology, Brisbane, Australia (QUT), offers three bachelor degree courses in electrical and computer engineering. In all its courses there is a strong emphasis on signal processing. A newly established Signal Processing Research Centre (SPRC) has played an important role in the development of the signal processing units in these courses. This paper describes the unique design of the undergraduate program in signal processing at QUT, the laboratories developed to support it, and the criteria that influenced the design.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Machine vision represents a particularly attractive solution for sensing and detecting potential collision-course targets due to the relatively low cost, size, weight, and power requirements of the sensors involved (as opposed to radar). This paper describes the development and evaluation of a vision-based collision detection algorithm suitable for fixed-wing aerial robotics. The system was evaluated using highly realistic vision data of the moments leading up to a collision. Based on the collected data, our detection approaches were able to detect targets at distances ranging from 400m to about 900m. These distances (with some assumptions about closing speeds and aircraft trajectories) translate to an advanced warning of between 8-10 seconds ahead of impact, which approaches the 12.5 second response time recommended for human pilots. We make use of the enormous potential of graphic processing units to achieve processing rates of 30Hz (for images of size 1024-by- 768). Currently, integration in the final platform is under way.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The emergence of pseudo-marginal algorithms has led to improved computational efficiency for dealing with complex Bayesian models with latent variables. Here an unbiased estimator of the likelihood replaces the true likelihood in order to produce a Bayesian algorithm that remains on the marginal space of the model parameter (with latent variables integrated out), with a target distribution that is still the correct posterior distribution. Very efficient proposal distributions can be developed on the marginal space relative to the joint space of model parameter and latent variables. Thus psuedo-marginal algorithms tend to have substantially better mixing properties. However, for pseudo-marginal approaches to perform well, the likelihood has to be estimated rather precisely. This can be difficult to achieve in complex applications. In this paper we propose to take advantage of multiple central processing units (CPUs), that are readily available on most standard desktop computers. Here the likelihood is estimated independently on the multiple CPUs, with the ultimate estimate of the likelihood being the average of the estimates obtained from the multiple CPUs. The estimate remains unbiased, but the variability is reduced. We compare and contrast two different technologies that allow the implementation of this idea, both of which require a negligible amount of extra programming effort. The superior performance of this idea over the standard approach is demonstrated on simulated data from a stochastic volatility model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A "self-exciting" market is one in which the probability of observing a crash increases in response to the occurrence of a crash. It essentially describes cases where the initial crash serves to weaken the system to some extent, making subsequent crashes more likely. This thesis investigates if equity markets possess this property. A self-exciting extension of the well-known jump-based Bates (1996) model is used as the workhorse model for this thesis, and a particle-filtering algorithm is used to facilitate estimation by means of maximum likelihood. The estimation method is developed so that option prices are easily included in the dataset, leading to higher quality estimates. Equilibrium arguments are used to price the risks associated with the time-varying crash probability, and in turn to motivate a risk-neutral system for use in option pricing. The option pricing function for the model is obtained via the application of widely-used Fourier techniques. An application to S&P500 index returns and a panel of S&P500 index option prices reveals evidence of self excitation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The nonlinear problem of steady free-surface flow past a submerged source is considered as a case study for three-dimensional ship wave problems. Of particular interest is the distinctive wedge-shaped wave pattern that forms on the surface of the fluid. By reformulating the governing equations with a standard boundary-integral method, we derive a system of nonlinear algebraic equations that enforce a singular integro-differential equation at each midpoint on a two-dimensional mesh. Our contribution is to solve the system of equations with a Jacobian-free Newton-Krylov method together with a banded preconditioner that is carefully constructed with entries taken from the Jacobian of the linearised problem. Further, we are able to utilise graphics processing unit acceleration to significantly increase the grid refinement and decrease the run-time of our solutions in comparison to schemes that are presently employed in the literature. Our approach provides opportunities to explore the nonlinear features of three-dimensional ship wave patterns, such as the shape of steep waves close to their limiting configuration, in a manner that has been possible in the two-dimensional analogue for some time.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The evolution of technological systems is hindered by systemic components, referred to as reverse salients, which fail to deliver the necessary level of technological performance thereby inhibiting the performance delivery of the system as a whole. This paper develops a performance gap measure of reverse salience and applies this measurement in the study of the PC (personal computer) technological system, focusing on the evolutions of firstly the CPU (central processing unit) and PC game sub-systems, and secondly the GPU (graphics processing unit) and PC game sub-systems. The measurement of the temporal behavior of reverse salience indicates that the PC game sub-system is the reverse salient, continuously trailing behind the technological performance of the CPU and GPU sub-systems from 1996 through 2006. The technological performance of the PC game sub-system as a reverse salient trails that of the CPU sub-system by up to 2300 MHz with a gradually decreasing performance disparity in recent years. In contrast, the dynamics of the PC game sub-system as a reverse salient trails the GPU sub-system with an ever increasing performance gap throughout the timeframe of analysis. In addition, we further discuss the research and managerial implications of our findings.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

utomatic pain monitoring has the potential to greatly improve patient diagnosis and outcomes by providing a continuous objective measure. One of the most promising methods is to do this via automatically detecting facial expressions. However, current approaches have failed due to their inability to: 1) integrate the rigid and non-rigid head motion into a single feature representation, and 2) incorporate the salient temporal patterns into the classification stage. In this paper, we tackle the first problem by developing a “histogram of facial action units” representation using Active Appearance Model (AAM) face features, and then utilize a Hidden Conditional Random Field (HCRF) to overcome the second issue. We show that both of these methods improve the performance on the task of pain detection in sequence level compared to current state-of-the-art-methods on the UNBC-McMaster Shoulder Pain Archive.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The present study explored whether semantic and motor systems are functionally interwoven via the use of a dual-task paradigm. According to embodied language accounts that propose an automatic and necessary involvement of the motor system in conceptual processing, concurrent processing of hand-related information should interfere more with hand movements than processing of unrelated body-part (i.e., foot, mouth) information. Across three experiments, 100 right-handed participants performed left- or right-hand tapping movements while repeatedly reading action words related to different body-parts, or different body-part names, in both aloud and silent conditions. Concurrent reading of single words related to specific body-parts, or the same words embedded in sentences differing in syntactic and phonological complexity (to manipulate context-relevant processing), and reading while viewing videos of the actions and body-parts described by the target words (to elicit visuomotor associations) all interfered with right-hand but not left-hand tapping rate. However, this motor interference was not affected differentially by hand-related stimuli. Thus, the results provide no support for proposals that body-part specific resources in cortical motor systems are shared between overt manual movements and meaning-related processing of words related to the hand.