14 resultados para virtualised GPU

em Queensland University of Technology - ePrints Archive


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The use of graphical processing unit (GPU) parallel processing is becoming a part of mainstream statistical practice. The reliance of Bayesian statistics on Markov Chain Monte Carlo (MCMC) methods makes the applicability of parallel processing not immediately obvious. It is illustrated that there are substantial gains in improved computational time for MCMC and other methods of evaluation by computing the likelihood using GPU parallel processing. Examples use data from the Global Terrorism Database to model terrorist activity in Colombia from 2000 through 2010 and a likelihood based on the explicit convolution of two negative-binomial processes. Results show decreases in computational time by a factor of over 200. Factors influencing these improvements and guidelines for programming parallel implementations of the likelihood are discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The nonlinear problem of steady free-surface flow past a submerged source is considered as a case study for three-dimensional ship wave problems. Of particular interest is the distinctive wedge-shaped wave pattern that forms on the surface of the fluid. By reformulating the governing equations with a standard boundary-integral method, we derive a system of nonlinear algebraic equations that enforce a singular integro-differential equation at each midpoint on a two-dimensional mesh. Our contribution is to solve the system of equations with a Jacobian-free Newton-Krylov method together with a banded preconditioner that is carefully constructed with entries taken from the Jacobian of the linearised problem. Further, we are able to utilise graphics processing unit acceleration to significantly increase the grid refinement and decrease the run-time of our solutions in comparison to schemes that are presently employed in the literature. Our approach provides opportunities to explore the nonlinear features of three-dimensional ship wave patterns, such as the shape of steep waves close to their limiting configuration, in a manner that has been possible in the two-dimensional analogue for some time.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The efficient computation of matrix function vector products has become an important area of research in recent times, driven in particular by two important applications: the numerical solution of fractional partial differential equations and the integration of large systems of ordinary differential equations. In this work we consider a problem that combines these two applications, in the form of a numerical solution algorithm for fractional reaction diffusion equations that after spatial discretisation, is advanced in time using the exponential Euler method. We focus on the efficient implementation of the algorithm on Graphics Processing Units (GPU), as we wish to make use of the increased computational power available with this hardware. We compute the matrix function vector products using the contour integration method in [N. Hale, N. Higham, and L. Trefethen. Computing Aα, log(A), and related matrix functions by contour integrals. SIAM J. Numer. Anal., 46(5):2505–2523, 2008]. Multiple levels of preconditioning are applied to reduce the GPU memory footprint and to further accelerate convergence. We also derive an error bound for the convergence of the contour integral method that allows us to pre-determine the appropriate number of quadrature points. Results are presented that demonstrate the effectiveness of the method for large two-dimensional problems, showing a speedup of more than an order of magnitude compared to a CPU-only implementation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Machine vision represents a particularly attractive solution for sensing and detecting potential collision-course targets due to the relatively low cost, size, weight, and power requirements of vision sensors (as opposed to radar and TCAS). This paper describes the development and evaluation of a real-time vision-based collision detection system suitable for fixed-wing aerial robotics. Using two fixed-wing UAVs to recreate various collision-course scenarios, we were able to capture highly realistic vision (from an onboard camera perspective) of the moments leading up to a collision. This type of image data is extremely scarce and was invaluable in evaluating the detection performance of two candidate target detection approaches. Based on the collected data, our detection approaches were able to detect targets at distances ranging from 400m to about 900m. These distances (with some assumptions about closing speeds and aircraft trajectories) translate to an advanced warning of between 8-10 seconds ahead of impact, which approaches the 12.5 second response time recommended for human pilots. We overcame the challenge of achieving real-time computational speeds by exploiting the parallel processing architectures of graphics processing units found on commercially-off-the-shelf graphics devices. Our chosen GPU device suitable for integration onto UAV platforms can be expected to handle real-time processing of 1024 by 768 pixel image frames at a rate of approximately 30Hz. Flight trials using manned Cessna aircraft where all processing is performed onboard will be conducted in the near future, followed by further experiments with fully autonomous UAV platforms.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Uninhabited aerial vehicles (UAVs) are a cutting-edge technology that is at the forefront of aviation/aerospace research and development worldwide. Many consider their current military and defence applications as just a token of their enormous potential. Unlocking and fully exploiting this potential will see UAVs in a multitude of civilian applications and routinely operating alongside piloted aircraft. The key to realising the full potential of UAVs lies in addressing a host of regulatory, public relation, and technological challenges never encountered be- fore. Aircraft collision avoidance is considered to be one of the most important issues to be addressed, given its safety critical nature. The collision avoidance problem can be roughly organised into three areas: 1) Sense; 2) Detect; and 3) Avoid. Sensing is concerned with obtaining accurate and reliable information about other aircraft in the air; detection involves identifying potential collision threats based on available information; avoidance deals with the formulation and execution of appropriate manoeuvres to maintain safe separation. This thesis tackles the detection aspect of collision avoidance, via the development of a target detection algorithm that is capable of real-time operation onboard a UAV platform. One of the key challenges of the detection problem is the need to provide early warning. This translates to detecting potential threats whilst they are still far away, when their presence is likely to be obscured and hidden by noise. Another important consideration is the choice of sensors to capture target information, which has implications for the design and practical implementation of the detection algorithm. The main contributions of the thesis are: 1) the proposal of a dim target detection algorithm combining image morphology and hidden Markov model (HMM) filtering approaches; 2) the novel use of relative entropy rate (RER) concepts for HMM filter design; 3) the characterisation of algorithm detection performance based on simulated data as well as real in-flight target image data; and 4) the demonstration of the proposed algorithm's capacity for real-time target detection. We also consider the extension of HMM filtering techniques and the application of RER concepts for target heading angle estimation. In this thesis we propose a computer-vision based detection solution, due to the commercial-off-the-shelf (COTS) availability of camera hardware and the hardware's relatively low cost, power, and size requirements. The proposed target detection algorithm adopts a two-stage processing paradigm that begins with an image enhancement pre-processing stage followed by a track-before-detect (TBD) temporal processing stage that has been shown to be effective in dim target detection. We compare the performance of two candidate morphological filters for the image pre-processing stage, and propose a multiple hidden Markov model (MHMM) filter for the TBD temporal processing stage. The role of the morphological pre-processing stage is to exploit the spatial features of potential collision threats, while the MHMM filter serves to exploit the temporal characteristics or dynamics. The problem of optimising our proposed MHMM filter has been examined in detail. Our investigation has produced a novel design process for the MHMM filter that exploits information theory and entropy related concepts. The filter design process is posed as a mini-max optimisation problem based on a joint RER cost criterion. We provide proof that this joint RER cost criterion provides a bound on the conditional mean estimate (CME) performance of our MHMM filter, and this in turn establishes a strong theoretical basis connecting our filter design process to filter performance. Through this connection we can intelligently compare and optimise candidate filter models at the design stage, rather than having to resort to time consuming Monte Carlo simulations to gauge the relative performance of candidate designs. Moreover, the underlying entropy concepts are not constrained to any particular model type. This suggests that the RER concepts established here may be generalised to provide a useful design criterion for multiple model filtering approaches outside the class of HMM filters. In this thesis we also evaluate the performance of our proposed target detection algorithm under realistic operation conditions, and give consideration to the practical deployment of the detection algorithm onboard a UAV platform. Two fixed-wing UAVs were engaged to recreate various collision-course scenarios to capture highly realistic vision (from an onboard camera perspective) of the moments leading up to a collision. Based on this collected data, our proposed detection approach was able to detect targets out to distances ranging from about 400m to 900m. These distances, (with some assumptions about closing speeds and aircraft trajectories) translate to an advanced warning ahead of impact that approaches the 12.5 second response time recommended for human pilots. Furthermore, readily available graphic processing unit (GPU) based hardware is exploited for its parallel computing capabilities to demonstrate the practical feasibility of the proposed target detection algorithm. A prototype hardware-in- the-loop system has been found to be capable of achieving data processing rates sufficient for real-time operation. There is also scope for further improvement in performance through code optimisations. Overall, our proposed image-based target detection algorithm offers UAVs a cost-effective real-time target detection capability that is a step forward in ad- dressing the collision avoidance issue that is currently one of the most significant obstacles preventing widespread civilian applications of uninhabited aircraft. We also highlight that the algorithm development process has led to the discovery of a powerful multiple HMM filtering approach and a novel RER-based multiple filter design process. The utility of our multiple HMM filtering approach and RER concepts, however, extend beyond the target detection problem. This is demonstrated by our application of HMM filters and RER concepts to a heading angle estimation problem.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper a real-time vision based power line extraction solution is investigated for active UAV guidance. The line extraction algorithm starts from ridge points detected by steerable filters. A collinear line segments fitting algorithm is followed up by considering global and local information together with multiple collinear measurements. GPU boosted algorithm implementation is also investigated in the experiment. The experimental result shows that the proposed algorithm outperforms two baseline line detection algorithms and is able to fitting long collinear line segments. The low computational cost of the algorithm make suitable for real-time applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The ‘Fashion Tales’ Conference identifies three fashion discourses: that of making, that of media, and that of scholarship. We propose a fourth, which provides a foundational base for the others: the discourse of fashion pedagogy. We begin with the argument that to thrive in any of these discourses, all fashion graduates require the ability to navigate the complexities of the 21st century fashion industry. Fashion graduates emerge into a professional world which demands a range of high level capabilities above and beyond those traditionally acknowledged by the discipline. Professional education in fashion must transform itself to accommodate these imperatives. In this paper, we document a tale of fashion learning, teaching and scholarship – the tale of a highly successful future-orientated boutique university-based undergraduate fashion course in Queensland, Australia. The Discipline consistently maintains the highest student satisfaction and lowest attrition of any course in the university, achieves extremely competitive student satisfaction scores when compared with other courses nationally and internationally, and reports outstanding graduate employment outcomes. The core of the article addresses how the course effectively balances five key pedagogical tensions identified from the findings of in-depth focus groups with graduating students, and interviews with teaching staff. The pedagogical tensions are: high concept/ authenticity; high disciplinarity/ interdisciplinarity; high rigour/ play; high autonomy/ scaffolding; and high individuality/ community, where community can be further divided into high challenge and high support. We discuss each of these tensions and how they are characterised within the course, using rich descriptions given by the students. We also draw upon the wider andragogical and learning futures literatures to link the tensions with what is already known about excellence in 21st century higher and further education curriculum and pedagogic practice. We ask: as the fashion industry becomes truly globalised, virtualised, and diversified, and as initial professional training for the industry becomes increasingly massified and performatised, what are the best teaching approaches to produce autonomous, professionally capable, enterprising and responsible graduates into the future? Can the pedagogical balances described in this case study be maintained in the light of these powerful external forces, and if so, how?

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The objective of this PhD research program is to investigate numerical methods for simulating variably-saturated flow and sea water intrusion in coastal aquifers in a high-performance computing environment. The work is divided into three overlapping tasks: to develop an accurate and stable finite volume discretisation and numerical solution strategy for the variably-saturated flow and salt transport equations; to implement the chosen approach in a high performance computing environment that may have multiple GPUs or CPU cores; and to verify and test the implementation. The geological description of aquifers is often complex, with porous materials possessing highly variable properties, that are best described using unstructured meshes. The finite volume method is a popular method for the solution of the conservation laws that describe sea water intrusion, and is well-suited to unstructured meshes. In this work we apply a control volume-finite element (CV-FE) method to an extension of a recently proposed formulation (Kees and Miller, 2002) for variably saturated groundwater flow. The CV-FE method evaluates fluxes at points where material properties and gradients in pressure and concentration are consistently defined, making it both suitable for heterogeneous media and mass conservative. Using the method of lines, the CV-FE discretisation gives a set of differential algebraic equations (DAEs) amenable to solution using higher-order implicit solvers. Heterogeneous computer systems that use a combination of computational hardware such as CPUs and GPUs, are attractive for scientific computing due to the potential advantages offered by GPUs for accelerating data-parallel operations. We present a C++ library that implements data-parallel methods on both CPU and GPUs. The finite volume discretisation is expressed in terms of these data-parallel operations, which gives an efficient implementation of the nonlinear residual function. This makes the implicit solution of the DAE system possible on the GPU, because the inexact Newton-Krylov method used by the implicit time stepping scheme can approximate the action of a matrix on a vector using residual evaluations. We also propose preconditioning strategies that are amenable to GPU implementation, so that all computationally-intensive aspects of the implicit time stepping scheme are implemented on the GPU. Results are presented that demonstrate the efficiency and accuracy of the proposed numeric methods and formulation. The formulation offers excellent conservation of mass, and higher-order temporal integration increases both numeric efficiency and accuracy of the solutions. Flux limiting produces accurate, oscillation-free solutions on coarse meshes, where much finer meshes are required to obtain solutions with equivalent accuracy using upstream weighting. The computational efficiency of the software is investigated using CPUs and GPUs on a high-performance workstation. The GPU version offers considerable speedup over the CPU version, with one GPU giving speedup factor of 3 over the eight-core CPU implementation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Safety concerns in the operation of autonomous aerial systems require safe-landing protocols be followed during situations where the a mission should be aborted due to mechanical or other failure. On-board cameras provide information that can be used in the determination of potential landing sites, which are continually updated and ranked to prevent injury and minimize damage. Pulse Coupled Neural Networks have been used for the detection of features in images that assist in the classification of vegetation and can be used to minimize damage to the aerial vehicle. However, a significant drawback in the use of PCNNs is that they are computationally expensive and have been more suited to off-line applications on conventional computing architectures. As heterogeneous computing architectures are becoming more common, an OpenCL implementation of a PCNN feature generator is presented and its performance is compared across OpenCL kernels designed for CPU, GPU and FPGA platforms. This comparison examines the compute times required for network convergence under a variety of images obtained during unmanned aerial vehicle trials to determine the plausibility for real-time feature detection.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

During the evolution of the music industry, developments in the media environment have required music firms to adapt in order to survive. Changes in broadcast radio programming during the 1950s; the Compact Cassette during the 1970s; and the deregulation of media ownership during the 1990s are all examples of changes which have heavily affected the music industry. This study explores similar contemporary dynamics, examines how decision makers in the music industry perceive and make sense of the developments, and reveals how they revise their business strategies, based on their mental models of the media environment. A qualitative system dynamics model is developed in order to support the reasoning brought forward by the study. The model is empirically grounded, but is also based on previous music industry research and a theoretical platform constituted by concepts from evolutionary economics and sociology of culture. The empirical data primarily consist of 36 personal interviews with decision makers in the American, British and Swedish music industrial ecosystems. The study argues that the model which is proposed, more effectively explains contemporary music industry dynamics than music industry models presented by previous research initiatives. Supported by the model, the study is able to show how “new” media outlets make old music business models obsolete and challenge the industry’s traditional power structures. It is no longer possible to expose music at one outlet (usually broadcast radio) in the hope that it will lead to sales of the same music at another (e.g. a compact disc). The study shows that many music industry decision makers still have not embraced the new logic, and have not yet challenged their traditional mental models of the media environment. Rather, they remain focused on preserving the pivotal role held by the CD and other physical distribution technologies. Further, the study shows that while many music firms remain attached to the old models, other firms, primarily music publishers, have accepted the transformation, and have reluctantly recognised the realities of a virtualised environment.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Safety concerns in the operation of autonomous aerial systems require safe-landing protocols be followed during situations where the mission should be aborted due to mechanical or other failure. This article presents a pulse-coupled neural network (PCNN) to assist in the vegetation classification in a vision-based landing site detection system for an unmanned aircraft. We propose a heterogeneous computing architecture and an OpenCL implementation of a PCNN feature generator. Its performance is compared across OpenCL kernels designed for CPU, GPU, and FPGA platforms. This comparison examines the compute times required for network convergence under a variety of images to determine the plausibility for real-time feature detection.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study uses the reverse salient methodology to contrast subsystems in video game consoles in order to discover, characterize, and forecast the most significant technology gap. We build on the current methodologies (Performance Gap and Time Gap) for measuring the magnitude of Reverse Salience, by showing the effectiveness of Performance Gap Ratio (PGR). The three subject subsystems in this analysis are the CPU Score, GPU core frequency, and video memory bandwidth. CPU Score is a metric developed for this project, which is the product of the core frequency, number of parallel cores, and instruction size. We measure the Performance Gap of each subsystem against concurrently available PC hardware on the market. Using PGR, we normalize the evolution of these technologies for comparative analysis. The results indicate that while CPU performance has historically been the Reverse Salient, video memory bandwidth has taken over as the quickest growing technology gap in the current generation. Finally, we create a technology forecasting model that shows how much the video RAM bandwidth gap will grow through 2019 should the current trend continue. This analysis can assist console developers in assigning resources to the next generation of platforms, which will ultimately result in longer hardware life cycles.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The evolution of technological systems is hindered by systemic components, referred to as reverse salients, which fail to deliver the necessary level of technological performance thereby inhibiting the performance delivery of the system as a whole. This paper develops a performance gap measure of reverse salience and applies this measurement in the study of the PC (personal computer) technological system, focusing on the evolutions of firstly the CPU (central processing unit) and PC game sub-systems, and secondly the GPU (graphics processing unit) and PC game sub-systems. The measurement of the temporal behavior of reverse salience indicates that the PC game sub-system is the reverse salient, continuously trailing behind the technological performance of the CPU and GPU sub-systems from 1996 through 2006. The technological performance of the PC game sub-system as a reverse salient trails that of the CPU sub-system by up to 2300 MHz with a gradually decreasing performance disparity in recent years. In contrast, the dynamics of the PC game sub-system as a reverse salient trails the GPU sub-system with an ever increasing performance gap throughout the timeframe of analysis. In addition, we further discuss the research and managerial implications of our findings.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tridiagonal diagonally dominant linear systems arise in many scientific and engineering applications. The standard Thomas algorithm for solving such systems is inherently serial forming a bottleneck in computation. Algorithms such as cyclic reduction and SPIKE reduce a single large tridiagonal system into multiple small independent systems which can be solved in parallel. We have developed portable cyclic reduction and SPIKE algorithm OpenCL implementations with the intent to target a range of co-processors in a heterogeneous computing environment including Field Programmable Gate Arrays (FPGAs), Graphics Processing Units (GPUs) and other multi-core processors. In this paper, we evaluate these designs in the context of solver performance, resource efficiency and numerical accuracy.