8 resultados para parallel processing

em Aston University Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The trend in modal extraction algorithms is to use all the available frequency response functions data to obtain a global estimate of the natural frequencies, damping ratio and mode shapes. Improvements in transducer and signal processing technology allow the simultaneous measurement of many hundreds of channels of response data. The quantity of data available and the complexity of the extraction algorithms make considerable demands on the available computer power and require a powerful computer or dedicated workstation to perform satisfactorily. An alternative to waiting for faster sequential processors is to implement the algorithm in parallel, for example on a network of Transputers. Parallel architectures are a cost effective means of increasing computational power, and a larger number of response channels would simply require more processors. This thesis considers how two typical modal extraction algorithms, the Rational Fraction Polynomial method and the Ibrahim Time Domain method, may be implemented on a network of transputers. The Rational Fraction Polynomial Method is a well known and robust frequency domain 'curve fitting' algorithm. The Ibrahim Time Domain method is an efficient algorithm that 'curve fits' in the time domain. This thesis reviews the algorithms, considers the problems involved in a parallel implementation, and shows how they were implemented on a real Transputer network.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In Fourier domain optical coherence tomography (FD-OCT), a large amount of interference data needs to be resampled from the wavelength domain to the wavenumber domain prior to Fourier transformation. We present an approach to optimize this data processing, using a graphics processing unit (GPU) and parallel processing algorithms. We demonstrate an increased processing and rendering rate over that previously reported by using GPU paged memory to render data in the GPU rather than copying back to the CPU. This avoids unnecessary and slow data transfer, enabling a processing and display rate of well over 524,000 A-scan/s for a single frame. To the best of our knowledge this is the fastest processing demonstrated to date and the first time that FD-OCT processing and rendering has been demonstrated entirely on a GPU.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Very large spatially-referenced datasets, for example, those derived from satellite-based sensors which sample across the globe or large monitoring networks of individual sensors, are becoming increasingly common and more widely available for use in environmental decision making. In large or dense sensor networks, huge quantities of data can be collected over small time periods. In many applications the generation of maps, or predictions at specific locations, from the data in (near) real-time is crucial. Geostatistical operations such as interpolation are vital in this map-generation process and in emergency situations, the resulting predictions need to be available almost instantly, so that decision makers can make informed decisions and define risk and evacuation zones. It is also helpful when analysing data in less time critical applications, for example when interacting directly with the data for exploratory analysis, that the algorithms are responsive within a reasonable time frame. Performing geostatistical analysis on such large spatial datasets can present a number of problems, particularly in the case where maximum likelihood. Although the storage requirements only scale linearly with the number of observations in the dataset, the computational complexity in terms of memory and speed, scale quadratically and cubically respectively. Most modern commodity hardware has at least 2 processor cores if not more. Other mechanisms for allowing parallel computation such as Grid based systems are also becoming increasingly commonly available. However, currently there seems to be little interest in exploiting this extra processing power within the context of geostatistics. In this paper we review the existing parallel approaches for geostatistics. By recognising that diffeerent natural parallelisms exist and can be exploited depending on whether the dataset is sparsely or densely sampled with respect to the range of variation, we introduce two contrasting novel implementations of parallel algorithms based on approximating the data likelihood extending the methods of Vecchia [1988] and Tresp [2000]. Using parallel maximum likelihood variogram estimation and parallel prediction algorithms we show that computational time can be significantly reduced. We demonstrate this with both sparsely sampled data and densely sampled data on a variety of architectures ranging from the common dual core processor, found in many modern desktop computers, to large multi-node super computers. To highlight the strengths and weaknesses of the diffeerent methods we employ synthetic data sets and go on to show how the methods allow maximum likelihood based inference on the exhaustive Walker Lake data set.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Image segmentation is one of the most computationally intensive operations in image processing and computer vision. This is because a large volume of data is involved and many different features have to be extracted from the image data. This thesis is concerned with the investigation of practical issues related to the implementation of several classes of image segmentation algorithms on parallel architectures. The Transputer is used as the basic building block of hardware architectures and Occam is used as the programming language. The segmentation methods chosen for implementation are convolution, for edge-based segmentation; the Split and Merge algorithm for segmenting non-textured regions; and the Granlund method for segmentation of textured images. Three different convolution methods have been implemented. The direct method of convolution, carried out in the spatial domain, uses the array architecture. The other two methods, based on convolution in the frequency domain, require the use of the two-dimensional Fourier transform. Parallel implementations of two different Fast Fourier Transform algorithms have been developed, incorporating original solutions. For the Row-Column method the array architecture has been adopted, and for the Vector-Radix method, the pyramid architecture. The texture segmentation algorithm, for which a system-level design is given, demonstrates a further application of the Vector-Radix Fourier transform. A novel concurrent version of the quad-tree based Split and Merge algorithm has been implemented on the pyramid architecture. The performance of the developed parallel implementations is analysed. Many of the obtained speed-up and efficiency measures show values close to their respective theoretical maxima. Where appropriate comparisons are drawn between different implementations. The thesis concludes with comments on general issues related to the use of the Transputer system as a development tool for image processing applications; and on the issues related to the engineering of concurrent image processing applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The orientations of lines and edges are important in defining the structure of the visual environment, and observers can detect differences in line orientation within the first few hundred milliseconds of scene viewing. The present work is a psychophysical investigation of the mechanisms of early visual orientation-processing. In experiments with briefly presented displays of line elements, observers indicated whether all the elements were uniformly oriented or whether a uniquely oriented target was present among uniformly oriented nontargets. The minimum difference between nontarget and target orientations that was required for effective target-detection (the orientation increment threshold) varied little with the number of elements and their spatial density, but the percentage of correct responses in detection of a large orientation-difference increased with increasing element density. The differing variations with element density of thresholds and percent-correct scores may indicate the operation of more than one mechanism in early visual orientation-processIng. Reducing element length caused threshold to increase with increasing number of elements, showing that the effectiveness of rapid, spatially parallel orientation-processing depends on element length. Orientational anisotropy in line-target detection has been reported previously: a coarse periodic variation and some finer variations in orientation increment threshold with nontarget orientation have been found. In the present work, the prominence of the coarse variation in relation to finer variations decreased with increasing effective viewing duration, as if the operation of coarse orientation-processing mechanisms precedes the operation of finer ones. Orientational anisotropy was prominent even when observers lay horizontally and viewed displays by looking upwards through a black cylinder that excluded all possible visual references for orientation. So, gravitational and visual cues are not essential to the definition of an orientational reference frame for early vision, and such a reference can be well defined by retinocentric neural coding, awareness of body-axis orientation, or both.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The focus of this study is development of parallelised version of severely sequential and iterative numerical algorithms based on multi-threaded parallel platform such as a graphics processing unit. This requires design and development of a platform-specific numerical solution that can benefit from the parallel capabilities of the chosen platform. Graphics processing unit was chosen as a parallel platform for design and development of a numerical solution for a specific physical model in non-linear optics. This problem appears in describing ultra-short pulse propagation in bulk transparent media that has recently been subject to several theoretical and numerical studies. The mathematical model describing this phenomenon is a challenging and complex problem and its numerical modeling limited on current modern workstations. Numerical modeling of this problem requires a parallelisation of an essentially serial algorithms and elimination of numerical bottlenecks. The main challenge to overcome is parallelisation of the globally non-local mathematical model. This thesis presents a numerical solution for elimination of numerical bottleneck associated with the non-local nature of the mathematical model. The accuracy and performance of the parallel code is identified by back-to-back testing with a similar serial version.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We report the performance of a group of adult dyslexics and matched controls in an array-matching task where two strings of either consonants or symbols are presented side by side and have to be judged to be the same or different. The arrays may differ either in the order or identity of two adjacent characters. This task does not require naming – which has been argued to be the cause of dyslexics’ difficulty in processing visual arrays – but, instead, has a strong serial component as demonstrated by the fact that, in both groups, Reaction times (RTs) increase monotonically with position of a mismatch. The dyslexics are clearly impaired in all conditions and performance in the identity conditions predicts performance across orthographic tasks even after age, performance IQ and phonology are partialled out. Moreover, the shapes of serial position curves are revealing of the underlying impairment. In the dyslexics, RTs increase with position at the same rate as in the controls (lines are parallel) ruling out reduced processing speed or difficulties in shifting attention. Instead, error rates show a catastrophic increase for positions which are either searched later or more subject to interference. These results are consistent with a reduction in the attentional capacity needed in a serial task to bind together identity and positional information. This capacity is best seen as a reduction in the number of spotlights into which attention can be split to process information at different locations rather than as a more generic reduction of resources which would also affect processing the details of single objects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Femtosecond laser microfabrication has emerged over the last decade as a 3D flexible technology in photonics. Numerical simulations provide an important insight into spatial and temporal beam and pulse shaping during the course of extremely intricate nonlinear propagation (see e.g. [1,2]). Electromagnetics of such propagation is typically described in the form of the generalized Non-Linear Schrdinger Equation (NLSE) coupled with Drude model for plasma [3]. In this paper we consider a multi-threaded parallel numerical solution for a specific model which describes femtosecond laser pulse propagation in transparent media [4, 5]. However our approach can be extended to similar models. The numerical code is implemented in NVIDIA Graphics Processing Unit (GPU) which provides an effitient hardware platform for multi-threded computing. We compare the performance of the described below parallel code implementated for GPU using CUDA programming interface [3] with a serial CPU version used in our previous papers [4,5]. © 2011 IEEE.