30 resultados para Single Graphics Processing Units
em Aston University Research Archive
Resumo:
In Fourier domain optical coherence tomography (FD-OCT), a large amount of interference data needs to be resampled from the wavelength domain to the wavenumber domain prior to Fourier transformation. We present an approach to optimize this data processing, using a graphics processing unit (GPU) and parallel processing algorithms. We demonstrate an increased processing and rendering rate over that previously reported by using GPU paged memory to render data in the GPU rather than copying back to the CPU. This avoids unnecessary and slow data transfer, enabling a processing and display rate of well over 524,000 A-scan/s for a single frame. To the best of our knowledge this is the fastest processing demonstrated to date and the first time that FD-OCT processing and rendering has been demonstrated entirely on a GPU.
Resumo:
This thesis describes advances in the characterisation, calibration and data processing of optical coherence tomography (OCT) systems. Femtosecond (fs) laser inscription was used for producing OCT-phantoms. Transparent materials are generally inert to infra-red radiations, but with fs lasers material modification occurs via non-linear processes when the highly focused light source interacts with the materials. This modification is confined to the focal volume and is highly reproducible. In order to select the best inscription parameters, combination of different inscription parameters were tested, using three fs laser systems, with different operating properties, on a variety of materials. This facilitated the understanding of the key characteristics of the produced structures with the aim of producing viable OCT-phantoms. Finally, OCT-phantoms were successfully designed and fabricated in fused silica. The use of these phantoms to characterise many properties (resolution, distortion, sensitivity decay, scan linearity) of an OCT system was demonstrated. Quantitative methods were developed to support the characterisation of an OCT system collecting images from phantoms and also to improve the quality of the OCT images. Characterisation methods include the measurement of the spatially variant resolution (point spread function (PSF) and modulation transfer function (MTF)), sensitivity and distortion. Processing of OCT data is a computer intensive process. Standard central processing unit (CPU) based processing might take several minutes to a few hours to process acquired data, thus data processing is a significant bottleneck. An alternative choice is to use expensive hardware-based processing such as field programmable gate arrays (FPGAs). However, recently graphics processing unit (GPU) based data processing methods have been developed to minimize this data processing and rendering time. These processing techniques include standard-processing methods which includes a set of algorithms to process the raw data (interference) obtained by the detector and generate A-scans. The work presented here describes accelerated data processing and post processing techniques for OCT systems. The GPU based processing developed, during the PhD, was later implemented into a custom built Fourier domain optical coherence tomography (FD-OCT) system. This system currently processes and renders data in real time. Processing throughput of this system is currently limited by the camera capture rate. OCTphantoms have been heavily used for the qualitative characterization and adjustment/ fine tuning of the operating conditions of OCT system. Currently, investigations are under way to characterize OCT systems using our phantoms. The work presented in this thesis demonstrate several novel techniques of fabricating OCT-phantoms and accelerating OCT data processing using GPUs. In the process of developing phantoms and quantitative methods, a thorough understanding and practical knowledge of OCT and fs laser processing systems was developed. This understanding leads to several novel pieces of research that are not only relevant to OCT but have broader importance. For example, extensive understanding of the properties of fs inscribed structures will be useful in other photonic application such as making of phase mask, wave guides and microfluidic channels. Acceleration of data processing with GPUs is also useful in other fields.
Resumo:
The focus of this study is development of parallelised version of severely sequential and iterative numerical algorithms based on multi-threaded parallel platform such as a graphics processing unit. This requires design and development of a platform-specific numerical solution that can benefit from the parallel capabilities of the chosen platform. Graphics processing unit was chosen as a parallel platform for design and development of a numerical solution for a specific physical model in non-linear optics. This problem appears in describing ultra-short pulse propagation in bulk transparent media that has recently been subject to several theoretical and numerical studies. The mathematical model describing this phenomenon is a challenging and complex problem and its numerical modeling limited on current modern workstations. Numerical modeling of this problem requires a parallelisation of an essentially serial algorithms and elimination of numerical bottlenecks. The main challenge to overcome is parallelisation of the globally non-local mathematical model. This thesis presents a numerical solution for elimination of numerical bottleneck associated with the non-local nature of the mathematical model. The accuracy and performance of the parallel code is identified by back-to-back testing with a similar serial version.
Resumo:
Femtosecond laser microfabrication has emerged over the last decade as a 3D flexible technology in photonics. Numerical simulations provide an important insight into spatial and temporal beam and pulse shaping during the course of extremely intricate nonlinear propagation (see e.g. [1,2]). Electromagnetics of such propagation is typically described in the form of the generalized Non-Linear Schrdinger Equation (NLSE) coupled with Drude model for plasma [3]. In this paper we consider a multi-threaded parallel numerical solution for a specific model which describes femtosecond laser pulse propagation in transparent media [4, 5]. However our approach can be extended to similar models. The numerical code is implemented in NVIDIA Graphics Processing Unit (GPU) which provides an effitient hardware platform for multi-threded computing. We compare the performance of the described below parallel code implementated for GPU using CUDA programming interface [3] with a serial CPU version used in our previous papers [4,5]. © 2011 IEEE.
Resumo:
We compared reading acquisition in English and Italian children up to late primary school analyzing RTs and errors as a function of various psycholinguistic variables and changes due to experience. Our results show that reading becomes progressively more reliant on larger processing units with age, but that this is modulated by consistency of the language. In English, an inconsistent orthography, reliance on larger units occurs earlier on and it is demonstrated by faster RTs, a stronger effect of lexical variables and lack of length effect (by fifth grade). However, not all English children are able to master this mode of processing yielding larger inter-individual variability. In Italian, a consistent orthography, reliance on larger units occurs later and it is less pronounced. This is demonstrated by larger length effects which remain significant even in older children and by larger effects of a global factor (related to speed of orthographic decoding) explaining changes of performance across ages. Our results show the importance of considering not only overall performance, but inter-individual variability and variability between conditions when interpreting cross-linguistic differences.
Resumo:
The perception of an object as a single entity within a visual scene requires that its features are bound together and segregated from the background and/or other objects. Here, we used magnetoencephalography (MEG) to assess the hypothesis that coherent percepts may arise from the synchronized high frequency (gamma) activity between neurons that code features of the same object. We also assessed the role of low frequency (alpha, beta) activity in object processing. The target stimulus (i.e. object) was a small patch of a concentric grating of 3c/°, viewed eccentrically. The background stimulus was either a blank field or a concentric grating of 3c/° periodicity, viewed centrally. With patterned backgrounds, the target stimulus emerged--through rotation about its own centre--as a circular subsection of the background. Data were acquired using a 275-channel whole-head MEG system and analyzed using Synthetic Aperture Magnetometry (SAM), which allows one to generate images of task-related cortical oscillatory power changes within specific frequency bands. Significant oscillatory activity across a broad range of frequencies was evident at the V1/V2 border, and subsequent analyses were based on a virtual electrode at this location. When the target was presented in isolation, we observed that: (i) contralateral stimulation yielded a sustained power increase in gamma activity; and (ii) both contra- and ipsilateral stimulation yielded near identical transient power changes in alpha (and beta) activity. When the target was presented against a patterned background, we observed that: (i) contralateral stimulation yielded an increase in high-gamma (>55 Hz) power together with a decrease in low-gamma (40-55 Hz) power; and (ii) both contra- and ipsilateral stimulation yielded a transient decrease in alpha (and beta) activity, though the reduction tended to be greatest for contralateral stimulation. The opposing power changes across different regions of the gamma spectrum with 'figure/ground' stimulation suggest a possible dual role for gamma rhythms in visual object coding, and provide general support of the binding-by-synchronization hypothesis. As the power changes in alpha and beta activity were largely independent of the spatial location of the target, however, we conclude that their role in object processing may relate principally to changes in visual attention.
Resumo:
In a group of adult dyslexics word reading and, especially, word spelling are predicted more by what we have called lexical learning (tapped by a paired-associate task with pictures and written nonwords) than by phonological skills. Nonword reading and spelling, instead, are not associated with this task but they are predicted by phonological tasks. Consistently, surface and phonological dyslexics show opposite profiles on lexical learning and phonological tasks. The phonological dyslexics are more impaired on the phonological tasks, while the surface dyslexics are equally or more impaired on the lexical learning tasks. Finally, orthographic lexical learning explains more variation in spelling than in reading, and subtyping based on spelling returns more interpretable results than that based on reading. These results suggest that the quality of lexical representations is crucial to adult literacy skills. This is best measured by spelling and best predicted by a task of lexical learning. We hypothesize that lexical learning taps a uniquely human capacity to form new representations by recombining the units of a restricted set.
Resumo:
In this paper we propose an alternative method for measuring efficiency of Decision making Units, which allows the presence of variables with both negative and positive values. The model is applied to data on the notional effluent processing system to compare the results with recent developed methods; Modified Slacks Based Model as suggested by Sharp et al (2007) and Range Directional Measures developed by Silva Portela et al (2004). A further example explores advantages of using the new model.
Resumo:
Data Envelopment Analysis (DEA) is a nonparametric method for measuring the efficiency of a set of decision making units such as firms or public sector agencies, first introduced into the operational research and management science literature by Charnes, Cooper, and Rhodes (CCR) [Charnes, A., Cooper, W.W., Rhodes, E., 1978. Measuring the efficiency of decision making units. European Journal of Operational Research 2, 429–444]. The original DEA models were applicable only to technologies characterized by positive inputs/outputs. In subsequent literature there have been various approaches to enable DEA to deal with negative data. In this paper, we propose a semi-oriented radial measure, which permits the presence of variables which can take both negative and positive values. The model is applied to data on a notional effluent processing system to compare the results with those yielded by two alternative methods for dealing with negative data in DEA: The modified slacks-based model suggested by Sharp et al. [Sharp, J.A., Liu, W.B., Meng, W., 2006. A modified slacks-based measure model for data envelopment analysis with ‘natural’ negative outputs and inputs. Journal of Operational Research Society 57 (11) 1–6] and the range directional model developed by Portela et al. [Portela, M.C.A.S., Thanassoulis, E., Simpson, G., 2004. A directional distance approach to deal with negative data in DEA: An application to bank branches. Journal of Operational Research Society 55 (10) 1111–1121]. A further example explores the advantages of using the new model.
Resumo:
How does nearby motion affect the perceived speed of a target region? When a central drifting Gabor patch is surrounded by translating noise, its speed can be misperceived over a fourfold range. Typically, when a surround moves in the same direction, perceived centre speed is reduced; for opposite-direction surrounds it increases. Measuring this illusion for a variety of surround properties reveals that the motion context effects are a saturating function of surround speed (Experiment I) and contrast (Experiment II). Our analyses indicate that the effects are consistent with a subtractive process, rather than with speed being averaged over area. In Experiment III we exploit known properties of the motion system to ask where these surround effects impact. Using 2D plaid stimuli, we find that surround-induced shifts in perceived speed of one plaid component produce substantial shifts in perceived plaid direction. This indicates that surrounds exert their influence early in processing, before pattern motion direction is computed. These findings relate to ongoing investigations of surround suppression for direction discrimination, and are consistent with single-cell findings of direction-tuned suppressive and facilitatory interactions in primary visual cortex (V1).
Resumo:
The following thesis presents results obtained from both numerical simulation and laboratory experimentation (both of which were carried out by the author). When data is propagated along an optical transmission line some timing irregularities can occur such as timing jitter and phase wander. Traditionally these timing problems would have been corrected by converting the optical signal into the electrical domain and then compensating for the timing irregularity before converting the signal back into the optical domain. However, this thesis posses a potential solution to the problem by remaining completely in the optical domain, eliminating the need for electronics. This is desirable as not only does optical processing reduce the latency effect that their electronic counterpart have, it also holds the possibility of an increase in overall speed. A scheme was proposed which utilises the principle of wavelength conversion to dynamically convert timing irregularities (timing jitter and phase wander) into a change in wavelength (this occurs on a bit-by-bit level and so timing jitter and phase wander can be compensated for simultaneously). This was achieved by optically sampling a linearly chirped, locally generated clock source (the sampling function was achieved using a nonlinear optical loop mirror). The data, now with each bit or code word having a unique wavelength, is then propagated through a dispersion compensation module. The dispersion compensation effectively re-aligns the data in time and so thus, the timing irregularities are removed. The principle of operation was tested using computer simulation before being re-tested in a laboratory environment. A second stage was added to the device to create 3R regeneration. The second stage is used to simply convert the timing suppressed data back into a single wavelength. By controlling the relative timing displacement between stage one and stage two, the wavelength that is finally produced can be controlled.
Resumo:
One of the major problems associated with communication via a loudspeaking telephone (LST) is that, using analogue processing, duplex transmission is limited to low-loss lines and produces a low acoustic output. An architectural for an instrument has been developed and tested, which uses digital signal processing to provide duplex transmission between a LST and a telopnone handset over most of the B.T. network. Digital adaptive-filters are used in the duplex LST to cancel coupling between the loudspeaker and microphone, and across the transmit to receive paths of the 2-to-4-wire converter. Normal movement of a person in the acoustic path causes a loss of stability by increasing the level of coupling from the loudspeaker to the microphone, since there is a lag associated the adaptive filters learning about a non-stationary path, Control of the loop stability and the level of sidetone heard by the hadset user is by a microprocessoe, which continually monitors the system and regulates the gain. The result is a system which offers the best compromise available based on a set of measured parameters.A theory has been developed which gives the loop stability requirements based on the error between the parameters of the filter and those of the unknown path. The programme to develope a low-cost adaptive filter in LST produced a low-cost adaptive filter in LST produced a unique architecture which has a number of features not available in any similar system. These include automatic compensation for the rate of adaptation over a 36 dB range of output level, , 4 rates of adaptation (with a maximum of 465 dB/s), plus the ability to cascade up to 4 filters without loss o performance. A complex story has been developed to determine the adptation which can be achieved using finite-precision arithmatic. This enabled the development of an architecture which distributed the normalisation required to achieve optimum rate of adaptation over the useful input range. Comparison of theory and measurement for the adaptive filter show very close agreement. A single experimental LST was built and tested on connections to hanset telephones over the BT network. The LST demonstrated that duplex transmission was feasible using signal processing and produced a more comfortable means of communication beween people than methods emplying deep voice-switching to regulate the local-loop gain. Although, with the current level of processing power, it is not a panacea and attention must be directed toward the physical acoustic isolation between loudspeaker and microphone.
Resumo:
Grafting of antioxidants and other modifiers onto polymers by reactive extrusion, has been performed successfully by the Polymer Processing and Performance Group at Aston University. Traditionally the optimum conditions for the grafting process have been established within a Brabender internal mixer. Transfer of this batch process to a continuous processor, such as an extruder, has, typically, been empirical. To have more confidence in the success of direct transfer of the process requires knowledge of, and comparison between, residence times, mixing intensities, shear rates and flow regimes in the internal mixer and in the continuous processor.The continuous processor chosen for the current work in the closely intermeshing, co-rotating twin-screw extruder (CICo-TSE). CICo-TSEs contain screw elements that convey material with a self-wiping action and are widely used for polymer compounding and blending. Of the different mixing modules contained within the CICo-TSE, the trilobal elements, which impose intensive mixing, and the mixing discs, which impose extensive mixing, are of importance when establishing the intensity of mixing. In this thesis, the flow patterns within the various regions of the single-flighted conveying screw elements and within both the trilobal element and mixing disc zones of a Betol BTS40 CICo-TSE, have been modelled using the computational fluid dynamics package Polyflow. A major obstacle encountered when solving the flow problem within all of these sets of elements, arises from both the complex geometry and the time-dependent flow boundaries as the elements rotate about their fixed axes. Simulation of the time dependent boundaries was overcome by selecting a number of sequential 2D and 3D geometries, used to represent partial mixing cycles. The flow fields were simulated using the ideal rheological properties of polypropylene and characterised in terms of velocity vectors, shear stresses generated and a parameter known as the mixing efficiency. The majority of the large 3D simulations were performed on the Cray J90 supercomputer situated at the Rutherford-Appleton laboratories, with pre- and postprocessing operations achieved via a Silicon Graphics Indy workstation. A mechanical model was constructed consisting of various CICo-TSE elements rotating within a transparent outer barrel. A technique has been developed using coloured viscous clays whereby the flow patterns and mixing characteristics within the CICo-TSE may be visualised. In order to test and verify the simulated predictions, the patterns observed within the mechanical model were compared with the flow patterns predicted by the computational model. The flow patterns within the single-flighted conveying screw elements in particular, showed good agreement between the experimental and simulated results.
Resumo:
This thesis describes the design and engineering of a pressurised biomass gasification test facility. A detailed examination of the major elements within the plant has been undertaken in relation to specification of equipment, evaluation of options and final construction. The retrospective project assessment was developed from consideration of relevant literature and theoretical principles. The literature review includes a discussion on legislation and applicable design codes. From this analysis, each of the necessary equipment units was reviewed and important design decisions and procedures highlighted and explored. Particular emphasis was placed on examination of the stringent demands of the ASME VIII design codes. The inter-relationship of functional units was investigated and areas of deficiency, such as biomass feeders and gas cleaning, have been commented upon. Finally, plant costing was summarized in relation to the plant design and proposed experimental programme. The main conclusion drawn from the study is that pressurised gasification of biomass is far more difficult and expensive to support than atmospheric gasification. A number of recommendations have been made regarding future work in this area.
Resumo:
The aim of this Interdisciplinary Higher Degrees project was the development of a high-speed method of photometrically testing vehicle headlamps, based on the use of image processing techniques, for Lucas Electrical Limited. Photometric testing involves measuring the illuminance produced by a lamp at certain points in its beam distribution. Headlamp performance is best represented by an iso-lux diagram, showing illuminance contours, produced from a two-dimensional array of data. Conventionally, the tens of thousands of measurements required are made using a single stationary photodetector and a two-dimensional mechanical scanning system which enables a lamp's horizontal and vertical orientation relative to the photodetector to be changed. Even using motorised scanning and computerised data-logging, the data acquisition time for a typical iso-lux test is about twenty minutes. A detailed study was made of the concept of using a video camera and a digital image processing system to scan and measure a lamp's beam without the need for the time-consuming mechanical movement. Although the concept was shown to be theoretically feasible, and a prototype system designed, it could not be implemented because of the technical limitations of commercially-available equipment. An alternative high-speed approach was developed, however, and a second prototype syqtem designed. The proposed arrangement again uses an image processing system, but in conjunction with a one-dimensional array of photodetectors and a one-dimensional mechanical scanning system in place of a video camera. This system can be implemented using commercially-available equipment and, although not entirely eliminating the need for mechanical movement, greatly reduces the amount required, resulting in a predicted data acquisiton time of about twenty seconds for a typical iso-lux test. As a consequence of the work undertaken, the company initiated an 80,000 programme to implement the system proposed by the author.