2 resultados para General-purpose computing on graphics processing units (GPGPU)
em Glasgow Theses Service
Resumo:
Processors with large numbers of cores are becoming commonplace. In order to utilise the available resources in such systems, the programming paradigm has to move towards increased parallelism. However, increased parallelism does not necessarily lead to better performance. Parallel programming models have to provide not only flexible ways of defining parallel tasks, but also efficient methods to manage the created tasks. Moreover, in a general-purpose system, applications residing in the system compete for the shared resources. Thread and task scheduling in such a multiprogrammed multithreaded environment is a significant challenge. In this thesis, we introduce a new task-based parallel reduction model, called the Glasgow Parallel Reduction Machine (GPRM). Our main objective is to provide high performance while maintaining ease of programming. GPRM supports native parallelism; it provides a modular way of expressing parallel tasks and the communication patterns between them. Compiling a GPRM program results in an Intermediate Representation (IR) containing useful information about tasks, their dependencies, as well as the initial mapping information. This compile-time information helps reduce the overhead of runtime task scheduling and is key to high performance. Generally speaking, the granularity and the number of tasks are major factors in achieving high performance. These factors are even more important in the case of GPRM, as it is highly dependent on tasks, rather than threads. We use three basic benchmarks to provide a detailed comparison of GPRM with Intel OpenMP, Cilk Plus, and Threading Building Blocks (TBB) on the Intel Xeon Phi, and with GNU OpenMP on the Tilera TILEPro64. GPRM shows superior performance in almost all cases, only by controlling the number of tasks. GPRM also provides a low-overhead mechanism, called “Global Sharing”, which improves performance in multiprogramming situations. We use OpenMP, as the most popular model for shared-memory parallel programming as the main GPRM competitor for solving three well-known problems on both platforms: LU factorisation of Sparse Matrices, Image Convolution, and Linked List Processing. We focus on proposing solutions that best fit into the GPRM’s model of execution. GPRM outperforms OpenMP in all cases on the TILEPro64. On the Xeon Phi, our solution for the LU Factorisation results in notable performance improvement for sparse matrices with large numbers of small blocks. We investigate the overhead of GPRM’s task creation and distribution for very short computations using the Image Convolution benchmark. We show that this overhead can be mitigated by combining smaller tasks into larger ones. As a result, GPRM can outperform OpenMP for convolving large 2D matrices on the Xeon Phi. Finally, we demonstrate that our parallel worksharing construct provides an efficient solution for Linked List processing and performs better than OpenMP implementations on the Xeon Phi. The results are very promising, as they verify that our parallel programming framework for manycore processors is flexible and scalable, and can provide high performance without sacrificing productivity.
Resumo:
It is well known that self-generated stimuli are processed differently from externally generated stimuli. For example, many people have noticed since childhood that it is very difficult to make a self-tickling. In the auditory domain, self-generated sounds elicit smaller brain responses as compared to externally generated sounds, known as the sensory attenuation (SA) effect. SA is manifested in reduced amplitudes of evoked responses as measured through MEEG, decreased firing rates of neurons and a lower level of perceived loudness for self-generated sounds. The predominant explanation for SA is based on the idea that self-generated stimuli are predicted (e.g., the forward model account). It is the nature of their predictability that is crucial for SA. On the contrary, the sensory gating account emphasizes a general suppressive effect of actions on sensory processing, regardless of the predictability of the stimuli. Both accounts have received empirical support, which suggests that both mechanisms may exist. In chapter 2, three behavioural studies concerning the influence of motor activation on auditory perception were presented. Study 1 compared the effect of SA and attention in an auditory detection task and showed that SA was present even when substantial attention was paid to unpredictable stimuli. Study 2 compared the loudness perception of tones generated by others between Chinese and British participants. Compared to externally generated tones, a decrease in perceived loudness for others generated tones was found among Chinese but not among the British. In study 3, partial evidence was found that even when reading words that are related to action, auditory detection performance was impaired. In chapter 3, the classic SA effect of M100 suppression was replicated with MEG in study 4. With time-frequency analysis, a potential neural information processing sequence was found in auditory cortex. Prior to the onset of self-generated tones, there was an increase of oscillatory power in the alpha band. After the stimulus onset, reduced gamma power and alpha/beta phase locking were found. The three temporally segregated oscillatory events correlated with each other and with SA effect, which may be the underlying neural implementation of SA. In chapter 4, a TMS-MEG study was presented investigating the role of the cerebellum in adapting to delayed presentation of self-generated tones (study 5). It demonstrated that in sham stimulation condition, the brain can adapt to the delay (about 100 ms) within 300 trials of learning by showing a significant increase of SA effect in the suppression of M100, but not M200 component. Whereas after stimulating the cerebellum with a suppressive TMS protocol, the adaptation in M100 suppression disappeared and the pattern of M200 suppression reversed to M200 enhancement. These data support the idea that the suppressive effect of actions on auditory processing is a consequence of both motor driven sensory predictions and general sensory gating. The results also demonstrate the importance of neural oscillations in implementing SA effect and the critical role of the cerebellum in learning sensory predictions under sensory perturbation.