361 resultados para accelerators
Resumo:
Graphics Processing Units (GPUs) are becoming popular accelerators in modern High-Performance Computing (HPC) clusters. Installing GPUs on each node of the cluster is not efficient resulting in high costs and power consumption as well as underutilisation of the accelerator. The research reported in this paper is motivated towards the use of few physical GPUs by providing cluster nodes access to remote GPUs on-demand for a financial risk application. We hypothesise that sharing GPUs between several nodes, referred to as multi-tenancy, reduces the execution time and energy consumed by an application. Two data transfer modes between the CPU and the GPUs, namely concurrent and sequential, are explored. The key result from the experiments is that multi-tenancy with few physical GPUs using sequential data transfers lowers the execution time and the energy consumed, thereby improving the overall performance of the application.
Resumo:
We report on the first demonstration of passive all-optical plasma lensing using a two-stage setup. An intense femtosecond laser accelerates electrons in a laser wakefield accelerator (LWFA) to 100 MeVover millimeter length scales. By adding a second gas target behind the initial LWFAstage we introduce a robust and independently tunable plasma lens. We observe a density dependent reduction of the LWFA electron beam divergence from an initial value of 2.3 mrad, down to 1.4 mrad (rms), when the plasma lens is in operation. Such a plasma lens provides a simple and compact approach for divergence reduction well matched to the mm-scale length of the LWFA accelerator. The focusing forces are provided solely by the plasma and driven by the bunch itself only, making this a highly useful and conceptually new approach to electron beam focusing. Possible applications of this lens are not limited to laser plasma accelerators. Since no active driver is needed the passive plasma lens is also suited for high repetition rate focusing of electron bunches. Its understanding is also required for modeling the evolution of the driving particle bunch in particle driven wake field acceleration.
Resumo:
Thesis (Master's)--University of Washington, 2016-08
Resumo:
In this paper, we develop a fast implementation of an hyperspectral coded aperture (HYCA) algorithm on different platforms using OpenCL, an open standard for parallel programing on heterogeneous systems, which includes a wide variety of devices, from dense multicore systems from major manufactures such as Intel or ARM to new accelerators such as graphics processing units (GPUs), field programmable gate arrays (FPGAs), the Intel Xeon Phi and other custom devices. Our proposed implementation of HYCA significantly reduces its computational cost. Our experiments have been conducted using simulated data and reveal considerable acceleration factors. This kind of implementations with the same descriptive language on different architectures are very important in order to really calibrate the possibility of using heterogeneous platforms for efficient hyperspectral imaging processing in real remote sensing missions.
Resumo:
Observations of H3+ in the Galactic diffuse interstellar medium (ISM) have led to various surprising results, including the conclusion that the cosmic-ray ionization rate (zeta_2) is about 1 order of magnitude larger than previously thought. The present survey expands the sample of diffuse cloud sight lines with H3+ observations to 50, with detections in 21 of those. Ionization rates inferred from these detections are in the range (1.7+-1.0)x10^-16 s^-1 < zeta_2 < (10.6+-6.8)x10^-16 s^-1 with a mean value of zeta_2=(3.3+-0.4)x10^-16 s^-1. Upper limits (3sigma) derived from non-detections of H3+ are as low as zeta_2 < 0.4x10^-16 s^-1. These low upper-limits, in combination with the wide range of inferred cosmic-ray ionization rates, indicate variations in zeta_2 between different diffuse cloud sight lines. Calculations of the cosmic-ray ionization rate from theoretical cosmic-ray spectra require a large flux of low-energy (MeV) particles to reproduce values inferred from observations. Given the relatively short range of low-energy cosmic rays --- those most efficient at ionization --- the proximity of a cloud to a site of particle acceleration may set its ionization rate. Variations in zeta_2 are thus likely due to variations in the cosmic-ray spectrum at low energies resulting from the effects of particle propagation. To test this theory, H3+ was observed in sight lines passing through diffuse molecular clouds known to be interacting with the supernova remnant IC 443, a probable site of particle acceleration. Where H3+ is detected, ionization rates of zeta_2=(20+-10)x10^-16 s^-1 are inferred, higher than for any other diffuse cloud. These results support both the concept that supernova remnants act as particle accelerators, and the hypothesis that propagation effects are responsible for causing spatial variations in the cosmic-ray spectrum and ionization rate. Future observations of H3+ near other supernova remnants and in sight lines where complementary ionization tracers (OH+, H2O+, H3O+) have been observed will further our understanding of the subject.
Resumo:
Reconfigurable platforms are a promising technology that offers an interesting trade-off between flexibility and performance, which many recent embedded system applications demand, especially in fields such as multimedia processing. These applications typically involve multiple ad-hoc tasks for hardware acceleration, which are usually represented using formalisms such as Data Flow Diagrams (DFDs), Data Flow Graphs (DFGs), Control and Data Flow Graphs (CDFGs) or Petri Nets. However, none of these models is able to capture at the same time the pipeline behavior between tasks (that therefore can coexist in order to minimize the application execution time), their communication patterns, and their data dependencies. This paper proves that the knowledge of all this information can be effectively exploited to reduce the resource requirements and the timing performance of modern reconfigurable systems, where a set of hardware accelerators is used to support the computation. For this purpose, this paper proposes a novel task representation model, named Temporal Constrained Data Flow Diagram (TCDFD), which includes all this information. This paper also presents a mapping-scheduling algorithm that is able to take advantage of the new TCDFD model. It aims at minimizing the dynamic reconfiguration overhead while meeting the communication requirements among the tasks. Experimental results show that the presented approach achieves up to 75% of resources saving and up to 89% of reconfiguration overhead reduction with respect to other state-of-the-art techniques for reconfigurable platforms.
Resumo:
At the HL-LHC, proton bunches will cross each other every 25. ns, producing an average of 140 pp-collisions per bunch crossing. To operate in such an environment, the CMS experiment will need a L1 hardware trigger able to identify interesting events within a latency of 12.5. μs. The future L1 trigger will make use also of data coming from the silicon tracker to control the trigger rate. The architecture that will be used in future to process tracker data is still under discussion. One interesting proposal makes use of the Time Multiplexed Trigger concept, already implemented in the CMS calorimeter trigger for the Phase I trigger upgrade. The proposed track finding algorithm is based on the Hough Transform method. The algorithm has been tested using simulated pp-collision data. Results show a very good tracking efficiency. The algorithm will be demonstrated in hardware in the coming months using the MP7, which is a μTCA board with a powerful FPGA capable of handling data rates approaching 1. Tb/s.