857 resultados para Federal High Performance Computing Program (U.S.)
Resumo:
The objective of this PhD research program is to investigate numerical methods for simulating variably-saturated flow and sea water intrusion in coastal aquifers in a high-performance computing environment. The work is divided into three overlapping tasks: to develop an accurate and stable finite volume discretisation and numerical solution strategy for the variably-saturated flow and salt transport equations; to implement the chosen approach in a high performance computing environment that may have multiple GPUs or CPU cores; and to verify and test the implementation. The geological description of aquifers is often complex, with porous materials possessing highly variable properties, that are best described using unstructured meshes. The finite volume method is a popular method for the solution of the conservation laws that describe sea water intrusion, and is well-suited to unstructured meshes. In this work we apply a control volume-finite element (CV-FE) method to an extension of a recently proposed formulation (Kees and Miller, 2002) for variably saturated groundwater flow. The CV-FE method evaluates fluxes at points where material properties and gradients in pressure and concentration are consistently defined, making it both suitable for heterogeneous media and mass conservative. Using the method of lines, the CV-FE discretisation gives a set of differential algebraic equations (DAEs) amenable to solution using higher-order implicit solvers. Heterogeneous computer systems that use a combination of computational hardware such as CPUs and GPUs, are attractive for scientific computing due to the potential advantages offered by GPUs for accelerating data-parallel operations. We present a C++ library that implements data-parallel methods on both CPU and GPUs. The finite volume discretisation is expressed in terms of these data-parallel operations, which gives an efficient implementation of the nonlinear residual function. This makes the implicit solution of the DAE system possible on the GPU, because the inexact Newton-Krylov method used by the implicit time stepping scheme can approximate the action of a matrix on a vector using residual evaluations. We also propose preconditioning strategies that are amenable to GPU implementation, so that all computationally-intensive aspects of the implicit time stepping scheme are implemented on the GPU. Results are presented that demonstrate the efficiency and accuracy of the proposed numeric methods and formulation. The formulation offers excellent conservation of mass, and higher-order temporal integration increases both numeric efficiency and accuracy of the solutions. Flux limiting produces accurate, oscillation-free solutions on coarse meshes, where much finer meshes are required to obtain solutions with equivalent accuracy using upstream weighting. The computational efficiency of the software is investigated using CPUs and GPUs on a high-performance workstation. The GPU version offers considerable speedup over the CPU version, with one GPU giving speedup factor of 3 over the eight-core CPU implementation.
Resumo:
The amount of data contained in electroencephalogram (EEG) recordings is quite massive and this places constraints on bandwidth and storage. The requirement of online transmission of data needs a scheme that allows higher performance with lower computation. Single channel algorithms, when applied on multichannel EEG data fail to meet this requirement. While there have been many methods proposed for multichannel ECG compression, not much work appears to have been done in the area of multichannel EEG. compression. In this paper, we present an EEG compression algorithm based on a multichannel model, which gives higher performance compared to other algorithms. Simulations have been performed on both normal and pathological EEG data and it is observed that a high compression ratio with very large SNR is obtained in both cases. The reconstructed signals are found to match the original signals very closely, thus confirming that diagnostic information is being preserved during transmission.
Resumo:
Energy consumption has become a major constraint in providing increased functionality for devices with small form factors. Dynamic voltage and frequency scaling has been identified as an effective approach for reducing the energy consumption of embedded systems. Earlier works on dynamic voltage scaling focused mainly on performing voltage scaling when the CPU is waiting for memory subsystem or concentrated chiefly on loop nests and/or subroutine calls having sufficient number of dynamic instructions. This paper concentrates on coarser program regions and for the first time uses program phase behavior for performing dynamic voltage scaling. Program phases are annotated at compile time with mode switch instructions. Further, we relate the Dynamic Voltage Scaling Problem to the Multiple Choice Knapsack Problem, and use well known heuristics to solve it efficiently. Also, we develop a simple integer linear program formulation for this problem. Experimental evaluation on a set of media applications reveal that our heuristic method obtains a 38% reduction in energy consumption on an average, with a performance degradation of 1% and upto 45% reduction in energy with a performance degradation of 5%. Further, the energy consumed by the heuristic solution is within 1% of the optimal solution obtained from the ILP approach.
Resumo:
Based on dynamic inversion, a relatively straightforward approach is presented in this paper for nonlinear flight control design of high performance aircrafts, which does not require the normal and lateral acceleration commands to be first transferred to body rates before computing the required control inputs. This leads to substantial improvement of the tracking response. Promising results are obtained from six degree-offreedom simulation studies of F-16 aircraft, which are found to be superior as compared to an existing approach (which is also based on dynamic inversion). The new approach has two potential benefits, namely reduced oscillatory response (including elimination of non-minimum phase behavior) and reduced control magnitude. Next, a model-following neuron-adaptive design is augmented the nominal design in order to assure robust performance in the presence of parameter inaccuracies in the model. Note that in the approach the model update takes place adaptively online and hence it is philosophically similar to indirect adaptive control. However, unlike a typical indirect adaptive control approach, there is no need to update the individual parameters explicitly. Instead the inaccuracy in the system output dynamics is captured directly and then used in modifying the control. This leads to faster adaptation, which helps in stabilizing the unstable plant quicker. The robustness study from a large number of simulations shows that the adaptive design has good amount of robustness with respect to the expected parameter inaccuracies in the model.
Resumo:
XII, 116 p.
Resumo:
Quality control is considered from the simulator's perspective through comparative simulation of an ultra energy-efficient building with EE4-DOE2.1E and EnergyPlus. The University of Calgary's Leadership in Energy and Environmental Design Platinum Child Development Centre, with a 66% certified energy cost reduction rating, was the case study building. A Natural Resources Canada incentive program required use of EE4 interface with DOE2.1E simulation engine for energy modelling. As DOE2.1E lacks specific features to simulate advanced systems such as radiant cooling in the CDC, an EnergyPlus model was developed to further evaluate these features. The EE4-DOE2.1E model was used for quality control during development of the base EnergyPlus model and simulation results were compared. Advanced energy systems then added to the EnergyPlus model generated small difference in estimated total annual energy use. The comparative simulation process helped identify the main input errors in the draft EnergyPlus model. The comparative use of less complex simulation programs is recommended for quality control when producing more complex models. © 2009 International Building Performance Simulation Association (IBPSA).
Resumo:
We present a system for keyword search on Cantonese conversational telephony audio, collected for the IARPA Babel program, that achieves good performance by combining postings lists produced by diverse speech recognition systems from three different research groups. We describe the keyword search task, the data on which the work was done, four different speech recognition systems, and our approach to system combination for keyword search. We show that the combination of four systems outperforms the best single system by 7%, achieving an actual term-weighted value of 0.517. © 2013 IEEE.
Resumo:
A new evanescently-coupled uni-traveling-carrier photodiode (EC-UTC PD) based on a multimode diluted waveguide (MDW) structure is fabricated, analysed and characterized. Optical and electrical characteristics of the device are investigated. The excellent characteristics are demonstrated such as a responsivity of 0.36 A/W, a bandwidth of 11.5 GHz and a small-signal 1-dB compression current greater than 18 mA at 10 GHz. The saturation current is significantly improved compared with those of similar evanescently-coupled pin photodiodes. The radio frequency (RF) bandwidth can be further improved by eliminating RF losses induced by the cables, the probe and the bias tee between the photodiode and the spectrum analyzer.
Resumo:
A 100-μm-long electroabsorption modulator monolithically integrated with passive waveguides at the input and output ports is fabricated through ion implantation induced quantum well intermixing, using only a two-step low-pressure metal-organic vapor phase epitaxial process. An InGaAsP/InGaAsP intra-step quantum well is introduced to the active region to improve the modulation properties. In the experiment high modulation speed and high extinction ratio are obtained simultaneously, the electrical-to-optical frequency response (E/O response) without any load termination reaches to 22 GHz, and extinction ration is as high as 16 dB.
Resumo:
Parallelizing compilers have difficulty analysing and optimising complex code. To address this, some analysis may be delayed until run-time, and techniques such as speculative execution used. Furthermore, to enhance performance, a feedback loop may be setup between the compile time and run-time analysis systems, as in iterative compilation. To extend this, it is proposed that the run-time analysis collects information about the values of variables not already determined, and estimates a probability measure for the sampled values. These measures may be used to guide optimisations in further analyses of the program. To address the problem of variables with measures as values, this paper also presents an outline of a novel combination of previous probabilistic denotational semantics models, applied to a simple imperative language.
Resumo:
A high-sample rate 3D median filtering processor architecture is proposed, based on a novel 3D median filtering algorithm, that can reduce the computing complexity in comparison with the traditional bubble sorting algorithm. A 3 x 3 x 3 filter processor is implemented in VHDL, and the simulation verifies that the processor can process a 128 x 128 x 96 MRI image in 0.03 seconds while running at 50 MHz.