113 resultados para Federal High Performance Computing Program (U.S.)
Resumo:
The design cycle for complex special-purpose computing systems is extremely costly and time-consuming. It involves a multiparametric design space exploration for optimization, followed by design verification. Designers of special purpose VLSI implementations often need to explore parameters, such as optimal bitwidth and data representation, through time-consuming Monte Carlo simulations. A prominent example of this simulation-based exploration process is the design of decoders for error correcting systems, such as the Low-Density Parity-Check (LDPC) codes adopted by modern communication standards, which involves thousands of Monte Carlo runs for each design point. Currently, high-performance computing offers a wide set of acceleration options that range from multicore CPUs to Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs). The exploitation of diverse target architectures is typically associated with developing multiple code versions, often using distinct programming paradigms. In this context, we evaluate the concept of retargeting a single OpenCL program to multiple platforms, thereby significantly reducing design time. A single OpenCL-based parallel kernel is used without modifications or code tuning on multicore CPUs, GPUs, and FPGAs. We use SOpenCL (Silicon to OpenCL), a tool that automatically converts OpenCL kernels to RTL in order to introduce FPGAs as a potential platform to efficiently execute simulations coded in OpenCL. We use LDPC decoding simulations as a case study. Experimental results were obtained by testing a variety of regular and irregular LDPC codes that range from short/medium (e.g., 8,000 bit) to long length (e.g., 64,800 bit) DVB-S2 codes. We observe that, depending on the design parameters to be simulated, on the dimension and phase of the design, the GPU or FPGA may suit different purposes more conveniently, thus providing different acceleration factors over conventional multicore CPUs.
Resumo:
Traditional Chinese Medicines (TCMs) derived from animal horns are one of the most important types of Chinese medicine. In the present study, a fast and sensitive analytical method was established for qualitative and quantitative determination of 14 nucleosides and nucleobases in animal horns using hydrophilic interaction ultra-high performance liquid chromatography coupled with triple-quadruple tandem mass spectrometry (HILIC-UPLC-QQQ-MS/MS) in selective reaction monitoring (SRM) mode. The method was optimized and validated, and showed good linearity, precision, repeatability, and accuracy. The method was successfully used to determine contents of the 14 nucleosides and nucleobases in 25 animal horn samples. Hierarchical clustering analysis (HCA) and principal component analysis (PCA) were performed and the 25 samples were thereby divided into two groups, which agreed with taxonomy. The method may enable quick and effective search of substitutes for precious horns.
Resumo:
The design of a high-performance IIR (infinite impulse response) digital filter is described. The chip architecture operates on 11-b parallel, two's complement input data with a 12-b parallel two's complement coefficient to produce a 14-b two's complement output. The chip is implemented in 1.5-µm, double-layer-metal CMOS technology, consumes 0.5 W, and can operate up to 15 Msample/s. The main component of the system is a fine-grained systolic array that internally is based on a signed binary number representation (SBNR). Issues addressed include testing, clock distribution, and circuitry for conversion between two's complement and SBNR.
Resumo:
In this work, an economical route based on hydrothermal and layer-by-layer (LBL) self-assembly processes has been developed to synthesize unique Al 2O3-modified LiV3O8 nanosheets, comprising a core of LiV3O8 nanosheets and a thin Al 2O3 nanolayer. The thickness of the Al2O 3 nanolayer can be tuned by altering the LBL cycles. When evaluated for their lithium-storage properties, the 1 LBL Al2O 3-modified LiV3O8 nanosheets exhibit a high discharge capacity of 191 mA h g-1 at 300 mA g-1 (1C) over 200 cycles and excellent rate capability, demonstrating that enhanced physical and/or chemical properties can be achieved through proper surface modification. © 2014 Elsevier B.V. All rights reserved.
Resumo:
In this paper results are presented for a simple yet highly sensitive transceiver for phase modulated RFID applications. This is an advance on other simple RFID readers which can only operate with amplitude shift keyed (ASK) signals. Simple circuitry is achieved by the use of a novel injection locked PLL configuration which replaces the standard superhet type architecture normally used. The transceiver is shown to operate with a number of phase modulation modes which have certain advantages relating to distance to target. The paper concludes with practical results obtained for the transceiver when operated within a backscatter RFID application. A unique advantage of this transceiver is its complete immunity to the problem of TX/RX isolation, allowing for long ranges, estimated to be in the region of 80m at 1 GHz, to be achieved even in the presence of a simple backscatter target.
Resumo:
This paper reports on the accuracy of new test methods developed to measure the air and water permeability of high-performance concretes (HPCs). Five representative HPC and one normal concrete (NC) mixtures were tested to estimate both repeatability and reliability of the proposed methods. Repeatability acceptance was adjudged using values of signal-noise ratio (SNR) and discrimination ratio (DR), and reliability was investigated by comparing against standard laboratory-based test methods (i.e., the RILEM gas permeability test and BS EN water penetration test). With SNR and DR values satisfying recommended criteria, it was concluded that test repeatability error has no significant influence on results. In addition, the research confirmed strong positive relationships between the proposed test methods and existing standard permeability assessment techniques. Based on these findings, the proposed test methods show strong potential to become recognized as international methods for determining the permeability of HPCs.
Resumo:
Pre-processing (PP) of received symbol vector and channel matrices is an essential pre-requisite operation for Sphere Decoder (SD)-based detection of Multiple-Input Multiple-Output (MIMO) wireless systems. PP is a highly complex operation, but relative to the total SD workload it represents a relatively small fraction of the overall computational cost of detecting an OFDM MIMO frame in standards such as 802.11n. Despite this, real-time PP architectures are highly inefficient, dominating the resource cost of real-time SD architectures. This paper resolves this issue. By reorganising the ordering and QR decomposition sub operations of PP, we describe a Field Programmable Gate Array (FPGA)-based PP architecture for the Fixed Complexity Sphere Decoder (FSD) applied to 4 × 4 802.11n MIMO which reduces resource cost by 50% as compared to state-of-the-art solutions whilst maintaining real-time performance.
Resumo:
In the reinsurance market, the risks natural catastrophes pose to portfolios of properties must be quantified, so that they can be priced, and insurance offered. The analysis of such risks at a portfolio level requires a simulation of up to 800 000 trials with an average of 1000 catastrophic events per trial. This is sufficient to capture risk for a global multi-peril reinsurance portfolio covering a range of perils including earthquake, hurricane, tornado, hail, severe thunderstorm, wind storm, storm surge and riverine flooding, and wildfire. Such simulations are both computation and data intensive, making the application of high-performance computing techniques desirable.
In this paper, we explore the design and implementation of portfolio risk analysis on both multi-core and many-core computing platforms. Given a portfolio of property catastrophe insurance treaties, key risk measures, such as probable maximum loss, are computed by taking both primary and secondary uncertainties into account. Primary uncertainty is associated with whether or not an event occurs in a simulated year, while secondary uncertainty captures the uncertainty in the level of loss due to the use of simplified physical models and limitations in the available data. A combination of fast lookup structures, multi-threading and careful hand tuning of numerical operations is required to achieve good performance. Experimental results are reported for multi-core processors and systems using NVIDIA graphics processing unit and Intel Phi many-core accelerators.