961 resultados para graphics processing unit (GPU)


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Adaptive Mesh Refinement is a method which dynamically varies the spatio-temporal resolution of localized mesh regions in numerical simulations, based on the strength of the solution features. In-situ visualization plays an important role for analyzing the time evolving characteristics of the domain structures. Continuous visualization of the output data for various timesteps results in a better study of the underlying domain and the model used for simulating the domain. In this paper, we develop strategies for continuous online visualization of time evolving data for AMR applications executed on GPUs. We reorder the meshes for computations on the GPU based on the users input related to the subdomain that he wants to visualize. This makes the data available for visualization at a faster rate. We then perform asynchronous executions of the visualization steps and fix-up operations on the CPUs while the GPU advances the solution. By performing experiments on Tesla S1070 and Fermi C2070 clusters, we found that our strategies result in 60% improvement in response time and 16% improvement in the rate of visualization of frames over the existing strategy of performing fix-ups and visualization at the end of the timesteps.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Phase-locked loops (PLLs) are necessary in applications which require grid synchronization. Presence of unbalance or harmonics in the grid voltage creates errors in the estimated frequency and angle of a PLL. The error in estimated angle has the effect of distorting the unit vectors generated by the PLL. In this paper, analytical expressions are derived which determine the error in the phase angle estimated by a PLL when there is unbalance and harmonics in the grid voltage. By using the derived expressions, the total harmonic distortion (THD) and the fundamental phase error of the unit vectors can be determined for a given PLL topology and a given level of unbalance and distortion in the grid voltage. The accuracy of the results obtained from the analytical expressions is validated with the simulation and experimental results for synchronous reference frame PLL (SRF-PLL). Based on these expressions, a new tuning method for the SRF-PLL is proposed which quantifies the tradeoff between the unit vector THD and the bandwidth of the SRF-PLL. Using this method, the exact value of the bandwidth of the SRF-PLL can be obtained for a given worst case grid voltage unbalance and distortion to have an acceptable level of unit vector THD. The tuning method for SRF-PLL is also validated experimentally.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we consider the inference for the component and system lifetime distribution of a k-unit parallel system with independent components based on system data. The components are assumed to have identical Weibull distribution. We obtain the maximum likelihood estimates of the unknown parameters based on system data. The Fisher information matrix has been derived. We propose -expectation tolerance interval and -content -level tolerance interval for the life distribution of the system. Performance of the estimators and tolerance intervals is investigated via simulation study. A simulated dataset is analyzed for illustration.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In order to reduce the motion artifacts in DSA, non-rigid image registration is commonly used before subtracting the mask from the contrast image. Since DSA registration requires a set of spatially non-uniform control points, a conventional MRF model is not very efficient. In this paper, we introduce the concept of pivotal and non-pivotal control points to address this, and propose a non-uniform MRF for DSA registration. We use quad-trees in a novel way to generate the non-uniform grid of control points. Our MRF formulation produces a smooth displacement field and therefore results in better artifact reduction than that of registering the control points independently. We achieve improved computational performance using pivotal control points without compromising on the artifact reduction. We have tested our approach using several clinical data sets, and have presented the results of quantitative analysis, clinical assessment and performance improvement on a GPU. (C) 2013 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The design and development of a Bottom Pressure Recorder for a Tsunami Early Warning System is described here. The special requirements that it should satisfy for the specific application of deployment at ocean bed and pressure monitoring of the water column above are dealt with. A high-resolution data digitization and low circuit power consumption are typical ones. The implementation details of the data sensing and acquisition part to meet these are also brought out. The data processing part typically encompasses a Tsunami detection algorithm that should detect an event of significance in the background of a variety of periodic and aperiodic noise signals. Such an algorithm and its simulation are presented. Further, the results of sea trials carried out on the system off the Chennai coast are presented. The high quality and fidelity of the data prove that the system design is robust despite its low cost and with suitable augmentations, is ready for a full-fledged deployment at ocean bed. (C) 2013 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Energy research is to a large extent materials research, encompassing the physics and chemistry of materials, including their synthesis, processing toward components and design toward architectures, allowing for their functionality as energy devices, extending toward their operation parameters and environment, including also their degradation, limited life, ultimate failure and potential recycling. In all these stages, X-ray and electron spectroscopy are helpful methods for analysis, characterization and diagnostics for the engineer and for the researcher working in basic science.This paper gives a short overview of experiments with X-ray and electron spectroscopy for solar energy and water splitting materials and addresses also the issue of solar fuel, a relatively new topic in energy research. The featured systems are iron oxide and tungsten oxide as photoanodes, and hydrogenases as molecular systems. We present surface and subsurface studies with ambient pressure XPS and hard X-ray XPS, resonant photoemission, light induced effects in resonant photoemission experiments and a photo-electrochemical in situ/operando NEXAFS experiment in a liquid cell, and nuclear resonant vibrational spectroscopy (NRVS). (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Among the armoury of photovoltaic materials, thin film heterojunction photovoltaics continue to be a promising candidate for solar energy conversion delivering a vast scope in terms of device design and fabrication. Their production does not require expensive semiconductor substrates and high temperature device processing, which allows reduced cost per unit area while maintaining reasonable efficiency. In this regard, superstrate CdTe/CdS solar cells are extensively investigated because of their suitable bandgap alignments, cost effective methods of production at large scales and stability against proton/electron irradiation. The conversion efficiencies in the range of 6-20% are achieved by structuring the device by varying the absorber/window layer thickness, junction activation/annealing steps, with more suitable front/back contacts, preparation techniques, doping with foreign ions, etc. This review focuses on fundamental and critical aspects like: (a) choice of CdS window layer and CdTe absorber layer; (b) drawbacks associated with the device including environmental problems, optical absorption losses and back contact barriers; (c) structural dynamics at CdS-CdTe interface; (d) influence of junction activation process by CdCl2 or HCF2Cl treatment; (e) interface and grain boundary passivation effects; (f) device degradation due to impurity diffusion and stress; (g) fabrication with suitable front and back contacts; (h) chemical processes occurring at various interfaces; (i) strategies and modifications developed to improve their efficiency. The complexity involved in understanding the multiple aspects of tuning the solar cell efficiency is reviewed in detail by considering the individual contribution from each component of the device. It is expected that this review article will enrich the materials aspects of CdTe/CdS devices for solar energy conversion and stimulate further innovative research interest on this intriguing topic.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Presented in this paper is an improvement over a spring-steel dual-axis accelerometer that we had reported earlier.The fabrication process (which entails wire-cut electro discharge machining of easily accessible and inexpensive spring-steelfoil) and the sensing of the displacement (which is done using off-the-shelf Hall-effect sensors) remain the same. Theimprovements reported here are twofold: (i) the footprint of the packaged accelerometer is reduced from 80 mm square to 40mm square, and (ii) almost perfect de-coupling and symmetry are achieved between the two in-plane axes of the packageddevice as opposed to the previous embodiment where this was not the case. Good linearity with about 40 mV/g was measuredalong both the in-plane axes over a range of 0.1 to 1 g. The first two natural frequencies of the devices are at 30 Hz and 100Hz, respectively, as per the experiment. The highlights of this work are cost-effective processing, easy integration of the Hall-effect sensing capability on a customised printed circuit board, and inexpensive packaging without overly compromising eitherthe overall size or the sensitivity of the accelerometer. Through this work, we have reaffirmed the practicability of spring-steelaccelerometers towards the eventual goal of making it compete with micro machined silicon accelerometers in terms of sizeand performance. The cost is likely to be much lower for the spring-steel accelerometers than that of silicon accelerometers, especially when the volume of production is low and the sensor is to be used as a single packaged unit.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Phase-locked loops (PLLs) are necessary in grid connected systems to obtain information about the frequency, amplitude and phase of the grid voltage. In stationary reference frame control, the unit vectors of PLLs are used for reference generation. It is important that the PLL performance is not affected significantly when grid voltage undergoes amplitude and frequency variations. In this paper, a novel design for the popular single-phase PLL topology, namely the second-order generalized integrator (SOGI) based PLL is proposed which achieves minimum settling time during grid voltage amplitude and frequency variations. The proposed design achieves a settling time of less than 27.7 ms. This design also ensures that the unit vectors generated by this PLL have a steady state THD of less than 1% during frequency variations of the grid voltage. The design of the SOGI-PLL based on the theoretical analysis is validated by experimental results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper analyses the results of experiments on the propagation rate in a fuel bed under gasification conditions in a co-current reactor configuration. Experiments using wood chips with different values of moisture content have been conducted under gasification conditions. The influence of air mass flux on the propagation rate, peak temperature and gas quality is investigated. It is observed from the experiments that the flame front propagation rate initially increases as the air mass flux increased, reaching a peak propagation rate, and further increase in the air mass flux results in a decrease in the propagation rate. However, the bed movement increases with the increase in air mass flux. The experimental results provide an understanding on influence of the fuel properties on propagation front. The surface area per unit volume of the particles in the packed bed plays an important role in the propagation rate. It has been argued that the flaming pyrolysis contributes towards the flame propagation as opposed to the overall combustion process in a packed bed. The calorific value of the producer gas generated is nearly the same over the entire range of air mass flux for bone-dry and 10% moist wood. (C) 2014 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pyridoxal kinase (PdxK; EC 2.7.1.35) belongs to the phosphotransferase family of enzymes and catalyzes the conversion of the three active forms of vitamin B-6, pyridoxine, pyridoxal and pyridoxamine, to their phosphorylated forms and thereby plays a key role in pyridoxal 5 `-phosphate salvage. In the present study, pyridoxal kinase from Salmonella typhimurium was cloned and overexpressed in Escherichia coli, purified using Ni-NTA affinity chromatography and crystallized. X-ray diffraction data were collected to 2.6 angstrom resolution at 100 K. The crystal belonged to the primitive orthorhombic space group P2(1)2(1)2(1), with unitcell parameters a = 65.11, b = 72.89, c = 107.52 angstrom. The data quality obtained by routine processing was poor owing to the presence of strong diffraction rings caused by a polycrystalline material of an unknown small molecule in all oscillation images. Excluding the reflections close to powder/polycrystalline rings provided data of sufficient quality for structure determination. A preliminary structure solution has been obtained by molecular replacement with the Phaser program in the CCP4 suite using E. coli pyridoxal kinase (PDB entry 2ddm) as the phasing model. Further refinement and analysis of the structure are likely to provide valuable insights into catalysis by pyridoxal kinases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop a communication theoretic framework for modeling 2-D magnetic recording channels. Using the model, we define the signal-to-noise ratio (SNR) for the channel considering several physical parameters, such as the channel bit density, code rate, bit aspect ratio, and noise parameters. We analyze the problem of optimizing the bit aspect ratio for maximizing SNR. The read channel architecture comprises a novel 2-D joint self-iterating equalizer and detection system with noise prediction capability. We evaluate the system performance based on our channel model through simulations. The coded performance with the 2-D equalizer detector indicates similar to 5.5 dB of SNR gain over uncoded data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study is aimed toward obtaining near spherical microstructural features of Rheocast A380 aluminum alloy. Cooling slope (CS) technique has been used to generate semisolid slurry from the superheated alloy melt. Spherodization of primary grains is the heart of semisolid processing which improves mechanical properties significantly in the parts cast from semisolid state compared to the conventional casting processes. Keeping in view of the desired microstructural morphology, i.e., rosette or spherical shape of primary alpha-Al phase, successive slurry samples have been collected during melt flow and oil quenched to investigate the microstructure evolution mechanism. Conventionally cast A380 Al alloy sample shows dendritic grains surrounded by large eutectic phase whereas finer, near spherical grains have been observed within the cooling slope processed slurry and also in the solidified castings which confirms the effectiveness of semisolid processing of the alloy following cooling slope technique. Grain refiner addition into the alloy melt is found to have favorable effect which leads to the generation of finer primary grains within the slurry with higher degree of sphericity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multi-GPU machines are being increasingly used in high-performance computing. Each GPU in such a machine has its own memory and does not share the address space either with the host CPU or other GPUs. Hence, applications utilizing multiple GPUs have to manually allocate and manage data on each GPU. Existing works that propose to automate data allocations for GPUs have limitations and inefficiencies in terms of allocation sizes, exploiting reuse, transfer costs, and scalability. We propose a scalable and fully automatic data allocation and buffer management scheme for affine loop nests on multi-GPU machines. We call it the Bounding-Box-based Memory Manager (BBMM). BBMM can perform at runtime, during standard set operations like union, intersection, and difference, finding subset and superset relations on hyperrectangular regions of array data (bounding boxes). It uses these operations along with some compiler assistance to identify, allocate, and manage data required by applications in terms of disjoint bounding boxes. This allows it to (1) allocate exactly or nearly as much data as is required by computations running on each GPU, (2) efficiently track buffer allocations and hence maximize data reuse across tiles and minimize data transfer overhead, and (3) and as a result, maximize utilization of the combined memory on multi-GPU machines. BBMM can work with any choice of parallelizing transformations, computation placement, and scheduling schemes, whether static or dynamic. Experiments run on a four-GPU machine with various scientific programs showed that BBMM reduces data allocations on each GPU by up to 75% compared to current allocation schemes, yields performance of at least 88% of manually written code, and allows excellent weak scaling.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose a fully parallel 64K point radix-4(4) FFT processor. The radix-4(4) parallel unrolled architecture uses a novel radix-4 butterfly unit which takes all four inputs in parallel and can selectively produce one out of the four outputs. The radix-4(4) block can take all 256 inputs in parallel and can use the select control signals to generate one out of the 256 outputs. The resultant 64K point FFT processor shows significant reduction in intermediate memory but with increased hardware complexity. Compared to the state-of-art implementation 5], our architecture shows reduced latency with comparable throughput and area. The 64K point FFT architecture was synthesized using a 130nm CMOS technology which resulted in a throughput of 1.4 GSPS and latency of 47.7 mu s with a maximum clock frequency of 350MHz. When compared to 5], the latency is reduced by 303 mu s with 50.8% reduction in area.