70 resultados para Parallel design patterns
Resumo:
The design cycle for complex special-purpose computing systems is extremely costly and time-consuming. It involves a multiparametric design space exploration for optimization, followed by design verification. Designers of special purpose VLSI implementations often need to explore parameters, such as optimal bitwidth and data representation, through time-consuming Monte Carlo simulations. A prominent example of this simulation-based exploration process is the design of decoders for error correcting systems, such as the Low-Density Parity-Check (LDPC) codes adopted by modern communication standards, which involves thousands of Monte Carlo runs for each design point. Currently, high-performance computing offers a wide set of acceleration options that range from multicore CPUs to Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs). The exploitation of diverse target architectures is typically associated with developing multiple code versions, often using distinct programming paradigms. In this context, we evaluate the concept of retargeting a single OpenCL program to multiple platforms, thereby significantly reducing design time. A single OpenCL-based parallel kernel is used without modifications or code tuning on multicore CPUs, GPUs, and FPGAs. We use SOpenCL (Silicon to OpenCL), a tool that automatically converts OpenCL kernels to RTL in order to introduce FPGAs as a potential platform to efficiently execute simulations coded in OpenCL. We use LDPC decoding simulations as a case study. Experimental results were obtained by testing a variety of regular and irregular LDPC codes that range from short/medium (e.g., 8,000 bit) to long length (e.g., 64,800 bit) DVB-S2 codes. We observe that, depending on the design parameters to be simulated, on the dimension and phase of the design, the GPU or FPGA may suit different purposes more conveniently, thus providing different acceleration factors over conventional multicore CPUs.
Resumo:
The design, fabrication, and measured results are presented for a reconfigurable reflectarray antenna based on liquid crystals (LCs)which operates above 100 GHz. The antenna has been designed to provide beam scanning capabilities over a wide angular range, a large bandwidth,and reduced side-lobe level (SLL). Measured radiation patterns are in good agreement with simulations, and show that the antenna generates an electronically steerable beam in one plane over an angular range of 55◦ in the frequency band from 96 to 104 GHz. The SLL is lower than −13 dB for all the scan angles and −18 dB is obtained over 16% of the scan range. The measured performance is significantly better than previously published results for this class of electronically tunable antenna, and moreover, veri-fies the accuracy of the proposed procedure for LC modeling and antenna design.
Resumo:
This paper presents a new variant of broadband Doherty power amplifier that employs a novel output combiner. A new parameter ∝ is introduced to permit a generalized analysis of the recently reported Parallel Doherty power amplifier (PDPA),and hence offer design flexibility. The circuit prototype of the new DPA fabricated using GaN devices exhibits maximum drain efficiency of 85% at 43-dBm peak power and 63% at 6-dB backoff power (BOP). Measured drain efficiency of >60% at peak power across 500-MHz frequency range and >50% at 6-dB BOP across 480-MHz frequency range were achieved, confirming the theoretical wideband characteristics of the new DPA.
Resumo:
As a newly invented parallel kinematic machine (PKM), Exechon has attracted intensive attention from both academic and industrial fields due to its conceptual high performance. Nevertheless, the dynamic behaviors of Exechon PKM have not been thoroughly investigated because of its structural and kinematic complexities. To identify the dynamic characteristics of Exechon PKM, an elastodynamic model is proposed with the substructure synthesis technique in this paper. The Exechon PKM is divided into a moving platform subsystem, a fixed base subsystem and three limb subsystems according to its structural features. Differential equations of motion for the limb subsystem are derived through finite element (FE) formulations by modeling the complex limb structure as a spatial beam with corresponding geometric cross sections. Meanwhile, revolute, universal, and spherical joints are simplified into virtual lumped springs associated with equivalent stiffnesses and mass at their geometric centers. Differential equations of motion for the moving platform are derived with Newton's second law after treating the platform as a rigid body due to its comparatively high rigidity. After introducing the deformation compatibility conditions between the platform and the limbs, governing differential equations of motion for Exechon PKM are derived. The solution to characteristic equations leads to natural frequencies and corresponding modal shapes of the PKM at any typical configuration. In order to predict the dynamic behaviors in a quick manner, an algorithm is proposed to numerically compute the distributions of natural frequencies throughout the workspace. Simulation results reveal that the lower natural frequencies are strongly position-dependent and distributed axial-symmetrically due to the structure symmetry of the limbs. At the last stage, a parametric analysis is carried out to identify the effects of structural, dimensional, and stiffness parameters on the system's dynamic characteristics with the purpose of providing useful information for optimal design and performance improvement of the Exechon PKM. The elastodynamic modeling methodology and dynamic analysis procedure can be well extended to other overconstrained PKMs with minor modifications.
Resumo:
Importance Limited information exists about the epidemiology, recognition, management, and outcomes of patients with the acute respiratory distress syndrome (ARDS).
Objectives To evaluate intensive care unit (ICU) incidence and outcome of ARDS and to assess clinician recognition, ventilation management, and use of adjuncts—for example prone positioning—in routine clinical practice for patients fulfilling the ARDS Berlin Definition.
Design, Setting, and Participants The Large Observational Study to Understand the Global Impact of Severe Acute Respiratory Failure (LUNG SAFE) was an international, multicenter, prospective cohort study of patients undergoing invasive or noninvasive ventilation, conducted during 4 consecutive weeks in the winter of 2014 in a convenience sample of 459 ICUs from 50 countries across 5 continents.
Exposures Acute respiratory distress syndrome.
Main Outcomes and Measures The primary outcome was ICU incidence of ARDS. Secondary outcomes included assessment of clinician recognition of ARDS, the application of ventilatory management, the use of adjunctive interventions in routine clinical practice, and clinical outcomes from ARDS.
Results Of 29 144 patients admitted to participating ICUs, 3022 (10.4%) fulfilled ARDS criteria. Of these, 2377 patients developed ARDS in the first 48 hours and whose respiratory failure was managed with invasive mechanical ventilation. The period prevalence of mild ARDS was 30.0% (95% CI, 28.2%-31.9%); of moderate ARDS, 46.6% (95% CI, 44.5%-48.6%); and of severe ARDS, 23.4% (95% CI, 21.7%-25.2%). ARDS represented 0.42 cases per ICU bed over 4 weeks and represented 10.4% (95% CI, 10.0%-10.7%) of ICU admissions and 23.4% of patients requiring mechanical ventilation. Clinical recognition of ARDS ranged from 51.3% (95% CI, 47.5%-55.0%) in mild to 78.5% (95% CI, 74.8%-81.8%) in severe ARDS. Less than two-thirds of patients with ARDS received a tidal volume 8 of mL/kg or less of predicted body weight. Plateau pressure was measured in 40.1% (95% CI, 38.2-42.1), whereas 82.6% (95% CI, 81.0%-84.1%) received a positive end-expository pressure (PEEP) of less than 12 cm H2O. Prone positioning was used in 16.3% (95% CI, 13.7%-19.2%) of patients with severe ARDS. Clinician recognition of ARDS was associated with higher PEEP, greater use of neuromuscular blockade, and prone positioning. Hospital mortality was 34.9% (95% CI, 31.4%-38.5%) for those with mild, 40.3% (95% CI, 37.4%-43.3%) for those with moderate, and 46.1% (95% CI, 41.9%-50.4%) for those with severe ARDS.
Conclusions and Relevance Among ICUs in 50 countries, the period prevalence of ARDS was 10.4% of ICU admissions. This syndrome appeared to be underrecognized and undertreated and associated with a high mortality rate. These findings indicate the potential for improvement in the management of patients with ARDS.
Resumo:
In this paper, a novel nanolens with super resolution, based on the photon nanojet effect through dielectric nanostructures in visible wavelengths, is proposed. The nanolens is made from plastic SU-8, consisting of parallel semi-cylinders in an array. This paper focuses on the lens designed by numerical simulation with the finite-difference time domain method and nanofabrication of the lens by grayscale electron beam lithography combined with a casting/bonding/lift-off transfer process. Monte Carlo simulation for injected charge distribution and development modeling was applied to define the resultant 3D profile in PMMA as the template for the lens shape. After the casting/bonding/lift-off process, the fabricated nanolens in SU-8 has the desired lens shape, very close to that of PMMA, indicating that the pattern transfer process developed in this work can be reliably applied not only for the fabrication of the lens but also for other 3D nanopatterns in general. The light distribution through the lens near its surface was initially characterized by a scanning near-field optical microscope, showing a well defined focusing image of designed grating lines. Such focusing function supports the great prospects of developing a novel nanolithography based on the photon nanojet effect.
Resumo:
An analysis of the operation of a new series-L/parallel-tuned Class-E amplifier and its equivalence to the classic shunt-C/series-tuned Class-E amplifier are presented. The first reported closed form design equations for the series-L/parallel-tuned topology operating under ideal switching conditions are given, including the switch current and voltage in steady state, the circuit component values, the peak values of switch current and voltage and the power-output capability. Theoretical analysis is confirmed by numerical simulation for a 500 mW (27 dBm), 10% bandwidth, 5 V series-L/parallel-tuned, then, shunt-C/series-tuned Class-E power amplifier, operating at 2.5 GHz. Excellent agreement between theory and simulation results is achieved.
Resumo:
In the highly competitive world of modern finance, new derivatives are continually required to take advantage of changes in financial markets, and to hedge businesses against new risks. The research described in this paper aims to accelerate the development and pricing of new derivatives in two different ways. Firstly, new derivatives can be specified mathematically within a general framework, enabling new mathematical formulae to be specified rather than just new parameter settings. This Generic Pricing Engine (GPE) is expressively powerful enough to specify a wide range of stand¬ard pricing engines. Secondly, the associated price simulation using the Monte Carlo method is accelerated using GPU or multicore hardware. The parallel implementation (in OpenCL) is automatically derived from the mathematical description of the derivative. As a test, for a Basket Option Pricing Engine (BOPE) generated using the GPE, on the largest problem size, an NVidia GPU runs the generated pricing engine at 45 times the speed of a sequential, specific hand-coded implementation of the same BOPE. Thus a user can more rapidly devise, simulate and experiment with new derivatives without actual programming.
Resumo:
In this paper we advocate the Loop-of-stencil-reduce pattern as a way to simplify the parallel programming of heterogeneous platforms (multicore+GPUs). Loop-of-Stencil-reduce is general enough to subsume map, reduce, map-reduce, stencil, stencil-reduce, and, crucially, their usage in a loop. It transparently targets (by using OpenCL) combinations of CPU cores and GPUs, and it makes it possible to simplify the deployment of a single stencil computation kernel on different GPUs. The paper discusses the implementation of Loop-of-stencil-reduce within the FastFlow parallel framework, considering a simple iterative data-parallel application as running example (Game of Life) and a highly effective parallel filter for visual data restoration to assess performance. Thanks to the high-level design of the Loop-of-stencil-reduce, it was possible to run the filter seamlessly on a multicore machine, on multi-GPUs, and on both.
Resumo:
We advocate the Loop-of-stencil-reduce pattern as a means of simplifying the implementation of data-parallel programs on heterogeneous multi-core platforms. Loop-of-stencil-reduce is general enough to subsume map, reduce, map-reduce, stencil, stencil-reduce, and, crucially, their usage in a loop in both data-parallel and streaming applications, or a combination of both. The pattern makes it possible to deploy a single stencil computation kernel on different GPUs. We discuss the implementation of Loop-of-stencil-reduce in FastFlow, a framework for the implementation of applications based on the parallel patterns. Experiments are presented to illustrate the use of Loop-of-stencil-reduce in developing data-parallel kernels running on heterogeneous systems.