274 resultados para parallel processing
Resumo:
We study the problem of minimizing total completion time on single and parallel batch processing machines. A batch processing machine is one which can process up to B jobs simultaneously. The processing time of a batch is equal to the largest processing time among all jobs in the batch. This problem is motivated by burn-in operations in the final testing stage of semiconductor manufacturing and is expected to occur in other production environments. We provide an exact solution procedure for the single-machine problem and heuristic algorithms for both single and parallel machine problems. While the exact algorithms have limited applicability due to high computational requirements, extensive experiments show that the heuristics are capable of consistently obtaining near-optimal solutions in very reasonable CPU times.
Resumo:
Kinetic data on inhibition of protein synthesis in thymocyte by three abrins and ricin have been obtained. The intrinsic efficiencies of A chains of four toxins to inactivate ribosomes, as analyzed by k1-versus-concentration plots were abrin II, III > ricin > abrin I. The lag times were 90, 66, 75 and 105 min at a 0.0744 nM concentration of each of abrin I, II, III and ricin, respectively. To account for the observed differences in the dose-dependent lag time, functional and structural variables of toxins such as binding efficiency of B chains to receptors and low-pH-induced structural alterations have been analyzed. The association constants obtained by stopped flow studies showed that abrin-I (4.13 × 105 M−1 s−1) association with putative receptor (4-methylumbelliferyl-α-D-galactoside) is nearly two times more often than abrin III (2.6 × 105 M−1 s−1) at 20°C. Equillibrium binding constants of abrin I and II to thymocyte at 37°C were 2.26 × 107 M−1 and 2.8 × 107 M−1 respectively. pH-induced structural alterations as studied by a parallel enhancement in 8-anilino-L-naphthalene sulfonate fluorescence revealed a high degree of qualitative similarity. These results taken with a nearly identical concentration-independent lag time (minimum lag of 41–42 min) indicated that the binding efficiencies and internalization efficiencies of these toxins are the same and that the observed difference in the dose-dependent lag time is causally related to the proposed processing event. The rates of reduction of inter-subunit disulfide bond, an obligatory step in the intoxication process, have been measured and compared under a variety of conditions. Intersubunit disulfide reduction of abrin I is fourfold faster than that of abrin II at pH 7.2. The rate of disulfide reduction in abrin I could be decreased 1 I-fold by adding lactose, compared to that without lactose. The observed differences in the efficiencies of A chains, the dose-dependent lag period, the modulating effect of lactose on the rates of disulfide reduction and similarity in binding properties make the variants a valuable tool to probe the processing events in toxin transport in detail.
Resumo:
Workstation clusters equipped with high performance interconnect having programmable network processors facilitate interesting opportunities to enhance the performance of parallel application run on them. In this paper, we propose schemes where certain application level processing in parallel database query execution is performed on the network processor. We evaluate the performance of TPC-H queries executing on a high end cluster where all tuple processing is done on the host processor, using a timed Petri net model, and find that tuple processing costs on the host processor dominate the execution time. These results are validated using a small cluster. We therefore propose 4 schemes where certain tuple processing activity is offloaded to the network processor. The first 2 schemes offload the tuple splitting activity - computation to identify the node on which to process the tuples, resulting in an execution time speedup of 1.09 relative to the base scheme, but with I/O bus becoming the bottleneck resource. In the 3rd scheme in addition to offloading tuple processing activity, the disk and network interface are combined to avoid the I/O bus bottleneck, which results in speedups up to 1.16, but with high host processor utilization. Our 4th scheme where the network processor also performs apart of join operation along with the host processor, gives a speedup of 1.47 along with balanced system resource utilizations. Further we observe that the proposed schemes perform equally well even in a scaled architecture i.e., when the number of processors is increased from 2 to 64
Resumo:
Parallel sub-word recognition (PSWR) is a new model that has been proposed for language identification (LID) which does not need elaborate phonetic labeling of the speech data in a foreign language. The new approach performs a front-end tokenization in terms of sub-word units which are designed by automatic segmentation, segment clustering and segment HMM modeling. We develop PSWR based LID in a framework similar to the parallel phone recognition (PPR) approach in the literature. This includes a front-end tokenizer and a back-end language model, for each language to be identified. Considering various combinations of the statistical evaluation scores, it is found that PSWR can perform as well as PPR, even with broad acoustic sub-word tokenization, thus making it an efficient alternative to the PPR system.
Resumo:
Many common activities, like reading, scanning scenes, or searching for an inconspicuous item in a cluttered environment, entail serial movements of the eyes that shift the gaze from one object to another. Previous studies have shown that the primate brain is capable of programming sequential saccadic eye movements in parallel. Given that the onset of saccades directed to a target are unpredictable in individual trials, what prevents a saccade during parallel programming from being executed in the direction of the second target before execution of another saccade in the direction of the first target remains unclear. Using a computational model, here we demonstrate that sequential saccades inhibit each other and share the brain's limited processing resources (capacity) so that the planning of a saccade in the direction of the first target always finishes first. In this framework, the latency of a saccade increases linearly with the fraction of capacity allocated to the other saccade in the sequence, and exponentially with the duration of capacity sharing. Our study establishes a link between the dual-task paradigm and the ramp-to-threshold model of response time to identify a physiologically viable mechanism that preserves the serial order of saccades without compromising the speed of performance.
Resumo:
With proliferation of chip multicores (CMPs) on desktops and embedded platforms, multi-threaded programs have become ubiquitous. Existence of multiple threads may cause resource contention, such as, in on-chip shared cache and interconnects, depending upon how they access resources. Hence, we propose a tool - Thread Contention Predictor (TCP) to help quantify the number of threads sharing data and their sharing pattern. We demonstrate its use to predict a more profitable shared, last level on-chip cache (LLC) access policy on CMPs. Our cache configuration predictor is 2.2 times faster compared to the cycle-accurate simulations. We also demonstrate its use for identifying hot data structures in a program which may cause performance degradation due to false data sharing. We fix layout of such data structures and show up-to 10% and 18% improvement in execution time and energy-delay product (EDP), respectively.
Resumo:
An area-efficient, wideband RF frequency synthesizer, which simultaneously generates multiple local oscillator (LO) signals, is designed. It is suitable for parallel wideband RF spectrum sensing in cognitive radios. The frequency synthesizer consists of an injection locked oscillator cascade (ILOC) where all the LO signals are derived from a single reference oscillator. The ILOC is implemented in a 130-nm technology with an active area of . It generates 4 uniformly spaced LO carrier frequencies from 500 MHz to 2 GHz. This design is the first known implementation of a CMOS based ILOC for wide-band RF spectrum sensing applications.
Resumo:
In this paper, we propose a new load distribution strategy called `send-and-receive' for scheduling divisible loads, in a linear network of processors with communication delay. This strategy is designed to optimally utilize the network resources and thereby minimizes the processing time of entire processing load. A closed-form expression for optimal size of load fractions and processing time are derived when the processing load originates at processor located in boundary and interior of the network. A condition on processor and link speed is also derived to ensure that the processors are continuously engaged in load distributions. This paper also presents a parallel implementation of `digital watermarking problem' on a personal computer-based Pentium Linear Network (PLN) topology. Experiments are carried out to study the performance of the proposed strategy and results are compared with other strategies found in literature.
Resumo:
The unsteady incompressible viscous fluid flow between two parallel infinite disks which are located at a distance h(t*) at time t* has been studied. The upper disk moves towards the lower disk with velocity h'(t*). The lower disk is porous and rotates with angular velocity Omega(t*). A magnetic field B(t*) is applied perpendicular to the two disks. It has been found that the governing Navier-Stokes equations reduce to a set of ordinary differential equations if h(t*), a(t*) and B(t*) vary with time t* in a particular manner, i.e. h(t*) = H(1 - alpha t*)(1/2), Omega(t*) = Omega(0)(1 - alpha t*)(-1), B(t*) = B-0(1 - alpha t*)(-1/2). These ordinary differential equations have been solved numerically using a shooting method. For small Reynolds numbers, analytical solutions have been obtained using a regular perturbation technique. The effects of squeeze Reynolds numbers, Hartmann number and rotation of the disk on the flow pattern, normal force or load and torque have been studied in detail
Resumo:
The development of a microstructure in 304L stainless steel during industrial hot-forming operations, including press forging (mean strain rate of 0.15 s(-1)), rolling/extrusion (2-5 s(-1)), and hammer forging (100 s(-1)) at different temperatures in the range 600-1200 degrees C, was studied with a view to validating the predictions of the processing map. The results have shown that excellent correlation exists between the regimes exhibited by the map and the product microstructures. 304L stainless steel exhibits instability bands when hammer forged at temperatures below 1100 degrees C, rolled/extruded below 1000 degrees C, or press forged below 800 degrees C. All of these conditions must be avoided in mechanical processing of the material. On the other hand, ideally, the material may be rolled, extruded, or press forged at 1200 degrees C to obtain a defect-free microstructure.
Resumo:
In this paper, we first recast the generalized symmetric eigenvalue problem, where the underlying matrix pencil consists of symmetric positive definite matrices, into an unconstrained minimization problem by constructing an appropriate cost function, We then extend it to the case of multiple eigenvectors using an inflation technique, Based on this asymptotic formulation, we derive a quasi-Newton-based adaptive algorithm for estimating the required generalized eigenvectors in the data case. The resulting algorithm is modular and parallel, and it is globally convergent with probability one, We also analyze the effect of inexact inflation on the convergence of this algorithm and that of inexact knowledge of one of the matrices (in the pencil) on the resulting eigenstructure. Simulation results demonstrate that the performance of this algorithm is almost identical to that of the rank-one updating algorithm of Karasalo. Further, the performance of the proposed algorithm has been found to remain stable even over 1 million updates without suffering from any error accumulation problems.
Diffraction Of Elastic Waves By Two Parallel Rigid Strips Embedded In An Infinite Orthotropic Medium
Resumo:
The elastodynamic response of a pair of parallel rigid strips embedded in an infinite orthotropic medium due to elastic waves incident normally on the strips has been investigated. The mixed boundary value problem has been solved by the Integral Equation method. The normal stress and the vertical displacement have been derived in closed form. Numerical values of stress intensity factors at inner and outer edges of the strips and vertical displacement at points in the plane of the strips for several orthotropic materials have been calculated and plotted graphically to show the effect of material orthotropy.
Resumo:
The hot deformation behavior of hot isostatically pressed (HIPd) P/M IN-100 superalloy has been studied in the temperature range 1000-1200 degrees C and strain rate range 0.0003-10 s(-1) using hot compression testing. A processing map has been developed on the basis of these data and using the principles of dynamic materials modelling. The map exhibited three domains: one at 1050 degrees C and 0.01 s(-1), with a peak efficiency of power dissipation of approximate to 32%, the second at 1150 degrees C and 10 s(-1), with a peak efficiency of approximate to 36% and the third at 1200 degrees C and 0.1 s(-1), with a similar efficiency. On the basis of optical and electron microscopic observations, the first domain was interpreted to represent dynamic recovery of the gamma phase, the second domain represents dynamic recrystallization (DRX) of gamma in the presence of softer gamma', while the third domain represents DRX of the gamma phase only. The gamma' phase is stable upto 1150 degrees C, gets deformed below this temperature and the chunky gamma' accumulates dislocations, which at larger strains cause cracking of this phase. At temperatures lower than 1080 degrees C and strain rates higher than 0.1 s(-1), the material exhibits flow instability, manifested in the form of adiabatic shear bands. The material may be subjected to mechanical processing without cracking or instabilities at 1200 degrees C and 0.1 s(-1), which are the conditions for DRX of the gamma phase.
Resumo:
An overview of the synthesis of materials under microwave irradiation has been presented based on the work performed recently. A variety of reactions such as direct combination, carbothermal reduction, carbidation and nitridation have been described. Examples of microwave preparation of glasses are also presented. Great advantages of fast, clean and reduced reaction temperature of microwave methods are emphasized. The example of ZrO2-CeO2 ceramics has been used show the extraordinarily fast and effective sintering which occurs in microwave irradiation.
Resumo:
Power dissipation maps have been generated in the temperature range of 900 degrees C to 1150 degrees C and strain rate range of 10(-3) to 10 s(-1) for a cast aluminide alloy Ti-24Al-20Nb using dynamic material model. The results define two distinct regimes of temperature and strain rate in which efficiency of power dissipation is maximum. The first region, centered around 975 degrees C/0.1 s(-1), is shown to correspond to dynamic recrystallization of the alpha(2) phase and the second, centered around 1150 degrees C/0.001 s(-1), corresponds to dynamic recovery and superplastic deformation of the beta phase. Thermal activation analysis using the power law creep equation yielded apparent activation energies of 854 and 627 kJ/mol for the first and second regimes, respectively. Reanalyzing the data by alternate methods yielded activation energies in the range of 170 to 220 kJ/mol and 220 to 270 kJ/mol for the first and second regimes, respectively. Cross slip was shown to constitute the activation barrier in both cases. Two distinct regimes of processing instability-one at high strain rates and the other at the low strain rates in the lower temperature regions-have been identified, within which shear bands are formed.