Biblioteca Digital

205 resultados para parallel applications

Symmetrizing a Hessenberg matrix: Designs for VLSI parallel processor arrays

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A symmetrizer of a nonsymmetric matrix A is the symmetric matrix X that satisfies the equation XA = A(t)X, where t indicates the transpose. A symmetrizer is useful in converting a nonsymmetric eigenvalue problem into a symmetric one which is relatively easy to solve and finds applications in stability problems in control theory and in the study of general matrices. Three designs based on VLSI parallel processor arrays are presented to compute a symmetrizer of a lower Hessenberg matrix. Their scope is discussed. The first one is the Leiserson systolic design while the remaining two, viz., the double pipe design and the fitted diagonal design are the derived versions of the first design with improved performance.

Modeling of the cooling process on the runout table of a hot strip mill - A parallel approach

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper deals with the development of a new model for the cooling process on the runout table of hot strip mills, The suitability of different numerical methods for the solution of the proposed model equation from the point of view of accuracy and computation time are studied, Parallel solutions for the model equation are proposed.

A new parallel overlapped domain decomposition method for nonlinear dynamic finite element analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper a new parallel algorithm for nonlinear transient dynamic analysis of large structures has been presented. An unconditionally stable Newmark-beta method (constant average acceleration technique) has been employed for time integration. The proposed parallel algorithm has been devised within the broad framework of domain decomposition techniques. However, unlike most of the existing parallel algorithms (devised for structural dynamic applications) which are basically derived using nonoverlapped domains, the proposed algorithm uses overlapped domains. The parallel overlapped domain decomposition algorithm proposed in this paper has been formulated by splitting the mass, damping and stiffness matrices arises out of finite element discretisation of a given structure. A predictor-corrector scheme has been formulated for iteratively improving the solution in each step. A computer program based on the proposed algorithm has been developed and implemented with message passing interface as software development environment. PARAM-10000 MIMD parallel computer has been used to evaluate the performances. Numerical experiments have been conducted to validate as well as to evaluate the performance of the proposed parallel algorithm. Comparisons have been made with the conventional nonoverlapped domain decomposition algorithms. Numerical studies indicate that the proposed algorithm is superior in performance to the conventional domain decomposition algorithms. (C) 2003 Elsevier Ltd. All rights reserved.

A Solution Framework for Discrete Parallel Processors Scheduling Problem with Weighted Flow time Minimization

Relevância:

30.00% 30.00%

Publicador:

Folded cube-connected cycles: A new interconnection topology for massively parallel systems

Relevância:

30.00% 30.00%

Publicador:

Integrated parallelization of computations and visualization for large-scale applications

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Critical applications like cyclone tracking and earthquake modeling require simultaneous high-performance simulations and online visualization for timely analysis. Faster simulations and simultaneous visualization enable scientists provide real-time guidance to decision makers. In this work, we have developed an integrated user-driven and automated steering framework that simultaneously performs numerical simulations and efficient online remote visualization of critical weather applications in resource-constrained environments. It considers application dynamics like the criticality of the application and resource dynamics like the storage space, network bandwidth and available number of processors to adapt various application and resource parameters like simulation resolution, simulation rate and the frequency of visualization. We formulate the problem of finding an optimal set of simulation parameters as a linear programming problem. This leads to 30% higher simulation rate and 25-50% lesser storage consumption than a naive greedy approach. The framework also provides the user control over various application parameters like region of interest and simulation resolution. We have also devised an adaptive algorithm to reduce the lag between the simulation and visualization times. Using experiments with different network bandwidths, we find that our adaptive algorithm is able to reduce lag as well as visualize the most representative frames.

MPI-based parallel synchronous vector evaluated particle swarm optimization for multi-objective design optimization of composite structures

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a decentralized/peer-to-peer architecture-based parallel version of the vector evaluated particle swarm optimization (VEPSO) algorithm for multi-objective design optimization of laminated composite plates using message passing interface (MPI). The design optimization of laminated composite plates being a combinatorially explosive constrained non-linear optimization problem (CNOP), with many design variables and a vast solution space, warrants the use of non-parametric and heuristic optimization algorithms like PSO. Optimization requires minimizing both the weight and cost of these composite plates, simultaneously, which renders the problem multi-objective. Hence VEPSO, a multi-objective variant of the PSO algorithm, is used. Despite the use of such a heuristic, the application problem, being computationally intensive, suffers from long execution times due to sequential computation. Hence, a parallel version of the PSO algorithm for the problem has been developed to run on several nodes of an IBM P720 cluster. The proposed parallel algorithm, using MPI's collective communication directives, establishes a peer-to-peer relationship between the constituent parallel processes, deviating from the more common master-slave approach, in achieving reduction of computation time by factor of up to 10. Finally we show the effectiveness of the proposed parallel algorithm by comparing it with a serial implementation of VEPSO and a parallel implementation of the vector evaluated genetic algorithm (VEGA) for the same design problem. (c) 2012 Elsevier Ltd. All rights reserved.

A dynamic bandwidth allocation scheme for interactive multimedia applications over cellular networks

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cellular networks played key role in enabling high level of bandwidth for users by employing traditional methods such as guaranteed QoS based on application category at radio access stratum level for various classes of QoSs. Also, the newer multimode phones (e.g., phones that support LTE (Long Term Evolution standard), UMTS, GSM, WIFI all at once) are capable to use multiple access methods simulta- neously and can perform seamless handover among various supported technologies to remain connected. With various types of applications (including interactive ones) running on these devices, which in turn have different QoS requirements, this work discusses as how QoS (measured in terms of user level response time, delay, jitter and transmission rate) can be achieved for interactive applications using dynamic bandwidth allocation schemes over cellular networks. In this work, we propose a dynamic bandwidth allocation scheme for interactive multimedia applications with/without background load in the cellular networks. The system has been simulated for many application types running in parallel and it has been observed that if interactive applications are to be provided with decent response time, a periodic overhauling of policy at admission control has to be done by taking into account history, criticality of applications. The results demonstrate that interactive appli- cations can be provided with good service if policy database at admission control is reviewed dynamically.

Minimization of grid current distortion in parallel-connected converters through carrier interleaving

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Identical parallel-connected converters with unequal load sharing have unequal terminal voltages. The difference in terminal voltages is more pronounced in case of back-to-back connected converters, operated in power-circulation mode for the purpose of endurance tests. In this paper, a synchronous reference frame based analysis is presented to estimate the grid current distortion in interleaved, grid-connected converters with unequal terminal voltages. Influence of carrier interleaving angle on rms grid current ripple is studied theoretically as well as experimentally. Optimum interleaving angle to minimize the rms grid current ripple is investigated for different applications of parallel converters. The applications include unity power factor rectifiers, inverters for renewable energy sources, reactive power compensators, and circulating-power test set-up used for thermal testing of high-power converters. Optimum interleaving angle is shown to be a strong function of the average of the modulation indices of the two converters, irrespective of the application. The findings are verified experimentally on two parallel-connected converters, circulating reactive power of up to 150 kVA between them.

Design of a Low Power 64 Point FFT Architecture for WLAN Applications

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a Radix-4(3) based FFT architecture suitable for OFDM based WLAN applications. The radix-4(3) parallel unrolled architecture presented here, uses a radix-4 butterfly unit which takes all four inputs in parallel and can selectively produce one out of the four outputs. A 64 point FFT processor based on the proposed architecture has been implemented in UMC 130nm 1P8M CMOS process with a maximum clock frequency of 100 MHz and area of 0.83mm(2). The proposed processor provides a throughput of four times the clock rate and can finish one 64 point FFT computation in 16 clock cycles. For IEEE 802.11a/g WLAN, the processor needs to be operated at a clock rate of 5 MHz with a power consumption of 2.27 mW which is 27% less than the previously reported low power implementations.

PocketMatch (version 2.0): A parallel algorithm for the detection of structural similarities between protein ligand binding-sites

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Knowledge of protein-ligand interactions is essential to understand several biological processes and important for applications ranging from understanding protein function to drug discovery and protein engineering. Here, we describe an algorithm for the comparison of three-dimensional ligand-binding sites in protein structures. A previously described algorithm, PocketMatch (version 1.0) is optimised, expanded, and MPI-enabled for parallel execution. PocketMatch (version 2.0) rapidly quantifies binding-site similarity based on structural descriptors such as residue nature and interatomic distances. Atomic-scale alignments may also be obtained from amino acid residue pairings generated. It allows an end-user to compute database-wide, all-to-all comparisons in a matter of hours. The use of our algorithm on a sample dataset, performance-analysis, and annotated source code is also included.

An open source massively parallel solver for Richards equation: Mechanistic modelling of water fluxes at the watershed scale

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a massively parallel open source solver for Richards equation, named the RichardsFOAM solver. This solver has been developed in the framework of the open source generalist computational fluid dynamics tool box OpenFOAM (R) and is capable to deal with large scale problems in both space and time. The source code for RichardsFOAM may be downloaded from the CPC program library website. It exhibits good parallel performances (up to similar to 90% parallel efficiency with 1024 processors both in strong and weak scaling), and the conditions required for obtaining such performances are analysed and discussed. These performances enable the mechanistic modelling of water fluxes at the scale of experimental watersheds (up to few square kilometres of surface area), and on time scales of decades to a century. Such a solver can be useful in various applications, such as environmental engineering for long term transport of pollutants in soils, water engineering for assessing the impact of land settlement on water resources, or in the study of weathering processes on the watersheds. (C) 2014 Elsevier B.V. All rights reserved.

A 0.5-2.0 GHz injection locked oscillator cascade for parallel wideband RF spectrum sensing

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An area-efficient, wideband RF frequency synthesizer, which simultaneously generates multiple local oscillator (LO) signals, is designed. It is suitable for parallel wideband RF spectrum sensing in cognitive radios. The frequency synthesizer consists of an injection locked oscillator cascade (ILOC) where all the LO signals are derived from a single reference oscillator. The ILOC is implemented in a 130-nm technology with an active area of . It generates 4 uniformly spaced LO carrier frequencies from 500 MHz to 2 GHz. This design is the first known implementation of a CMOS based ILOC for wide-band RF spectrum sensing applications.

Unsteady flow and heat transfer between rotating coaxial disks

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A study is made on the flow and heat transfer of a viscous fluid confined between two parallel disks. The disks are allowed to rotate with different time dependent angular velocities, and the upper disk is made to approach the lower one with a constant speed. Numerical solutions of the governing parabolic partial differential equations are obtained through a fourth-order accurate compact finite difference scheme. The normal forces and torques that the fluid exerts on the rotating surfaces are obtained at different nondimensional times for different values of the rate of squeezing and disk angular velocities. The temperature distribution and heat transfer are also investigated in the present analysis.

Unsteady-Flow Of A Viscous-Fluid Between 2 Parallel Disks With A Time-Varying Gap Width And A Magnetic-Field

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The unsteady incompressible viscous fluid flow between two parallel infinite disks which are located at a distance h(t*) at time t* has been studied. The upper disk moves towards the lower disk with velocity h'(t*). The lower disk is porous and rotates with angular velocity Omega(t*). A magnetic field B(t*) is applied perpendicular to the two disks. It has been found that the governing Navier-Stokes equations reduce to a set of ordinary differential equations if h(t*), a(t*) and B(t*) vary with time t* in a particular manner, i.e. h(t*) = H(1 - alpha t*)(1/2), Omega(t*) = Omega(0)(1 - alpha t*)(-1), B(t*) = B-0(1 - alpha t*)(-1/2). These ordinary differential equations have been solved numerically using a shooting method. For small Reynolds numbers, analytical solutions have been obtained using a regular perturbation technique. The effects of squeeze Reynolds numbers, Hartmann number and rotation of the disk on the flow pattern, normal force or load and torque have been studied in detail

«
1
2
3
4
5
6
7
8
...
13
14
»