354 resultados para parallel algorithm


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Workstation clusters equipped with high performance interconnect having programmable network processors facilitate interesting opportunities to enhance the performance of parallel application run on them. In this paper, we propose schemes where certain application level processing in parallel database query execution is performed on the network processor. We evaluate the performance of TPC-H queries executing on a high end cluster where all tuple processing is done on the host processor, using a timed Petri net model, and find that tuple processing costs on the host processor dominate the execution time. These results are validated using a small cluster. We therefore propose 4 schemes where certain tuple processing activity is offloaded to the network processor. The first 2 schemes offload the tuple splitting activity - computation to identify the node on which to process the tuples, resulting in an execution time speedup of 1.09 relative to the base scheme, but with I/O bus becoming the bottleneck resource. In the 3rd scheme in addition to offloading tuple processing activity, the disk and network interface are combined to avoid the I/O bus bottleneck, which results in speedups up to 1.16, but with high host processor utilization. Our 4th scheme where the network processor also performs apart of join operation along with the host processor, gives a speedup of 1.47 along with balanced system resource utilizations. Further we observe that the proposed schemes perform equally well even in a scaled architecture i.e., when the number of processors is increased from 2 to 64

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We study the problem of optimal bandwidth allocation in communication networks. We consider a queueing model with two queues to which traffic from different competing flows arrive. The queue length at the buffers is observed every T instants of time, on the basis of which a decision on the amount of bandwidth to be allocated to each buffer for the next T instants is made. We consider a class of closed-loop feedback policies for the system and use a twotimescale simultaneous perturbation stochastic approximation(SPSA) algorithm to find an optimal policy within the prescribed class. We study the performance of the proposed algorithm on a numerical setting. Our algorithm is found to exhibit good performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop a simulation based algorithm for finite horizon Markov decision processes with finite state and finite action space. Illustrative numerical experiments with the proposed algorithm are shown for problems in flow control of communication networks and capacity switching in semiconductor fabrication.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop a simulation-based, two-timescale actor-critic algorithm for infinite horizon Markov decision processes with finite state and action spaces, with a discounted reward criterion. The algorithm is of the gradient ascent type and performs a search in the space of stationary randomized policies. The algorithm uses certain simultaneous deterministic perturbation stochastic approximation (SDPSA) gradient estimates for enhanced performance. We show an application of our algorithm on a problem of mortgage refinancing. Our algorithm obtains the optimal refinancing strategies in a computationally efficient manner

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of estimating the fundamental frequency of voiced speech. We present a novel solution motivated by the importance of amplitude modulation in sound processing and speech perception. The new algorithm is based on a cumulative spectrum computed from the temporal envelope of various subbands. We provide theoretical analysis to derive the new pitch estimator based on the temporal envelope of the bandpass speech signal. We report extensive experimental performance for synthetic as well as natural vowels for both realworld noisy and noise-free data. Experimental results show that the new technique performs accurate pitch estimation and is robust to noise. We also show that the technique is superior to the autocorrelation technique for pitch estimation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a novel approach for designing a fixed gain robust power system stabilizer (PSS) with particu lar emphasis on achieving a minimum closed loop perfor mance, over a wide range of operating and system condi tion. The minimum performance requirements of the con troller has been decided apriori and obtained by using a genetic algorithm (GA) based power system stabilizer. The proposed PSS is robust to changes in the plant parameters brought about due to changes in system and operating con dition, guaranteeing a minimum performance. The efficacy of the proposed method has been tested on a multimachine system. The proposed method of tuning the PSS is an at tractive alternative to conventional fixed gain stabilizer de sign, as it retains the simplicity of the conventional PSS and still guarantees a robust acceptable performance over a wider range of operating and system condition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Modeling the performance behavior of parallel applications to predict the execution times of the applications for larger problem sizes and number of processors has been an active area of research for several years. The existing curve fitting strategies for performance modeling utilize data from experiments that are conducted under uniform loading conditions. Hence the accuracy of these models degrade when the load conditions on the machines and network change. In this paper, we analyze a curve fitting model that attempts to predict execution times for any load conditions that may exist on the systems during application execution. Based on the experiments conducted with the model for a parallel eigenvalue problem, we propose a multi-dimensional curve-fitting model based on rational polynomials for performance predictions of parallel applications in non-dedicated environments. We used the rational polynomial based model to predict execution times for 2 other parallel applications on systems with large load dynamics. In all the cases, the model gave good predictions of execution times with average percentage prediction errors of less than 20%

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the context of SPH-based simulations of impact dynamics, an optimised and automated form of the acceleration correction algorithm (Shaw and Reid, 2009a) is developed so as to remove spurious high frequency oscillations in computed responses whilst retaining the stabilizing characteristics of the artificial viscosity in the presence of shocks and layers with sharp gradients. A rational framework for an insightful characterisation of the erstwhile acceleration correction method is first set up. This is followed by the proposal of an optimised version of the method, wherein the strength of the correction term in the momentum balance and energy equations is optimised. For the first time, this leads to an automated procedure to arrive at the artificial viscosity term. In particular, this is achieved by taking a spatially varying response-dependent support size for the kernel function through which the correction term is computed. The optimum value of the support size is deduced by minimising the (spatially localised) total variation of the high oscillation in the acceleration term with respect to its (local) mean. The derivation of the method, its advantages over the heuristic method and issues related to its numerical implementation are discussed in detail. (C) 2011 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Long running multi-physics coupled parallel applications have gained prominence in recent years. The high computational requirements and long durations of simulations of these applications necessitate the use of multiple systems of a Grid for execution. In this paper, we have built an adaptive middleware framework for execution of long running multi-physics coupled applications across multiple batch systems of a Grid. Our framework, apart from coordinating the executions of the component jobs of an application on different batch systems, also automatically resubmits the jobs multiple times to the batch queues to continue and sustain long running executions. As the set of active batch systems available for execution changes, our framework performs migration and rescheduling of components using a robust rescheduling decision algorithm. We have used our framework for improving the application throughput of a foremost long running multi-component application for climate modeling, the Community Climate System Model (CCSM). Our real multi-site experiments with CCSM indicate that Grid executions can lead to improved application throughput for climate models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the image reconstruction using the fan-beam filtered backprojection (FBP) algorithm with no backprojection weight from windowed linear prediction (WLP) completed truncated projection data. The image reconstruction from truncated projections aims to reconstruct the object accurately from the available limited projection data. Due to the incomplete projection data, the reconstructed image contains truncation artifacts which extends into the region of interest (ROI) making the reconstructed image unsuitable for further use. Data completion techniques have been shown to be effective in such situations. We use windowed linear prediction technique for projection completion and then use the fan-beam FBP algorithm with no backprojection weight for the 2-D image reconstruction. We evaluate the quality of the reconstructed image using fan-beam FBP algorithm with no backprojection weight after WLP completion.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Simple algorithms have been developed to generate pairs of minterms forming a given 2-sum and thereby to test 2-asummability of switching functions. The 2-asummability testing procedure can be easily implemented on the computer. Since 2-asummability is a necessary and sufficient condition for a switching function of upto eight variables to be linearly separable (LS), it can be used for testing LS switching functions of upto eight variables.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new and efficient approach to construct a 3D wire-frame of an object from its orthographic projections is described. The input projections can be two or more and can include regular and complete auxiliary views. Each view may contain linear, circular and other conic sections. The output is a 3D wire-frame that is consistent with the input views. The approach can handle auxiliary views containing curved edges. This generality derives from a new technique to construct 3D vertices from the input 2D vertices (as opposed to matching coordinates that is prevalent in current art). 3D vertices are constructed by projecting the 2D vertices in a pair of views on the common line of the two views. The construction of 3D edges also does not require the addition of silhouette and tangential vertices and subsequently splitting edges in the views. The concepts of complete edges and n-tuples are introduced to obviate this need. Entities corresponding to the 3D edge in each view are first identified and the 3D edges are then constructed from the information available with the matching 2D edges. This allows the algorithm to handle conic sections that are not parallel to any of the viewing directions. The localization of effort in constructing 3D edges is the source of efficiency of the construction algorithm as it does not process all potential 3D edges. Working of the algorithm on typical drawings is illustrated. (C) 2011 Elsevier Ltd. All rights reserved.