Biblioteca Digital

135 resultados para Parallel execution

Power Efficient Redundant Execution for Chip Multiprocessor

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the design of a power efficient microarchitecture for transient fault detection in chip multiprocessors (CMPs) We introduce a new per-core dynamic voltage and frequency scaling (DVFS) algorithm for our architecture that significantly reduces power dissipation for redundant execution with a minimal performance overhead. Using cycle accurate simulation combined with a simple first order power model, we estimate that our architecture reduces dynamic power dissipation in the redundant core by an mean value of 79% and a maximum of 85% with an associated mean performance overhead of only 1:2%

Middleware for Long-running Applications on Batch Grids. Student Research Symposium poster

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Computational grids with multiple batch systems (batch grids) can be powerful infrastructures for executing long-running multicomponent parallel applications. In this paper, we have constructed a middleware framework for executing such long-running applications spanning multiple submissions to the queues on multiple batch systems. We have used our framework for execution of a foremost long-running multi-component application for climate modeling, the Community Climate System Model (CCSM). Our framework coordinates the distribution, execution, migration and restart of the components of CCSM on the multiple queues where the component jobs of the different queues can have different queue waiting and startup times.

On the throughput, DMT and optimal code construction of the K-parallel-path cooperative wireless fading network

Relevância:

20.00% 20.00%

Publicador:

An algebraic formulation of exact force-, moment-isotropy in spatial parallel manipulators

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present an algebraic method to study and design spatial parallel manipulators that demonstrate isotropy in the force and moment distributions.We use the force and moment transformation matrices separately,and derive conditions for their isotropy individually as well as in combination. The isotropy conditions are derived in closed-form in terms of the invariants of the quadratic forms associated with these matrices. The formulation has been applied to a class of Stewart platform manipulators. We obtain multi-parameter families of isotropic manipulator analytically. In addition to computing the isotropic configurations of an existing manipulator,we demonstrate a procedure for designing the manipulator for isotropy at a given configuration.

Exact and Near Optimal Heuristics for the Parallel Batch Problem to Maximize Capacity Utilization

Relevância:

20.00% 20.00%

Publicador:

Scheduling Parallel Batch Machines with Incompatible Jobs

Relevância:

20.00% 20.00%

Publicador:

A Solution Framework for Job Scheduling with Incompatible Job Families on Parallel Batch Processors for Minimizing the Total Weighted Tardiness

Relevância:

20.00% 20.00%

Publicador:

A Solution Framework for Discrete Parallel Processors Scheduling Problem with Weighted Flow time Minimization

Relevância:

20.00% 20.00%

Publicador:

Energy-efficient redundant execution for chip multiprocessors

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Relentless CMOS scaling coupled with lower design tolerances is making ICs increasingly susceptible to wear-out related permanent faults and transient faults, necessitating on-chip fault tolerance in future chip microprocessors (CMPs). In this paper, we describe a power-efficient architecture for redundant execution on chip multiprocessors (CMPs) which when coupled with our per-core dynamic voltage and frequency scaling (DVFS) algorithm significantly reduces the energy overhead of redundant execution without sacrificing performance. Our evaluation shows that this architecture has a performance overhead of only 0.3% and consumes only 1.48 times the energy of a non-fault-tolerant baseline.

Adaptive Executions of Multi-Physics Coupled Applications on Batch Grids

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Long running multi-physics coupled parallel applications have gained prominence in recent years. The high computational requirements and long durations of simulations of these applications necessitate the use of multiple systems of a Grid for execution. In this paper, we have built an adaptive middleware framework for execution of long running multi-physics coupled applications across multiple batch systems of a Grid. Our framework, apart from coordinating the executions of the component jobs of an application on different batch systems, also automatically resubmits the jobs multiple times to the batch queues to continue and sustain long running executions. As the set of active batch systems available for execution changes, our framework performs migration and rescheduling of components using a robust rescheduling decision algorithm. We have used our framework for improving the application throughput of a foremost long running multi-component application for climate modeling, the Community Climate System Model (CCSM). Our real multi-site experiments with CCSM indicate that Grid executions can lead to improved application throughput for climate models.

Parallel Computation of 2D Morse-Smale Complexes

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Morse-Smale complex is a useful topological data structure for the analysis and visualization of scalar data. This paper describes an algorithm that processes all mesh elements of the domain in parallel to compute the Morse-Smale complex of large two-dimensional data sets at interactive speeds. We employ a reformulation of the Morse-Smale complex using Forman's Discrete Morse Theory and achieve scalability by computing the discrete gradient using local accesses only. We also introduce a novel approach to merge gradient paths that ensures accurate geometry of the computed complex. We demonstrate that our algorithm performs well on both multicore environments and on massively parallel architectures such as the GPU.

An operator-splitting finite element method for the efficient parallel solution of multidimensional population balance systems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A finite element method for solving multidimensional population balance systems is proposed where the balance of fluid velocity, temperature and solute partial density is considered as a two-dimensional system and the balance of particle size distribution as a three-dimensional one. The method is based on a dimensional splitting into physical space and internal property variables. In addition, the operator splitting allows to decouple the equations for temperature, solute partial density and particle size distribution. Further, a nodal point based parallel finite element algorithm for multi-dimensional population balance systems is presented. The method is applied to study a crystallization process assuming, for simplicity, a size independent growth rate and neglecting agglomeration and breakage of particles. Simulations for different wall temperatures are performed to show the effect of cooling on the crystal growth. Although the method is described in detail only for the case of d=2 space and s=1 internal property variables it has the potential to be extendable to d+s variables, d=2, 3 and s >= 1. (C) 2011 Elsevier Ltd. All rights reserved.

Language identification using parallel sub-word recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Parallel sub-word recognition (PSWR) is a new model that has been proposed for language identification (LID) which does not need elaborate phonetic labeling of the speech data in a foreign language. The new approach performs a front-end tokenization in terms of sub-word units which are designed by automatic segmentation, segment clustering and segment HMM modeling. We develop PSWR based LID in a framework similar to the parallel phone recognition (PPR) approach in the literature. This includes a front-end tokenizer and a back-end language model, for each language to be identified. Considering various combinations of the statistical evaluation scores, it is found that PSWR can perform as well as PPR, even with broad acoustic sub-word tokenization, thus making it an efficient alternative to the PPR system.

A Parallel Algorithm for Learning Logic Expressions under Noise

Relevância:

20.00% 20.00%

Publicador:

A Parallel Algorithm for Concept Learning

Relevância:

20.00% 20.00%

Publicador:

«
1
2
3
4
5
6
7
8
9
»