227 resultados para Iterative Optimization
Resumo:
Fault-tolerance is due to the semiconductor technology development important, not only for safety-critical systems but also for general-purpose (non-safety critical) systems. However, instead of guaranteeing that deadlines always are met, it is for general-purpose systems important to minimize the average execution time (AET) while ensuring fault-tolerance. For a given job and a soft (transient) error probability, we define mathematical formulas for AET that includes bus communication overhead for both voting (active replication) and rollback-recovery with checkpointing (RRC). And, for a given multi-processor system-on-chip (MPSoC), we define integer linear programming (ILP) models that minimize AET including bus communication overhead when: (1) selecting the number of checkpoints when using RRC, (2) finding the number of processors and job-to-processor assignment when using voting, and (3) defining fault-tolerance scheme (voting or RRC) per job and defining its usage for each job. Experiments demonstrate significant savings in AET.
Resumo:
As the gap between processor and memory continues to grow Memory performance becomes a key performance bottleneck for many applications. Compilers therefore increasingly seek to modify an application’s data layout to improve cache locality and cache reuse. Whole program Structure Layout [WPSL] transformations can significantly increase the spatial locality of data and reduce the runtime of programs that use link-based data structures, by increasing the cache line utilization. However, in production compilers WPSL transformations do not realize the entire performance potential possible due to a number of factors. Structure layout decisions made on the basis of whole program aggregated affinity/hotness of structure fields, can be sub optimal for local code regions. WPSL is also restricted in applicability in production compilers for type unsafe languages like C/C++ due to the extensive legality checks and field sensitive pointer analysis required over the entire application. In order to overcome the issues associated with WPSL, we propose Region Based Structure Layout (RBSL) optimization framework, using selective data copying. We describe our RBSL framework, implemented in the production compiler for C/C++ on HP-UX IA-64. We show that acting in complement to the existing and mature WPSL transformation framework in our compiler, RBSL improves application performance in pointer intensive SPEC benchmarks ranging from 3% to 28% over WPSL
Resumo:
Non-linear precoding for the downlink of a multiuser MISO (multiple-input single-output) communication system in the presence of imperfect channel state information (CSI) is considered.The base station is equipped with multiple transmit antennas and each user terminal is equipped with a single receive antenna. The CSI at the transmitter is assumed to be perturbed by an estimation error. We propose a robust minimum mean square error (MMSE) Tomlinson-Harashima precoder (THP)design, which can be formulated as an optimization problem that can be solved efficiently by the method of alternating optimization(AO). In this method of optimization, the entire set of optimization variables is partitioned into non-overlapping subsets,and an iterative sequence of optimizations on these subsets is carried out, which is often simpler compared to simultaneous optimization over all variables. In our problem, the application of the AO method results in a second-order cone program which can be numerically solved efficiently. The proposed precoder is shown to be less sensitive to imperfect channel knowledge. Simulation results illustrate the improvement in performance compared to other robust linear and non-linear precoders in the literature.
Resumo:
This paper addresses the problem of maximum margin classification given the moments of class conditional densities and the false positive and false negative error rates. Using Chebyshev inequalities, the problem can be posed as a second order cone programming problem. The dual of the formulation leads to a geometric optimization problem, that of computing the distance between two ellipsoids, which is solved by an iterative algorithm. The formulation is extended to non-linear classifiers using kernel methods. The resultant classifiers are applied to the case of classification of unbalanced datasets with asymmetric costs for misclassification. Experimental results on benchmark datasets show the efficacy of the proposed method.
Resumo:
Frequent accesses to the register file make it one of the major sources of energy consumption in ILP architectures. The large number of functional units connected to a large unified register file in VLIW architectures make power dissipation in the register file even worse because of the need for a large number of ports. High power dissipation in a relatively smaller area occupied by a register file leads to a high power density in the register file and makes it one of the prime hot-spots. This makes it highly susceptible to the possibility of a catastrophic heatstroke. This in turn impacts the performance and cost because of the need for periodic cool down and sophisticated packaging and cooling techniques respectively. Clustered VLIW architectures partition the register file among clusters of functional units and reduce the number of ports required thereby reducing the power dissipation. However, we observe that the aggregate accesses to register files in clustered VLIW architectures (and associated energy consumption) become very high compared to the centralized VLIW architectures and this can be attributed to a large number of explicit inter-cluster communications. Snooping based clustered VLIW architectures provide very limited but very fast way of inter-cluster communication by allowing some of the functional units to directly read some of the operands from the register file of some of the other clusters. In this paper, we propose instruction scheduling algorithms that exploit the limited snooping capability to reduce the register file energy consumption on an average by 12% and 18% and improve the overall performance by 5% and 11% for a 2-clustered and a 4-clustered machine respectively, over an earlier state-of-the-art clustered scheduling algorithm when evaluated in the context of snooping based clustered VLIW architectures.
Resumo:
In this paper the use of probability theory in reliability based optimum design of reinforced gravity retaining wall is described. The formulation for computing system reliability index is presented. A parametric study is conducted using advanced first order second moment method (AFOSM) developed by Hasofer-Lind and Rackwitz-Fiessler (HL-RF) to asses the effect of uncertainties in design parameters on the probability of failure of reinforced gravity retaining wall. Totally 8 modes of failure are considered, viz overturning, sliding, eccentricity, bearing capacity failure, shear and moment failure in the toe slab and heel slab. The analysis is performed by treating back fill soil properties, foundation soil properties, geometric properties of wall, reinforcement properties and concrete properties as random variables. These results are used to investigate optimum wall proportions for different coefficients of variation of φ (5% and 10%) and targeting system reliability index (βt) in the range of 3 – 3.2.