965 resultados para EFFICIENCY OPTIMIZATION


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustered architecture processors are preferred for embedded systems because centralized register file architectures scale poorly in terms of clock rate, chip area, and power consumption. Although clustering helps by improving the clock speed, reducing the energy consumption of the logic, and making the design simpler, it introduces extra overheads by way of inter-cluster communication. This communication happens over long global wires having high load capacitance which leads to delay in execution and significantly high energy consumption. Inter-cluster communication also introduces many short idle cycles, thereby significantly increasing the overall leakage energy consumption in the functional units. The trend towards miniaturization of devices (and associated reduction in threshold voltage) makes energy consumption in interconnects and functional units even worse, and limits the usability of clustered architectures in smaller technologies. However, technological advancements now permit the design of interconnects and functional units with varying performance and power modes. In this paper, we propose scheduling algorithms that aggregate the scheduling slack of instructions and communication slack of data values to exploit the low-power modes of functional units and interconnects. Finally, we present a synergistic combination of these algorithms that simultaneously saves energy in functional units and interconnects to improves the usability of clustered architectures by achieving better overall energy-performance trade-offs. Even with conservative estimates of the contribution of the functional units and interconnects to the overall processor energy consumption, the proposed combined scheme obtains on average 8% and 10% improvement in overall energy-delay product with 3.5% and 2% performance degradation for a 2-clustered and a 4-clustered machine, respectively. We present a detailed experimental evaluation of the proposed schemes. Our test bed uses the Trimaran compiler infrastructure. (C) 2012 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we study constrained maximum entropy and minimum divergence optimization problems, in the cases where integer valued sufficient statistics exists, using tools from computational commutative algebra. We show that the estimation of parametric statistical models in this case can be transformed to solving a system of polynomial equations. We give an implicit description of maximum entropy models by embedding them in algebraic varieties for which we give a Grobner basis method to compute it. In the cases of minimum KL-divergence models we show that implicitization preserves specialization of prior distribution. This result leads us to a Grobner basis method to embed minimum KL-divergence models in algebraic varieties. (C) 2012 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The diffusion equation-based modeling of near infrared light propagation in tissue is achieved by using finite-element mesh for imaging real-tissue types, such as breast and brain. The finite-element mesh size (number of nodes) dictates the parameter space in the optical tomographic imaging. Most commonly used finite-element meshing algorithms do not provide the flexibility of distinct nodal spacing in different regions of imaging domain to take the sensitivity of the problem into consideration. This study aims to present a computationally efficient mesh simplification method that can be used as a preprocessing step to iterative image reconstruction, where the finite-element mesh is simplified by using an edge collapsing algorithm to reduce the parameter space at regions where the sensitivity of the problem is relatively low. It is shown, using simulations and experimental phantom data for simple meshes/domains, that a significant reduction in parameter space could be achieved without compromising on the reconstructed image quality. The maximum errors observed by using the simplified meshes were less than 0.27% in the forward problem and 5% for inverse problem.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose: To optimize the data-collection strategy for diffuse optical tomography and to obtain a set of independent measurements among the total measurements using the model based data-resolution matrix characteristics. Methods: The data-resolution matrix is computed based on the sensitivity matrix and the regularization scheme used in the reconstruction procedure by matching the predicted data with the actual one. The diagonal values of data-resolution matrix show the importance of a particular measurement and the magnitude of off-diagonal entries shows the dependence among measurements. Based on the closeness of diagonal value magnitude to off-diagonal entries, the independent measurements choice is made. The reconstruction results obtained using all measurements were compared to the ones obtained using only independent measurements in both numerical and experimental phantom cases. The traditional singular value analysis was also performed to compare the results obtained using the proposed method. Results: The results indicate that choosing only independent measurements based on data-resolution matrix characteristics for the image reconstruction does not compromise the reconstructed image quality significantly, in turn reduces the data-collection time associated with the procedure. When the same number of measurements (equivalent to independent ones) are chosen at random, the reconstruction results were having poor quality with major boundary artifacts. The number of independent measurements obtained using data-resolution matrix analysis is much higher compared to that obtained using the singular value analysis. Conclusions: The data-resolution matrix analysis is able to provide the high level of optimization needed for effective data-collection in diffuse optical imaging. The analysis itself is independent of noise characteristics in the data, resulting in an universal framework to characterize and optimize a given data-collection strategy. (C) 2012 American Association of Physicists in Medicine. http://dx.doi.org/10.1118/1.4736820]

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Thermoacoustic engines are energy conversion devices that convert thermal energy from a high-temperature heat source into useful work in the form of acoustic power while diverting waste heat into a cold sink; it can be used as a drive for cryocoolers and refrigerators. Though the devices are simple to fabricate, it is very challenging to design an optimized thermoacoustic primemover with better performance. The study presented here aims to optimize the thermoacoustic primemover using response surface methodology. The influence of stack position and its length, resonator length, plate thickness, and plate spacing on pressure amplitude and frequency in a thermoacoustic primemover is investigated in this study. For the desired frequency of 207 Hz, the optimized value of the above parameters suggested by the response surface methodology has been conducted experimentally, and simulations are also performed using DeltaEC. The experimental and simulation results showed similar output performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a decentralized/peer-to-peer architecture-based parallel version of the vector evaluated particle swarm optimization (VEPSO) algorithm for multi-objective design optimization of laminated composite plates using message passing interface (MPI). The design optimization of laminated composite plates being a combinatorially explosive constrained non-linear optimization problem (CNOP), with many design variables and a vast solution space, warrants the use of non-parametric and heuristic optimization algorithms like PSO. Optimization requires minimizing both the weight and cost of these composite plates, simultaneously, which renders the problem multi-objective. Hence VEPSO, a multi-objective variant of the PSO algorithm, is used. Despite the use of such a heuristic, the application problem, being computationally intensive, suffers from long execution times due to sequential computation. Hence, a parallel version of the PSO algorithm for the problem has been developed to run on several nodes of an IBM P720 cluster. The proposed parallel algorithm, using MPI's collective communication directives, establishes a peer-to-peer relationship between the constituent parallel processes, deviating from the more common master-slave approach, in achieving reduction of computation time by factor of up to 10. Finally we show the effectiveness of the proposed parallel algorithm by comparing it with a serial implementation of VEPSO and a parallel implementation of the vector evaluated genetic algorithm (VEGA) for the same design problem. (c) 2012 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The q-Gaussian distribution results from maximizing certain generalizations of Shannon entropy under some constraints. The importance of q-Gaussian distributions stems from the fact that they exhibit power-law behavior, and also generalize Gaussian distributions. In this paper, we propose a Smoothed Functional (SF) scheme for gradient estimation using q-Gaussian distribution, and also propose an algorithm for optimization based on the above scheme. Convergence results of the algorithm are presented. Performance of the proposed algorithm is shown by simulation results on a queuing model.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Automated image segmentation techniques are useful tools in biological image analysis and are an essential step in tracking applications. Typically, snakes or active contours are used for segmentation and they evolve under the influence of certain internal and external forces. Recently, a new class of shape-specific active contours have been introduced, which are known as Snakuscules and Ovuscules. These contours are based on a pair of concentric circles and ellipses as the shape templates, and the optimization is carried out by maximizing a contrast function between the outer and inner templates. In this paper, we present a unified approach to the formulation and optimization of Snakuscules and Ovuscules by considering a specific form of affine transformations acting on a pair of concentric circles. We show how the parameters of the affine transformation may be optimized for, to generate either Snakuscules or Ovuscules. Our approach allows for a unified formulation and relies only on generic regularization terms and not shape-specific regularization functions. We show how the calculations of the partial derivatives may be made efficient thanks to the Green's theorem. Results on synthesized as well as real data are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ground management problems are typically solved by the simulation-optimization approach where complex numerical models are used to simulate the groundwater flow and/or contamination transport. These numerical models take a lot of time to solve the management problems and hence become computationally expensive. In this study, Artificial Neural Network (ANN) and Particle Swarm Optimization (PSO) models were developed and coupled for the management of groundwater of Dore river basin in France. The Analytic Element Method (AEM) based flow model was developed and used to generate the dataset for the training and testing of the ANN model. This developed ANN-PSO model was applied to minimize the pumping cost of the wells, including cost of the pipe line. The discharge and location of the pumping wells were taken as the decision variable and the ANN-PSO model was applied to find out the optimal location of the wells. The results of the ANN-PSO model are found similar to the results obtained by AEM-PSO model. The results show that the ANN model can reduce the computational burden significantly as it is able to analyze different scenarios, and the ANN-PSO model is capable of identifying the optimal location of wells efficiently.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Service systems are labor intensive. Further, the workload tends to vary greatly with time. Adapting the staffing levels to the workloads in such systems is nontrivial due to a large number of parameters and operational variations, but crucial for business objectives such as minimal labor inventory. One of the central challenges is to optimize the staffing while maintaining system steady-state and compliance to aggregate SLA constraints. We formulate this problem as a parametrized constrained Markov process and propose a novel stochastic optimization algorithm for solving it. Our algorithm is a multi-timescale stochastic approximation scheme that incorporates a SPSA based algorithm for ‘primal descent' and couples it with a ‘dual ascent' scheme for the Lagrange multipliers. We validate this optimization scheme on five real-life service systems and compare it with a state-of-the-art optimization tool-kit OptQuest. Being two orders of magnitude faster than OptQuest, our scheme is particularly suitable for adaptive labor staffing. Also, we observe that it guarantees convergence and finds better solutions than OptQuest in many cases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

High-level loop transformations are a key instrument in mapping computational kernels to effectively exploit the resources in modern processor architectures. Nevertheless, selecting required compositions of loop transformations to achieve this remains a significantly challenging task; current compilers may be off by orders of magnitude in performance compared to hand-optimized programs. To address this fundamental challenge, we first present a convex characterization of all distinct, semantics-preserving, multidimensional affine transformations. We then bring together algebraic, algorithmic, and performance analysis results to design a tractable optimization algorithm over this highly expressive space. Our framework has been implemented and validated experimentally on a representative set of benchmarks running on state-of-the-art multi-core platforms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Advances in technology have increased the number of cores and size of caches present on chip multicore platforms(CMPs). As a result, leakage power consumption of on-chip caches has already become a major power consuming component of the memory subsystem. We propose to reduce leakage power consumption in static nonuniform cache architecture(SNUCA) on a tiled CMP by dynamically varying the number of cache slices used and switching off unused cache slices. A cache slice in a tile includes all cache banks present in that tile. Switched-off cache slices are remapped considering the communication costs to reduce cache usage with minimal impact on execution time. This saves leakage power consumption in switched-off L2 cache slices. On an average, there map policy achieves 41% and 49% higher EDP savings compared to static and dynamic NUCA (DNUCA) cache policies on a scalable tiled CMP, respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider precoding strategies at the secondary base station (SBS) in a cognitive radio network with interference constraints at the primary users (PUs). Precoding strategies at the SBS which satisfy interference constraints at the PUs in cognitive radio networks have not been adequately addressed in the literature so far. In this paper, we consider two scenarios: i) when the primary base station (PBS) data is not available at SBS, and ii) when the PBS data is made available at the SBS. We derive the optimum MMSE and Tomlinson-Harashima precoding (THP) matrix Alters at the SBS which satisfy the interference constraints at the PUs for the former case. For the latter case, we propose a precoding scheme at the SBS which performs pre-cancellation of the PBS data, followed by THP on the pre-cancelled data. The optimum precoding matrix filters are computed through an iterative search. To illustrate the robustness of the proposed approach against imperfect CSI at the SBS, we then derive robust precoding filters under imperfect CSI for the latter case. Simulation results show that the proposed optimum precoders achieve good bit error performance at the secondary users while meeting the interference constraints at the PUs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new multi-sensor image registration technique is proposed based on detecting the feature corner points using modified Harris Corner Detector (HDC). These feature points are matched using multi-objective optimization (distance condition and angle criterion) based on Discrete Particle Swarm Optimization (DPSO). This optimization process is more efficient as it considers both the distance and angle criteria to incorporate multi-objective switching in the fitness function. This optimization process helps in picking up three corresponding corner points detected in the sensed and base image and thereby using the affine transformation, the sensed image is aligned with the base image. Further, the results show that the new approach can provide a new dimension in solving multi-sensor image registration problems. From the obtained results, the performance of image registration is evaluated and is concluded that the proposed approach is efficient.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

During the motion of one dimensional flexible objects such as ropes, chains, etc., the assumption of constant length is realistic. Moreover,their motion appears to be naturally minimizing some abstract distance measure, wherein the disturbance at one end gradually dies down along the curve defining the object. This paper presents purely kinematic strategies for deriving length-preserving transformations of flexible objects that minimize appropriate ‘motion’. The strategies involve sequential and overall optimization of the motion derived using variational calculus. Numerical simulations are performed for the motion of a planar curve and results show stable converging behavior for single-step infinitesimal and finite perturbations 1 as well as multi-step perturbations. Additionally, our generalized approach provides different intuitive motions for various problem-specific measures of motion, one of which is shown to converge to the conventional tractrix-based solution. Simulation results for arbitrary shapes and excitations are also included.