Biblioteca Digital

928 resultados para BENCHMARK

Toward Effective Scalar Hardware for Highly Vectorizable Applications

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The performance of a program will ultimately be limited by its serial (scalar) portion, as pointed out by Amdahl′s Law. Reported studies thus far of instruction-level parallelism have mixed data-parallel program portions with scalar program portions, often leading to contradictory and controversial results. We report an instruction-level behavioral characterization of scalar code containing minimal data-parallelism, extracted from highly vectorized programs of the PERFECT benchmark suite running on a Cray Y-MP system. We classify scalar basic blocks according to their instruction mix, characterize the data dependencies seen in each class, and, as a first step, measure the maximum intrablock instruction-level parallelism available. We observe skewed rather than balanced instruction distributions in scalar code and in individual basic block classes of scalar code; nonuniform distribution of parallelism across instruction classes; and, as expected, limited available intrablock parallelism. We identify frequently occurring data-dependence patterns and discuss new instructions to reduce latency. Toward effective scalar hardware, we study latency-pipelining trade-offs and restricted multiple instruction issue mechanisms.

Points-to Analysis as a System of Linear Equations

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We propose a novel formulation of the points-to analysis as a system of linear equations. With this, the efficiency of the points-to analysis can be significantly improved by leveraging the advances in solution procedures for solving the systems of linear equations. However, such a formulation is non-trivial and becomes challenging due to various facts, namely, multiple pointer indirections, address-of operators and multiple assignments to the same variable. Further, the problem is exacerbated by the need to keep the transformed equations linear. Despite this, we successfully model all the pointer operations. We propose a novel inclusion-based context-sensitive points-to analysis algorithm based on prime factorization, which can model all the pointer operations. Experimental evaluation on SPEC 2000 benchmarks and two large open source programs reveals that our approach is competitive to the state-of-the-art algorithms. With an average memory requirement of mere 21MB, our context-sensitive points-to analysis algorithm analyzes each benchmark in 55 seconds on an average.

Test Application Time Minimization for RAS using Basis Optimization of Column Decoder

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Random Access Scan, which addresses individual flip-flops in a design using a memory array like row and column decoder architecture, has recently attracted widespread attention, due to its potential for lower test application time, test data volume and test power dissipation when compared to traditional Serial Scan. This is because typically only a very limited number of random ``care'' bits in a test response need be modified to create the next test vector. Unlike traditional scan, most flip-flops need not be updated. Test application efficiency can be further improved by organizing the access by word instead of by bit. In this paper we present a new decoder structure that takes advantage of basis vectors and linear algebra to further significantly optimize test application in RAS by performing the write operations on multiple bits consecutively. Simulations performed on benchmark circuits show an average of 2-3 times speed up in test write time compared to conventional RAS.

A Flat Concurrent Prolog compiler for PARAM

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We describe a compiler for the Flat Concurrent Prolog language on a message passing multiprocessor architecture. This compiler permits symbolic and declarative programming in the syntax of Guarded Horn Rules, The implementation has been verified and tested on the 64-node PARAM parallel computer developed by C-DAC (Centre for the Development of Advanced Computing, India), Flat Concurrent Prolog (FCP) is a logic programming language designed for concurrent programming and parallel execution, It is a process oriented language, which embodies dataflow synchronization and guarded-command as its basic control mechanisms. An identical algorithm is executed on every processor in the network, We assume regular network topologies like mesh, ring, etc, Each node has a local memory, The algorithm comprises of two important parts: reduction and communication, The most difficult task is to integrate the solutions of problems that arise in the implementation in a coherent and efficient manner. We have tested the efficacy of the compiler on various benchmark problems of the ICOT project that have been reported in the recent book by Evan Tick, These problems include Quicksort, 8-queens, and Prime Number Generation, The results of the preliminary tests are favourable, We are currently examining issues like indexing and load balancing to further optimize our compiler.

On generating optimal signal probabilities for random tests: A genetic approach

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Genetic Algorithms are robust search and optimization techniques. A Genetic Algorithm based approach for determining the optimal input distributions for generating random test vectors is proposed in the paper. A cost function based on the COP testability measure for determining the efficacy of the input distributions is discussed, A brief overview of Genetic Algorithms (GAs) and the specific details of our implementation are described. Experimental results based on ISCAS-85 benchmark circuits are presented. The performance pf our GA-based approach is compared with previous results. While the GA generates more efficient input distributions than the previous methods which are based on gradient descent search, the overheads of the GA in computing the input distributions are larger. To account for the relatively quick convergence of the gradient descent methods, we analyze the landscape of the COP-based cost function. We prove that the cost function is unimodal in the search space. This feature makes the cost function amenable to optimization by gradient-descent techniques as compared to random search methods such as Genetic Algorithms.

Analytical solutions for heat transfer during cyclic melting and freezing of a phase change material used in electronic or electrical packaging

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we develop an analytical heat transfer model, which is capable of analyzing cyclic melting and solidification processes of a phase change material used in the context of electronics cooling systems. The model is essentially based on conduction heat transfer, with treatments for convection and radiation embedded inside. The whole solution domain is first divided into two main sub-domains, namely, the melting sub-domain and the solidification sub-domain. Each sub-domain is then analyzed for a number of temporal regimes. Accordingly, analytical solutions for temperature distribution within each subdomain are formulated either using a semi-infinity consideration, or employing a method of quasi-steady state, depending on the applicability. The solution modules are subsequently united, leading to a closed-form solution for the entire problem. The analytical solutions are then compared with experimental and numerical solutions for a benchmark problem quoted in the literature, and excellent agreements can be observed.

A fast quasi-Newton method for semi-supervised SVM

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Due to its wide applicability, semi-supervised learning is an attractive method for using unlabeled data in classification. In this work, we present a semi-supervised support vector classifier that is designed using quasi-Newton method for nonsmooth convex functions. The proposed algorithm is suitable in dealing with very large number of examples and features. Numerical experiments on various benchmark datasets showed that the proposed algorithm is fast and gives improved generalization performance over the existing methods. Further, a non-linear semi-supervised SVM has been proposed based on a multiple label switching scheme. This non-linear semi-supervised SVM is found to converge faster and it is found to improve generalization performance on several benchmark datasets. (C) 2010 Elsevier Ltd. All rights reserved.

A Comparative Study of the Formulations for Topology Optimization of Compliant Mechanisms

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The topology optimization problem for the synthesis of compliant mechanisms has been formulated in many different ways in the last 15 years, but there is not yet a definitive formulation that is universally accepted. Furthermore, there are two unresolved issues in this problem. In this paper, we present a comparative study of five distinctly different formulations that are reported in the literature. Three benchmark examples are solved with these formulations using the same input and output specifications and the same numerical optimization algorithm. A total of 35 different synthesis examples are implemented. The examples are limited to desired instantaneous output direction for prescribed input force direction. Hence, this study is limited to linear elastic modeling with small deformations. Two design parameterizations, namely, the frame element based ground structure and the density approach using continuum elements, are used. The obtained designs are evaluated with all other objective functions and are compared with each other. The checkerboard patterns, point flexures, the ability to converge from an unbiased uniform initial guess, and the computation time are analyzed. Some observations are noted based on the extensive implementation done in this study. Complete details of the benchmark problems and the results are included. The computer codes related to this study are made available on the internet for ready access.

Bounds on Defect Level and Fault Coverage in Linear Analog Circuit Testing

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Transfer function coefficients (TFC) are widely used to test linear analog circuits for parametric and catastrophic faults. This paper presents closed form expressions for an upper bound on the defect level (DL) and a lower bound on fault coverage (FC) achievable in TFC based test method. The computed bounds have been tested and validated on several benchmark circuits. Further, application of these bounds to scalable RC ladder networks reveal a number of interesting characteristics. The approach adopted here is general and can be extended to find bounds of DL and FC of other parametric test methods for linear and non-linear circuits.

V-Transform: “An Enhanced Polynomial Coefficient Based DC Test for Non-linear Analog Circuits

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract—DC testing of parametric faults in non-linear analog circuits based on a new transformation, entitled, V-Transform acting on polynomial coefficient expansion of the circuit function is presented. V-Transform serves the dual purpose of monotonizing polynomial coefficients of circuit function expansion and increasing the sensitivity of these coefficients to circuit parameters. The sensitivity of V-Transform Coefficients (VTC) to circuit parameters is up to 3x-5x more than sensitivity of polynomial coefficients. As a case study, we consider a benchmark elliptic filter to validate our method. The technique is shown to uncover hitherto untestable parametric faults whose sizes are smaller than 10 % of the nominal values. I.

On Optimal Performance in Mobile Ad-Hoc Networks

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we are concerned with finding the maximum throughput that a mobile ad hoc network can support. Even when nodes are stationary, the problem of determining the capacity region has long been known to be NP-hard. Mobility introduces an additional dimension of complexity because nodes now also have to decide when they should initiate route discovery. Since route discovery involves communication and computation overhead, it should not be invoked very often. On the other hand, mobility implies that routes are bound to become stale resulting in sub-optimal performance if routes are not updated. We attempt to gain some understanding of these effects by considering a simple one-dimensional network model. The simplicity of our model allows us to use stochastic dynamic programming (SDP) to find the maximum possible network throughput with ideal routing and medium access control (MAC) scheduling. Using the optimal value as a benchmark, we also propose and evaluate the performance of a simple threshold-based heuristic. Unlike the optimal policy which requires considerable state information, the heuristic is very simple to implement and is not overly sensitive to the threshold value used. We find empirical conditions for our heuristic to be near-optimal as well as network scenarios when our simple heuristic does not perform very well. We provide extensive numerical and simulation results for different parameter settings of our model.

Intruder detection over sensor placement in a hexagonal lattice

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The problem of intrusion detection and location identification in the presence of clutter is considered for a hexagonal sensor-node geometry. It is noted that in any practical application,for a given fixed intruder or clutter location, only a small number of neighboring sensor nodes will register a significant reading. Thus sensing may be regarded as a local phenomenon and performance is strongly dependent on the local geometry of the sensor nodes. We focus on the case when the sensor nodes form a hexagonal lattice. The optimality of the hexagonal lattice with respect to density of packing and covering and largeness of the kissing number suggest that this is the best possible arrangement from a sensor network viewpoint. The results presented here are clearly relevant when the particular sensing application permits a deterministic placement of sensors. The results also serve as a performance benchmark for the case of a random deployment of sensors. A novel feature of our analysis of the hexagonal sensor grid is a signal-space viewpoint which sheds light on achievable performance.Under this viewpoint, the problem of intruder detection is reduced to one of determining in a distributed manner, the optimal decision boundary that separates the signal spaces SI and SC associated to intruder and clutter respectively. Given the difficulty of implementing the optimal detector, we present a low-complexity distributive algorithm under which the surfaces SI and SC are separated by a wellchosen hyperplane. The algorithm is designed to be efficient in terms of communication cost by minimizing the expected number of bits transmitted by a sensor.

Maximum Margin Classifiers with Specified False Positive and False Negative Error Rates

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper addresses the problem of maximum margin classification given the moments of class conditional densities and the false positive and false negative error rates. Using Chebyshev inequalities, the problem can be posed as a second order cone programming problem. The dual of the formulation leads to a geometric optimization problem, that of computing the distance between two ellipsoids, which is solved by an iterative algorithm. The formulation is extended to non-linear classifiers using kernel methods. The resultant classifiers are applied to the case of classification of unbalanced datasets with asymmetric costs for misclassification. Experimental results on benchmark datasets show the efficacy of the proposed method.

Focussed Crawling with large scale Ordinal Regression Solvers

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we propose a novel, scalable, clustering based Ordinal Regression formulation, which is an instance of a Second Order Cone Program (SOCP) with one Second Order Cone (SOC) constraint. The main contribution of the paper is a fast algorithm, CB-OR, which solves the proposed formulation more eficiently than general purpose solvers. Another main contribution of the paper is to pose the problem of focused crawling as a large scale Ordinal Regression problem and solve using the proposed CB-OR. Focused crawling is an efficient mechanism for discovering resources of interest on the web. Posing the problem of focused crawling as an Ordinal Regression problem avoids the need for a negative class and topic hierarchy, which are the main drawbacks of the existing focused crawling methods. Experiments on large synthetic and benchmark datasets show the scalability of CB-OR. Experiments also show that the proposed focused crawler outperforms the state-of-the-art.

Closed form solutions for element equilibrium and flexibility matrices of eight node rectangular plate bending element using integrated force method

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Closed form solutions for equilibrium and flexibility matrices of the Mindlin-Reissner theory based eight-node rectangular plate bending element (MRP8) using integrated Force Method (IFM) are presented in this paper. Though these closed form solutions of equilibrium and flexibility matrices are applicable to plate bending problems with square/rectangular boundaries, they reduce the computational time significantly and give more exact solutions. Presented closed form solutions are validated by solving large number of standard square/rectangular plate bending benchmark problems for deflections and moments and the results are compared with those of similar displacement-based eight-node quadrilateral plate bending elements available in the literature. The results are also compared with the exact solutions.

«
1
2
...
24
25
26
27
28
29
30
...
61
62
»