67 resultados para proposed program
Resumo:
Most of the existing WCET estimation methods directly estimate execution time, ET, in cycles. We propose to study ET as a product of two factors, ET = IC * CPI, where IC is instruction count and CPI is cycles per instruction. Considering directly the estimation of ET may lead to a highly pessimistic estimate since implicitly these methods may be using worst case IC and worst case CPI. We hypothesize that there exists a functional relationship between CPI and IC such that CPI=f(IC). This is ascertained by computing the covariance matrix and studying the scatter plots of CPI versus IC. IC and CPI values are obtained by running benchmarks with a large number of inputs using the cycle accurate architectural simulator, Simplescalar on two different architectures. It is shown that the benchmarks can be grouped into different classes based on the CPI versus IC relationship. For some benchmarks like FFT, FIR etc., both IC and CPI are almost a constant irrespective of the input. There are other benchmarks that exhibit a direct or an inverse relationship between CPI and IC. In such a case, one can predict CPI for a given IC as CPI=f(IC). We derive the theoretical worst case IC for a program, denoted as SWIC, using integer linear programming(ILP) and estimate WCET as SWIC*f(SWIC). However, if CPI decreases sharply with IC then measured maximum cycles is observed to be a better estimate. For certain other benchmarks, it is observed that the CPI versus IC relationship is either random or CPI remains constant with varying IC. In such cases, WCET is estimated as the product of SWIC and measured maximum CPI. It is observed that use of the proposed method results in tighter WCET estimates than Chronos, a static WCET analyzer, for most benchmarks for the two architectures considered in this paper.
Resumo:
In many real world prediction problems the output is a structured object like a sequence or a tree or a graph. Such problems range from natural language processing to compu- tational biology or computer vision and have been tackled using algorithms, referred to as structured output learning algorithms. We consider the problem of structured classifi- cation. In the last few years, large margin classifiers like sup-port vector machines (SVMs) have shown much promise for structured output learning. The related optimization prob -lem is a convex quadratic program (QP) with a large num-ber of constraints, which makes the problem intractable for large data sets. This paper proposes a fast sequential dual method (SDM) for structural SVMs. The method makes re-peated passes over the training set and optimizes the dual variables associated with one example at a time. The use of additional heuristics makes the proposed method more efficient. We present an extensive empirical evaluation of the proposed method on several sequence learning problems.Our experiments on large data sets demonstrate that the proposed method is an order of magnitude faster than state of the art methods like cutting-plane method and stochastic gradient descent method (SGD). Further, SDM reaches steady state generalization performance faster than the SGD method. The proposed SDM is thus a useful alternative for large scale structured output learning.
Resumo:
Accurate supersymmetric spectra are required to confront data from direct and indirect searches of supersymmetry. SuSeFLAV is a numerical tool capable of computing supersymmetric spectra precisely for various supersymmetric breaking scenarios applicable even in the presence of flavor violation. The program solves MSSM RGEs with complete 3 x 3 flavor mixing at 2-loop level and one loop finite threshold corrections to all MSSM parameters by incorporating radiative electroweak symmetry breaking conditions. The program also incorporates the Type-I seesaw mechanism with three massive right handed neutrinos at user defined mass scales and mixing. It also computes branching ratios of flavor violating processes such as l(j) -> l(i)gamma, l(j) -> 3 l(i), b -> s gamma and supersymmetric contributions to flavor conserving quantities such as (g(mu) - 2). A large choice of executables suitable for various operations of the program are provided. Program summary Program title: SuSeFLAV Catalogue identifier: AEOD_v1_0 Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEOD_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public License No. of lines in distributed program, including test data, etc.: 76552 No. of bytes in distributed program, including test data, etc.: 582787 Distribution format: tar.gz Programming language: Fortran 95. Computer: Personal Computer, Work-Station. Operating system: Linux, Unix. Classification: 11.6. Nature of problem: Determination of masses and mixing of supersymmetric particles within the context of MSSM with conserved R-parity with and without the presence of Type-I seesaw. Inter-generational mixing is considered while calculating the mass spectrum. Supersymmetry breaking parameters are taken as inputs at a high scale specified by the mechanism of supersymmetry breaking. RG equations including full inter-generational mixing are then used to evolve these parameters up to the electroweak breaking scale. The low energy supersymmetric spectrum is calculated at the scale where successful radiative electroweak symmetry breaking occurs. At weak scale standard model fermion masses, gauge couplings are determined including the supersymmetric radiative corrections. Once the spectrum is computed, the program proceeds to various lepton flavor violating observables (e.g., BR(mu -> e gamma), BR(tau -> mu gamma) etc.) at the weak scale. Solution method: Two loop RGEs with full 3 x 3 flavor mixing for all supersymmetry breaking parameters are used to compute the low energy supersymmetric mass spectrum. An adaptive step size Runge-Kutta method is used to solve the RGEs numerically between the high scale and the electroweak breaking scale. Iterative procedure is employed to get the consistent radiative electroweak symmetry breaking condition. The masses of the supersymmetric particles are computed at 1-loop order. The third generation SM particles and the gauge couplings are evaluated at the 1-loop order including supersymmetric corrections. A further iteration of the full program is employed such that the SM masses and couplings are consistent with the supersymmetric particle spectrum. Additional comments: Several executables are presented for the user. Running time: 0.2 s on a Intel(R) Core(TM) i5 CPU 650 with 3.20 GHz. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
This paper presents comparative evaluation of the distance relay characteristics for UHV and EHV transmission lines. Distance protection relay characteristics for the EHV and UHV systems are developed using Electromagnetic Transients (EMT) program. The variation of ideal trip boundaries for both the systems are presented. Unlike the conventional distance protection relay which uses a lumped parameter model, this paper uses the distributed parameter model. The effect of larger shunt susceptance on the trip boundaries is highlighted. Performance of distance relay with ideal trip boundaries for EHV and UHV lines have been tested for various fault locations and fault resistances. Electromagnetic Transients (EMT) program has been developed considering distributed parameter line model for simulating the test systems. The voltage and current phasors are computed from the signals using an improved full cycle DFT algorithm taking 20 samples per cycle. Two practical transmission systems of Indian power grid, namely 765 kV UHV transmission line and SREB 24-bus 400kV EHV system are used to test the performance of the proposed approach.
Resumo:
MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program's execution time. Today's computer systems have tremendous computing power in the form of traditional CPU cores and throughput oriented accelerators such as graphics processing units(GPUs). Thus, an approach that maps the control flow dominated regions to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this paper, we present the design and implementation of MEGHA, a compiler that automatically compiles MATLAB programs to enable synergistic execution on heterogeneous processors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. We propose a set of compiler optimizations tailored for MATLAB. Our compiler identifies data parallel regions of the program and composes them into kernels. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically and the amount of data transfer needed is minimized. In order to ensure required data movement for dependencies across basic blocks, we propose a data flow analysis and edge splitting strategy. Thus our compiler automatically handles composition of kernels, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfer. The proposed compiler was implemented and experimental evaluation using a set of MATLAB benchmarks shows that our approach achieves a geometric mean speedup of 19.8X for data parallel benchmarks over native execution of MATLAB.
Resumo:
This paper presents a fast and accurate relaying technique for a long 765kv UHV transmission line based on support vector machine. For a long EHV/UHV transmission line with large distributed capacitance, a traditional distance relay which uses a lumped parameter model of the transmission line can cause malfunction of the relay. With a frequency of 1kHz, 1/4th cycle of instantaneous values of currents and voltages of all phases at the relying end are fed to Support Vector Machine(SVM). The SVM detects fault type accurately using 3 milliseconds of post-fault data and reduces the fault clearing time which improves the system stability and power transfer capability. The performance of relaying scheme has been checked with a typical 765kV Indian transmission System which is simulated using the Electromagnetic Transients Program(EMTP) developed by authors in which the distributed parameter line model is used. More than 15,000 different short circuit fault cases are simulated by varying fault location, fault impedance, fault incidence angle and fault type to train the SVM for high speed accurate relaying. Simulation studies have shown that the proposed relay provides fast and accurate protection irrespective of fault location, fault impedance, incidence time of fault and fault type. And also the proposed scheme can be used as augmentation for the existing relaying, particularly for Zone-2, Zone-3 protection.
Resumo:
Seismic site classifications are used to represent site effects for estimating hazard parameters (response spectral ordinates) at the soil surface. Seismic site classifications have generally been carried out using average shear wave velocity and/or standard penetration test n-values of top 30-m soil layers, according to the recommendations of the National Earthquake Hazards Reduction Program (NEHRP) or the International Building Code (IBC). The site classification system in the NEHRP and the IBC is based on the studies carried out in the United States where soil layers extend up to several hundred meters before reaching any distinct soil-bedrock interface and may not be directly applicable to other regions, especially in regions having shallow geological deposits. This paper investigates the influence of rock depth on site classes based on the recommendations of the NEHRP and the IBC. For this study, soil sites having a wide range of average shear wave velocities (or standard penetration test n-values) have been collected from different parts of Australia, China, and India. Shear wave velocities of rock layers underneath soil layers have also been collected at depths from a few meters to 180 m. It is shown that a site classification system based on the top 30-m soil layers often represents stiffer site classes for soil sites having shallow rock depths (rock depths less than 25 m from the soil surface). A new site classification system based on average soil thickness up to engineering bedrock has been proposed herein, which is considered more representative for soil sites in shallow bedrock regions. It has been observed that response spectral ordinates, amplification factors, and site periods estimated using one-dimensional shear wave analysis considering the depth of engineering bedrock are different from those obtained considering top 30-m soil layers.
Resumo:
Chebyshev-inequality-based convex relaxations of Chance-Constrained Programs (CCPs) are shown to be useful for learning classifiers on massive datasets. In particular, an algorithm that integrates efficient clustering procedures and CCP approaches for computing classifiers on large datasets is proposed. The key idea is to identify high density regions or clusters from individual class conditional densities and then use a CCP formulation to learn a classifier on the clusters. The CCP formulation ensures that most of the data points in a cluster are correctly classified by employing a Chebyshev-inequality-based convex relaxation. This relaxation is heavily dependent on the second-order statistics. However, this formulation and in general such relaxations that depend on the second-order moments are susceptible to moment estimation errors. One of the contributions of the paper is to propose several formulations that are robust to such errors. In particular a generic way of making such formulations robust to moment estimation errors is illustrated using two novel confidence sets. An important contribution is to show that when either of the confidence sets is employed, for the special case of a spherical normal distribution of clusters, the robust variant of the formulation can be posed as a second-order cone program. Empirical results show that the robust formulations achieve accuracies comparable to that with true moments, even when moment estimates are erroneous. Results also illustrate the benefits of employing the proposed methodology for robust classification of large-scale datasets.
Resumo:
The problem addressed in this paper is concerned with an important issue faced by any green aware global company to keep its emissions within a prescribed cap. The specific problem is to allocate carbon reductions to its different divisions and supply chain partners in achieving a required target of reductions in its carbon reduction program. The problem becomes a challenging one since the divisions and supply chain partners, being autonomous, may exhibit strategic behavior. We use a standard mechanism design approach to solve this problem. While designing a mechanism for the emission reduction allocation problem, the key properties that need to be satisfied are dominant strategy incentive compatibility (DSIC) (also called strategy-proofness), strict budget balance (SBB), and allocative efficiency (AE). Mechanism design theory has shown that it is not possible to achieve the above three properties simultaneously. In the literature, a mechanism that satisfies DSIC and AE has recently been proposed in this context, keeping the budget imbalance minimal. Motivated by the observation that SBB is an important requirement, in this paper, we propose a mechanism that satisfies DSIC and SBB with slight compromise in allocative efficiency. Our experimentation with a stylized case study shows that the proposed mechanism performs satisfactorily and provides an attractive alternative mechanism for carbon footprint reduction by global companies.
Resumo:
The Lovasz θ function of a graph, is a fundamental tool in combinatorial optimization and approximation algorithms. Computing θ involves solving a SDP and is extremely expensive even for moderately sized graphs. In this paper we establish that the Lovasz θ function is equivalent to a kernel learning problem related to one class SVM. This interesting connection opens up many opportunities bridging graph theoretic algorithms and machine learning. We show that there exist graphs, which we call SVM−θ graphs, on which the Lovasz θ function can be approximated well by a one-class SVM. This leads to a novel use of SVM techniques to solve algorithmic problems in large graphs e.g. identifying a planted clique of size Θ(n√) in a random graph G(n,12). A classic approach for this problem involves computing the θ function, however it is not scalable due to SDP computation. We show that the random graph with a planted clique is an example of SVM−θ graph, and as a consequence a SVM based approach easily identifies the clique in large graphs and is competitive with the state-of-the-art. Further, we introduce the notion of a ''common orthogonal labeling'' which extends the notion of a ''orthogonal labelling of a single graph (used in defining the θ function) to multiple graphs. The problem of finding the optimal common orthogonal labelling is cast as a Multiple Kernel Learning problem and is used to identify a large common dense region in multiple graphs. The proposed algorithm achieves an order of magnitude scalability compared to the state of the art.
Resumo:
The twin demands of energy-efficiency and higher performance on DRAM are highly emphasized in multicore architectures. A variety of schemes have been proposed to address either the latency or the energy consumption of DRAMs. These schemes typically require non-trivial hardware changes and end up improving latency at the cost of energy or vice-versa. One specific DRAM performance problem in multicores is that interleaved accesses from different cores can potentially degrade row-buffer locality. In this paper, based on the temporal and spatial locality characteristics of memory accesses, we propose a reorganization of the existing single large row-buffer in a DRAM bank into multiple sub-row buffers (MSRB). This re-organization not only improves row hit rates, and hence the average memory latency, but also brings down the energy consumed by the DRAM. The first major contribution of this work is proposing such a reorganization without requiring any significant changes to the existing widely accepted DRAM specifications. Our proposed reorganization improves weighted speedup by 35.8%, 14.5% and 21.6% in quad, eight and sixteen core workloads along with a 42%, 28% and 31% reduction in DRAM energy. The proposed MSRB organization enables opportunities for the management of multiple row-buffers at the memory controller level. As the memory controller is aware of the behaviour of individual cores it allows us to implement coordinated buffer allocation schemes for different cores that take into account program behaviour. We demonstrate two such schemes, namely Fairness Oriented Allocation and Performance Oriented Allocation, which show the flexibility that memory controllers can now exploit in our MSRB organization to improve overall performance and/or fairness. Further, the MSRB organization enables additional opportunities for DRAM intra-bank parallelism and selective early precharging of the LRU row-buffer to further improve memory access latencies. These two optimizations together provide an additional 5.9% performance improvement.
Resumo:
Electrical Impedance Tomography (EIT) is a computerized medical imaging technique which reconstructs the electrical impedance images of a domain under test from the boundary voltage-current data measured by an EIT electronic instrumentation using an image reconstruction algorithm. Being a computed tomography technique, EIT injects a constant current to the patient's body through the surface electrodes surrounding the domain to be imaged (Omega) and tries to calculate the spatial distribution of electrical conductivity or resistivity of the closed conducting domain using the potentials developed at the domain boundary (partial derivative Omega). Practical phantoms are essentially required to study, test and calibrate a medical EIT system for certifying the system before applying it on patients for diagnostic imaging. Therefore, the EIT phantoms are essentially required to generate boundary data for studying and assessing the instrumentation and inverse solvers a in EIT. For proper assessment of an inverse solver of a 2D EIT system, a perfect 2D practical phantom is required. As the practical phantoms are the assemblies of the objects with 3D geometries, the developing of a practical 2D-phantom is a great challenge and therefore, the boundary data generated from the practical phantoms with 3D geometry are found inappropriate for assessing a 2D inverse solver. Furthermore, the boundary data errors contributed by the instrumentation are also difficult to separate from the errors developed by the 3D phantoms. Hence, the errorless boundary data are found essential to assess the inverse solver in 2D EIT. In this direction, a MatLAB-based Virtual Phantom for 2D EIT (MatVP2DEIT) is developed to generate accurate boundary data for assessing the 2D-EIT inverse solvers and the image reconstruction accuracy. MatVP2DEIT is a MatLAB-based computer program which simulates a phantom in computer and generates the boundary potential data as the outputs by using the combinations of different phantom parameters as the inputs to the program. Phantom diameter, inhomogeneity geometry (shape, size and position), number of inhomogeneities, applied current magnitude, background resistivity, inhomogeneity resistivity all are set as the phantom variables which are provided as the input parameters to the MatVP2DEIT for simulating different phantom configurations. A constant current injection is simulated at the phantom boundary with different current injection protocols and boundary potential data are calculated. Boundary data sets are generated with different phantom configurations obtained with the different combinations of the phantom variables and the resistivity images are reconstructed using EIDORS. Boundary data of the virtual phantoms, containing inhomogeneities with complex geometries, are also generated for different current injection patterns using MatVP2DEIT and the resistivity imaging is studied. The effect of regularization method on the image reconstruction is also studied with the data generated by MatVP2DEIT. Resistivity images are evaluated by studying the resistivity parameters and contrast parameters estimated from the elemental resistivity profiles of the reconstructed phantom domain. Results show that the MatVP2DEIT generates accurate boundary data for different types of single or multiple objects which are efficient and accurate enough to reconstruct the resistivity images in EIDORS. The spatial resolution studies show that, the resistivity imaging conducted with the boundary data generated by MatVP2DEIT with 2048 elements, can reconstruct two circular inhomogeneities placed with a minimum distance (boundary to boundary) of 2 mm. It is also observed that, in MatVP2DEIT with 2048 elements, the boundary data generated for a phantom with a circular inhomogeneity of a diameter less than 7% of that of the phantom domain can produce resistivity images in EIDORS with a 1968 element mesh. Results also show that the MatVP2DEIT accurately generates the boundary data for neighbouring, opposite reference and trigonometric current patterns which are very suitable for resistivity reconstruction studies. MatVP2DEIT generated data are also found suitable for studying the effect of the different regularization methods on reconstruction process. Comparing the reconstructed image with an original geometry made in MatVP2DEIT, it would be easier to study the resistivity imaging procedures as well as the inverse solver performance. Using the proposed MatVP2DEIT software with modified domains, the cross sectional anatomy of a number of body parts can be simulated in PC and the impedance image reconstruction of human anatomy can be studied.
Resumo:
Precise pointer analysis is a problem of interest to both the compiler and the program verification community. Flow-sensitivity is an important dimension of pointer analysis that affects the precision of the final result computed. Scaling flow-sensitive pointer analysis to millions of lines of code is a major challenge. Recently, staged flow-sensitive pointer analysis has been proposed, which exploits a sparse representation of program code created by staged analysis. In this paper we formulate the staged flow-sensitive pointer analysis as a graph-rewriting problem. Graph-rewriting has already been used for flow-insensitive analysis. However, formulating flow-sensitive pointer analysis as a graph-rewriting problem adds additional challenges due to the nature of flow-sensitivity. We implement our parallel algorithm using Intel Threading Building Blocks and demonstrate considerable scaling (upto 2.6x) for 8 threads on a set of 10 benchmarks. Compared to the sequential implementation of staged flow-sensitive analysis, a single threaded execution of our implementation performs better in 8 of the benchmarks.
Resumo:
Identification and analysis of nonbonded interactions within a molecule and with the surrounding molecules are an essential part of structural studies, given the importance of these interactions in defining the structure and function of any supramolecular entity. MolBridge is an easy to use algorithm based purely on geometric criteria that can identify all possible nonbonded interactions, such as hydrogen bond, halogen bond, cation-pi, pi-pi and van der Waals, in small molecules as well as biomolecules. The user can either upload three-dimensional coordinate files or enter the molecular ID corresponding to the relevant database. The program is available in a standalone form and as an interactive web server with Jmol and JME incorporated into it. The program is freely downloadable and the web server version is also available at http://nucleix.mbu.iisc.ernet.in/molbridge/index.php.