942 resultados para 291605 Processor Architectures
Resumo:
We present a distributed algorithm that finds a maximal edge packing in O(Δ + log* W) synchronous communication rounds in a weighted graph, independent of the number of nodes in the network; here Δ is the maximum degree of the graph and W is the maximum weight. As a direct application, we have a distributed 2-approximation algorithm for minimum-weight vertex cover, with the same running time. We also show how to find an f-approximation of minimum-weight set cover in O(f2k2 + fk log* W) rounds; here k is the maximum size of a subset in the set cover instance, f is the maximum frequency of an element, and W is the maximum weight of a subset. The algorithms are deterministic, and they can be applied in anonymous networks.
Resumo:
In a max-min LP, the objective is to maximise ω subject to Ax ≤ 1, Cx ≥ ω1, and x ≥ 0 for nonnegative matrices A and C. We present a local algorithm (constant-time distributed algorithm) for approximating max-min LPs. The approximation ratio of our algorithm is the best possible for any local algorithm; there is a matching unconditional lower bound.
Resumo:
Determining the sequence of amino acid residues in a heteropolymer chain of a protein with a given conformation is a discrete combinatorial problem that is not generally amenable for gradient-based continuous optimization algorithms. In this paper we present a new approach to this problem using continuous models. In this modeling, continuous "state functions" are proposed to designate the type of each residue in the chain. Such a continuous model helps define a continuous sequence space in which a chosen criterion is optimized to find the most appropriate sequence. Searching a continuous sequence space using a deterministic optimization algorithm makes it possible to find the optimal sequences with much less computation than many other approaches. The computational efficiency of this method is further improved by combining it with a graph spectral method, which explicitly takes into account the topology of the desired conformation and also helps make the combined method more robust. The continuous modeling used here appears to have additional advantages in mimicking the folding pathways and in creating the energy landscapes that help find sequences with high stability and kinetic accessibility. To illustrate the new approach, a widely used simplifying assumption is made by considering only two types of residues: hydrophobic (H) and polar (P). Self-avoiding compact lattice models are used to validate the method with known results in the literature and data that can be practically obtained by exhaustive enumeration on a desktop computer. We also present examples of sequence design for the HP models of some real proteins, which are solved in less than five minutes on a single-processor desktop computer Some open issues and future extensions are noted.
Resumo:
Catalytic combustion of H-2 was carried out over combustion synthesized noble metal (Pd or Pt) ion-substituted CeO2 based catalysts using a feed stream that simulated exhaust gases from a fuel cell processor The catalysts showed a high activity for H-2-combustion and complete conversion was achieved below 200 C over all the catalysts when O-2 was used in a stoichiometric amount With higher amounts of O-2 the reaction rates Increased and complete conversions were possible below 100 C The reaction was also carried out over Pd-impregnated CeO2 The conversions of H-2 with stoichiometric amount of O-2 were found to be higher over Pd-substituted compound The mechanism of the reaction over noble metal-substituted compounds was proposed on the basis of X-ray photoelectron spectroscopy studies The redox couples between Ce and metal ions were established and a dual site redox mechanism was pi posed for the reaction (C) 2010 Elsevier B V All rights reserved
Resumo:
Even research models of helicopter dynamics often lead to a large number of equations of motion with periodic coefficients; and Floquet theory is a widely used mathematical tool for dynamic analysis. Presently, three approaches are used in generating the equations of motion. These are (1) general-purpose symbolic processors such as REDUCE and MACSYMA, (2) a special-purpose symbolic processor, DEHIM (Dynamic Equations for Helicopter Interpretive Models), and (3) completely numerical approaches. In this paper, comparative aspects of the first two purely algebraic approaches are studied by applying REDUCE and DEHIM to the same set of problems. These problems range from a linear model with one degree of freedom to a mildly non-linear multi-bladed rotor model with several degrees of freedom. Further, computational issues in applying Floquet theory are also studied, which refer to (1) the equilibrium solution for periodic forced response together with the transition matrix for perturbations about that response and (2) a small number of eigenvalues and eigenvectors of the unsymmetric transition matrix. The study showed the following: (1) compared to REDUCE, DEHIM is far more portable and economical, but it is also less user-friendly, particularly during learning phases; (2) the problems of finding the periodic response and eigenvalues are well conditioned.
Resumo:
In this paper we propose a novel technique to model and ana¿ lyze the performability of parallel and distributed architectures using GSPN-reward models.
Resumo:
The design of a dual-DSP microprocessor system and its application for parallel FFT and two-dimensional convolution are explained. The system is based on a master-salve configuration. Two ADSP-2101s are configured as slave processors and a PC/AT serves as the master. The master serves as a control processor to transfer the program code and data to the DSPs. The system architecture and the algorithms for the two applications, viz. FFT and two-dimensional convolutions, are discussed.
Resumo:
In lake-rich regions, the gathering of information about water quality is challenging because only a small proportion of the lakes can be assessed each year by conventional methods. One of the techniques for improving the spatial and temporal representativeness of lake monitoring is remote sensing from satellites and aircrafts. The experimental material included detailed optical measurements in 11 lakes, air- and spaceborne remote sensing measurements with concurrent field sampling, automatic raft measurements and a national dataset of routine water quality measurements from over 1100 lakes. The analyses of the spatially high-resolution airborne remote sensing data from eutrophic and mesotrophic lakes showed that one or a few discrete water quality observations using conventional monitoring can yield a clear over- or underestimation of the overall water quality in a lake. The use of TM-type satellite instruments in addition to routine monitoring results substantially increases the number of lakes for which water quality information can be obtained. The preliminary results indicated that coloured dissolved organic matter (CDOM) can be estimated with TM-type satellite instruments, which could possibly be utilised as an aid in estimating the role of lakes in global carbon budgets. Based on the results of reflectance modelling and experimental data, MERIS satellite instrument has optimal or near-optimal channels for the estimation of turbidity, chlorophyll a and CDOM in Finnish lakes. MERIS images with a 300 m spatial resolution can provide water quality information in different parts of large and medium-sized lakes, and in filling in the gaps resulting from conventional monitoring. Algorithms that would not require simultaneous field data for algorithm training would increase the amount of remote sensing-based information available for lake monitoring. The MERIS Boreal Lakes processor, trained with the optical data and concentration ranges provided by this study, enabled turbidity estimations with good accuracy without the need for algorithm correction with field measurements, while chlorophyll a and CDOM estimations require further development of the processor. The accuracy of interpreting chlorophyll a via semi empirical algorithms can be improved by classifying lakes prior to interpretation according to their CDOM level and trophic status. Optical modelling indicated that the spectral diffuse attenuation coefficient can be estimated with reasonable accuracy from the measured water quality concentrations. This provides more detailed information on light attenuation from routine monitoring measurements than is available through the Secchi disk transparency. The results of this study improve the interpretation of lake water quality by remote sensing and encourage the use of remote sensing in lake monitoring.
Analyzing Cache Performance Bottlenecks of STM Applications and addressing them with Compiler's help
Resumo:
Software transactional memory (STM) is a promising programming paradigm for shared memory multithreaded programs as an alternative to traditional lock based synchronization. However adoption of STM in mainstream software has been quite low due to its considerable overheads and its poor cache/memory performance. In this paper, we perform a detailed study of the cache behavior of STM applications and quantify the impact of different STM factors on the cache misses experienced by the applications. Based on our analysis, we propose a compiler driven Lock-Data Colocation (LDC), targeted at reducing the cache overheads on STM. We show that LDC is effective in improving the cache behavior of STM applications by reducing the dcache miss latency and improving execution time performance.
Resumo:
In this work, we propose a new organization for the last level shared cache of a rnulticore system. Our design is based on the observation that the Next-Use distance, measured in terms of intervening misses between the eviction of a line and its next use, for lines brought in by a given delinquent PC falls within a predictable range of values. We exploit this correlation to improve the performance of shared caches in multi-core architectures by proposing the NUcache organization.
Resumo:
High performance video standards use prediction techniques to achieve high picture quality at low bit rates. The type of prediction decides the bit rates and the image quality. Intra Prediction achieves high video quality with significant reduction in bit rate. This paper present an area optimized architecture for Intra prediction, for H.264 decoding at HDTV resolution with a target of achieving 60 fps. The architecture was validated on Virtex-5 FPGA based platform. The architecture achieves a frame rate of 64 fps. The architecture is based on multi-level memory hierarchy to reduce latency and ensure optimum resources utilization. It removes redundancy by reusing same functional blocks across different modes. The proposed architecture uses only 13% of the total LUTs available on the Xilinx FPGA XC5VLX50T.
Resumo:
In this paper we present a cache coherence protocol for multistage interconnection network (MIN)-based multiprocessors with two distinct private caches: private-blocks caches (PCache) containing blocks private to a process and shared-blocks caches (SCache) containing data accessible by all processes. The architecture is extended by a coherence control bus connecting all shared-block cache controllers. Timing problems due to variable transit delays through the MIN are dealt with by introducing Transient states in the proposed cache coherence protocol. The impact of the coherence protocol on system performance is evaluated through a performance study of three phases. Assuming homogeneity of all nodes, a single-node queuing model (phase 3) is developed to analyze system performance. This model is solved for processor and coherence bus utilizations using the mean value analysis (MVA) technique with shared-blocks steady state probabilities (phase 1) and communication delays (phase 2) as input parameters. The performance of our system is compared to that of a system with an equivalent-sized unified cache and with a multiprocessor implementing a directory-based coherence protocol. System performance measures are verified through simulation.
Resumo:
A number of companies are trying to migrate large monolithic software systems to Service Oriented Architectures. A common approach to do this is to first identify and describe desired services (i.e., create a model), and then to locate portions of code within the existing system that implement the described services. In this paper we describe a detailed case study we undertook to match a model to an open-source business application. We describe the systematic methodology we used, the results of the exercise, as well as several observations that throw light on the nature of this problem. We also suggest and validate heuristics that are likely to be useful in partially automating the process of matching service descriptions to implementations.
Resumo:
The bipolar point spread function (PSF) corresponding to the Wiener filter tor correcting linear-motion-blurred pictures is implemented in a noncoherent optical processor. The following two approaches are taken for this implementation: (1) the PSF is modulated and biased so that the resulting function is non-negative and (2) the PSF is split into its positive and sign-reversed negative parts, and these two parts are dealt with separately. The phase problem associated with arriving at the pupil function from these modified PSFs is solved using both analytical and combined analytical-iterative techniques available in the literature. The designed pupil functions are experimentally implemented, and deblurring in a noncoherent processor is demonstrated. The postprocessing required (i.e., demodulation in the first approach to modulating the PSF and intensity subtraction in the second approach) are carried out either in a coherent processor or with the help of a PC-based vision system. The deblurred outputs are presented.
Resumo:
We describe a compiler for the Flat Concurrent Prolog language on a message passing multiprocessor architecture. This compiler permits symbolic and declarative programming in the syntax of Guarded Horn Rules, The implementation has been verified and tested on the 64-node PARAM parallel computer developed by C-DAC (Centre for the Development of Advanced Computing, India), Flat Concurrent Prolog (FCP) is a logic programming language designed for concurrent programming and parallel execution, It is a process oriented language, which embodies dataflow synchronization and guarded-command as its basic control mechanisms. An identical algorithm is executed on every processor in the network, We assume regular network topologies like mesh, ring, etc, Each node has a local memory, The algorithm comprises of two important parts: reduction and communication, The most difficult task is to integrate the solutions of problems that arise in the implementation in a coherent and efficient manner. We have tested the efficacy of the compiler on various benchmark problems of the ICOT project that have been reported in the recent book by Evan Tick, These problems include Quicksort, 8-queens, and Prime Number Generation, The results of the preliminary tests are favourable, We are currently examining issues like indexing and load balancing to further optimize our compiler.