26 resultados para General-purpose computing


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Packet forwarding is a memory-intensive application requiring multiple accesses through a trie structure. The efficiency of a cache for this application critically depends on the placement function to reduce conflict misses. Traditional placement functions use a one-level mapping that naively partitions trie-nodes into cache sets. However, as a significant percentage of trie nodes are not useful, these schemes suffer from a non-uniform distribution of useful nodes to sets. This in turn results in increased conflict misses. Newer organizations such as variable associativity caches achieve flexibility in placement at the expense of increased hit-latency. This makes them unsuitable for L1 caches.We propose a novel two-level mapping framework that retains the hit-latency of one-level mapping yet incurs fewer conflict misses. This is achieved by introducing a secondlevel mapping which reorganizes the nodes in the naive initial partitions into refined partitions with near-uniform distribution of nodes. Further as this remapping is accomplished by simply adapting the index bits to a given routing table the hit-latency is not affected. We propose three new schemes which result in up to 16% reduction in the number of misses and 13% speedup in memory access time. In comparison, an XOR-based placement scheme known to perform extremely well for general purpose architectures, can obtain up to 2% speedup in memory access time.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we present a framework for realizing arbitrary instruction set extensions (IE) that are identified post-silicon. The proposed framework has two components viz., an IE synthesis methodology and the architecture of a reconfigurable data-path for realization of the such IEs. The IE synthesis methodology ensures maximal utilization of resources on the reconfigurable data-path. In this context we present the techniques used to realize IEs for applications that demand high throughput or those that must process data streams. The reconfigurable hardware called HyperCell comprises a reconfigurable execution fabric. The fabric is a collection of interconnected compute units. A typical use case of HyperCell is where it acts as a co-processor with a host and accelerates execution of IEs that are defined post-silicon. We demonstrate the effectiveness of our approach by evaluating the performance of some well-known integer kernels that are realized as IEs on HyperCell. Our methodology for realizing IEs through HyperCells permits overlapping of potentially all memory transactions with computations. We show significant improvement in performance for streaming applications over general purpose processor based solutions, by fully pipelining the data-path. (C) 2014 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a GPU implementation of normalized cuts for road extraction problem using panchromatic satellite imagery. The roads have been extracted in three stages namely pre-processing, image segmentation and post-processing. Initially, the image is pre-processed to improve the tolerance by reducing the clutter (that mostly represents the buildings, vegetation,. and fallow regions). The road regions are then extracted using the normalized cuts algorithm. Normalized cuts algorithm is a graph-based partitioning `approach whose focus lies in extracting the global impression (perceptual grouping) of an image rather than local features. For the segmented image, post-processing is carried out using morphological operations - erosion and dilation. Finally, the road extracted image is overlaid on the original image. Here, a GPGPU (General Purpose Graphical Processing Unit) approach has been adopted to implement the same algorithm on the GPU for fast processing. A performance comparison of this proposed GPU implementation of normalized cuts algorithm with the earlier algorithm (CPU implementation) is presented. From the results, we conclude that the computational improvement in terms of time as the size of image increases for the proposed GPU implementation of normalized cuts. Also, a qualitative and quantitative assessment of the segmentation results has been projected.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Elettra is one of the first 3rd-generation storage rings, recently upgraded to routinely operate in top-up mode at both 2.0 and 2.4 GeV. The facility hosts four dedicated beamlines for crystallography, two open to the users and two under construction, and expected to be ready for public use in 2015. In service since 1994, XRD1 is a general-purpose diffraction beamline. The light source for this wide (4-21 keV) energy range beamline is a permanent magnet wiggler. XRD1 covers experiments ranging from grazing incidence X-ray diffraction to macromolecular crystallography, from industrial applications of powder diffraction to X-ray phasing with long wavelengths. The bending magnet powder diffraction beamline MCX has been open to users since 2009, with a focus on microstructural investigations and studies under non-ambient conditions. A superconducting wiggler delivers a high photon flux to a new fully automated beamline dedicated to macromolecular crystallography and to a branch beamline hosting a high-pressure powder X-ray diffraction station (both currently under construction). Users of the latter experimental station will have access to a specialized sample preparation laboratory, shared with the SISSI infrared beamline. A high throughput crystallization platform equipped with an imaging system for the remote viewing, evaluation and scoring of the macromolecular crystallization experiments has also been established and is open to the user community.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A new class of dendrimers, the poly(propyl ether imine) (PETIM) dendrimer, has been shown to be a novel hyperbranched polymer having potential applications as a drug delivery vehicle. Structure and dynamics of the amine terminated PETIM dendrimer and their changes with respect to the dendrimer generation are poorly understood. Since most drugs are hydrophobic in nature, the extent of hydrophobicity of the dendrimer core is related to its drug encapsulation and retention efficacy. In this study, we carry out fully atomistic molecular dynamics (MD) simulations to characterize the structure of PETIM (G2-G6) dendrimers in salt solution as a function of dendrimer generation at different protonation levels. Structural properties such as radius of gyration (R-g), radial density distribution, aspect ratio, and asphericity are calculated. In order to assess the hydrophilicity of the dendrimer, we compute the number of bound water molecules in the interior of dendrirner as well as the number of dendrimer-water hydrogen bonds. We conclude that PETIM dendrimers have relatively greater hydrophobicity and flexibility when compared with their extensively investigated PAMAM counterparts. Hence PETIM dendrimers are expected to have stronger interactions with lipid membranes as well as improved drug encapsulation and retention properties when compared with PAMAM dendrimers. We compute the root-mean-square fluctuation of dendrimers as well as their entropy to quantify the flexibility of the dendrimer. Finally we note that structural and solvation properties computed using force field parameters derived based on the CHARMM general purpose force field were in good quantitative agreement with those obtained using the generalized Amber force field (GAFF).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A Geodesic Constant Method (GCM) is outlined which provides a common approach to ray tracing on quadric cylinders in general, and yields all the surface ray-geometric parameters required in the UTD mutual coupling analysis of conformal antenna arrays in the closed form. The approach permits the incorporation of a shaping parameter which permits the modeling of quadric cylindrical surfaces of desired sharpness/flatness with a common set of equations. The mutual admittance between the slots on a general parabolic cylinder is obtained as an illustration of the applicability of the GCM.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Conformance testing focuses on checking whether an implementation. under test (IUT) behaves according to its specification. Typically, testers are interested it? performing targeted tests that exercise certain features of the IUT This intention is formalized as a test purpose. The tester needs a "strategy" to reach the goal specified by the test purpose. Also, for a particular test case, the strategy should tell the tester whether the IUT has passed, failed. or deviated front the test purpose. In [8] Jeron and Morel show how to compute, for a given finite state machine specification and a test purpose automaton, a complete test graph (CTG) which represents all test strategies. In this paper; we consider the case when the specification is a hierarchical state machine and show how to compute a hierarchical CTG which preserves the hierarchical structure of the specification. We also propose an algorithm for an online test oracle which avoids a space overhead associated with the CTG.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A symmetric solution X satisfying the matrix equation XA = AtX is called a symmetrizer of the matrix A. A general algorithm to compute a matrix symmetrizer is obtained. A new multiple-modulus residue arithmetic called floating-point modular arithmetic is described and implemented on the algorithm to compute an error-free matrix symmetrizer.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Under the project `Seasonal Prediction of the Indian Monsoon' (SPIM), the prediction of Indian summer monsoon rainfall by five atmospheric general circulation models (AGCMs) during 1985-2004 was assessed. The project was a collaborative effort of the coordinators and scientists from the different modelling groups across the country. All the runs were made at the Centre for Development of Advanced Computing (CDAC) at Bangalore on the PARAM Padma supercomputing system. Two sets of simulations were made for this purpose. In the first set, the AGCMs were forced by the observed sea surface temperature (SST) for May-September during 1985-2004. In the second set, runs were made for 1987, 1988, 1994, 1997 and 2002 forced by SST which was obtained by assuming that the April anomalies persist during May-September. The results of the first set of runs show, as expected from earlier studies, that none of the models were able to simulate the correct sign of the anomaly of the Indian summer monsoon rainfall for all the years. However, among the five models, one simulated the correct sign in the largest number of years and the second model showed maximum skill in the simulation of the extremes (i.e. droughts or excess rainfall years). The first set of runs showed some common bias which could arise either from an excessive sensitivity of the models to El Nino Southern Oscillation (ENSO) or an inability of the models to simulate the link of the Indian monsoon rainfall to Equatorial Indian Ocean Oscillation (EQUINOO), or both. Analysis of the second set of runs showed that with a weaker ENSO forcing, some models could simulate the link with EQUINOO, suggesting that the errors in the monsoon simulations with observed SST by these models could be attributed to unrealistically high sensitivity to ENSO.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The general procedure for synthesizing the rack and pinion mechanism up to seven precision conditions is developed. To illustrate the method, the mechanism has been synthesized in closed form for three precision conditions of path generation, two positions of function generation, and a velocity condition at one of the precision points. This mechanism has a number of advantages over conventional four bar mechanisms. First, since the rack is always tangent to the pinion, the transmission angle is always 90 deg minus the pressure angle of the rack. Second, with both translation and rotation of the rack occurring, multiple outputs are available. Other advantages include the generation of monotonic functions for a wide variety of motion and nonmonotonic functions for a full range of motion as well as nonlinear amplified motions. In this work the mechanism is made to satisfy a number of practical design requirements such as completely rotatable input crank and others. By including the velocity specification, the designer has considerably more control of the output motion. The method of solution developed in this work uses the complex number method of mechanism synthesis. A numerical example is included.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Computing the maximum of sensor readings arises in several environmental, health, and industrial monitoring applications of wireless sensor networks (WSNs). We characterize the several novel design trade-offs that arise when green energy harvesting (EH) WSNs, which promise perpetual lifetimes, are deployed for this purpose. The nodes harvest renewable energy from the environment for communicating their readings to a fusion node, which then periodically estimates the maximum. For a randomized transmission schedule in which a pre-specified number of randomly selected nodes transmit in a sensor data collection round, we analyze the mean absolute error (MAE), which is defined as the mean of the absolute difference between the maximum and that estimated by the fusion node in each round. We optimize the transmit power and the number of scheduled nodes to minimize the MAE, both when the nodes have channel state information (CSI) and when they do not. Our results highlight how the optimal system operation depends on the EH rate, availability and cost of acquiring CSI, quantization, and size of the scheduled subset. Our analysis applies to a general class of sensor reading and EH random processes.