Biblioteca Digital

163 resultados para structured parallel computations

TCP: thread contention predictor for parallel programs

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With proliferation of chip multicores (CMPs) on desktops and embedded platforms, multi-threaded programs have become ubiquitous. Existence of multiple threads may cause resource contention, such as, in on-chip shared cache and interconnects, depending upon how they access resources. Hence, we propose a tool - Thread Contention Predictor (TCP) to help quantify the number of threads sharing data and their sharing pattern. We demonstrate its use to predict a more profitable shared, last level on-chip cache (LLC) access policy on CMPs. Our cache configuration predictor is 2.2 times faster compared to the cycle-accurate simulations. We also demonstrate its use for identifying hot data structures in a program which may cause performance degradation due to false data sharing. We fix layout of such data structures and show up-to 10% and 18% improvement in execution time and energy-delay product (EDP), respectively.

Analysis of the degrees-of-freedom of spatial parallel manipulators in regular and singular configurations

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a study of the nature of the degrees-of-freedom of spatial manipulators based on the concept of partition of degrees-of-freedom. In particular, the partitioning of degrees-of-freedom is studied in five lower-mobility spatial parallel manipulators possessing different combinations of degrees-of-freedom. An extension of the existing theory is introduced so as to analyse the nature of the gained degree(s)-of-freedom at a gain-type singularity. The gain of one- and two-degrees-of-freedom is analysed in several well-studied, as well as newly developed manipulators. The formulations also present a basis for the analysis of the velocity kinematics of manipulators of any architecture. (C) 2013 Elsevier Ltd. All rights reserved.

CUDA-for-clusters: a system for efficient execution of CUDA kernels on multi-core clusters

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Rapid advancements in multi-core processor architectures coupled with low-cost, low-latency, high-bandwidth interconnects have made clusters of multi-core machines a common computing resource. Unfortunately, writing good parallel programs that efficiently utilize all the resources in such a cluster is still a major challenge. Various programming languages have been proposed as a solution to this problem, but are yet to be adopted widely to run performance-critical code mainly due to the relatively immature software framework and the effort involved in re-writing existing code in the new language. In this paper, we motivate and describe our initial study in exploring CUDA as a programming language for a cluster of multi-cores. We develop CUDA-For-Clusters (CFC), a framework that transparently orchestrates execution of CUDA kernels on a cluster of multi-core machines. The well-structured nature of a CUDA kernel, the growing popularity, support and stability of the CUDA software stack collectively make CUDA a good candidate to be considered as a programming language for a cluster. CFC uses a mixture of source-to-source compiler transformations, a work distribution runtime and a light-weight software distributed shared memory to manage parallel executions. Initial results on running several standard CUDA benchmark programs achieve impressive speedups of up to 7.5X on a cluster with 8 nodes, thereby opening up an interesting direction of research for further investigation.

D-A-D-structured conducting polymer-modified electrodes for detection of lead(II) ions in water

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Donor-acceptor-donor-structured thiophene derivative-based conducting polymer poly(7,9-dithiophene-2yl-8H-cyclopentaa]acenaphthalene-8-one) was chemically synthesized. This polymer was used to modify both glassy-carbon and carbon-paste electrode, which was used to detect lead(II) ions present in water in the range of 1 mM to 0.1 mu M. Cyclic voltammetry confirms the formation of the co-ordination complex between the soft segment of polymer and the dissolved lead ion. Anodic stripping voltammetry was carried out by the modified electrode to determine the lower limit of detection of dissolved lead(II) species in the solution. Differential adsorptive stripping and impedance measurements were also conducted to find the lowest possible response of the as-synthesized polymer to lead(II) ion in water. The electrochemical performance of the modified electrodes at different pH (4, 7 and 9) environments was carried out by stripping voltammetry, to get optimum sensitivity and stability under these conditions. Finally, interference analysis was carried out to detect the modified electrode's sensitivity towards lead ion affinity in water.

A Donor-Acceptor-Donor Structured Organic Conductor with S center dot center dot center dot S Chalcogen Bonding

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A novel thiophene derivative 7,9-di(thiophen-2-yl)-8H-cyclopentaa]acenaphthylen-8-one (DTCPA) is shown to exhibit high electrical conductivity (1.97 x 10(-2) +/- 0.0018 S/cm at RT) in the crystalline state. The material shows two orders of increase in conductivity from normal solid to single crystalline state. The crystal structure has S center dot center dot center dot S chalcogen bonding, C-H center dot center dot center dot O hydrogen bonding, and pi center dot center dot center dot pi stacking as the major intermolecular interactions. The nature and strength of the S center dot center dot center dot S interactions in this structure have been evaluated by theoretical charge density analysis, and its contribution to the crystal packing quantified by Hirshfeld surface analysis. Further, thermal and morphological characterizations have been carried out, and the second harmonic generation (SHG) efficiency has been measured using the Kurtz-Perry method.

Simulation of inhomogeneous distributions of ultracold atoms in an optical lattice via a massively parallel implementation of nonequilibrium strong-coupling perturbation theory

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a nonequilibrium strong-coupling approach to inhomogeneous systems of ultracold atoms in optical lattices. We demonstrate its application to the Mott-insulating phase of a two-dimensional Fermi-Hubbard model in the presence of a trap potential. Since the theory is formulated self-consistently, the numerical implementation relies on a massively parallel evaluation of the self-energy and the Green's function at each lattice site, employing thousands of CPUs. While the computation of the self-energy is straightforward to parallelize, the evaluation of the Green's function requires the inversion of a large sparse 10(d) x 10(d) matrix, with d > 6. As a crucial ingredient, our solution heavily relies on the smallness of the hopping as compared to the interaction strength and yields a widely scalable realization of a rapidly converging iterative algorithm which evaluates all elements of the Green's function. Results are validated by comparing with the homogeneous case via the local-density approximation. These calculations also show that the local-density approximation is valid in nonequilibrium setups without mass transport.

PocketMatch (version 2.0): A parallel algorithm for the detection of structural similarities between protein ligand binding-sites

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Knowledge of protein-ligand interactions is essential to understand several biological processes and important for applications ranging from understanding protein function to drug discovery and protein engineering. Here, we describe an algorithm for the comparison of three-dimensional ligand-binding sites in protein structures. A previously described algorithm, PocketMatch (version 1.0) is optimised, expanded, and MPI-enabled for parallel execution. PocketMatch (version 2.0) rapidly quantifies binding-site similarity based on structural descriptors such as residue nature and interatomic distances. Atomic-scale alignments may also be obtained from amino acid residue pairings generated. It allows an end-user to compute database-wide, all-to-all comparisons in a matter of hours. The use of our algorithm on a sample dataset, performance-analysis, and annotated source code is also included.

Parallel Flow-Sensitive Pointer Analysis by Graph-Rewriting

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Precise pointer analysis is a problem of interest to both the compiler and the program verification community. Flow-sensitivity is an important dimension of pointer analysis that affects the precision of the final result computed. Scaling flow-sensitive pointer analysis to millions of lines of code is a major challenge. Recently, staged flow-sensitive pointer analysis has been proposed, which exploits a sparse representation of program code created by staged analysis. In this paper we formulate the staged flow-sensitive pointer analysis as a graph-rewriting problem. Graph-rewriting has already been used for flow-insensitive analysis. However, formulating flow-sensitive pointer analysis as a graph-rewriting problem adds additional challenges due to the nature of flow-sensitivity. We implement our parallel algorithm using Intel Threading Building Blocks and demonstrate considerable scaling (upto 2.6x) for 8 threads on a set of 10 benchmarks. Compared to the sequential implementation of staged flow-sensitive analysis, a single threaded execution of our implementation performs better in 8 of the benchmarks.

Synthesis of self-light-scattering wrinkle structured ZnO photoanode by sol-gel method for dye-sensitized solar cells

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A special morphological zinc oxide (ZnO) photoanode for dye-sensitized solar cell was fabricated by simple sol-gel drop casting technique. This film shows a wrinkled structure resembling the roots of banyan tree, which acts as an effective self scattering layer for harvesting more visible light and offers an easy transport path for photo-injected electrons. These ZnO electrode of low thickness (similar to 5 mu m) gained an enhanced short-circuit current density of 6.15 mA/cm(2), open-circuit voltage of 0.67 V, fill factor of 0.47 and overall conversion efficiency of 1.97 % under 1 sun illumination. This shows a high conversion efficiency and a superior performance than that of ZnO nanoparticle-based photoanode (eta similar to 1.13 %) of high thickness (similar to 8 mu m).

Isotropic finite volume discretization

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Finite volume methods traditionally employ dimension by dimension extension of the one-dimensional reconstruction and averaging procedures to achieve spatial discretization of the governing partial differential equations on a structured Cartesian mesh in multiple dimensions. This simple approach based on tensor product stencils introduces an undesirable grid orientation dependence in the computed solution. The resulting anisotropic errors lead to a disparity in the calculations that is most prominent between directions parallel and diagonal to the grid lines. In this work we develop isotropic finite volume discretization schemes which minimize such grid orientation effects in multidimensional calculations by eliminating the directional bias in the lowest order term in the truncation error. Explicit isotropic expressions that relate the cell face averaged line and surface integrals of a function and its derivatives to the given cell area and volume averages are derived in two and three dimensions, respectively. It is found that a family of isotropic approximations with a free parameter can be derived by combining isotropic schemes based on next-nearest and next-next-nearest neighbors in three dimensions. Use of these isotropic expressions alone in a standard finite volume framework, however, is found to be insufficient in enforcing rotational invariance when the flux vector is nonlinear and/or spatially non-uniform. The rotationally invariant terms which lead to a loss of isotropy in such cases are explicitly identified and recast in a differential form. Various forms of flux correction terms which allow for a full recovery of rotational invariance in the lowest order truncation error terms, while preserving the formal order of accuracy and discrete conservation of the original finite volume method, are developed. Numerical tests in two and three dimensions attest the superior directional attributes of the proposed isotropic finite volume method. Prominent anisotropic errors, such as spurious asymmetric distortions on a circular reaction-diffusion wave that feature in the conventional finite volume implementation are effectively suppressed through isotropic finite volume discretization. Furthermore, for a given spatial resolution, a striking improvement in the prediction of kinetic energy decay rate corresponding to a general two-dimensional incompressible flow field is observed with the use of an isotropic finite volume method instead of the conventional discretization. (C) 2014 Elsevier Inc. All rights reserved.

Laterally structured ripple and square phases with one and two dimensional thickness modulations in a model bilayer system

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Molecular dynamics simulations of bilayers in a surfactant/co-surfactant/water system with explicit solvent molecules show formation of topologically distinct gel phases depending upon the bilayer composition. At low temperatures, the bilayers transform from the tilted gel phase, L beta', to the one dimensional (1D) rippled, P beta' phase as the surfactant concentration is increased. More interestingly, we observe a two dimensional (2D) square phase at higher surfactant concentration which, upon heating, transforms to the gel L beta' phase. The thickness modulations in the 1D rippled and square phases are asymmetric in two surfactant leaflets and the bilayer thickness varies by a factor of similar to 2 between maximum and minimum. The 1D ripple consists of a thinner interdigitated region of smaller extent alternating with a thicker non-interdigitated region. The 2D ripple phase is made up of two superimposed square lattices of maximum and minimum thicknesses with molecules of high tilt forming a square lattice translated from the lattice formed with the thickness minima. Using Voronoi diagrams we analyze the intricate interplay between the area-per-head-group, height modulations and chain tilt for the different ripple symmetries. Our simulations indicate that composition plays an important role in controlling the formation of low temperature gel phase symmetries and rippling accommodates the increased area-per-head-group of the surfactant molecules.

REPRESENTING A CUBIC GRAPH AS THE INTERSECTION GRAPH OF AXIS-PARALLEL BOXES IN THREE DIMENSIONS

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We show that every graph of maximum degree 3 can be represented as the intersection graph of axis parallel boxes in three dimensions, that is, every vertex can be mapped to an axis parallel box such that two boxes intersect if and only if their corresponding vertices are adjacent. In fact, we construct a representation in which any two intersecting boxes touch just at their boundaries.

Optical and structural properties of highly porous shell structured Fe doped TiO2 thin films

Relevância:

20.00% 20.00%

Publicador:

Resumo:

TiO2 thin films with 0.2 wt%, 0.4 wt%, 0.6 wt%, and 0.8 wt% Fe were prepared on glass and silicon substrates using sol-gel spin coating technique. The optical cut-off points are increasingly red-shifted and the absorption edge is shifted over the higher wavelength region with Fe content increasing. As Fe content increases, the optical band gap decreases from 3.03 to 2.48 eV whereas the tail width increases from 0.26 to 1.43 eV. The X-ray diffraction (XRD) patterns for doped films at 0.2 wt% and 0.8 wt% Fe reveal no characteristic peaks, indicating that the film is amorphous whereas undoped TiO2 exhibits (101) orientation with anatase phase. Thin films of higher Fe content exhibit a homogeneous, uniform, and nano-structured highly porous shell morphology.

Mining Unit Tests for Discovery and Migration of Math APIs

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Today's programming languages are supported by powerful third-party APIs. For a given application domain, it is common to have many competing APIs that provide similar functionality. Programmer productivity therefore depends heavily on the programmer's ability to discover suitable APIs both during an initial coding phase, as well as during software maintenance. The aim of this work is to support the discovery and migration of math APIs. Math APIs are at the heart of many application domains ranging from machine learning to scientific computations. Our approach, called MATHFINDER, combines executable specifications of mathematical computations with unit tests (operational specifications) of API methods. Given a math expression, MATHFINDER synthesizes pseudo-code comprised of API methods to compute the expression by mining unit tests of the API methods. We present a sequential version of our unit test mining algorithm and also design a more scalable data-parallel version. We perform extensive evaluation of MATHFINDER (1) for API discovery, where math algorithms are to be implemented from scratch and (2) for API migration, where client programs utilizing a math API are to be migrated to another API. We evaluated the precision and recall of MATHFINDER on a diverse collection of math expressions, culled from algorithms used in a wide range of application areas such as control systems and structural dynamics. In a user study to evaluate the productivity gains obtained by using MATHFINDER for API discovery, the programmers who used MATHFINDER finished their programming tasks twice as fast as their counterparts who used the usual techniques like web and code search, IDE code completion, and manual inspection of library documentation. For the problem of API migration, as a case study, we used MATHFINDER to migrate Weka, a popular machine learning library. Overall, our evaluation shows that MATHFINDER is easy to use, provides highly precise results across several math APIs and application domains even with a small number of unit tests per method, and scales to large collections of unit tests.

An open source massively parallel solver for Richards equation: Mechanistic modelling of water fluxes at the watershed scale

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a massively parallel open source solver for Richards equation, named the RichardsFOAM solver. This solver has been developed in the framework of the open source generalist computational fluid dynamics tool box OpenFOAM (R) and is capable to deal with large scale problems in both space and time. The source code for RichardsFOAM may be downloaded from the CPC program library website. It exhibits good parallel performances (up to similar to 90% parallel efficiency with 1024 processors both in strong and weak scaling), and the conditions required for obtaining such performances are analysed and discussed. These performances enable the mechanistic modelling of water fluxes at the scale of experimental watersheds (up to few square kilometres of surface area), and on time scales of decades to a century. Such a solver can be useful in various applications, such as environmental engineering for long term transport of pollutants in soils, water engineering for assessing the impact of land settlement on water resources, or in the study of weathering processes on the watersheds. (C) 2014 Elsevier B.V. All rights reserved.

«
1
2
3
4
5
6
7
8
9
10
11
»