Biblioteca Digital

143 resultados para Parallel Architectures

Representing a cubic graph as the intersection graph of axis-parallel boxes in three dimensions

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We show that every graph of maximum degree 3 can be represented as the intersection graph of axis parallel boxes in three dimensions, that is, every vertex can be mapped to an axis parallel box such that two boxes intersect if and only if their corresponding vertices are adjacent. In fact, we construct a representation in which any two intersecting boxes just touch at their boundaries. Further, this construction can be realized in linear time.

DMT of parallel-path and layered networks under the half-duplex constraint

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we study the diversity-multiplexing-gain tradeoff (DMT) of wireless relay networks under the half-duplex constraint. It is often unclear what penalty if any, is imposed by the half-duplex constraint on the DMT of such networks. We study two classes of networks; the first class, called KPP(I) networks, is the class of networks with the relays organized in K parallel paths between the source and the destination. While we assume that there is no direct source-destination path, the K relaying paths can interfere with each other. The second class, termed as layered networks, is comprised of relays organized in layers, where links exist only between adjacent layers. We present a communication scheme based on static schedules and amplify-and-forward relaying for these networks. We also show that for KPP(I) networks with K >= 3, the proposed schemes can achieve full-duplex DMT performance, thus demonstrating that there is no performance hit on the DMT due to the half-duplex constraint. We also show that, for layered networks, a linear DMT of d(max)(1 - r)(+) between the maximum diversity d(max) and the maximum MG, r(max) = 1 is achievable. We adapt existing DMT optimal coding schemes to these networks, thus specifying the end-to-end communication strategy explicitly.

A Hybrid Parallel Algorithm for Computing and Tracking Level Set Topology

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The contour tree is a topological abstraction of a scalar field that captures evolution in level set connectivity. It is an effective representation for visual exploration and analysis of scientific data. We describe a work-efficient, output sensitive, and scalable parallel algorithm for computing the contour tree of a scalar field defined on a domain that is represented using either an unstructured mesh or a structured grid. A hybrid implementation of the algorithm using the GPU and multi-core CPU can compute the contour tree of an input containing 16 million vertices in less than ten seconds with a speedup factor of upto 13. Experiments based on an implementation in a multi-core CPU environment show near-linear speedup for large data sets.

Inference for the Component and System Lifetime Distribution of a k-unit Parallel System Based on System Data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we consider the inference for the component and system lifetime distribution of a k-unit parallel system with independent components based on system data. The components are assumed to have identical Weibull distribution. We obtain the maximum likelihood estimates of the unknown parameters based on system data. The Fisher information matrix has been derived. We propose -expectation tolerance interval and -content -level tolerance interval for the life distribution of the system. Performance of the estimators and tolerance intervals is investigated via simulation study. A simulated dataset is analyzed for illustration.

Carbon nanotube-ZnO nanowire hybrid architectures as multifunctional devices

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We report on multifunctional devices based on CNT arrays-ZnO nanowires hybrid architectures. The hybrid structure exhibit excellent high current Schottky like behavior with ZnO as p-type and an ideality factor close to the ideal value. Further the CNT-ZnO hybrid structures can be used as high current p-type field effect transistors that can deliver currents of the order of milliamperes and also can be used as ultraviolet detectors with controllable current on-off ratio and response time. The p-type nature of ZnO and possible mechanism for the rectifying characteristics of CNT-ZnO has been presented.

TCP: thread contention predictor for parallel programs

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With proliferation of chip multicores (CMPs) on desktops and embedded platforms, multi-threaded programs have become ubiquitous. Existence of multiple threads may cause resource contention, such as, in on-chip shared cache and interconnects, depending upon how they access resources. Hence, we propose a tool - Thread Contention Predictor (TCP) to help quantify the number of threads sharing data and their sharing pattern. We demonstrate its use to predict a more profitable shared, last level on-chip cache (LLC) access policy on CMPs. Our cache configuration predictor is 2.2 times faster compared to the cycle-accurate simulations. We also demonstrate its use for identifying hot data structures in a program which may cause performance degradation due to false data sharing. We fix layout of such data structures and show up-to 10% and 18% improvement in execution time and energy-delay product (EDP), respectively.

Analysis of the degrees-of-freedom of spatial parallel manipulators in regular and singular configurations

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a study of the nature of the degrees-of-freedom of spatial manipulators based on the concept of partition of degrees-of-freedom. In particular, the partitioning of degrees-of-freedom is studied in five lower-mobility spatial parallel manipulators possessing different combinations of degrees-of-freedom. An extension of the existing theory is introduced so as to analyse the nature of the gained degree(s)-of-freedom at a gain-type singularity. The gain of one- and two-degrees-of-freedom is analysed in several well-studied, as well as newly developed manipulators. The formulations also present a basis for the analysis of the velocity kinematics of manipulators of any architecture. (C) 2013 Elsevier Ltd. All rights reserved.

CUDA-for-clusters: a system for efficient execution of CUDA kernels on multi-core clusters

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Rapid advancements in multi-core processor architectures coupled with low-cost, low-latency, high-bandwidth interconnects have made clusters of multi-core machines a common computing resource. Unfortunately, writing good parallel programs that efficiently utilize all the resources in such a cluster is still a major challenge. Various programming languages have been proposed as a solution to this problem, but are yet to be adopted widely to run performance-critical code mainly due to the relatively immature software framework and the effort involved in re-writing existing code in the new language. In this paper, we motivate and describe our initial study in exploring CUDA as a programming language for a cluster of multi-cores. We develop CUDA-For-Clusters (CFC), a framework that transparently orchestrates execution of CUDA kernels on a cluster of multi-core machines. The well-structured nature of a CUDA kernel, the growing popularity, support and stability of the CUDA software stack collectively make CUDA a good candidate to be considered as a programming language for a cluster. CFC uses a mixture of source-to-source compiler transformations, a work distribution runtime and a light-weight software distributed shared memory to manage parallel executions. Initial results on running several standard CUDA benchmark programs achieve impressive speedups of up to 7.5X on a cluster with 8 nodes, thereby opening up an interesting direction of research for further investigation.

Tripodal Bile Acid Architectures Based on a Triarylphosphine Oxide Core Obtained by Copper-Catalysed 1,3]-Dipolar Cycloaddition: Synthesis and Preliminary Aggregation Studies

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We report the synthesis and aggregation behaviour of new water-soluble, bile acid derived tripodal architectures based on a core derived from triphenylphosphine oxide. We employed the well-established copper-catalysed 1,3]-dipolar cycloaddition (CuAAC) for the construction of these tripodal molecules. The aggregation behaviour of these molecules in aqueous media was studied by different analytical methods such as dye solubilisation, dynamic light scattering, NMR and AFM. These molecular architectures also offer an additional advantage in aiding understanding of the influence of the nature of the bile acid backbone and of the configuration at the steroid C-3 position in these architectures; to the best of our knowledge this has not been reported in the literature. The unique gelation properties of the -derivatives were explained through molecular modelling studies and the mechanical behaviour of these gels was studied by rheology experiments.

Simulation of inhomogeneous distributions of ultracold atoms in an optical lattice via a massively parallel implementation of nonequilibrium strong-coupling perturbation theory

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a nonequilibrium strong-coupling approach to inhomogeneous systems of ultracold atoms in optical lattices. We demonstrate its application to the Mott-insulating phase of a two-dimensional Fermi-Hubbard model in the presence of a trap potential. Since the theory is formulated self-consistently, the numerical implementation relies on a massively parallel evaluation of the self-energy and the Green's function at each lattice site, employing thousands of CPUs. While the computation of the self-energy is straightforward to parallelize, the evaluation of the Green's function requires the inversion of a large sparse 10(d) x 10(d) matrix, with d > 6. As a crucial ingredient, our solution heavily relies on the smallness of the hopping as compared to the interaction strength and yields a widely scalable realization of a rapidly converging iterative algorithm which evaluates all elements of the Green's function. Results are validated by comparing with the homogeneous case via the local-density approximation. These calculations also show that the local-density approximation is valid in nonequilibrium setups without mass transport.

PocketMatch (version 2.0): A parallel algorithm for the detection of structural similarities between protein ligand binding-sites

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Knowledge of protein-ligand interactions is essential to understand several biological processes and important for applications ranging from understanding protein function to drug discovery and protein engineering. Here, we describe an algorithm for the comparison of three-dimensional ligand-binding sites in protein structures. A previously described algorithm, PocketMatch (version 1.0) is optimised, expanded, and MPI-enabled for parallel execution. PocketMatch (version 2.0) rapidly quantifies binding-site similarity based on structural descriptors such as residue nature and interatomic distances. Atomic-scale alignments may also be obtained from amino acid residue pairings generated. It allows an end-user to compute database-wide, all-to-all comparisons in a matter of hours. The use of our algorithm on a sample dataset, performance-analysis, and annotated source code is also included.

REPRESENTING A CUBIC GRAPH AS THE INTERSECTION GRAPH OF AXIS-PARALLEL BOXES IN THREE DIMENSIONS

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We show that every graph of maximum degree 3 can be represented as the intersection graph of axis parallel boxes in three dimensions, that is, every vertex can be mapped to an axis parallel box such that two boxes intersect if and only if their corresponding vertices are adjacent. In fact, we construct a representation in which any two intersecting boxes touch just at their boundaries.

An open source massively parallel solver for Richards equation: Mechanistic modelling of water fluxes at the watershed scale

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a massively parallel open source solver for Richards equation, named the RichardsFOAM solver. This solver has been developed in the framework of the open source generalist computational fluid dynamics tool box OpenFOAM (R) and is capable to deal with large scale problems in both space and time. The source code for RichardsFOAM may be downloaded from the CPC program library website. It exhibits good parallel performances (up to similar to 90% parallel efficiency with 1024 processors both in strong and weak scaling), and the conditions required for obtaining such performances are analysed and discussed. These performances enable the mechanistic modelling of water fluxes at the scale of experimental watersheds (up to few square kilometres of surface area), and on time scales of decades to a century. Such a solver can be useful in various applications, such as environmental engineering for long term transport of pollutants in soils, water engineering for assessing the impact of land settlement on water resources, or in the study of weathering processes on the watersheds. (C) 2014 Elsevier B.V. All rights reserved.

Ligand 5,10,15,20-Tetra(N-methyl-4-pyridyl)porphine (TMPyP4) Prefers the Parallel Propeller-Type Human Telomeric G-Quadruplex DNA over Its Other Polymorphs

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The binding of ligand 5,10,15,20-tetra(N-methyl-4-pyridyl)porphine (TMPyP4) with telomeric and genomic G-quadruplex DNA has been extensively studied. However, a comparative study of interactions of TMPyP4 with different conformations of human telomeric G-quadruplex DNA, namely, parallel propeller-type (PP), antiparallel basket-type (AB), and mixed hybrid-type (MH) G-quadruplex DNA, has not been done. We considered all the possible binding sites in each of the G-quadruplex DNA structures and docked TMPyP4 to each one of them. The resultant most potent sites for binding were analyzed from the mean binding free energy of the complexes. Molecular dynamics simulations were then carried out, and analysis of the binding free energy of the TMPyP4-G-quadruplex complex showed that the binding of TMPyP4 with parallel propeller-type G-quadruplex DNA is preferred over the other two G-quadruplex DNA conformations. The results obtained from the change in solvent excluded surface area (SESA) and solvent accessible surface area (SASA) also support the more pronounced binding of the ligand with the parallel propeller-type G-quadruplex DNA.

Common-Mode Injection PWM for Parallel Converters

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The ac-side terminal voltages of parallel-connected converters are different if the line reactive drops of the individual converters are different. This could result either from differences in per-phase inductances or from differences in the line currents of the converters. In such cases, the modulating signals are different for the converters. Hence, the common-mode (CM) voltages for the converters, injected by conventional space vector pulsewidth modulation (CSVPWM) to increase dc-bus utilization, are different. Consequently, significant low-frequency zero-sequence circulating currents result. This paper proposes a new modulation method for parallel-connected converters with unequal terminal voltages. This method does not cause low-frequency zero-sequence circulating currents and is comparable with CSVPWM in terms of dc-bus utilization and device power loss. Experimental results are presented at a power level of 150 kVA from a circulating-power test setup, where the differences in converter terminal voltages are quite significant.

«
1
2
3
4
5
6
7
8
9
10
»