990 resultados para Efficient elliptic curve arithmetic
Resumo:
An efficient parallelization algorithm for the Fast Multipole Method which aims to alleviate the parallelization bottleneck arising from lower job-count closer to root levels is presented. An electrostatic problem of 12 million non-uniformly distributed mesh elements is solved with 80-85% parallel efficiency in matrix setup and matrix-vector product using 60GB and 16 threads on shared memory architecture.
Resumo:
Bentonite clays are proven to be attractive as buffer and backfill material in high-level nuclear waste repositories around the world. A quick estimation of swelling pressures of the compacted bentonites for different clay-water-electrolyte interactions is essential in the design of buffer and backfill materials. The theoretical studies on the swelling behavior of bentonites are based on diffuse double layer (DDL) theory. To establish theoretical relationship between void ratio and swelling pressure (e versus P), evaluation of elliptic integral and inverse analysis are unavoidable. In this paper, a novel procedure is presented to establish theoretical relationship of e versus P based on the Gouy-Chapman method. The proposed procedure establishes a unique relationship between electric potentials of interacting and non-interacting diffuse clay-water-electrolyte systems. A procedure is, thus, proposed to deduce the relation between swelling pressures and void ratio from the established relation between electric potentials. This approach is simple and alleviates the need for elliptic integral evaluation and also the inverse analysis. Further, application of the proposed approach to estimate swelling pressures of four compacted bentonites, for example, MX 80, Febex, Montigel and Kunigel V1, at different dry densities, shows that the method is very simple and predicts solutions with very good accuracy. Moreover, the proposed procedure provides continuous distributions of e versus P and thus it is computationally efficient when compared with the existing techniques.
Resumo:
A computationally efficient approach that computes the optimal regularization parameter for the Tikhonov-minimization scheme is developed for photoacoustic imaging. This approach is based on the least squares-QR decomposition which is a well-known dimensionality reduction technique for a large system of equations. It is shown that the proposed framework is effective in terms of quantitative and qualitative reconstructions of initial pressure distribution enabled via finding an optimal regularization parameter. The computational efficiency and performance of the proposed method are shown using a test case of numerical blood vessel phantom, where the initial pressure is exactly known for quantitative comparison. (C) 2013 Society of Photo-Optical Instrumentation Engineers (SPIE)
Resumo:
Exploiting the performance potential of GPUs requires managing the data transfers to and from them efficiently which is an error-prone and tedious task. In this paper, we develop a software coherence mechanism to fully automate all data transfers between the CPU and GPU without any assistance from the programmer. Our mechanism uses compiler analysis to identify potential stale accesses and uses a runtime to initiate transfers as necessary. This allows us to avoid redundant transfers that are exhibited by all other existing automatic memory management proposals. We integrate our automatic memory manager into the X10 compiler and runtime, and find that it not only results in smaller and simpler programs, but also eliminates redundant memory transfers. Tested on eight programs ported from the Rodinia benchmark suite it achieves (i) a 1.06x speedup over hand-tuned manual memory management, and (ii) a 1.29x speedup over another recently proposed compiler--runtime automatic memory management system. Compared to other existing runtime-only and compiler-only proposals, it also transfers 2.2x to 13.3x less data on average.
Resumo:
Rapid advancements in multi-core processor architectures coupled with low-cost, low-latency, high-bandwidth interconnects have made clusters of multi-core machines a common computing resource. Unfortunately, writing good parallel programs that efficiently utilize all the resources in such a cluster is still a major challenge. Various programming languages have been proposed as a solution to this problem, but are yet to be adopted widely to run performance-critical code mainly due to the relatively immature software framework and the effort involved in re-writing existing code in the new language. In this paper, we motivate and describe our initial study in exploring CUDA as a programming language for a cluster of multi-cores. We develop CUDA-For-Clusters (CFC), a framework that transparently orchestrates execution of CUDA kernels on a cluster of multi-core machines. The well-structured nature of a CUDA kernel, the growing popularity, support and stability of the CUDA software stack collectively make CUDA a good candidate to be considered as a programming language for a cluster. CFC uses a mixture of source-to-source compiler transformations, a work distribution runtime and a light-weight software distributed shared memory to manage parallel executions. Initial results on running several standard CUDA benchmark programs achieve impressive speedups of up to 7.5X on a cluster with 8 nodes, thereby opening up an interesting direction of research for further investigation.
Resumo:
The First Order Reversal Curve (FORC) method has been utilised to understand the magnetization reversal and the extent of the irreversible magnetization of the soft CoFe2O4-hard SrFe12O19 nanocomposite in the nonexchange spring and the exchange spring regime. The single peak switching behaviour in the FORC distribution of the exchange spring composite confirms the coherent reversal of the soft and hard phases. The onset of the nucleation field and the magnetization reversal by domain wall movement are also evident from the FORC measurements. (C) 2013 AIP Publishing LLC.
Resumo:
A computationally efficient Li-ion battery model has been proposed in this paper. The battery model utilizes the features of both analytical and electrical circuit modeling techniques. The model is simple as it does not involve a look-up table technique and fast as it does not include a polynomial function during computation. The internal voltage of the battery is modeled as a linear function of the state-of-charge of the battery. The internal resistance is experimentally determined and the optimal value of resistance is considered for modeling. Experimental and simulated data are compared to validate the accuracy of the model.
Resumo:
This paper presents a computationally efficient model for a dc-dc boost converter, which is valid for continuous and discontinuous conduction modes; the model also incorporates significant non-idealities of the converter. Simulation of the dc-dc boost converter using an average model provides practically all the details, which are available from the simulation using the switching (instantaneous) model, except for the quantum of ripple in currents and voltages. A harmonic model of the converter can be used to evaluate the ripple quantities. This paper proposes a combined (average-cum-harmonic) model of the boost converter. The accuracy of the combined model is validated through extensive simulations and experiments. A quantitative comparison of the computation times of the average, combined and switching models are presented. The combined model is shown to be more computationally efficient than the switching model for simulation of transient and steady-state responses of the converter under various conditions.
Resumo:
In this article, we derive an a posteriori error estimator for various discontinuous Galerkin (DG) methods that are proposed in (Wang, Han and Cheng, SIAM J. Numer. Anal., 48: 708-733, 2010) for an elliptic obstacle problem. Using a key property of DG methods, we perform the analysis in a general framework. The error estimator we have obtained for DG methods is comparable with the estimator for the conforming Galerkin (CG) finite element method. In the analysis, we construct a non-linear smoothing function mapping DG finite element space to CG finite element space and use it as a key tool. The error estimator consists of a discrete Lagrange multiplier associated with the obstacle constraint. It is shown for non-over-penalized DG methods that the discrete Lagrange multiplier is uniformly stable on non-uniform meshes. Finally, numerical results demonstrating the performance of the error estimator are presented.
Resumo:
In this paper, we propose a novel authentication protocol for MANETs requiring stronger security. The protocol works on a two-tier network architecture with client nodes and authentication server nodes, and supports dynamic membership. We use an external membership granting server (MGS) to provide stronger security with dynamic membership. However, the external MGS in our protocol is semi-online instead of being online, i.e., the MGS cannot initiate a connection with a network node but any network node can communicate with the MGS whenever required. To ensure efficiency, the protocol uses symmetric key cryptography to implement the authentication service. However, to achieve storage scalability, the protocol uses a pseudo random function (PRF) to bind the secret key of a client to its identity using the secret key of its server. In addition, the protocol possesses an efficient server revocation mechanism along with an efficient server re-assignment mechanism, which makes the protocol robust against server node compromise.
Resumo:
The increasing number of available protein structures requires efficient tools for multiple structure comparison. Indeed, multiple structural alignments are essential for the analysis of function, evolution and architecture of protein structures. For this purpose, we proposed a new web server called multiple Protein Block Alignment (mulPBA). This server implements a method based on a structural alphabet to describe the backbone conformation of a protein chain in terms of dihedral angles. This sequence-like' representation enables the use of powerful sequence alignment methods for primary structure comparison, followed by an iterative refinement of the structural superposition. This approach yields alignments superior to most of the rigid-body alignment methods and highly comparable with the flexible structure comparison approaches. We implement this method in a web server designed to do multiple structure superimpositions from a set of structures given by the user. Outputs are given as both sequence alignment and superposed 3D structures visualized directly by static images generated by PyMol or through a Jmol applet allowing dynamic interaction. Multiple global quality measures are given. Relatedness between structures is indicated by a distance dendogram. Superimposed structures in PDB format can be also downloaded, and the results are quickly obtained. mulPBA server can be accessed at www.dsimb.inserm.fr/dsimb_tools/mulpba/.
Resumo:
Selective detection of nitro-aromatic compounds (NACs) at nanomolar concentration is achieved for the first time in multiple media including water, micelles or in organogels as well as using test strips. Mechanism of interaction of NACs with highly fluorescent p-phenylenevinylene-based molecules has been described as the electron transfer phenomenon from the electron-rich chromophoric probe to the electron deficient NACs. The selectivity in sensing is guided by the pK(a) of the probes as well as the NACs under consideration. TNP-induced selective gel-to-sol transition in THF medium is also observed through the reorganization of molecular self-assembly and the portable test trips are made successfully for rapid on-site detection purpose.
Resumo:
Dendrimers as vectors for gene delivery were established, primarily by utilizing few prominent dendrimer types so far. We report herein studies of DNA complexation efficacies and gene delivery vector properties of a nitrogen-core poly(propyl ether imine) (PETIM) dendrimer, constituted with 22 tertiary amine internal branches and 24 primary amines at the periphery. The interaction of the dendrimer with pEGFPDNA was evaluated through UV-vis, circular dichroism (CD) spectral studies, ethidium bromide fluorescence emission quenching, thermal melting, and gel retardation assays, from which most changes to DNA structure during complexation was found to occur at a weight ratio of dendrimer:DNA similar to 2:1. The zeta potential measurements further confirmed this stoichiometry at electroneutrality. The structure of a DNA oligomer upon dendrimer complexation was simulated through molecular modeling and the simulation showed that the dendrimer enfolded DNA oligomer along both major and minor grooves, without causing DNA deformation, in 1:1 and 2:1 dendrimer-to-DNA complexes. Atomic force microscopy (AFM) studies on dendrimer-pEGFP DNA complex showed an increase in the average z-height as a result of dendrimers decorating the DNA, without causing a distortion of the DNA structure. Cytotoxicity studies involving five different mammalian cell lines, using 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl-tetrazolium bromide] (MTT) assay, reveal the dendrimer toxicity profile (IC50) values of similar to 400-1000 mu g mL(-1), depending on the cell line tested. Quantitative estimation, using luciferase assay, showed that the gene transfection was at least 100 times higher when compared to poly(ethylene imine) branched polymer, having similar number of cationic sites as the dendrimer. The present study establishes the physicochemical behavior of new nitrogen-core PETIM dendrimer-DNA complexes, their lower toxicities, and efficient gene delivery vector properties.
Resumo:
In this paper, we propose a quantum method for generation of random numbers based on bosonic stimulation. Randomness arises through the path-dependent indeterministic amplification of two competing bosonic modes. We show that the process provides an efficient method for macroscopic extraction of microscopic randomness.