984 resultados para ACCELERATING FRONTS


Relevância:

20.00% 20.00%

Publicador:

Resumo:

FastFlow is a programming framework specifically targeting cache-coherent shared-memory multi-cores. It is implemented as a stack of C++ template libraries built on top of lock-free (and memory fence free) synchronization mechanisms. Its philosophy is to combine programmability with performance. In this paper a new FastFlow programming methodology aimed at supporting parallelization of existing sequential code via offloading onto a dynamically created software accelerator is presented. The new methodology has been validated using a set of simple micro-benchmarks and some real applications. © 2011 Springer-Verlag.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The R-matrix method when applied to the study of intermediate energy electron scattering by the hydrogen atom gives rise to a large number of two electron integrals between numerical basis functions. Each integral is evaluated independently of the others, thereby rendering this a prime candidate for a parallel implementation. In this paper, we present a parallel implementation of this routine which uses a Graphical Processing Unit as a co-processor, giving a speedup of approximately 20 times when compared with a sequential version. We briefly consider properties of this calculation which make a GPU implementation appropriate with a view to identifying other calculations which might similarly benet.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ‘edible’, ‘fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem.

Can we accelerate any CMTF solver, so that it runs within a few minutes instead of tens of hours to a day, while maintaining good accuracy? We introduce Turbo-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, by up to 200x, along with an up to 65 fold increase in sparsity, with comparable accuracy to the baseline.

We apply Turbo-SMT to BrainQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. Turbo-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy.




Relevância:

20.00% 20.00%

Publicador:

Resumo:

Fully Homomorphic Encryption (FHE) is a recently developed cryptographic technique which allows computations on encrypted data. There are many interesting applications for this encryption method, especially within cloud computing. However, the computational complexity is such that it is not yet practical for real-time applications. This work proposes optimised hardware architectures of the encryption step of an integer-based FHE scheme with the aim of improving its practicality. A low-area design and a high-speed parallel design are proposed and implemented on a Xilinx Virtex-7 FPGA, targeting the available DSP slices, which offer high-speed multiplication and accumulation. Both use the Comba multiplication scheduling method to manage the large multiplications required with uneven sized multiplicands and to minimise the number of read and write operations to RAM. Results show that speed up factors of 3.6 and 10.4 can be achieved for the encryption step with medium-sized security parameters for the low-area and parallel designs respectively, compared to the benchmark software implementation on an Intel Core2 Duo E8400 platform running at 3 GHz.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Large integer multiplication is a major performance bottleneck in fully homomorphic encryption (FHE) schemes over the integers. In this paper two optimised multiplier architectures for large integer multiplication are proposed. The first of these is a low-latency hardware architecture of an integer-FFT multiplier. Secondly, the use of low Hamming weight (LHW) parameters is applied to create a novel hardware architecture for large integer multiplication in integer-based FHE schemes. The proposed architectures are implemented, verified and compared on the Xilinx Virtex-7 FPGA platform. Finally, the proposed implementations are employed to evaluate the large multiplication in the encryption step of FHE over the integers. The analysis shows a speed improvement factor of up to 26.2 for the low-latency design compared to the corresponding original integer-based FHE software implementation. When the proposed LHW architecture is combined with the low-latency integer-FFT accelerator to evaluate a single FHE encryption operation, the performance results show that a speed improvement by a factor of approximately 130 is possible.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Most traditional data mining algorithms struggle to cope with the sheer scale of data efficiently. In this paper, we propose a general framework to accelerate existing clustering algorithms to cluster large-scale datasets which contain large numbers of attributes, items, and clusters. Our framework makes use of locality sensitive hashing (LSH) to significantly reduce the cluster search space. We also theoretically prove that our framework has a guaranteed error bound in terms of the clustering quality. This framework can be applied to a set of centroid-based clustering algorithms that assign an object to the most similar cluster, and we adopt the popular K-Modes categorical clustering algorithm to present how the framework can be applied. We validated our framework with five synthetic datasets and a real world Yahoo! Answers dataset. The experimental results demonstrate that our framework is able to speed up the existing clustering algorithm between factors of 2 and 6, while maintaining comparable cluster purity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Use of bridge deck overlays is important in maximizing bridge service life. Overlays can replace the deteriorated part of the deck, thus extending the bridge life. Even though overlay construction avoids the construction of a whole new bridge deck, construction still takes significant time in re-opening the bridge to traffic. Current processes and practices are time-consuming and multiple opportunities may exist to reduce overall construction time by modifying construction requirements and/or materials utilized. Reducing the construction time could have an effect on reducing the socioeconomic costs associated with bridge deck rehabilitation and the inconvenience caused to travelers. This work included three major tasks with literature review, field investigation, and laboratory testing. Overlay concrete mix used for present construction takes long curing hours and therefore an investigation was carried out to find fast-curing concrete mixes that could reduce construction time. Several fast-cuing concrete mixes were found and suggested for further evaluation. An on-going overlay construction project was observed and documented. Through these observations, several opportunities were suggested where small modifications in the process could lead to significant time savings. With current standards of the removal depth of substrate concrete in Iowa, it takes long hours for the removal process. Four different laboratory tests were performed with different loading conditions to determine the necessary substrate concrete removal depth for a proper bond between the substrate concrete and the new overlay concrete. Several parameters, such as failure load, bond stress, and stiffness, were compared for four different concrete removal depths. Through the results and observations of this investigation several conclusions were made which could reduce bridge deck overlay construction time.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

PowerPoint presentation that showcases: • Research Objectives • Strategic Value of the Lean Enterprise • Multi-Stakeholder Value Optimization • Lean Enterprise Self-Assessment Tool (LESAT) • Leading and Lagging Indicators of Lean Enterprise Transformation • Empirical Results in the Aerospace Industry • Accelerating the Lean Transformation - Linking LESAT to Strategic Objectives • Summary and Questions

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We generalize a previous model of time-delayed reaction–diffusion fronts (Fort and Méndez 1999 Phys. Rev. Lett. 82 867) to allow for a bias in the microscopic random walk of particles or individuals. We also present a second model which takes the time order of events (diffusion and reproduction) into account. As an example, we apply them to the human invasion front across the USA in the 19th century. The corrections relative to the previous model are substantial. Our results are relevant to physical and biological systems with anisotropic fronts, including particle diffusion in disordered lattices, population invasions, the spread of epidemics, etc

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We extend a previous model of the Neolithic transition in Europe [J. Fort and V. Méndez, Phys. Rev. Lett. 82, 867 (1999)] by taking two effects into account: (i) we do not use the diffusion approximation (which corresponds to second-order Taylor expansions), and (ii) we take proper care of the fact that parents do not migrate away from their children (we refer to this as a time-order effect, in the sense that it implies that children grow up with their parents, before they become adults and can survive and migrate). We also derive a time-ordered, second-order equation, which we call the sequential reaction-diffusion equation, and use it to show that effect (ii) is the most important one, and that both of them should in general be taken into account to derive accurate results. As an example, we consider the Neolithic transition: the model predictions agree with the observed front speed, and the corrections relative to previous models are important (up to 70%)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Realistic rendering animation is known to be an expensive processing task when physically-based global illumination methods are used in order to improve illumination details. This paper presents an acceleration technique to compute animations in radiosity environments. The technique is based on an interpolated approach that exploits temporal coherence in radiosity. A fast global Monte Carlo pre-processing step is introduced to the whole computation of the animated sequence to select important frames. These are fully computed and used as a base for the interpolation of all the sequence. The approach is completely view-independent. Once the illumination is computed, it can be visualized by any animated camera. Results present significant high speed-ups showing that the technique could be an interesting alternative to deterministic methods for computing non-interactive radiosity animations for moderately complex scenarios

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bimodal dispersal probability distributions with characteristic distances differing by several orders of magnitude have been derived and favorably compared to observations by Nathan [Nature (London) 418, 409 (2002)]. For such bimodal kernels, we show that two-dimensional molecular dynamics computer simulations are unable to yield accurate front speeds. Analytically, the usual continuous-space random walks (CSRWs) are applied to two dimensions. We also introduce discrete-space random walks and use them to check the CSRW results (because of the inefficiency of the numerical simulations). The physical results reported are shown to predict front speeds high enough to possibly explain Reid's paradox of rapid tree migration. We also show that, for a time-ordered evolution equation, fronts are always slower in two dimensions than in one dimension and that this difference is important both for unimodal and for bimodal kernels