396 resultados para Supercomputer


Relevância:

10.00% 10.00%

Publicador:

Resumo:

A finite-element scheme based on a coupled arbitrary Lagrangian-Eulerian and Lagrangian approach is developed for the computation of interface flows with soluble surfactants. The numerical scheme is designed to solve the time-dependent Navier-Stokes equations and an evolution equation for the surfactant concentration in the bulk phase, and simultaneously, an evolution equation for the surfactant concentration on the interface. Second-order isoparametric finite elements on moving meshes and second-order isoparametric surface finite elements are used to solve these equations. The interface-resolved moving meshes allow the accurate incorporation of surface forces, Marangoni forces and jumps in the material parameters. The lower-dimensional finite-element meshes for solving the surface evolution equation are part of the interface-resolved moving meshes. The numerical scheme is validated for problems with known analytical solutions. A number of computations to study the influence of the surfactants in 3D-axisymmetric rising bubbles have been performed. The proposed scheme shows excellent conservation of fluid mass and of the total mass of the surfactant. (C) 2012 Elsevier Inc. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Fragment Finder 2.0 is a web-based interactive computing server which can be used to retrieve structurally similar protein fragments from 25 and 90% nonredundant data sets. The computing server identifies structurally similar fragments using the protein backbone C alpha angles. In addition, the identified fragments can be superimposed using either of the two structural superposition programs, STAMP and PROFIT, provided in the server. The freely available Java plug-in Jmol has been interfaced with the server for the visualization of the query and superposed fragments. The server is the updated version of a previously developed search engine and employs an in-house-developed fast pattern matching algorithm. This server can be accessed freely over the World Wide Web through the URL http://cluster.physics.iisc.ernet.in/ff/.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Today's SoCs are complex designs with multiple embedded processors, memory subsystems, and application specific peripherals. The memory architecture of embedded SoCs strongly influences the power and performance of the entire system. Further, the memory subsystem constitutes a major part (typically up to 70%) of the silicon area for the current day SoC. In this article, we address the on-chip memory architecture exploration for DSP processors which are organized as multiple memory banks, where banks can be single/dual ported with non-uniform bank sizes. In this paper we propose two different methods for physical memory architecture exploration and identify the strengths and applicability of these methods in a systematic way. Both methods address the memory architecture exploration for a given target application by considering the application's data access characteristics and generates a set of Pareto-optimal design points that are interesting from a power, performance and VLSI area perspective. To the best of our knowledge, this is the first comprehensive work on memory space exploration at physical memory level that integrates data layout and memory exploration to address the system objectives from both hardware design and application software development perspective. Further we propose an automatic framework that explores the design space identifying 100's of Pareto-optimal design points within a few hours of running on a standard desktop configuration.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we use optical flow based complex-valued features extracted from video sequences to recognize human actions. The optical flow features between two image planes can be appropriately represented in the Complex plane. Therefore, we argue that motion information that is used to model the human actions should be represented as complex-valued features and propose a fast learning fully complex-valued neural classifier to solve the action recognition task. The classifier, termed as, ``fast learning fully complex-valued neural (FLFCN) classifier'' is a single hidden layer fully complex-valued neural network. The neurons in the hidden layer employ the fully complex-valued activation function of the type of a hyperbolic secant function. The parameters of the hidden layer are chosen randomly and the output weights are estimated as the minimum norm least square solution to a set of linear equations. The results indicate the superior performance of FLFCN classifier in recognizing the actions compared to real-valued support vector machines and other existing results in the literature. Complex valued representation of 2D motion and orthogonal decision boundaries boost the classification performance of FLFCN classifier. (c) 2012 Elsevier B.V. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a heterogeneous finite element method for the solution of a high-dimensional population balance equation, which depends both the physical and the internal property coordinates. The proposed scheme tackles the two main difficulties in the finite element solution of population balance equation: (i) spatial discretization with the standard finite elements, when the dimension of the equation is more than three, (ii) spurious oscillations in the solution induced by standard Galerkin approximation due to pure advection in the internal property coordinates. The key idea is to split the high-dimensional population balance equation into two low-dimensional equations, and discretize the low-dimensional equations separately. In the proposed splitting scheme, the shape of the physical domain can be arbitrary, and different discretizations can be applied to the low-dimensional equations. In particular, we discretize the physical and internal spaces with the standard Galerkin and Streamline Upwind Petrov Galerkin (SUPG) finite elements, respectively. The stability and error estimates of the Galerkin/SUPG finite element discretization of the population balance equation are derived. It is shown that a slightly more regularity, i.e. the mixed partial derivatives of the solution has to be bounded, is necessary for the optimal order of convergence. Numerical results are presented to support the analysis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The altered spontaneous emission of an emitter near an arbitrary body can be elucidated using an energy balance of the electromagnetic field. From a classical point of view it is trivial to show that the field scattered back from any body should alter the emission of the source. But it is not at all apparent that the total radiative and non-radiative decay in an arbitrary body can add to the vacuum decay rate of the emitter (i.e.) an increase of emission that is just as much as the body absorbs and radiates in all directions. This gives us an opportunity to revisit two other elegant classical ideas of the past, the optical theorem and the Wheeler-Feynman absorber theory of radiation. It also provides us alternative perspectives of Purcell effect and generalizes many of its manifestations, both enhancement and inhibition of emission. When the optical density of states of a body or a material is difficult to resolve (in a complex geometry or a highly inhomogeneous volume) such a generalization offers new directions to solutions. (c) 2012 Elsevier Ltd. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Morse-Smale complex is a topological structure that captures the behavior of the gradient of a scalar function on a manifold. This paper discusses scalable techniques to compute the Morse-Smale complex of scalar functions defined on large three-dimensional structured grids. Computing the Morse-Smale complex of three-dimensional domains is challenging as compared to two-dimensional domains because of the non-trivial structure introduced by the two types of saddle criticalities. We present a parallel shared-memory algorithm to compute the Morse-Smale complex based on Forman's discrete Morse theory. The algorithm achieves scalability via synergistic use of the CPU and the GPU. We first prove that the discrete gradient on the domain can be computed independently for each cell and hence can be implemented on the GPU. Second, we describe a two-step graph traversal algorithm to compute the 1-saddle-2-saddle connections efficiently and in parallel on the CPU. Simultaneously, the extremasaddle connections are computed using a tree traversal algorithm on the GPU.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Online remote visualization and steering of critical weather applications like cyclone tracking are essential for effective and timely analysis by geographically distributed climate science community. A steering framework for controlling the high-performance simulations of critical weather events needs to take into account both the steering inputs of the scientists and the criticality needs of the application including minimum progress rate of simulations and continuous visualization of significant events. In this work, we have developed an integrated user-driven and automated steering framework INST for simulations, online remote visualization, and analysis for critical weather applications. INST provides the user control over various application parameters including region of interest, resolution of simulation, and frequency of data for visualization. Unlike existing efforts, our framework considers both the steering inputs and the criticality of the application, namely, the minimum progress rate needed for the application, and various resource constraints including storage space and network bandwidth to decide the best possible parameter values for simulations and visualization.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Over the past two decades, many ingenious efforts have been made in protein remote homology detection. Because homologous proteins often diversify extensively in sequence, it is challenging to demonstrate such relatedness through entirely sequence-driven searches. Here, we describe a computational method for the generation of `protein-like' sequences that serves to bridge gaps in protein sequence space. Sequence profile information, as embodied in a position-specific scoring matrix of multiply aligned sequences of bona fide family members, serves as the starting point in this algorithm. The observed amino acid propensity and the selection of a random number dictate the selection of a residue for each position in the sequence. In a systematic manner, and by applying a `roulette-wheel' selection approach at each position, we generate parent family-like sequences and thus facilitate an enlargement of sequence space around the family. When generated for a large number of families, we demonstrate that they expand the utility of natural intermediately related sequences in linking distant proteins. In 91% of the assessed examples, inclusion of designed sequences improved fold coverage by 5-10% over searches made in their absence. Furthermore, with several examples from proteins adopting folds such as TIM, globin, lipocalin and others, we demonstrate that the success of including designed sequences in a database positively sensitized methods such as PSI-BLAST and Cascade PSI-BLAST and is a promising opportunity for enormously improved remote homology recognition using sequence information alone.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Software transactional memory (STM) is a promising programming paradigm for shared memory multithreaded programs. In order for STMs to be adopted widely for performance critical software, understanding and improving the cache performance of applications running on STM becomes increasingly crucial, as the performance gap between processor and memory continues to grow. In this paper, we present the most detailed experimental evaluation to date, of the cache behavior of STM applications and quantify the impact of the different STM factors on the cache misses experienced by the applications. We find that STMs are not cache friendly, with the data cache stall cycles contributing to more than 50% of the execution cycles in a majority of the benchmarks. We find that on an average, misses occurring inside the STM account for 62% of total data cache miss latency cycles experienced by the applications and the cache performance is impacted adversely due to certain inherent characteristics of the STM itself. The above observations motivate us to propose a set of specific compiler transformations targeted at making the STMs cache friendly. We find that STM's fine grained and application unaware locking is a major contributor to its poor cache behavior. Hence we propose selective Lock Data co-location (LDC) and Redundant Lock Access Removal (RLAR) to address the lock access misses. We find that even transactions that are completely disjoint access parallel, suffer from costly coherence misses caused by the centralized global time stamp updates and hence we propose the Selective Per-Partition Time Stamp (SPTS) transformation to address this. We show that our transformations are effective in improving the cache behavior of STM applications by reducing the data cache miss latency by 20.15% to 37.14% and improving execution time by 18.32% to 33.12% in five of the 8 STAMP applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The spatial search problem on regular lattice structures in integer number of dimensions d >= 2 has been studied extensively, using both coined and coinless quantum walks. The relativistic Dirac operator has been a crucial ingredient in these studies. Here, we investigate the spatial search problem on fractals of noninteger dimensions. Although the Dirac operator cannot be defined on a fractal, we construct the quantum walk on a fractal using the flip-flop operator that incorporates a Klein-Gordon mode. We find that the scaling behavior of the spatial search is determined by the spectral (and not the fractal) dimension. Our numerical results have been obtained on the well-known Sierpinski gaskets in two and three dimensions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Experiments have shown strong effects of some substrates on the localized plasmons of metallic nano particles but they are inconclusive on the affecting parameters. Here, we have used discrete dipole approximation in conjunction with Sommerfeld integral relations to explain the effect of the substrates as a function of the parameters of incident radiation. The radiative coupling can both quench and enhance the resonance and its dependence on the angle and polarization of incident radiation with respect to the surface is shown. Non-radiative interaction with the substrate enhances the plasmon resonance of the particles and can shift the resonances from their free-space energies significantly. The non-radiative interaction of the substrate is sensitive to the shape of particles and polarization of incident radiation with respect to substrate. Our results show that the plasmon resonances in coupled and single particles can be significantly altered from their free-space resonances and are quenched or enhanced by the choice of substrate and polarization of incident radiation. (C) 2012 American Institute of Physics. http://dx.doi.org/10.1063/1.4736544]

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigate evolution of quantum correlations in ensembles of two-qubit nuclear spin systems via nuclear magnetic resonance techniques. We use discord as a measure of quantum correlations and the Werner state as an explicit example. We, first, introduce different ways of measuring discord and geometric discord in two-qubit systems and then describe the following experimental studies: (a) We quantitatively measure discord for Werner-like states prepared using an entangling pulse sequence. An initial thermal state with zero discord is gradually and periodically transformed into a mixed state with maximum discord. The experimental and simulated behavior of rise and fall of discord agree fairly well. (b) We examine the efficiency of dynamical decoupling sequences in preserving quantum correlations. In our experimental setup, the dynamical decoupling sequences preserved the traceless parts of the density matrices at high fidelity. But they could not maintain the purity of the quantum states and so were unable to keep the discord from decaying. (c) We observe the evolution of discord for a singlet-triplet mixed state during a radio-frequency spin-lock. A simple relaxation model describes the evolution of discord, and the accompanying evolution of fidelity of the long-lived singlet state, reasonably well.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The diffusion equation-based modeling of near infrared light propagation in tissue is achieved by using finite-element mesh for imaging real-tissue types, such as breast and brain. The finite-element mesh size (number of nodes) dictates the parameter space in the optical tomographic imaging. Most commonly used finite-element meshing algorithms do not provide the flexibility of distinct nodal spacing in different regions of imaging domain to take the sensitivity of the problem into consideration. This study aims to present a computationally efficient mesh simplification method that can be used as a preprocessing step to iterative image reconstruction, where the finite-element mesh is simplified by using an edge collapsing algorithm to reduce the parameter space at regions where the sensitivity of the problem is relatively low. It is shown, using simulations and experimental phantom data for simple meshes/domains, that a significant reduction in parameter space could be achieved without compromising on the reconstructed image quality. The maximum errors observed by using the simplified meshes were less than 0.27% in the forward problem and 5% for inverse problem.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Exascale systems of the future are predicted to have mean time between failures (MTBF) of less than one hour. Malleable applications, where the number of processors on which the applications execute can be changed during executions, can make use of their malleability to better tolerate high failure rates. We present AdFT, an adaptive fault tolerance framework for long running malleable applications to maximize application performance in the presence of failures. AdFT framework includes cost models for evaluating the benefits of various fault tolerance actions including checkpointing, live-migration and rescheduling, and runtime decisions for dynamically selecting the fault tolerance actions at different points of application execution to maximize performance. Simulations with real and synthetic failure traces show that our approach outperforms existing fault tolerance mechanisms for malleable applications yielding up to 23% improvement in application performance, and is effective even for petascale systems and beyond.