955 resultados para Quadratic, sieve, CUDA, OpenMP, SOC, Tegrak1
Resumo:
Darken's quadratic formalism is extended to multicomponent solutions. Equations are developed for the representation of the integral and partial excess free energies, entropies and enthalpies in dilute multicomponent solutions. Quadratic formalism applied to multicomponent solutions is thermodynamically consistent. The formalism is compared with the conventional second order Maclaurin series or interaction parameter representation and the relations between them are derived. Advantages of the quadratic formalism are discussed.
Resumo:
High-speed evaluation of a large number of linear, quadratic, and cubic expressions is very important for the modeling and real-time display of objects in computer graphics. Using VLSI techniques, chips called pixel planes have actually been built by H. Fuchs and his group to evaluate linear expressions. In this paper, we describe a topological variant of Fuchs' pixel planes which can evaluate linear, quadratic, cubic, and higher-order polynomials. In our design, we make use of local interconnections only, i.e., interconnections between neighboring processing cells. This leads to the concept of tiling the processing cells for VLSI implementation.
Resumo:
We report large quadratic nonlinearity in a series of 1:1 molecular complexes between methyl substituted benzene donors and quinone acceptors in solution. The first hyperpolarizability, beta(HRS), which is very small for the individual components, becomes large by intermolecular charge transfer (CT) interaction between the donor and the acceptor in the complex. In addition, we have investigated the geometry of these CT complexes in solution using polarization resolved hyper-Rayleigh scattering (HRS). Using linearly (electric field vector along X direction) and circularly polarized incident light, respectively, we have measured two macroscopic depolarization ratios D = I-2 omega,I-X,I-X/I-2 omega,I-Z,I-X and D' = I-2 omega,I-X,I-C/I-2 omega,I-Z,I-C in the laboratory fixed XYZ frame by detecting the second harmonic scattered light in a polarization resolved fashion. The experimentally obtained first hyperpolarizability, beta(HRS), and the value of macroscopic depolarization ratios, D and D', are then matched with the theoretically deduced values from single and double configuration interaction calculations performed using the Zerner's intermediate neglect of differential overlap self-consistent reaction field technique. In solution, since several geometries are possible, we have carried out calculations by rotating the acceptor moiety around three different axes keeping the donor molecule fixed at an optimized geometry. These rotations give us the theoretical beta(HRS), D and D' values as a function of the geometry of the complex. The calculated beta(HRS), D, and D' values that closely match with the experimental values, give the dominant equilibrium geometry in solution. All the CT complexes between methyl benzenes and chloranil or 1,2-dichloro-4,5-dicyano-p-benzoquinone investigated here are found to have a slipped parallel stacking of the donors and the acceptors. Furthermore, the geometries are staggered and in some pairs, a twist angle as high as 30 degrees is observed. Thus, we have demonstrated in this paper that the polarization resolved HRS technique along with theoretical calculations can unravel the geometry of CT complexes in solution. (C) 2011 American Institute of Physics. doi:10.1063/1.3514922]
Resumo:
In this paper, we have computed the quadratic nonlinear optical (NLO) properties of a class of weak charge transfer (CT) complexes. These weak complexes are formed when the methyl substituted benzenes (donors) are added to strong acceptors like chloranil (CHL) or di-chloro-di-cyano benzoquinone (DDQ) in chloroform or in dichloromethane. The formation of such complexes is manifested by the presence of a broad absorption maximum in the visible range of the spectrum where neither the donor nor the acceptor absorbs. The appearance of this visible band is due to CT interactions, which result in strong NLO responses. We have employed the semiempirical intermediate neglect of differential overlap (INDO/S) Hamiltonian to calculate the energy levels of these CT complexes using single and double configuration interaction (SDCI). The solvent effects are taken into account by using the self-consistent reaction field (SCRF) scheme. The geometry of the complex is obtained by exploring different relative molecular geometries by rotating the acceptor with respect to the fixed donor about three different axes. The theoretical geometry that best fits the experimental energy gaps, beta(HRS) and macroscopic depolarization ratios is taken to be the most probable geometry of the complex. Our studies show that the most probable geometry of these complexes in solution is the parallel displaced structure with a significant twist in some cases. (C) 2011 American Institute of Physics. doi:10.1063/1.3526748]
Resumo:
By using the strain smoothing technique proposed by Chen et al. (Comput. Mech. 2000; 25: 137-156) for meshless methods in the context of the finite element method (FEM), Liu et al. (Comput. Mech. 2007; 39(6): 859-877) developed the Smoothed FEM (SFEM). Although the SFEM is not yet well understood mathematically, numerical experiments point to potentially useful features of this particularly simple modification of the FEM. To date, the SFEM has only been investigated for bilinear and Wachspress approximations and is limited to linear reproducing conditions. The goal of this paper is to extend the strain smoothing to higher order elements and to investigate numerically in which condition strain smoothing is beneficial to accuracy and convergence of enriched finite element approximations. We focus on three widely used enrichment schemes, namely: (a) weak discontinuities; (b) strong discontinuities; (c) near-tip linear elastic fracture mechanics functions. The main conclusion is that strain smoothing in enriched approximation is only beneficial when the enrichment functions are polynomial (cases (a) and (b)), but that non-polynomial enrichment of type (c) lead to inferior methods compared to the standard enriched FEM (e.g. XFEM). Copyright (C) 2011 John Wiley & Sons, Ltd.
Resumo:
Let K be any quadratic field with O-K its ring of integers. We study the solutions of cubic equations, which represent elliptic curves defined over Q, in quadratic fields and prove some interesting results regarding the solutions by using elementary tools. As an application we consider the Diophantine equation r + s + t = rst = 1 in O-K. This Diophantine equation gives an elliptic curve defined over Q with finite Mordell-Weil group. Using our study of the solutions of cubic equations in quadratic fields we present a simple proof of the fact that except for the ring of integers of Q(i) and Q(root 2), this Diophantine equation is not solvable in the ring of integers of any other quadratic fields, which is already proved in [4].
Resumo:
The paper deals with the existence of a quadratic Lyapunov function V = x′P(t)x for an exponentially stable linear system with varying coefficients described by the vector differential equation S0305004100044777_inline1 The derivative dV/dt is allowed to be strictly semi-(F) and the locus dV/dt = 0 does not contain any arc of the system trajectory. It is then shown that the coefficient matrix A(t) of the exponentially stable sy
Resumo:
In this work, we evaluate performance of a real-world image processing application that uses a cross-correlation algorithm to compare a given image with a reference one. The algorithm processes individual images represented as 2-dimensional matrices of single-precision floating-point values using O(n4) operations involving dot-products and additions. We implement this algorithm on a nVidia GTX 285 GPU using CUDA, and also parallelize it for the Intel Xeon (Nehalem) and IBM Power7 processors, using both manual and automatic techniques. Pthreads and OpenMP with SSE and VSX vector intrinsics are used for the manually parallelized version, while a state-of-the-art optimization framework based on the polyhedral model is used for automatic compiler parallelization and optimization. The performance of this algorithm on the nVidia GPU suffers from: (1) a smaller shared memory, (2) unaligned device memory access patterns, (3) expensive atomic operations, and (4) weaker single-thread performance. On commodity multi-core processors, the application dataset is small enough to fit in caches, and when parallelized using a combination of task and short-vector data parallelism (via SSE/VSX) or through fully automatic optimization from the compiler, the application matches or beats the performance of the GPU version. The primary reasons for better multi-core performance include larger and faster caches, higher clock frequency, higher on-chip memory bandwidth, and better compiler optimization and support for parallelization. The best performing versions on the Power7, Nehalem, and GTX 285 run in 1.02s, 1.82s, and 1.75s, respectively. These results conclusively demonstrate that, under certain conditions, it is possible for a FLOP-intensive structured application running on a multi-core processor to match or even beat the performance of an equivalent GPU version.
Resumo:
Verification is one of the important stages in designing an SoC (system on chips) that consumes upto 70% of the design time. In this work, we present a methodology to automatically generate verification test-cases to verify a class of SoCs and also enable re-use of verification resources created from one SoC to another. A prototype implementation for generating the test-cases is also presented.
Resumo:
Today's feature-rich multimedia products require embedded system solution with complex System-on-Chip (SoC) to meet market expectations of high performance at a low cost and lower energy consumption. The memory architecture of the embedded system strongly influences critical system design objectives like area, power and performance. Hence the embedded system designer performs a complete memory architecture exploration to custom design a memory architecture for a given set of applications. Further, the designer would be interested in multiple optimal design points to address various market segments. However, tight time-to-market constraints enforces short design cycle time. In this paper we address the multi-level multi-objective memory architecture exploration problem through a combination of exhaustive-search based memory exploration at the outer level and a two step based integrated data layout for SPRAM-Cache based architectures at the inner level. We present a two step integrated approach for data layout for SPRAM-Cache based hybrid architectures with the first step as data-partitioning that partitions data between SPRAM and Cache, and the second step is the cache conscious data layout. We formulate the cache-conscious data layout as a graph partitioning problem and show that our approach gives up to 34% improvement over an existing approach and also optimizes the off-chip memory address space. We experimented our approach with 3 embedded multimedia applications and our approach explores several hundred memory configurations for each application, yielding several optimal design points in a few hours of computation on a standard desktop.
Resumo:
In this paper we propose the architecture of a SoC fabric onto which applications described in a HLL are synthesized. The fabric is a homogeneous layout of computation, storage and communication resources on silicon. Through a process of composition of resources (as opposed to decomposition of applications), application specific computational structures are defined on the fabric at runtime to realize different modules of the applications in hardware. Applications synthesized on this fabric offers performance comparable to ASICs while retaining the programmability of processing cores. We outline the application synthesis methodology through examples, and compare our results with software implementations on traditional platforms with unbounded resources.
Resumo:
Continuous advances in VLSI technology have made implementation of very complicated systems possible. Modern System-on -Chips (SoCs) have many processors, IP cores and other functional units. As a result, complete verification of whole systems before implementation is becoming infeasible; hence it is likely that these systems may have some errors after manufacturing. This increases the need to find design errors in chips after fabrication. The main challenge for post-silicon debug is the observability of the internal signals. Post-silicon debug is the problem of determining what's wrong when the fabricated chip of a new design behaves incorrectly. This problem now consumes over half of the overall verification effort on large designs, and the problem is growing worse.Traditional post-silicon debug methods concentrate on functional parts of systems and provide mechanisms to increase the observability of internal state of systems. Those methods may not be sufficient as modern SoCs have lots of blocks (processors, IP cores, etc.) which are communicating with one another and communication is another source of design errors. This tutorial will be provide an insight into various observability enhancement techniques, on chip instrumentation techniques and use of high level models to support the debug process targeting both inside blocks and communication among them. It will also cover the use of formal methods to help debug process.