955 resultados para Quadratic, sieve, CUDA, OpenMP, SOC, Tegrak1
Resumo:
We have investigated quadratic nonlinearity (beta(HRS)) and linear and circular depolarization ratios (D and D', respectively) of a series of 1:1 complexes of tropyliumtetrafluoroborate as a cation and methyl-substituted benzenes as pi-donors by making polarization resolved hyper-Rayleigh scattering measurements in solution. The measured D and D' values are much lower than the values expected from a typical sandwich or a T-shaped geometry of a complex. In the cation-pi complexes studied here, the D value varies from 1.36 to 1.46 and D' from 1.62 to 1.72 depending on the number of methyl substitutions on the benzene ring. In order to probe it further, beta, D and D' were computed using the Zerner intermediate neglect of differential overlap-correction vector self-consistent reaction field technique including single and double configuration interactions in the absence and presence of BF4- anion. In the absence of the anion, the calculated value of D varies from 4.20 to 4.60 and that of D' from 2.45 to 2.72 which disagree with experimental values. However, by arranging three cation-pi BF4- complexes in a trigonal symmetry, the computed values are brought to agreement with experiments. When such an arrangement was not considered, the calculated beta values were lower than the experimental values by more than a factor of two. This unprecedented influence of the otherwise ``unimportant'' anion in solution on the beta value and depolarization ratios of these cation-pi complexes is highlighted and emphasized in this paper. (C) 2012 American Institute of Physics. http://dx.doi.org/10.1063/1.4716020]
Resumo:
We have developed an efficient fully three-dimensional (3D) reconstruction algorithm for diffuse optical tomography (DOT). The 3D DOT, a severely ill-posed problem, is tackled through a pseudodynamic (PD) approach wherein an ordinary differential equation representing the evolution of the solution on pseudotime is integrated that bypasses an explicit inversion of the associated, ill-conditioned system matrix. One of the most computationally expensive parts of the iterative DOT algorithm, the reevaluation of the Jacobian in each of the iterations, is avoided by using the adjoint-Broyden update formula to provide low rank updates to the Jacobian. In addition, wherever feasible, we have also made the algorithm efficient by integrating along the quadratic path provided by the perturbation equation containing the Hessian. These algorithms are then proven by reconstruction, using simulated and experimental data and verifying the PD results with those from the popular Gauss-Newton scheme. The major findings of this work are as follows: (i) the PD reconstructions are comparatively artifact free, providing superior absorption coefficient maps in terms of quantitative accuracy and contrast recovery; (ii) the scaling of computation time with the dimension of the measurement set is much less steep with the Jacobian update formula in place than without it; and (iii) an increase in the data dimension, even though it renders the reconstruction problem less ill conditioned and thus provides relatively artifact-free reconstructions, does not necessarily provide better contrast property recovery. For the latter, one should also take care to uniformly distribute the measurement points, avoiding regions close to the source so that the relative strength of the derivatives for measurements away from the source does not become insignificant. (c) 2012 Optical Society of America
Resumo:
We present an extensive study of Mott insulator (MI) and superfluid (SF) shells in Bose-Hubbard (BH) models for bosons in optical lattices with harmonic traps. For this we apply the inhomogeneous mean-field theory developed by Sheshadri et al. Phys. Rev. Lett. 75, 4075 (1995)]. Our results for the BH model with one type of spinless bosons agree quantitatively with quantum Monte Carlo simulations. Our approach is numerically less intensive than such simulations, so we are able to perform calculations on experimentally realistic, large three-dimensional systems, explore a wide range of parameter values, and make direct contact with a variety of experimental measurements. We also extend our inhomogeneous mean-field theory to study BH models with harmonic traps and (a) two species of bosons or (b) spin-1 bosons. With two species of bosons, we obtain rich phase diagrams with a variety of SF and MI phases and associated shells when we include a quadratic confining potential. For the spin-1 BH model, we show, in a representative case, that the system can display alternating shells of polar SF and MI phases, and we make interesting predictions for experiments in such systems.
Resumo:
Linear quadratic stabilizers are well-known for their superior control capabilities when compared to the conventional lead-lag power system stabilizers. However, they have not seen much of practical importance as the state variables are generally not measurable; especially the generator rotor angle measurement is not available in most of the power plants. Full state feedback controllers require feedback of other machine states in a multi-machine power system and necessitate block diagonal structure constraints for decentralized implementation. This paper investigates the design of Linear Quadratic Power System Stabilizers using a recently proposed modified Heffron-Phillip's model. This model is derived by taking the secondary bus voltage of the step-up transformer as reference instead of the infinite bus. The state variables of this model can be obtained by local measurements. This model allows a coordinated linear quadratic control design in multi machine systems. The performance of the proposed controller has been evaluated on two widely used multi-machine power systems, 4 generator 10 bus and 10 generator 39 bus systems. It has been observed that the performance of the proposed controller is superior to that of the conventional Power System Stabilizers (PSS) over a wide range of operating and system conditions.
Resumo:
We develop a quadratic C degrees interior penalty method for linear fourth order boundary value problems with essential and natural boundary conditions of the Cahn-Hilliard type. Both a priori and a posteriori error estimates are derived. The performance of the method is illustrated by numerical experiments.
Resumo:
A new approach that can easily incorporate any generic penalty function into the diffuse optical tomographic image reconstruction is introduced to show the utility of nonquadratic penalty functions. The penalty functions that were used include quadratic (l(2)), absolute (l(1)), Cauchy, and Geman-McClure. The regularization parameter in each of these cases was obtained automatically by using the generalized cross-validation method. The reconstruction results were systematically compared with each other via utilization of quantitative metrics, such as relative error and Pearson correlation. The reconstruction results indicate that, while the quadratic penalty may be able to provide better separation between two closely spaced targets, its contrast recovery capability is limited, and the sparseness promoting penalties, such as l(1), Cauchy, and Geman-McClure have better utility in reconstructing high-contrast and complex-shaped targets, with the Geman-McClure penalty being the most optimal one. (C) 2013 Optical Society of America
Resumo:
We propose an eigenvalue based technique to solve the Homogeneous Quadratic Constrained Quadratic Programming problem (HQCQP) with at most three constraints which arise in many signal processing problems. Semi-Definite Relaxation (SDR) is the only known approach and is computationally intensive. We study the performance of the proposed fast eigen approach through simulations in the context of MIMO relays and show that the solution converges to the solution obtained using the SDR approach with significant reduction in complexity.
Resumo:
Rapid advancements in multi-core processor architectures coupled with low-cost, low-latency, high-bandwidth interconnects have made clusters of multi-core machines a common computing resource. Unfortunately, writing good parallel programs that efficiently utilize all the resources in such a cluster is still a major challenge. Various programming languages have been proposed as a solution to this problem, but are yet to be adopted widely to run performance-critical code mainly due to the relatively immature software framework and the effort involved in re-writing existing code in the new language. In this paper, we motivate and describe our initial study in exploring CUDA as a programming language for a cluster of multi-cores. We develop CUDA-For-Clusters (CFC), a framework that transparently orchestrates execution of CUDA kernels on a cluster of multi-core machines. The well-structured nature of a CUDA kernel, the growing popularity, support and stability of the CUDA software stack collectively make CUDA a good candidate to be considered as a programming language for a cluster. CFC uses a mixture of source-to-source compiler transformations, a work distribution runtime and a light-weight software distributed shared memory to manage parallel executions. Initial results on running several standard CUDA benchmark programs achieve impressive speedups of up to 7.5X on a cluster with 8 nodes, thereby opening up an interesting direction of research for further investigation.
Resumo:
As System-on-Chip (SoC) designs migrate to 28nm process node and beyond, the electromagnetic (EM) co-interactions of the Chip-Package-Printed Circuit Board (PCB) becomes critical and require accurate and efficient characterization and verification. In this paper a fast, scalable, and parallelized boundary element based integral EM solutions to Maxwell equations is presented. The accuracy of the full-wave formulation, for complete EM characterization, has been validated on both canonical structures and real-world 3-D system (viz. Chip + Package + PCB). Good correlation between numerical simulation and measurement has been achieved. A few examples of the applicability of the formulation to high speed digital and analog serial interfaces on a 45nm SoC are also presented.
Resumo:
CuIn1-xAlxSe2 (CIASe) thin films were grown by a simple sol-gel route followed by annealing under vacuum. Parameters related to the spin-orbit (Delta(SO)) and crystal field (Delta(CF)) were determined using a quasi-cubic model. Highly oriented (002) aluminum doped (2%) ZnO, 100 nm thin films, were co-sputtered for CuIn1-xAlxSe2/AZnO based solar cells. Barrier height and ideality factor varied from 0.63 eV to 0.51 eV and 1.3186 to 2.095 in the dark and under 1.38 A. M 1.5 solar illumination respectively. Current-voltage characteristics carried out at 300 K were confined to a triangle, exhibiting three limiting conduction mechanisms: Ohms law, trap-filled limit curve and SCLC, with 0.2 V being the cross-over voltage, for a quadratic transition from Ohm's to Child's law. Visible photodetection was demonstrated with a CIASe/AZO photodiode configuration. Photocurrent was enhanced by one order from 3 x 10(-3) A in the dark at 1 V to 3 x 10(-2) A upon 1.38 sun illumination. The optimized photodiode exhibits an external quantum efficiency of over 32% to 10% from 350 to 1100 nm at high intensity 17.99 mW cm(-2) solar illumination. High responsivity R-lambda similar to 920 A W-1, sensitivity S similar to 9.0, specific detectivity D* similar to 3 x 10(14) Jones, make CIASe a potential absorber for enhancing the forthcoming technological applications of photodetection.
Resumo:
We discuss the computational bottlenecks in molecular dynamics (MD) and describe the challenges in parallelizing the computation-intensive tasks. We present a hybrid algorithm using MPI (Message Passing Interface) with OpenMP threads for parallelizing a generalized MD computation scheme for systems with short range interatomic interactions. The algorithm is discussed in the context of nano-indentation of Chromium films with carbon indenters using the Embedded Atom Method potential for Cr-Cr interaction and the Morse potential for Cr-C interactions. We study the performance of our algorithm for a range of MPI-thread combinations and find the performance to depend strongly on the computational task and load sharing in the multi-core processor. The algorithm scaled poorly with MPI and our hybrid schemes were observed to outperform the pure message passing scheme, despite utilizing the same number of processors or cores in the cluster. Speed-up achieved by our algorithm compared favorably with that achieved by standard MD packages. (C) 2013 Elsevier Inc. All rights reserved.
Resumo:
The Cubic Sieve Method for solving the Discrete Logarithm Problem in prime fields requires a nontrivial solution to the Cubic Sieve Congruence (CSC) x(3) equivalent to y(2)z (mod p), where p is a given prime number. A nontrivial solution must also satisfy x(3) not equal y(2)z and 1 <= x, y, z < p(alpha), where alpha is a given real number such that 1/3 < alpha <= 1/2. The CSC problem is to find an efficient algorithm to obtain a nontrivial solution to CSC. CSC can be parametrized as x equivalent to v(2)z (mod p) and y equivalent to v(3)z (mod p). In this paper, we give a deterministic polynomial-time (O(ln(3) p) bit-operations) algorithm to determine, for a given v, a nontrivial solution to CSC, if one exists. Previously it took (O) over tilde (p(alpha)) time in the worst case to determine this. We relate the CSC problem to the gap problem of fractional part sequences, where we need to determine the non-negative integers N satisfying the fractional part inequality {theta N} < phi (theta and phi are given real numbers). The correspondence between the CSC problem and the gap problem is that determining the parameter z in the former problem corresponds to determining N in the latter problem. We also show in the alpha = 1/2 case of CSC that for a certain class of primes the CSC problem can be solved deterministically in <(O)over tilde>(p(1/3)) time compared to the previous best of (O) over tilde (p(1/2)). It is empirically observed that about one out of three primes is covered by the above class. (C) 2013 Elsevier B.V. All rights reserved.
Resumo:
A residual based a posteriori error estimator is derived for a quadratic finite element method (FEM) for the elliptic obstacle problem. The error estimator involves various residuals consisting of the data of the problem, discrete solution and a Lagrange multiplier related to the obstacle constraint. The choice of the discrete Lagrange multiplier yields an error estimator that is comparable with the error estimator in the case of linear FEM. Further, an a priori error estimate is derived to show that the discrete Lagrange multiplier converges at the same rate as that of the discrete solution of the obstacle problem. The numerical experiments of adaptive FEM show optimal order convergence. This demonstrates that the quadratic FEM for obstacle problem exhibits optimal performance.
Resumo:
In this paper, an alternative apriori and aposteriori formulation has been derived for the discrete linear quadratic regulator (DLQR) in a manner analogous to that used in the discrete Kalman filter. It has been shown that the formulation seamlessly fits into the available formulation of the DLQR and the equivalent terms in the existing formulation and the proposed formulation have been identified. Thereafter, the significance of this alternative formulation has been interpreted in terms of the sensitivity of the controller performances to any changes in the states or to changes in the control inputs. The implications of this alternative formulation to adaptive controller tuning have also been discussed.
Resumo:
Rapid reconstruction of multidimensional image is crucial for enabling real-time 3D fluorescence imaging. This becomes a key factor for imaging rapidly occurring events in the cellular environment. To facilitate real-time imaging, we have developed a graphics processing unit (GPU) based real-time maximum a-posteriori (MAP) image reconstruction system. The parallel processing capability of GPU device that consists of a large number of tiny processing cores and the adaptability of image reconstruction algorithm to parallel processing (that employ multiple independent computing modules called threads) results in high temporal resolution. Moreover, the proposed quadratic potential based MAP algorithm effectively deconvolves the images as well as suppresses the noise. The multi-node multi-threaded GPU and the Compute Unified Device Architecture (CUDA) efficiently execute the iterative image reconstruction algorithm that is similar to 200-fold faster (for large dataset) when compared to existing CPU based systems. (C) 2015 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution 3.0 Unported License.