872 resultados para Sparse Matrix
Resumo:
Krylov subspace techniques have been shown to yield robust methods for the numerical computation of large sparse matrix exponentials and especially the transient solutions of Markov Chains. The attractiveness of these methods results from the fact that they allow us to compute the action of a matrix exponential operator on an operand vector without having to compute, explicitly, the matrix exponential in isolation. In this paper we compare a Krylov-based method with some of the current approaches used for computing transient solutions of Markov chains. After a brief synthesis of the features of the methods used, wide-ranging numerical comparisons are performed on a power challenge array supercomputer on three different models. (C) 1999 Elsevier Science B.V. All rights reserved.AMS Classification: 65F99; 65L05; 65U05.
Resumo:
Sparse matrix-vector multiplication (SMVM) is a fundamental operation in many scientific and engineering applications. In many cases sparse matrices have thousands of rows and columns where most of the entries are zero, while non-zero data is spread over the matrix. This sparsity of data locality reduces the effectiveness of data cache in general-purpose processors quite reducing their performance efficiency when compared to what is achieved with dense matrix multiplication. In this paper, we propose a parallel processing solution for SMVM in a many-core architecture. The architecture is tested with known benchmarks using a ZYNQ-7020 FPGA. The architecture is scalable in the number of core elements and limited only by the available memory bandwidth. It achieves performance efficiencies up to almost 70% and better performances than previous FPGA designs.
Resumo:
Mode of access: Internet.
Resumo:
Sparse-matrix sampling using commercially available crystallization screen kits has become the most popular way of determining the preliminary crystallization conditions for macromolecules. In this study, the efficiency of three commercial screening kits, Crystal Screen and Crystal Screen 2 (Hampton Research), Wizard Screens I and II (Emerald BioStructures) and Personal Structure Screens 1 and 2 (Molecular Dimensions), has been compared using a set of 19 diverse proteins. 18 proteins yielded crystals using at least one crystallization screen. Surprisingly, Crystal Screens and Personal Structure Screens showed dramatically different results, although most of the crystallization formulations are identical as listed by the manufacturers. Higher molecular weight polyethylene glycols and mixed precipitants were found to be the most effective precipitants in this study.
Resumo:
Expokit provides a set of routines aimed at computing matrix exponentials. More precisely, it computes either a small matrix exponential in full, the action of a large sparse matrix exponential on an operand vector, or the solution of a system of linear ODEs with constant inhomogeneity. The backbone of the sparse routines consists of matrix-free Krylov subspace projection methods (Arnoldi and Lanczos processes), and that is why the toolkit is capable of coping with sparse matrices of large dimension. The software handles real and complex matrices and provides specific routines for symmetric and Hermitian matrices. The computation of matrix exponentials is a numerical issue of critical importance in the area of Markov chains and furthermore, the computed solution is subject to probabilistic constraints. In addition to addressing general matrix exponentials, a distinct attention is assigned to the computation of transient states of Markov chains.
Resumo:
The modern GPUs are well suited for intensive computational tasks and massive parallel computation. Sparse matrix multiplication and linear triangular solver are the most important and heavily used kernels in scientific computation, and several challenges in developing a high performance kernel with the two modules is investigated. The main interest it to solve linear systems derived from the elliptic equations with triangular elements. The resulting linear system has a symmetric positive definite matrix. The sparse matrix is stored in the compressed sparse row (CSR) format. It is proposed a CUDA algorithm to execute the matrix vector multiplication using directly the CSR format. A dependence tree algorithm is used to determine which variables the linear triangular solver can determine in parallel. To increase the number of the parallel threads, a coloring graph algorithm is implemented to reorder the mesh numbering in a pre-processing phase. The proposed method is compared with parallel and serial available libraries. The results show that the proposed method improves the computation cost of the matrix vector multiplication. The pre-processing associated with the triangular solver needs to be executed just once in the proposed method. The conjugate gradient method was implemented and showed similar convergence rate for all the compared methods. The proposed method showed significant smaller execution time.
Resumo:
A fully 3D iterative image reconstruction algorithm has been developed for high-resolution PET cameras composed of pixelated scintillator crystal arrays and rotating planar detectors, based on the ordered subsets approach. The associated system matrix is precalculated with Monte Carlo methods that incorporate physical effects not included in analytical models, such as positron range effects and interaction of the incident gammas with the scintillator material. Custom Monte Carlo methodologies have been developed and optimized for modelling of system matrices for fast iterative image reconstruction adapted to specific scanner geometries, without redundant calculations. According to the methodology proposed here, only one-eighth of the voxels within two central transaxial slices need to be modelled in detail. The rest of the system matrix elements can be obtained with the aid of axial symmetries and redundancies, as well as in-plane symmetries within transaxial slices. Sparse matrix techniques for the non-zero system matrix elements are employed, allowing for fast execution of the image reconstruction process. This 3D image reconstruction scheme has been compared in terms of image quality to a 2D fast implementation of the OSEM algorithm combined with Fourier rebinning approaches. This work confirms the superiority of fully 3D OSEM in terms of spatial resolution, contrast recovery and noise reduction as compared to conventional 2D approaches based on rebinning schemes. At the same time it demonstrates that fully 3D methodologies can be efficiently applied to the image reconstruction problem for high-resolution rotational PET cameras by applying accurate pre-calculated system models and taking advantage of the system's symmetries.
Resumo:
Numerical methods related to Krylov subspaces are widely used in large sparse numerical linear algebra. Vectors in these subspaces are manipulated via their representation onto orthonormal bases. Nowadays, on serial computers, the method of Arnoldi is considered as a reliable technique for constructing such bases. However, although easily parallelizable, this technique is not as scalable as expected for communications. In this work we examine alternative methods aimed at overcoming this drawback. Since they retrieve upon completion the same information as Arnoldi's algorithm does, they enable us to design a wide family of stable and scalable Krylov approximation methods for various parallel environments. We present timing results obtained from their implementation on two distributed-memory multiprocessor supercomputers: the Intel Paragon and the IBM Scalable POWERparallel SP2. (C) 1997 by John Wiley & Sons, Ltd.
Resumo:
Debido al gran número de transistores por mm2 que hoy en día podemos encontrar en las GPU convencionales, en los últimos años éstas se vienen utilizando para propósitos generales gracias a que ofrecen un mayor rendimiento para computación paralela. Este proyecto implementa el producto sparse matrix-vector sobre OpenCL. En los primeros capítulos hacemos una revisión de la base teórica necesaria para comprender el problema. Después veremos los fundamentos de OpenCL y del hardware sobre el que se ejecutarán las librerías desarrolladas. En el siguiente capítulo seguiremos con una descripción del código de los kernels y de su flujo de datos. Finalmente, el software es evaluado basándose en comparativas con la CPU.
Resumo:
The present work reports our succesfull experience concerning crystallization of four fish hemoglobins from three Brazilian species of Teleosts: Liposarcus anisitsi, Brycon cephalus and Piaractus mesopotamicus. The data shown here is part of a systematic functional and structural study of fish hemoglobins with the aim of better understanding the outstanding range of functional and structural properties exhibited by these proteins. We also present a reduced sparse-matrix method for crystallization of fish hemoglobins, which can reduce the amount of hemoglobin initially used in the crystallization experiments.
Resumo:
This dissertation describes an approach for developing a real-time simulation for working mobile vehicles based on multibody modeling. The use of multibody modeling allows comprehensive description of the constrained motion of the mechanical systems involved and permits real-time solving of the equations of motion. By carefully selecting the multibody formulation method to be used, it is possible to increase the accuracy of the multibody model while at the same time solving equations of motion in real-time. In this study, a multibody procedure based on semi-recursive and augmented Lagrangian methods for real-time dynamic simulation application is studied in detail. In the semirecursive approach, a velocity transformation matrix is introduced to describe the dependent coordinates into relative (joint) coordinates, which reduces the size of the generalized coordinates. The augmented Lagrangian method is based on usage of global coordinates and, in that method, constraints are accounted using an iterative process. A multibody system can be modelled as either rigid or flexible bodies. When using flexible bodies, the system can be described using a floating frame of reference formulation. In this method, the deformation mode needed can be obtained from the finite element model. As the finite element model typically involves large number of degrees of freedom, reduced number of deformation modes can be obtained by employing model order reduction method such as Guyan reduction, Craig-Bampton method and Krylov subspace as shown in this study The constrained motion of the working mobile vehicles is actuated by the force from the hydraulic actuator. In this study, the hydraulic system is modeled using lumped fluid theory, in which the hydraulic circuit is divided into volumes. In this approach, the pressure wave propagation in the hoses and pipes is neglected. The contact modeling is divided into two stages: contact detection and contact response. Contact detection determines when and where the contact occurs, and contact response provides the force acting at the collision point. The friction between tire and ground is modelled using the LuGre friction model, which describes the frictional force between two surfaces. Typically, the equations of motion are solved in the full matrices format, where the sparsity of the matrices is not considered. Increasing the number of bodies and constraint equations leads to the system matrices becoming large and sparse in structure. To increase the computational efficiency, a technique for solution of sparse matrices is proposed in this dissertation and its implementation demonstrated. To assess the computing efficiency, augmented Lagrangian and semi-recursive methods are implemented employing a sparse matrix technique. From the numerical example, the results show that the proposed approach is applicable and produced appropriate results within the real-time period.
Resumo:
For many networks in nature, science and technology, it is possible to order the nodes so that most links are short-range, connecting near-neighbours, and relatively few long-range links, or shortcuts, are present. Given a network as a set of observed links (interactions), the task of finding an ordering of the nodes that reveals such a range-dependent structure is closely related to some sparse matrix reordering problems arising in scientific computation. The spectral, or Fiedler vector, approach for sparse matrix reordering has successfully been applied to biological data sets, revealing useful structures and subpatterns. In this work we argue that a periodic analogue of the standard reordering task is also highly relevant. Here, rather than encouraging nonzeros only to lie close to the diagonal of a suitably ordered adjacency matrix, we also allow them to inhabit the off-diagonal corners. Indeed, for the classic small-world model of Watts & Strogatz (1998, Collective dynamics of ‘small-world’ networks. Nature, 393, 440–442) this type of periodic structure is inherent. We therefore devise and test a new spectral algorithm for periodic reordering. By generalizing the range-dependent random graph class of Grindrod (2002, Range-dependent random graphs and their application to modeling large small-world proteome datasets. Phys. Rev. E, 66, 066702-1–066702-7) to the periodic case, we can also construct a computable likelihood ratio that suggests whether a given network is inherently linear or periodic. Tests on synthetic data show that the new algorithm can detect periodic structure, even in the presence of noise. Further experiments on real biological data sets then show that some networks are better regarded as periodic than linear. Hence, we find both qualitative (reordered networks plots) and quantitative (likelihood ratios) evidence of periodicity in biological networks.
Resumo:
This paper deals with approaches for sparse matrix substitutions using vector processing. Many publications have used the W-matrix method to solve the forward/backward substitutions on vector computer. Recently a different approach has been presented using dependency-based substitution algorithm (DBSA). In this paper the focus is on new algorithms able to explore the sparsity of the vectors. The efficiency is tested using linear systems from power systems with 118, 320, 725 and 1729 buses. The tests were performed on a CRAY Y MP2E/232. The speedups for a fast-forward/fast-backward using a 1729-bus system are near 19 and 14 for real and complex arithmetic operations, respectively. When forward/backward is employed the speedups are about 8 and 6 to perform the same simulations.
Resumo:
SMase I, a 32 kDa sphingomyelinase found in Loxosceles laeta venom, is responsible for the major pathological effects of spider envenomation. This toxin has been cloned and functionally expressed as a fusion protein containing a 6 x His tag at its N-terminus to yield a 33 kDa protein [Fernandes-Pedrosa et al. (2002), Biochem. Biophys. Res. Commun. 298, 638 - 645]. The recombinant protein possesses all the biological properties ascribed to the whole L. laeta venom, including dermonecrotic and complement-dependent haemolytic activities. Dynamic light-scattering experiments conducted at 291 K demonstrate that the sample possesses a monomodal distribution, with a hydrodynamic radius of 3.57 nm. L. laeta SMase I was crystallized by the hanging-drop vapour-diffusion technique using the sparse-matrix method. Single crystals were obtained using a buffer solution consisting of 0.08 M HEPES and 0.9 M trisodium citrate, which was titrated to pH 7.5 using 0.25 M sodium hydroxide. Complete three-dimensional diffraction data were collected to 1.8 Angstrom at the Laboratorio Nacional de Luz Sincrotron (LNLS, Campinas, Brazil). The crystals belong to the hexagonal system ( space group P6(1) or P6(5)), with unit-cell parameters a = b = 140.6, c = 113.6 Angstrom. A search for heavy-atom derivatives has been initiated and elucidation of the crystal structure is currently in progress.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)