943 resultados para Sparse matrices


Relevância:

100.00% 100.00%

Publicador:

Resumo:

"Vegeu el resum a l'inici del document del fitxer adjunt."

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Sparse matrix-vector multiplication (SMVM) is a fundamental operation in many scientific and engineering applications. In many cases sparse matrices have thousands of rows and columns where most of the entries are zero, while non-zero data is spread over the matrix. This sparsity of data locality reduces the effectiveness of data cache in general-purpose processors quite reducing their performance efficiency when compared to what is achieved with dense matrix multiplication. In this paper, we propose a parallel processing solution for SMVM in a many-core architecture. The architecture is tested with known benchmarks using a ZYNQ-7020 FPGA. The architecture is scalable in the number of core elements and limited only by the available memory bandwidth. It achieves performance efficiencies up to almost 70% and better performances than previous FPGA designs.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper we introduce a new algorithm, based on the successful work of Fathi and Alexandrov, on hybrid Monte Carlo algorithms for matrix inversion and solving systems of linear algebraic equations. This algorithm consists of two parts, approximate inversion by Monte Carlo and iterative refinement using a deterministic method. Here we present a parallel hybrid Monte Carlo algorithm, which uses Monte Carlo to generate an approximate inverse and that improves the accuracy of the inverse with an iterative refinement. The new algorithm is applied efficiently to sparse non-singular matrices. When we are solving a system of linear algebraic equations, Bx = b, the inverse matrix is used to compute the solution vector x = B(-1)b. We present results that show the efficiency of the parallel hybrid Monte Carlo algorithm in the case of sparse matrices.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Computing the weighted geometric mean of large sparse matrices is an operation that tends to become rapidly intractable, when the size of the matrices involved grows. However, if we are not interested in the computation of the matrix function itself, but just in that of its product times a vector, the problem turns simpler and there is a chance to solve it even when the matrix mean would actually be impossible to compute. Our interest is motivated by the fact that this calculation has some practical applications, related to the preconditioning of some operators arising in domain decomposition of elliptic problems. In this thesis, we explore how such a computation can be efficiently performed. First, we exploit the properties of the weighted geometric mean and find several equivalent ways to express it through real powers of a matrix. Hence, we focus our attention on matrix powers and examine how well-known techniques can be adapted to the solution of the problem at hand. In particular, we consider two broad families of approaches for the computation of f(A) v, namely quadrature formulae and Krylov subspace methods, and generalize them to the pencil case f(A\B) v. Finally, we provide an extensive experimental evaluation of the proposed algorithms and also try to assess how convergence speed and execution time are influenced by some characteristics of the input matrices. Our results suggest that a few elements have some bearing on the performance and that, although there is no best choice in general, knowing the conditioning and the sparsity of the arguments beforehand can considerably help in choosing the best strategy to tackle the problem.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The stable similarity reduction of a nonsymmetric square matrix to tridiagonal form has been a long-standing problem in numerical linear algebra. The biorthogonal Lanczos process is in principle a candidate method for this task, but in practice it is confined to sparse matrices and is restarted periodically because roundoff errors affect its three-term recurrence scheme and degrade the biorthogonality after a few steps. This adds to its vulnerability to serious breakdowns or near-breakdowns, the handling of which involves recovery strategies such as the look-ahead technique, which needs a careful implementation to produce a block-tridiagonal form with unpredictable block sizes. Other candidate methods, geared generally towards full matrices, rely on elementary similarity transformations that are prone to numerical instabilities. Such concomitant difficulties have hampered finding a satisfactory solution to the problem for either sparse or full matrices. This study focuses primarily on full matrices. After outlining earlier tridiagonalization algorithms from within a general framework, we present a new elimination technique combining orthogonal similarity transformations that are stable. We also discuss heuristics to circumvent breakdowns. Applications of this study include eigenvalue calculation and the approximation of matrix functions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Expokit provides a set of routines aimed at computing matrix exponentials. More precisely, it computes either a small matrix exponential in full, the action of a large sparse matrix exponential on an operand vector, or the solution of a system of linear ODEs with constant inhomogeneity. The backbone of the sparse routines consists of matrix-free Krylov subspace projection methods (Arnoldi and Lanczos processes), and that is why the toolkit is capable of coping with sparse matrices of large dimension. The software handles real and complex matrices and provides specific routines for symmetric and Hermitian matrices. The computation of matrix exponentials is a numerical issue of critical importance in the area of Markov chains and furthermore, the computed solution is subject to probabilistic constraints. In addition to addressing general matrix exponentials, a distinct attention is assigned to the computation of transient states of Markov chains.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper gives a detailed presentation of the Substitution-Newton-Raphson method, suitable for large sparse non-linear systems. It combines the Successive Substitution method and the Newton-Raphson method in such way as to take the best advantages of both, keeping the convergence features of the Newton-Raphson with the low requirements of memory and time of the Successive Substitution schemes. The large system is solved employing few effective variables, using the greatest possible part of the model equations in substitution fashion to fix the remaining variables, but maintaining the convergence characteristics of the Newton-Raphson. The methodology is exemplified through a simple algebraic system, and applied to a simple thermodynamic, mechanical and heat transfer modeling of a single-stage vapor compression refrigeration system. Three distinct approaches for reproducing the thermodynamic properties of the refrigerant R-134a are compared: the linear interpolation from tabulated data, the use of polynomial fitted curves and the use of functions derived from the Helmholtz free energy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This dissertation describes an approach for developing a real-time simulation for working mobile vehicles based on multibody modeling. The use of multibody modeling allows comprehensive description of the constrained motion of the mechanical systems involved and permits real-time solving of the equations of motion. By carefully selecting the multibody formulation method to be used, it is possible to increase the accuracy of the multibody model while at the same time solving equations of motion in real-time. In this study, a multibody procedure based on semi-recursive and augmented Lagrangian methods for real-time dynamic simulation application is studied in detail. In the semirecursive approach, a velocity transformation matrix is introduced to describe the dependent coordinates into relative (joint) coordinates, which reduces the size of the generalized coordinates. The augmented Lagrangian method is based on usage of global coordinates and, in that method, constraints are accounted using an iterative process. A multibody system can be modelled as either rigid or flexible bodies. When using flexible bodies, the system can be described using a floating frame of reference formulation. In this method, the deformation mode needed can be obtained from the finite element model. As the finite element model typically involves large number of degrees of freedom, reduced number of deformation modes can be obtained by employing model order reduction method such as Guyan reduction, Craig-Bampton method and Krylov subspace as shown in this study The constrained motion of the working mobile vehicles is actuated by the force from the hydraulic actuator. In this study, the hydraulic system is modeled using lumped fluid theory, in which the hydraulic circuit is divided into volumes. In this approach, the pressure wave propagation in the hoses and pipes is neglected. The contact modeling is divided into two stages: contact detection and contact response. Contact detection determines when and where the contact occurs, and contact response provides the force acting at the collision point. The friction between tire and ground is modelled using the LuGre friction model, which describes the frictional force between two surfaces. Typically, the equations of motion are solved in the full matrices format, where the sparsity of the matrices is not considered. Increasing the number of bodies and constraint equations leads to the system matrices becoming large and sparse in structure. To increase the computational efficiency, a technique for solution of sparse matrices is proposed in this dissertation and its implementation demonstrated. To assess the computing efficiency, augmented Lagrangian and semi-recursive methods are implemented employing a sparse matrix technique. From the numerical example, the results show that the proposed approach is applicable and produced appropriate results within the real-time period.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we deal with performance analysis of Monte Carlo algorithm for large linear algebra problems. We consider applicability and efficiency of the Markov chain Monte Carlo for large problems, i.e., problems involving matrices with a number of non-zero elements ranging between one million and one billion. We are concentrating on analysis of the almost Optimal Monte Carlo (MAO) algorithm for evaluating bilinear forms of matrix powers since they form the so-called Krylov subspaces. Results are presented comparing the performance of the Robust and Non-robust Monte Carlo algorithms. The algorithms are tested on large dense matrices as well as on large unstructured sparse matrices.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

El objetivo de esta Tesis ha sido la consecución de simulaciones en tiempo real de vehículos industriales modelizados como sistemas multicuerpo complejos formados por sólidos rígidos. Para el desarrollo de un programa de simulación deben considerarse cuatro aspectos fundamentales: la modelización del sistema multicuerpo (tipos de coordenadas, pares ideales o impuestos mediante fuerzas), la formulación a utilizar para plantear las ecuaciones diferenciales del movimiento (coordenadas dependientes o independientes, métodos globales o topológicos, forma de imponer las ecuaciones de restricción), el método de integración numérica para resolver estas ecuaciones en el tiempo (integradores explícitos o implícitos) y finalmente los detalles de la implementación realizada (lenguaje de programación, librerías matemáticas, técnicas de paralelización). Estas cuatro etapas están interrelacionadas entre sí y todas han formado parte de este trabajo. Desde la generación de modelos de una furgoneta y de camión con semirremolque, el uso de tres formulaciones dinámicas diferentes, la integración de las ecuaciones diferenciales del movimiento mediante métodos explícitos e implícitos, hasta el uso de funciones BLAS, de técnicas de matrices sparse y la introducción de paralelización para utilizar los distintos núcleos del procesador. El trabajo presentado en esta Tesis ha sido organizado en 8 capítulos, dedicándose el primero de ellos a la Introducción. En el Capítulo 2 se presentan dos formulaciones semirrecursivas diferentes, de las cuales la primera está basada en una doble transformación de velocidades, obteniéndose las ecuaciones diferenciales del movimiento en función de las aceleraciones relativas independientes. La integración numérica de estas ecuaciones se ha realizado con el método de Runge-Kutta explícito de cuarto orden. La segunda formulación está basada en coordenadas relativas dependientes, imponiendo las restricciones por medio de penalizadores en posición y corrigiendo las velocidades y aceleraciones mediante métodos de proyección. En este segundo caso la integración de las ecuaciones del movimiento se ha llevado a cabo mediante el integrador implícito HHT (Hilber, Hughes and Taylor), perteneciente a la familia de integradores estructurales de Newmark. En el Capítulo 3 se introduce la tercera formulación utilizada en esta Tesis. En este caso las uniones entre los sólidos del sistema se ha realizado mediante uniones flexibles, lo que obliga a imponer los pares por medio de fuerzas. Este tipo de uniones impide trabajar con coordenadas relativas, por lo que la posición del sistema y el planteamiento de las ecuaciones del movimiento se ha realizado utilizando coordenadas Cartesianas y parámetros de Euler. En esta formulación global se introducen las restricciones mediante fuerzas (con un planteamiento similar al de los penalizadores) y la estabilización del proceso de integración numérica se realiza también mediante proyecciones de velocidades y aceleraciones. En el Capítulo 4 se presenta una revisión de las principales herramientas y estrategias utilizadas para aumentar la eficiencia de las implementaciones de los distintos algoritmos. En primer lugar se incluye una serie de consideraciones básicas para aumentar la eficiencia numérica de las implementaciones. A continuación se mencionan las principales características de los analizadores de códigos utilizados y también las librerías matemáticas utilizadas para resolver los problemas de álgebra lineal tanto con matrices densas como sparse. Por último se desarrolla con un cierto detalle el tema de la paralelización en los actuales procesadores de varios núcleos, describiendo para ello el patrón empleado y las características más importantes de las dos herramientas propuestas, OpenMP y las TBB de Intel. Hay que señalar que las características de los sistemas multicuerpo problemas de pequeño tamaño, frecuente uso de la recursividad, y repetición intensiva en el tiempo de los cálculos con fuerte dependencia de los resultados anteriores dificultan extraordinariamente el uso de técnicas de paralelización frente a otras áreas de la mecánica computacional, tales como por ejemplo el cálculo por elementos finitos. Basándose en los conceptos mencionados en el Capítulo 4, el Capítulo 5 está dividido en tres secciones, una para cada formulación propuesta en esta Tesis. En cada una de estas secciones se describen los detalles de cómo se han realizado las distintas implementaciones propuestas para cada algoritmo y qué herramientas se han utilizado para ello. En la primera sección se muestra el uso de librerías numéricas para matrices densas y sparse en la formulación topológica semirrecursiva basada en la doble transformación de velocidades. En la segunda se describe la utilización de paralelización mediante OpenMP y TBB en la formulación semirrecursiva con penalizadores y proyecciones. Por último, se describe el uso de técnicas de matrices sparse y paralelización en la formulación global con uniones flexibles y parámetros de Euler. El Capítulo 6 describe los resultados alcanzados mediante las formulaciones e implementaciones descritas previamente. Este capítulo comienza con una descripción de la modelización y topología de los dos vehículos estudiados. El primer modelo es un vehículo de dos ejes del tipo chasis-cabina o furgoneta, perteneciente a la gama de vehículos de carga medianos. El segundo es un vehículo de cinco ejes que responde al modelo de un camión o cabina con semirremolque, perteneciente a la categoría de vehículos industriales pesados. En este capítulo además se realiza un estudio comparativo entre las simulaciones de estos vehículos con cada una de las formulaciones utilizadas y se presentan de modo cuantitativo los efectos de las mejoras alcanzadas con las distintas estrategias propuestas en esta Tesis. Con objeto de extraer conclusiones más fácilmente y para evaluar de un modo más objetivo las mejoras introducidas en la Tesis, todos los resultados de este capítulo se han obtenido con el mismo computador, que era el top de la gama Intel Xeon en 2007, pero que hoy día está ya algo obsoleto. Por último los Capítulos 7 y 8 están dedicados a las conclusiones finales y las futuras líneas de investigación que pueden derivar del trabajo realizado en esta Tesis. Los objetivos de realizar simulaciones en tiempo real de vehículos industriales de gran complejidad han sido alcanzados con varias de las formulaciones e implementaciones desarrolladas. ABSTRACT The objective of this Dissertation has been the achievement of real time simulations of industrial vehicles modeled as complex multibody systems made up by rigid bodies. For the development of a simulation program, four main aspects must be considered: the modeling of the multibody system (types of coordinates, ideal joints or imposed by means of forces), the formulation to be used to set the differential equations of motion (dependent or independent coordinates, global or topological methods, ways to impose constraints equations), the method of numerical integration to solve these equations in time (explicit or implicit integrators) and the details of the implementation carried out (programming language, mathematical libraries, parallelization techniques). These four stages are interrelated and all of them are part of this work. They involve the generation of models for a van and a semitrailer truck, the use of three different dynamic formulations, the integration of differential equations of motion through explicit and implicit methods, the use of BLAS functions and sparse matrix techniques, and the introduction of parallelization to use the different processor cores. The work presented in this Dissertation has been structured in eight chapters, the first of them being the Introduction. In Chapter 2, two different semi-recursive formulations are shown, of which the first one is based on a double velocity transformation, thus getting the differential equations of motion as a function of the independent relative accelerations. The numerical integration of these equations has been made with the Runge-Kutta explicit method of fourth order. The second formulation is based on dependent relative coordinates, imposing the constraints by means of position penalty coefficients and correcting the velocities and accelerations by projection methods. In this second case, the integration of the motion equations has been carried out by means of the HHT implicit integrator (Hilber, Hughes and Taylor), which belongs to the Newmark structural integrators family. In Chapter 3, the third formulation used in this Dissertation is presented. In this case, the joints between the bodies of the system have been considered as flexible joints, with forces used to impose the joint conditions. This kind of union hinders to work with relative coordinates, so the position of the system bodies and the setting of the equations of motion have been carried out using Cartesian coordinates and Euler parameters. In this global formulation, constraints are introduced through forces (with a similar approach to the penalty coefficients) are presented. The stabilization of the numerical integration is carried out also by velocity and accelerations projections. In Chapter 4, a revision of the main computer tools and strategies used to increase the efficiency of the implementations of the algorithms is presented. First of all, some basic considerations to increase the numerical efficiency of the implementations are included. Then the main characteristics of the code’ analyzers used and also the mathematical libraries used to solve linear algebra problems (both with dense and sparse matrices) are mentioned. Finally, the topic of parallelization in current multicore processors is developed thoroughly. For that, the pattern used and the most important characteristics of the tools proposed, OpenMP and Intel TBB, are described. It needs to be highlighted that the characteristics of multibody systems small size problems, frequent recursion use and intensive repetition along the time of the calculation with high dependencies of the previous results complicate extraordinarily the use of parallelization techniques against other computational mechanics areas, as the finite elements computation. Based on the concepts mentioned in Chapter 4, Chapter 5 is divided into three sections, one for each formulation proposed in this Dissertation. In each one of these sections, the details of how these different proposed implementations have been made for each algorithm and which tools have been used are described. In the first section, it is shown the use of numerical libraries for dense and sparse matrices in the semirecursive topological formulation based in the double velocity transformation. In the second one, the use of parallelization by means OpenMP and TBB is depicted in the semi-recursive formulation with penalization and projections. Lastly, the use of sparse matrices and parallelization techniques is described in the global formulation with flexible joints and Euler parameters. Chapter 6 depicts the achieved results through the formulations and implementations previously described. This chapter starts with a description of the modeling and topology of the two vehicles studied. The first model is a two-axle chassis-cabin or van like vehicle, which belongs to the range of medium charge vehicles. The second one is a five-axle vehicle belonging to the truck or cabin semi-trailer model, belonging to the heavy industrial vehicles category. In this chapter, a comparative study is done between the simulations of these vehicles with each one of the formulations used and the improvements achieved are presented in a quantitative way with the different strategies proposed in this Dissertation. With the aim of deducing the conclusions more easily and to evaluate in a more objective way the improvements introduced in the Dissertation, all the results of this chapter have been obtained with the same computer, which was the top one among the Intel Xeon range in 2007, but which is rather obsolete today. Finally, Chapters 7 and 8 are dedicated to the final conclusions and the future research projects that can be derived from the work presented in this Dissertation. The objectives of doing real time simulations in high complex industrial vehicles have been achieved with the formulations and implementations developed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The performance of Gallager's error-correcting code is investigated via methods of statistical physics. In this method, the transmitted codeword comprises products of the original message bits selected by two randomly-constructed sparse matrices; the number of non-zero row/column elements in these matrices constitutes a family of codes. We show that Shannon's channel capacity is saturated for many of the codes while slightly lower performance is obtained for others which may be of higher practical relevance. Decoding aspects are considered by employing the TAP approach which is identical to the commonly used belief-propagation-based decoding.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We employ the methods of statistical physics to study the performance of Gallager type error-correcting codes. In this approach, the transmitted codeword comprises Boolean sums of the original message bits selected by two randomly-constructed sparse matrices. We show that a broad range of these codes potentially saturate Shannon's bound but are limited due to the decoding dynamics used. Other codes show sub-optimal performance but are not restricted by the decoding dynamics. We show how these codes may also be employed as a practical public-key cryptosystem and are of competitive performance to modern cyptographical methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We study the performance of Low Density Parity Check (LDPC) error-correcting codes using the methods of statistical physics. LDPC codes are based on the generation of codewords using Boolean sums of the original message bits by employing two randomly-constructed sparse matrices. These codes can be mapped onto Ising spin models and studied using common methods of statistical physics. We examine various regular constructions and obtain insight into their theoretical and practical limitations. We also briefly report on results obtained for irregular code constructions, for codes with non-binary alphabet, and on how a finite system size effects the error probability.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The replica method, developed in statistical physics, is employed in conjunction with Gallager's methodology to accurately evaluate zero error noise thresholds for Gallager code ensembles. Our approach generally provides more optimistic evaluations than those reported in the information theory literature for sparse matrices; the difference vanishes as the parity check matrix becomes dense.