922 resultados para Parallel numerical algorithms
Resumo:
Variations in different types of genomes have been found to be responsible for a large degree of physical diversity such as appearance and susceptibility to disease. Identification of genomic variations is difficult and can be facilitated through computational analysis of DNA sequences. Newly available technologies are able to sequence billions of DNA base pairs relatively quickly. These sequences can be used to identify variations within their specific genome but must be mapped to a reference sequence first. In order to align these sequences to a reference sequence, we require mapping algorithms that make use of approximate string matching and string indexing methods. To date, few mapping algorithms have been tailored to handle the massive amounts of output generated by newly available sequencing technologies. In otrder to handle this large amount of data, we modified the popular mapping software BWA to run in parallel using OpenMPI. Parallel BWA matches the efficiency of multithreaded BWA functions while providing efficient parallelism for BWA functions that do not currently support multithreading. Parallel BWA shows significant wall time speedup in comparison to multithreaded BWA on high-performance computing clusters, and will thus facilitate the analysis of genome sequencing data.
Resumo:
The KCube interconnection topology was rst introduced in 2010. The KCube graph is a compound graph of a Kautz digraph and hypercubes. Compared with the at- tractive Kautz digraph and well known hypercube graph, the KCube graph could accommodate as many nodes as possible for a given indegree (and outdegree) and the diameter of interconnection networks. However, there are few algorithms designed for the KCube graph. In this thesis, we will concentrate on nding graph theoretical properties of the KCube graph and designing parallel algorithms that run on this network. We will explore several topological properties, such as bipartiteness, Hamiltonianicity, and symmetry property. These properties for the KCube graph are very useful to develop efficient algorithms on this network. We will then study the KCube network from the algorithmic point of view, and will give an improved routing algorithm. In addition, we will present two optimal broadcasting algorithms. They are fundamental algorithms to many applications. A literature review of the state of the art network designs in relation to the KCube network as well as some open problems in this field will also be given.
Resumo:
The KCube interconnection network was first introduced in 2010 in order to exploit the good characteristics of two well-known interconnection networks, the hypercube and the Kautz graph. KCube links up multiple processors in a communication network with high density for a fixed degree. Since the KCube network is newly proposed, much study is required to demonstrate its potential properties and algorithms that can be designed to solve parallel computation problems. In this thesis we introduce a new methodology to construct the KCube graph. Also, with regard to this new approach, we will prove its Hamiltonicity in the general KC(m; k). Moreover, we will find its connectivity followed by an optimal broadcasting scheme in which a source node containing a message is to communicate it with all other processors. In addition to KCube networks, we have studied a version of the routing problem in the traditional hypercube, investigating this problem: whether there exists a shortest path in a Qn between two nodes 0n and 1n, when the network is experiencing failed components. We first conditionally discuss this problem when there is a constraint on the number of faulty nodes, and subsequently introduce an algorithm to tackle the problem without restrictions on the number of nodes.
Resumo:
Le problème de tarification qui nous intéresse ici consiste à maximiser le revenu généré par les usagers d'un réseau de transport. Pour se rendre à leurs destinations, les usagers font un choix de route et utilisent des arcs sur lesquels nous imposons des tarifs. Chaque route est caractérisée (aux yeux de l'usager) par sa "désutilité", une mesure de longueur généralisée tenant compte à la fois des tarifs et des autres coûts associés à son utilisation. Ce problème a surtout été abordé sous une modélisation déterministe de la demande selon laquelle seules des routes de désutilité minimale se voient attribuer une mesure positive de flot. Le modèle déterministe se prête bien à une résolution globale, mais pèche par manque de réalisme. Nous considérons ici une extension probabiliste de ce modèle, selon laquelle les usagers d'un réseau sont alloués aux routes d'après un modèle de choix discret logit. Bien que le problème de tarification qui en résulte est non linéaire et non convexe, il conserve néanmoins une forte composante combinatoire que nous exploitons à des fins algorithmiques. Notre contribution se répartit en trois articles. Dans le premier, nous abordons le problème d'un point de vue théorique pour le cas avec une paire origine-destination. Nous développons une analyse de premier ordre qui exploite les propriétés analytiques de l'affectation logit et démontrons la validité de règles de simplification de la topologie du réseau qui permettent de réduire la dimension du problème sans en modifier la solution. Nous établissons ensuite l'unimodalité du problème pour une vaste gamme de topologies et nous généralisons certains de nos résultats au problème de la tarification d'une ligne de produits. Dans le deuxième article, nous abordons le problème d'un point de vue numérique pour le cas avec plusieurs paires origine-destination. Nous développons des algorithmes qui exploitent l'information locale et la parenté des formulations probabilistes et déterministes. Un des résultats de notre analyse est l'obtention de bornes sur l'erreur commise par les modèles combinatoires dans l'approximation du revenu logit. Nos essais numériques montrent qu'une approximation combinatoire rudimentaire permet souvent d'identifier des solutions quasi-optimales. Dans le troisième article, nous considérons l'extension du problème à une demande hétérogène. L'affectation de la demande y est donnée par un modèle de choix discret logit mixte où la sensibilité au prix d'un usager est aléatoire. Sous cette modélisation, l'expression du revenu n'est pas analytique et ne peut être évaluée de façon exacte. Cependant, nous démontrons que l'utilisation d'approximations non linéaires et combinatoires permet d'identifier des solutions quasi-optimales. Finalement, nous en profitons pour illustrer la richesse du modèle, par le biais d'une interprétation économique, et examinons plus particulièrement la contribution au revenu des différents groupes d'usagers.
Resumo:
Naïvement perçu, le processus d’évolution est une succession d’événements de duplication et de mutations graduelles dans le génome qui mènent à des changements dans les fonctions et les interactions du protéome. La famille des hydrolases de guanosine triphosphate (GTPases) similaire à Ras constitue un bon modèle de travail afin de comprendre ce phénomène fondamental, car cette famille de protéines contient un nombre limité d’éléments qui diffèrent en fonctionnalité et en interactions. Globalement, nous désirons comprendre comment les mutations singulières au niveau des GTPases affectent la morphologie des cellules ainsi que leur degré d’impact sur les populations asynchrones. Mon travail de maîtrise vise à classifier de manière significative différents phénotypes de la levure Saccaromyces cerevisiae via l’analyse de plusieurs critères morphologiques de souches exprimant des GTPases mutées et natives. Notre approche à base de microscopie et d’analyses bioinformatique des images DIC (microscopie d’interférence différentielle de contraste) permet de distinguer les phénotypes propres aux cellules natives et aux mutants. L’emploi de cette méthode a permis une détection automatisée et une caractérisation des phénotypes mutants associés à la sur-expression de GTPases constitutivement actives. Les mutants de GTPases constitutivement actifs Cdc42 Q61L, Rho5 Q91H, Ras1 Q68L et Rsr1 G12V ont été analysés avec succès. En effet, l’implémentation de différents algorithmes de partitionnement, permet d’analyser des données qui combinent les mesures morphologiques de population native et mutantes. Nos résultats démontrent que l’algorithme Fuzzy C-Means performe un partitionnement efficace des cellules natives ou mutantes, où les différents types de cellules sont classifiés en fonction de plusieurs facteurs de formes cellulaires obtenus à partir des images DIC. Cette analyse démontre que les mutations Cdc42 Q61L, Rho5 Q91H, Ras1 Q68L et Rsr1 G12V induisent respectivement des phénotypes amorphe, allongé, rond et large qui sont représentés par des vecteurs de facteurs de forme distincts. Ces distinctions sont observées avec différentes proportions (morphologie mutante / morphologie native) dans les populations de mutants. Le développement de nouvelles méthodes automatisées d’analyse morphologique des cellules natives et mutantes s’avère extrêmement utile pour l’étude de la famille des GTPases ainsi que des résidus spécifiques qui dictent leurs fonctions et réseau d’interaction. Nous pouvons maintenant envisager de produire des mutants de GTPases qui inversent leur fonction en ciblant des résidus divergents. La substitution fonctionnelle est ensuite détectée au niveau morphologique grâce à notre nouvelle stratégie quantitative. Ce type d’analyse peut également être transposé à d’autres familles de protéines et contribuer de manière significative au domaine de la biologie évolutive.
Resumo:
Parmi les méthodes d’estimation de paramètres de loi de probabilité en statistique, le maximum de vraisemblance est une des techniques les plus populaires, comme, sous des conditions l´egères, les estimateurs ainsi produits sont consistants et asymptotiquement efficaces. Les problèmes de maximum de vraisemblance peuvent être traités comme des problèmes de programmation non linéaires, éventuellement non convexe, pour lesquels deux grandes classes de méthodes de résolution sont les techniques de région de confiance et les méthodes de recherche linéaire. En outre, il est possible d’exploiter la structure de ces problèmes pour tenter d’accélerer la convergence de ces méthodes, sous certaines hypothèses. Dans ce travail, nous revisitons certaines approches classiques ou récemment d´eveloppées en optimisation non linéaire, dans le contexte particulier de l’estimation de maximum de vraisemblance. Nous développons également de nouveaux algorithmes pour résoudre ce problème, reconsidérant différentes techniques d’approximation de hessiens, et proposons de nouvelles méthodes de calcul de pas, en particulier dans le cadre des algorithmes de recherche linéaire. Il s’agit notamment d’algorithmes nous permettant de changer d’approximation de hessien et d’adapter la longueur du pas dans une direction de recherche fixée. Finalement, nous évaluons l’efficacité numérique des méthodes proposées dans le cadre de l’estimation de modèles de choix discrets, en particulier les modèles logit mélangés.
Resumo:
Dynamics of Nd:YAG laser with intracavity KTP crystal operating in two parallel polarized modes is investigated analytically and numerically. System equilibrium points were found out and the stability of each of them was checked using Routh–Hurwitz criteria and also by calculating the eigen values of the Jacobian. It is found that the system possesses three equilibrium points for (Ij, Gj), where j = 1, 2. One of these equilibrium points undergoes Hopf bifurcation in output dynamics as the control parameter is increased. The other two remain unstable throughout the entire region of the parameter space. Our numerical analysis of the Hopf bifurcation phenomena is found to be in good agreement with the analytical results. Nature of energy transfer between the two modes is also studied numerically.
Resumo:
The motion instability is an important issue that occurs during the operation of towed underwater vehicles (TUV), which considerably affects the accuracy of high precision acoustic instrumentations housed inside the same. Out of the various parameters responsible for this, the disturbances from the tow-ship are the most significant one. The present study focus on the motion dynamics of an underwater towing system with ship induced disturbances as the input. The study focus on an innovative system called two-part towing. The methodology involves numerical modeling of the tow system, which consists of modeling of the tow-cables and vehicles formulation. Previous study in this direction used a segmental approach for the modeling of the cable. Even though, the model was successful in predicting the heave response of the tow-body, instabilities were observed in the numerical solution. The present study devises a simple approach called lumped mass spring model (LMSM) for the cable formulation. In this work, the traditional LMSM has been modified in two ways. First, by implementing advanced time integration procedures and secondly, use of a modified beam model which uses only translational degrees of freedoms for solving beam equation. A number of time integration procedures, such as Euler, Houbolt, Newmark and HHT-α were implemented in the traditional LMSM and the strength and weakness of each scheme were numerically estimated. In most of the previous studies, hydrodynamic forces acting on the tow-system such as drag and lift etc. are approximated as analytical expression of velocities. This approach restricts these models to use simple cylindrical shaped towed bodies and may not be applicable modern tow systems which are diversed in shape and complexity. Hence, this particular study, hydrodynamic parameters such as drag and lift of the tow-system are estimated using CFD techniques. To achieve this, a RANS based CFD code has been developed. Further, a new convection interpolation scheme for CFD simulation, called BNCUS, which is blend of cell based and node based formulation, was proposed in the study and numerically tested. To account for the fact that simulation takes considerable time in solving fluid dynamic equations, a dedicated parallel computing setup has been developed. Two types of computational parallelisms are explored in the current study, viz; the model for shared memory processors and distributed memory processors. In the present study, shared memory model was used for structural dynamic analysis of towing system, distributed memory one was devised in solving fluid dynamic equations.
Resumo:
Distributed systems are one of the most vital components of the economy. The most prominent example is probably the internet, a constituent element of our knowledge society. During the recent years, the number of novel network types has steadily increased. Amongst others, sensor networks, distributed systems composed of tiny computational devices with scarce resources, have emerged. The further development and heterogeneous connection of such systems imposes new requirements on the software development process. Mobile and wireless networks, for instance, have to organize themselves autonomously and must be able to react to changes in the environment and to failing nodes alike. Researching new approaches for the design of distributed algorithms may lead to methods with which these requirements can be met efficiently. In this thesis, one such method is developed, tested, and discussed in respect of its practical utility. Our new design approach for distributed algorithms is based on Genetic Programming, a member of the family of evolutionary algorithms. Evolutionary algorithms are metaheuristic optimization methods which copy principles from natural evolution. They use a population of solution candidates which they try to refine step by step in order to attain optimal values for predefined objective functions. The synthesis of an algorithm with our approach starts with an analysis step in which the wanted global behavior of the distributed system is specified. From this specification, objective functions are derived which steer a Genetic Programming process where the solution candidates are distributed programs. The objective functions rate how close these programs approximate the goal behavior in multiple randomized network simulations. The evolutionary process step by step selects the most promising solution candidates and modifies and combines them with mutation and crossover operators. This way, a description of the global behavior of a distributed system is translated automatically to programs which, if executed locally on the nodes of the system, exhibit this behavior. In our work, we test six different ways for representing distributed programs, comprising adaptations and extensions of well-known Genetic Programming methods (SGP, eSGP, and LGP), one bio-inspired approach (Fraglets), and two new program representations called Rule-based Genetic Programming (RBGP, eRBGP) designed by us. We breed programs in these representations for three well-known example problems in distributed systems: election algorithms, the distributed mutual exclusion at a critical section, and the distributed computation of the greatest common divisor of a set of numbers. Synthesizing distributed programs the evolutionary way does not necessarily lead to the envisaged results. In a detailed analysis, we discuss the problematic features which make this form of Genetic Programming particularly hard. The two Rule-based Genetic Programming approaches have been developed especially in order to mitigate these difficulties. In our experiments, at least one of them (eRBGP) turned out to be a very efficient approach and in most cases, was superior to the other representations.
Resumo:
The Scheme86 and the HP Precision Architectures represent different trends in computer processor design. The former uses wide micro-instructions, parallel hardware, and a low latency memory interface. The latter encourages pipelined implementation and visible interlocks. To compare the merits of these approaches, algorithms frequently encountered in numerical and symbolic computation were hand-coded for each architecture. Timings were done in simulators and the results were evaluated to determine the speed of each design. Based on these measurements, conclusions were drawn as to which aspects of each architecture are suitable for a high- performance computer.
Resumo:
This work demonstrates how partial evaluation can be put to practical use in the domain of high-performance numerical computation. I have developed a technique for performing partial evaluation by using placeholders to propagate intermediate results. For an important class of numerical programs, a compiler based on this technique improves performance by an order of magnitude over conventional compilation techniques. I show that by eliminating inherently sequential data-structure references, partial evaluation exposes the low-level parallelism inherent in a computation. I have implemented several parallel scheduling and analysis programs that study the tradeoffs involved in the design of an architecture that can effectively utilize this parallelism. I present these results using the 9- body gravitational attraction problem as an example.
Resumo:
The Kineticist's Workbench is a program that simulates chemical reaction mechanisms by predicting, generating, and interpreting numerical data. Prior to simulation, it analyzes a given mechanism to predict that mechanism's behavior; it then simulates the mechanism numerically; and afterward, it interprets and summarizes the data it has generated. In performing these tasks, the Workbench uses a variety of techniques: graph- theoretic algorithms (for analyzing mechanisms), traditional numerical simulation methods, and algorithms that examine simulation results and reinterpret them in qualitative terms. The Workbench thus serves as a prototype for a new class of scientific computational tools---tools that provide symbiotic collaborations between qualitative and quantitative methods.
Resumo:
A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two algorithms, one serial and one parallel, to maintain series-parallel (SP) relationships "on the fly" for fork-join multithreaded programs. The serial SP-order algorithm runs in O(1) amortized time per operation. In contrast, the previously best algorithm requires a time per operation that is proportional to Tarjan’s functional inverse of Ackermann’s function. SP-order employs an order-maintenance data structure that allows us to implement a more efficient "English-Hebrew" labeling scheme than was used in earlier race detectors, which immediately yields an improved determinacy-race detector. In particular, any fork-join program running in T₁ time on a single processor can be checked on the fly for determinacy races in O(T₁) time. Corresponding improved bounds can also be obtained for more sophisticated data-race detectors, for example, those that use locks. By combining SP-order with Feng and Leiserson’s serial SP-bags algorithm, we obtain a parallel SP-maintenance algorithm, called SP-hybrid. Suppose that a fork-join program has n threads, T₁ work, and a critical-path length of T[subscript â]. When executed on P processors, we prove that SP-hybrid runs in O((T₁/P + PT[subscript â]) lg n) expected time. To understand this bound, consider that the original program obtains linear speed-up over a 1-processor execution when P = O(T₁/T[subscript â]). In contrast, SP-hybrid obtains linear speed-up when P = O(√T₁/T[subscript â]), but the work is increased by a factor of O(lg n).
Resumo:
This paper proposes a parallel architecture for estimation of the motion of an underwater robot. It is well known that image processing requires a huge amount of computation, mainly at low-level processing where the algorithms are dealing with a great number of data. In a motion estimation algorithm, correspondences between two images have to be solved at the low level. In the underwater imaging, normalised correlation can be a solution in the presence of non-uniform illumination. Due to its regular processing scheme, parallel implementation of the correspondence problem can be an adequate approach to reduce the computation time. Taking into consideration the complexity of the normalised correlation criteria, a new approach using parallel organisation of every processor from the architecture is proposed
Resumo:
Virtual tools are commonly used nowadays to optimize product design and manufacturing process of fibre reinforced composite materials. The present work focuses on two areas of interest to forecast the part performance and the production process particularities. The first part proposes a multi-physical optimization tool to support the concept stage of a composite part. The strategy is based on the strategic handling of information and, through a single control parameter, is able to evaluate the effects of design variations throughout all these steps in parallel. The second part targets the resin infusion process and the impact of thermal effects. The numerical and experimental approach allowed the identificationof improvement opportunities regarding the implementation of algorithms in commercially available simulation software.