Biblioteca Digital

981 resultados para SPANNING TREE PROBLEM

Estimating Dependency Structure as a Hidden Variable

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper introduces a probability model, the mixture of trees that can account for sparse, dynamically changing dependence relationships. We present a family of efficient algorithms that use EM and the Minimum Spanning Tree algorithm to find the ML and MAP mixture of trees for a variety of priors, including the Dirichlet and the MDL priors. We also show that the single tree classifier acts like an implicit feature selector, thus making the classification performance insensitive to irrelevant attributes. Experimental results demonstrate the excellent performance of the new model both in density estimation and in classification.

Paleoamerican Morphology in the Context of European and East Asian Late Pleistocene Variation: Implications for Human Dispersion Into the New World

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Early American crania show a different morphological pattern from the one shared by late Native Americans. Although the origin of the diachronic morphological diversity seen on the continents is still debated, the distinct morphology of early Americans is well documented and widely dispersed. This morphology has been described extensively for South America, where larger samples are available. Here we test the hypotheses that the morphology of Early Americans results from retention of the morphological pattern of Late Pleistocene modern humans and that the occupation of the New World precedes the morphological differentiation that gave rise to recent Eurasian and American morphology. We compare Early American samples with European Upper Paleolithic skulls, the East Asian Zhoukoudian Upper Cave specimens and a series of 20 modern human reference crania. Canonical Analysis and Minimum Spanning Tree were used to assess the morphological affinities among the series, while Mantel and Dow-Cheverud tests based on Mahalanobis Squared Distances were used to test different evolutionary scenarios. Our results show strong morphological affinities among the early series irrespective of geographical origin, which together with the matrix analyses results favor the scenario of a late morphological differentiation of modern humans. We conclude that the geographic differentiation of modern human morphology is a late phenomenon that occurred after the initial settlement of the Americas. Am J Phys Anthropol 144:442-453, 2011. (c) 2010 Wiley-Liss, Inc.

Evaluation of ERIC-PCR as Genotyping Method for Corynebacterium pseudotuberculosis Isolates

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

A path- and label-cost propagation approach to speedup the training of the optimum-path forest classifier

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In general, pattern recognition techniques require a high computational burden for learning the discriminating functions that are responsible to separate samples from distinct classes. As such, there are several studies that make effort to employ machine learning algorithms in the context of big data classification problems. The research on this area ranges from Graphics Processing Units-based implementations to mathematical optimizations, being the main drawback of the former approaches to be dependent on the graphic video card. Here, we propose an architecture-independent optimization approach for the optimum-path forest (OPF) classifier, that is designed using a theoretical formulation that relates the minimum spanning tree with the minimum spanning forest generated by the OPF over the training dataset. The experiments have shown that the approach proposed can be faster than the traditional one in five public datasets, being also as accurate as the original OPF. (C) 2014 Elsevier B. V. All rights reserved.

Algorithms for network design and routing problems

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this thesis we study three combinatorial optimization problems belonging to the classes of Network Design and Vehicle Routing problems that are strongly linked in the context of the design and management of transportation networks: the Non-Bifurcated Capacitated Network Design Problem (NBP), the Period Vehicle Routing Problem (PVRP) and the Pickup and Delivery Problem with Time Windows (PDPTW). These problems are NP-hard and contain as special cases some well known difficult problems such as the Traveling Salesman Problem and the Steiner Tree Problem. Moreover, they model the core structure of many practical problems arising in logistics and telecommunications. The NBP is the problem of designing the optimum network to satisfy a given set of traffic demands. Given a set of nodes, a set of potential links and a set of point-to-point demands called commodities, the objective is to select the links to install and dimension their capacities so that all the demands can be routed between their respective endpoints, and the sum of link fixed costs and commodity routing costs is minimized. The problem is called non- bifurcated because the solution network must allow each demand to follow a single path, i.e., the flow of each demand cannot be splitted. Although this is the case in many real applications, the NBP has received significantly less attention in the literature than other capacitated network design problems that allow bifurcation. We describe an exact algorithm for the NBP that is based on solving by an integer programming solver a formulation of the problem strengthened by simple valid inequalities and four new heuristic algorithms. One of these heuristics is an adaptive memory metaheuristic, based on partial enumeration, that could be applied to a wider class of structured combinatorial optimization problems. In the PVRP a fleet of vehicles of identical capacity must be used to service a set of customers over a planning period of several days. Each customer specifies a service frequency, a set of allowable day-combinations and a quantity of product that the customer must receive every time he is visited. For example, a customer may require to be visited twice during a 5-day period imposing that these visits take place on Monday-Thursday or Monday-Friday or Tuesday-Friday. The problem consists in simultaneously assigning a day- combination to each customer and in designing the vehicle routes for each day so that each customer is visited the required number of times, the number of routes on each day does not exceed the number of vehicles available, and the total cost of the routes over the period is minimized. We also consider a tactical variant of this problem, called Tactical Planning Vehicle Routing Problem, where customers require to be visited on a specific day of the period but a penalty cost, called service cost, can be paid to postpone the visit to a later day than that required. At our knowledge all the algorithms proposed in the literature for the PVRP are heuristics. In this thesis we present for the first time an exact algorithm for the PVRP that is based on different relaxations of a set partitioning-like formulation. The effectiveness of the proposed algorithm is tested on a set of instances from the literature and on a new set of instances. Finally, the PDPTW is to service a set of transportation requests using a fleet of identical vehicles of limited capacity located at a central depot. Each request specifies a pickup location and a delivery location and requires that a given quantity of load is transported from the pickup location to the delivery location. Moreover, each location can be visited only within an associated time window. Each vehicle can perform at most one route and the problem is to satisfy all the requests using the available vehicles so that each request is serviced by a single vehicle, the load on each vehicle does not exceed the capacity, and all locations are visited according to their time window. We formulate the PDPTW as a set partitioning-like problem with additional cuts and we propose an exact algorithm based on different relaxations of the mathematical formulation and a branch-and-cut-and-price algorithm. The new algorithm is tested on two classes of problems from the literature and compared with a recent branch-and-cut-and-price algorithm from the literature.

Geometric and Combinatorial Aspects of NonEquilibrium Statistical Mechanics

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Non-Equilibrium Statistical Mechanics is a broad subject. Grossly speaking, it deals with systems which have not yet relaxed to an equilibrium state, or else with systems which are in a steady non-equilibrium state, or with more general situations. They are characterized by external forcing and internal fluxes, resulting in a net production of entropy which quantifies dissipation and the extent by which, by the Second Law of Thermodynamics, time-reversal invariance is broken. In this thesis we discuss some of the mathematical structures involved with generic discrete-state-space non-equilibrium systems, that we depict with networks in all analogous to electrical networks. We define suitable observables and derive their linear regime relationships, we discuss a duality between external and internal observables that reverses the role of the system and of the environment, we show that network observables serve as constraints for a derivation of the minimum entropy production principle. We dwell on deep combinatorial aspects regarding linear response determinants, which are related to spanning tree polynomials in graph theory, and we give a geometrical interpretation of observables in terms of Wilson loops of a connection and gauge degrees of freedom. We specialize the formalism to continuous-time Markov chains, we give a physical interpretation for observables in terms of locally detailed balanced rates, we prove many variants of the fluctuation theorem, and show that a well-known expression for the entropy production due to Schnakenberg descends from considerations of gauge invariance, where the gauge symmetry is related to the freedom in the choice of a prior probability distribution. As an additional topic of geometrical flavor related to continuous-time Markov chains, we discuss the Fisher-Rao geometry of nonequilibrium decay modes, showing that the Fisher matrix contains information about many aspects of non-equilibrium behavior, including non-equilibrium phase transitions and superposition of modes. We establish a sort of statistical equivalence principle and discuss the behavior of the Fisher matrix under time-reversal. To conclude, we propose that geometry and combinatorics might greatly increase our understanding of nonequilibrium phenomena.

Chemische Zusammensetzung der Aerosole : Design und Datenauswertung eines Einzelpartikel-Laserablationsmassenspektrometers

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Atmosphärische Aerosolpartikel wirken in vielerlei Hinsicht auf die Menschen und die Umwelt ein. Eine genaue Charakterisierung der Partikel hilft deren Wirken zu verstehen und dessen Folgen einzuschätzen. Partikel können hinsichtlich ihrer Größe, ihrer Form und ihrer chemischen Zusammensetzung charakterisiert werden. Mit der Laserablationsmassenspektrometrie ist es möglich die Größe und die chemische Zusammensetzung einzelner Aerosolpartikel zu bestimmen. Im Rahmen dieser Arbeit wurde das SPLAT (Single Particle Laser Ablation Time-of-flight mass spectrometer) zur besseren Analyse insbesondere von atmosphärischen Aerosolpartikeln weiterentwickelt. Der Aerosoleinlass wurde dahingehend optimiert, einen möglichst weiten Partikelgrößenbereich (80 nm - 3 µm) in das SPLAT zu transferieren und zu einem feinen Strahl zu bündeln. Eine neue Beschreibung für die Beziehung der Partikelgröße zu ihrer Geschwindigkeit im Vakuum wurde gefunden. Die Justage des Einlasses wurde mithilfe von Schrittmotoren automatisiert. Die optische Detektion der Partikel wurde so verbessert, dass Partikel mit einer Größe < 100 nm erfasst werden können. Aufbauend auf der optischen Detektion und der automatischen Verkippung des Einlasses wurde eine neue Methode zur Charakterisierung des Partikelstrahls entwickelt. Die Steuerelektronik des SPLAT wurde verbessert, so dass die maximale Analysefrequenz nur durch den Ablationslaser begrenzt wird, der höchsten mit etwa 10 Hz ablatieren kann. Durch eine Optimierung des Vakuumsystems wurde der Ionenverlust im Massenspektrometer um den Faktor 4 verringert.rnrnNeben den hardwareseitigen Weiterentwicklungen des SPLAT bestand ein Großteil dieser Arbeit in der Konzipierung und Implementierung einer Softwarelösung zur Analyse der mit dem SPLAT gewonnenen Rohdaten. CRISP (Concise Retrieval of Information from Single Particles) ist ein auf IGOR PRO (Wavemetrics, USA) aufbauendes Softwarepaket, das die effiziente Auswertung der Einzelpartikel Rohdaten erlaubt. CRISP enthält einen neu entwickelten Algorithmus zur automatischen Massenkalibration jedes einzelnen Massenspektrums, inklusive der Unterdrückung von Rauschen und von Problemen mit Signalen die ein intensives Tailing aufweisen. CRISP stellt Methoden zur automatischen Klassifizierung der Partikel zur Verfügung. Implementiert sind k-means, fuzzy-c-means und eine Form der hierarchischen Einteilung auf Basis eines minimal aufspannenden Baumes. CRISP bietet die Möglichkeit die Daten vorzubehandeln, damit die automatische Einteilung der Partikel schneller abläuft und die Ergebnisse eine höhere Qualität aufweisen. Daneben kann CRISP auf einfache Art und Weise Partikel anhand vorgebener Kriterien sortieren. Die CRISP zugrundeliegende Daten- und Infrastruktur wurde in Hinblick auf Wartung und Erweiterbarkeit erstellt. rnrnIm Rahmen der Arbeit wurde das SPLAT in mehreren Kampagnen erfolgreich eingesetzt und die Fähigkeiten von CRISP konnten anhand der gewonnen Datensätze gezeigt werden.rnrnDas SPLAT ist nun in der Lage effizient im Feldeinsatz zur Charakterisierung des atmosphärischen Aerosols betrieben zu werden, während CRISP eine schnelle und gezielte Auswertung der Daten ermöglicht.

Camel Streptococcus agalactiae populations are associated with specific disease complexes and acquired the tetracycline resistance gene tetM via a Tn916-like element

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Camels are the most valuable livestock species in the Horn of Africa and play a pivotal role in the nutritional sustainability for millions of people. Their health status is therefore of utmost importance for the people living in this region. Streptococcus agalactiae, a Group B Streptococcus (GBS), is an important camel pathogen. Here we present the first epidemiological study based on genetic and phenotypic data from African camel derived GBS. Ninety-two GBS were characterized using multilocus sequence typing (MLST), capsular polysaccharide typing and in vitro antimicrobial susceptibility testing. We analysed the GBS using Bayesian linkage, phylogenetic and minimum spanning tree analyses and compared them with human GBS from East Africa in order to investigate the level of genetic exchange between GBS populations in the region. Camel GBS sequence types (STs) were distinct from other STs reported so far. We mapped specific STs and capsular types to major disease complexes caused by GBS. Widespread resistance (34%) to tetracycline was associated with acquisition of the tetM gene that is carried on a Tn916-like element, and observed primarily among GBS isolated from mastitis. The presence of tetM within different MLST clades suggests acquisition on multiple occasions. Wound infections and mastitis in camels associated with GBS are widespread and should ideally be treated with antimicrobials other than tetracycline in East Africa.

Building the View Graph of a Category by Exploiting Image Realism

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose a weakly supervised method to arrange images of a given category based on the relative pose between the camera and the object in the scene. Relative poses are points on a sphere centered at the object in a given canonical pose, which we call object viewpoints. Our method builds a graph on this sphere by assigning images with similar viewpoint to the same node and by connecting nodes if they are related by a small rotation. The key idea is to exploit a large unlabeled dataset to validate the likelihood of dominant 3D planes of the object geometry. A number of 3D plane hypotheses are evaluated by applying small 3D rotations to each hypothesis and by measuring how well the deformed images match other images in the dataset. Correct hypotheses will result in deformed images that correspond to plausible views of the object, and thus will likely match well other images in the same category. The identified 3D planes are then used to compute affinities between images related by a change of viewpoint. We then use the affinities to build a view graph via a greedy method and the maximum spanning tree.

Node sampling using random centrifugal walks

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Sampling a network with a given probability distribution has been identified as a useful operation. In this paper we propose distributed algorithms for sampling networks, so that nodes are selected by a special node, called the source, with a given probability distribution. All these algorithms are based on a new class of random walks, that we call Random Centrifugal Walks (RCW). A RCW is a random walk that starts at the source and always moves away from it. Firstly, an algorithm to sample any connected network using RCW is proposed. The algorithm assumes that each node has a weight, so that the sampling process must select a node with a probability proportional to its weight. This algorithm requires a preprocessing phase before the sampling of nodes. In particular, a minimum diameter spanning tree (MDST) is created in the network, and then nodes weights are efficiently aggregated using the tree. The good news are that the preprocessing is done only once, regardless of the number of sources and the number of samples taken from the network. After that, every sample is done with a RCW whose length is bounded by the network diameter. Secondly, RCW algorithms that do not require preprocessing are proposed for grids and networks with regular concentric connectivity, for the case when the probability of selecting a node is a function of its distance to the source. The key features of the RCW algorithms (unlike previous Markovian approaches) are that (1) they do not need to warm-up (stabilize), (2) the sampling always finishes in a number of hops bounded by the network diameter, and (3) it selects a node with the exact probability distribution.

Bootstrap confidence levels for phylogenetic trees

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Evolutionary trees are often estimated from DNA or RNA sequence data. How much confidence should we have in the estimated trees? In 1985, Felsenstein [Felsenstein, J. (1985) Evolution 39, 783–791] suggested the use of the bootstrap to answer this question. Felsenstein’s method, which in concept is a straightforward application of the bootstrap, is widely used, but has been criticized as biased in the genetics literature. This paper concerns the use of the bootstrap in the tree problem. We show that Felsenstein’s method is not biased, but that it can be corrected to better agree with standard ideas of confidence levels and hypothesis testing. These corrections can be made by using the more elaborate bootstrap method presented here, at the expense of considerably more computation.

Bootstrap confidence levels for phylogenetic trees.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Evolutionary trees are often estimated from DNA or RNA sequence data. How much confidence should we have in the estimated trees? In 1985, Felsenstein [Felsenstein, J. (1985) Evolution 39, 783-791] suggested the use of the bootstrap to answer this question. Felsenstein's method, which in concept is a straightforward application of the bootstrap, is widely used, but has been criticized as biased in the genetics literature. This paper concerns the use of the bootstrap in the tree problem. We show that Felsenstein's method is not biased, but that it can be corrected to better agree with standard ideas of confidence levels and hypothesis testing. These corrections can be made by using the more elaborate bootstrap method presented here, at the expense of considerably more computation.

Genetic diversity and population structure of Tasmanian devils, the largest marsupial carnivore

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Genetic diversity and population structure were investigated across the core range of Tasmanian devils (Sarcophilus laniarius; Dasyuridae), a wide-ranging marsupial carnivore restricted to the island of Tasmania. Heterozygosity (0.386-0.467) and allelic diversity (2.7-3.3) were low in all subpopulations and allelic size ranges were small and almost continuous, consistent with a founder effect. Island effects and repeated periods of low population density may also have contributed to the low variation. Within continuous habitat, gene flow appears extensive up to 50 km (high assignment rates to source or close neighbour populations; nonsignificant values of pairwise F-ST), in agreement with movement data. At larger scales (150-250 km), gene flow is reduced (significant pairwise F-ST) but there is no evidence for isolation by distance. The most substantial genetic structuring was observed for comparisons spanning unsuitable habitat, implying limited dispersal of devils between the well-connected, eastern populations and a smaller northwestern population. The genetic distinctiveness of the northwestern population was reflected in all analyses: unique alleles; multivariate analyses of gene frequency (multidimensional scaling, minimum spanning tree, nearest neighbour); high self-assignment (95%); two distinct populations for Tasmania were detected in isolation by distance and in Bayesian model-based clustering analyses. Marsupial carnivores appear to have stronger population subdivisions than their placental counterparts.

Developing multidimensional metrics for evaluating paediatric neurodevelopmental disorders

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Healthy brain functioning depends on efficient communication of information between brain regions, forming complex networks. By quantifying synchronisation between brain regions, a functionally connected brain network can be articulated. In neurodevelopmental disorders, where diagnosis is based on measures of behaviour and tasks, a measure of the underlying biological mechanisms holds promise as a potential clinical tool. Graph theory provides a tool for investigating the neural correlates of neuropsychiatric disorders, where there is disruption of efficient communication within and between brain networks. This research aimed to use recent conceptualisation of graph theory, along with measures of behaviour and cognitive functioning, to increase understanding of the neurobiological risk factors of atypical development. Using magnetoencephalography to investigate frequency-specific temporal dynamics at rest, the research aimed to identify potential biological markers derived from sensor-level whole-brain functional connectivity. Whilst graph theory has proved valuable for insight into network efficiency, its application is hampered by two limitations. First, its measures have hardly been validated in MEG studies, and second, graph measures have been shown to depend on methodological assumptions that restrict direct network comparisons. The first experimental study (Chapter 3) addressed the first limitation by examining the reproducibility of graph-based functional connectivity and network parameters in healthy adult volunteers. Subsequent chapters addressed the second limitation through adapted minimum spanning tree (a network analysis approach that allows for unbiased group comparisons) along with graph network tools that had been shown in Chapter 3 to be highly reproducible. Network topologies were modelled in healthy development (Chapter 4), and atypical neurodevelopment (Chapters 5 and 6). The results provided support to the proposition that measures of network organisation, derived from sensor-space MEG data, offer insights helping to unravel the biological basis of typical brain maturation and neurodevelopmental conditions, with the possibility of future clinical utility.

Extended formulations, nonnegative factorizations, and randomized communication protocols

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An extended formulation of a polyhedron P is a linear description of a polyhedron Q together with a linear map π such that π(Q)=P. These objects are of fundamental importance in polyhedral combinatorics and optimization theory, and the subject of a number of studies. Yannakakis’ factorization theorem (Yannakakis in J Comput Syst Sci 43(3):441–466, 1991) provides a surprising connection between extended formulations and communication complexity, showing that the smallest size of an extended formulation of $$P$$P equals the nonnegative rank of its slack matrix S. Moreover, Yannakakis also shows that the nonnegative rank of S is at most 2^c, where c is the complexity of any deterministic protocol computing S. In this paper, we show that the latter result can be strengthened when we allow protocols to be randomized. In particular, we prove that the base-2 logarithm of the nonnegative rank of any nonnegative matrix equals the minimum complexity of a randomized communication protocol computing the matrix in expectation. Using Yannakakis’ factorization theorem, this implies that the base-2 logarithm of the smallest size of an extended formulation of a polytope P equals the minimum complexity of a randomized communication protocol computing the slack matrix of P in expectation. We show that allowing randomization in the protocol can be crucial for obtaining small extended formulations. Specifically, we prove that for the spanning tree and perfect matching polytopes, small variance in the protocol forces large size in the extended formulation.

«
1
2
3
4
5
6
7
8
...
65
66
»