17 resultados para Graph generators

em Helda - Digital Repository of University of Helsinki


Relevância:

20.00% 20.00%

Publicador:

Resumo:

A distributed system is a collection of networked autonomous processing units which must work in a cooperative manner. Currently, large-scale distributed systems, such as various telecommunication and computer networks, are abundant and used in a multitude of tasks. The field of distributed computing studies what can be computed efficiently in such systems. Distributed systems are usually modelled as graphs where nodes represent the processors and edges denote communication links between processors. This thesis concentrates on the computational complexity of the distributed graph colouring problem. The objective of the graph colouring problem is to assign a colour to each node in such a way that no two nodes connected by an edge share the same colour. In particular, it is often desirable to use only a small number of colours. This task is a fundamental symmetry-breaking primitive in various distributed algorithms. A graph that has been coloured in this manner using at most k different colours is said to be k-coloured. This work examines the synchronous message-passing model of distributed computation: every node runs the same algorithm, and the system operates in discrete synchronous communication rounds. During each round, a node can communicate with its neighbours and perform local computation. In this model, the time complexity of a problem is the number of synchronous communication rounds required to solve the problem. It is known that 3-colouring any k-coloured directed cycle requires at least ½(log* k - 3) communication rounds and is possible in ½(log* k + 7) communication rounds for all k ≥ 3. This work shows that for any k ≥ 3, colouring a k-coloured directed cycle with at most three colours is possible in ½(log* k + 3) rounds. In contrast, it is also shown that for some values of k, colouring a directed cycle with at most three colours requires at least ½(log* k + 1) communication rounds. Furthermore, in the case of directed rooted trees, reducing a k-colouring into a 3-colouring requires at least log* k + 1 rounds for some k and possible in log* k + 3 rounds for all k ≥ 3. The new positive and negative results are derived using computational methods, as the existence of distributed colouring algorithms corresponds to the colourability of so-called neighbourhood graphs. The colourability of these graphs is analysed using Boolean satisfiability (SAT) solvers. Finally, this thesis shows that similar methods are applicable in capturing the existence of distributed algorithms for other graph problems, such as the maximal matching problem.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Gene mapping is a systematic search for genes that affect observable characteristics of an organism. In this thesis we offer computational tools to improve the efficiency of (disease) gene-mapping efforts. In the first part of the thesis we propose an efficient simulation procedure for generating realistic genetical data from isolated populations. Simulated data is useful for evaluating hypothesised gene-mapping study designs and computational analysis tools. As an example of such evaluation, we demonstrate how a population-based study design can be a powerful alternative to traditional family-based designs in association-based gene-mapping projects. In the second part of the thesis we consider a prioritisation of a (typically large) set of putative disease-associated genes acquired from an initial gene-mapping analysis. Prioritisation is necessary to be able to focus on the most promising candidates. We show how to harness the current biomedical knowledge for the prioritisation task by integrating various publicly available biological databases into a weighted biological graph. We then demonstrate how to find and evaluate connections between entities, such as genes and diseases, from this unified schema by graph mining techniques. Finally, in the last part of the thesis, we define the concept of reliable subgraph and the corresponding subgraph extraction problem. Reliable subgraphs concisely describe strong and independent connections between two given vertices in a random graph, and hence they are especially useful for visualising such connections. We propose novel algorithms for extracting reliable subgraphs from large random graphs. The efficiency and scalability of the proposed graph mining methods are backed by extensive experiments on real data. While our application focus is in genetics, the concepts and algorithms can be applied to other domains as well. We demonstrate this generality by considering coauthor graphs in addition to biological graphs in the experiments.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The topic of this dissertation lies in the intersection of harmonic analysis and fractal geometry. We particulary consider singular integrals in Euclidean spaces with respect to general measures, and we study how the geometric structure of the measures affects certain analytic properties of the operators. The thesis consists of three research articles and an overview. In the first article we construct singular integral operators on lower dimensional Sierpinski gaskets associated with homogeneous Calderón-Zygmund kernels. While these operators are bounded their principal values fail to exist almost everywhere. Conformal iterated function systems generate a broad range of fractal sets. In the second article we prove that many of these limit sets are porous in a very strong sense, by showing that they contain holes spread in every direction. In the following we connect these results with singular integrals. We exploit the fractal structure of these limit sets, in order to establish that singular integrals associated with very general kernels converge weakly. Boundedness questions consist a central topic of investigation in the theory of singular integrals. In the third article we study singular integrals of different measures. We prove a very general boundedness result in the case where the two underlying measures are separated by a Lipshitz graph. As a consequence we show that a certain weak convergence holds for a large class of singular integrals.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Metabolism is the cellular subsystem responsible for generation of energy from nutrients and production of building blocks for larger macromolecules. Computational and statistical modeling of metabolism is vital to many disciplines including bioengineering, the study of diseases, drug target identification, and understanding the evolution of metabolism. In this thesis, we propose efficient computational methods for metabolic modeling. The techniques presented are targeted particularly at the analysis of large metabolic models encompassing the whole metabolism of one or several organisms. We concentrate on three major themes of metabolic modeling: metabolic pathway analysis, metabolic reconstruction and the study of evolution of metabolism. In the first part of this thesis, we study metabolic pathway analysis. We propose a novel modeling framework called gapless modeling to study biochemically viable metabolic networks and pathways. In addition, we investigate the utilization of atom-level information on metabolism to improve the quality of pathway analyses. We describe efficient algorithms for discovering both gapless and atom-level metabolic pathways, and conduct experiments with large-scale metabolic networks. The presented gapless approach offers a compromise in terms of complexity and feasibility between the previous graph-theoretic and stoichiometric approaches to metabolic modeling. Gapless pathway analysis shows that microbial metabolic networks are not as robust to random damage as suggested by previous studies. Furthermore the amino acid biosynthesis pathways of the fungal species Trichoderma reesei discovered from atom-level data are shown to closely correspond to those of Saccharomyces cerevisiae. In the second part, we propose computational methods for metabolic reconstruction in the gapless modeling framework. We study the task of reconstructing a metabolic network that does not suffer from connectivity problems. Such problems often limit the usability of reconstructed models, and typically require a significant amount of manual postprocessing. We formulate gapless metabolic reconstruction as an optimization problem and propose an efficient divide-and-conquer strategy to solve it with real-world instances. We also describe computational techniques for solving problems stemming from ambiguities in metabolite naming. These techniques have been implemented in a web-based sofware ReMatch intended for reconstruction of models for 13C metabolic flux analysis. In the third part, we extend our scope from single to multiple metabolic networks and propose an algorithm for inferring gapless metabolic networks of ancestral species from phylogenetic data. Experimenting with 16 fungal species, we show that the method is able to generate results that are easily interpretable and that provide hypotheses about the evolution of metabolism.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis studies optimisation problems related to modern large-scale distributed systems, such as wireless sensor networks and wireless ad-hoc networks. The concrete tasks that we use as motivating examples are the following: (i) maximising the lifetime of a battery-powered wireless sensor network, (ii) maximising the capacity of a wireless communication network, and (iii) minimising the number of sensors in a surveillance application. A sensor node consumes energy both when it is transmitting or forwarding data, and when it is performing measurements. Hence task (i), lifetime maximisation, can be approached from two different perspectives. First, we can seek for optimal data flows that make the most out of the energy resources available in the network; such optimisation problems are examples of so-called max-min linear programs. Second, we can conserve energy by putting redundant sensors into sleep mode; we arrive at the sleep scheduling problem, in which the objective is to find an optimal schedule that determines when each sensor node is asleep and when it is awake. In a wireless network simultaneous radio transmissions may interfere with each other. Task (ii), capacity maximisation, therefore gives rise to another scheduling problem, the activity scheduling problem, in which the objective is to find a minimum-length conflict-free schedule that satisfies the data transmission requirements of all wireless communication links. Task (iii), minimising the number of sensors, is related to the classical graph problem of finding a minimum dominating set. However, if we are not only interested in detecting an intruder but also locating the intruder, it is not sufficient to solve the dominating set problem; formulations such as minimum-size identifying codes and locating dominating codes are more appropriate. This thesis presents approximation algorithms for each of these optimisation problems, i.e., for max-min linear programs, sleep scheduling, activity scheduling, identifying codes, and locating dominating codes. Two complementary approaches are taken. The main focus is on local algorithms, which are constant-time distributed algorithms. The contributions include local approximation algorithms for max-min linear programs, sleep scheduling, and activity scheduling. In the case of max-min linear programs, tight upper and lower bounds are proved for the best possible approximation ratio that can be achieved by any local algorithm. The second approach is the study of centralised polynomial-time algorithms in local graphs these are geometric graphs whose structure exhibits spatial locality. Among other contributions, it is shown that while identifying codes and locating dominating codes are hard to approximate in general graphs, they admit a polynomial-time approximation scheme in local graphs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

MEG directly measures the neuronal events and has greater temporal resolution than fMRI, which has limited temporal resolution mainly due to the larger timescale of the hemodynamic response. On the other hand fMRI has advantages in spatial resolution, while the localization results with MEG can be ambiguous due to the non-uniqueness of the electromagnetic inverse problem. Thus, these methods could provide complementary information and could be used to create both spatially and temporally accurate models of brain function. We investigated the degree of overlap, revealed by the two imaging methods, in areas involved in sensory or motor processing in healthy subjects and neurosurgical patients. Furthermore, we used the spatial information from fMRI to construct a spatiotemporal model of the MEG data in order to investigate the sensorimotor system and to create a spatiotemporal model of its function. We compared the localization results from the MEG and fMRI with invasive electrophysiological cortical mapping. We used a recently introduced method, contextual clustering, for hypothesis testing of fMRI data and assessed the the effect of neighbourhood information use on the reproducibility of fMRI results. Using MEG, we identified the ipsilateral primary sensorimotor cortex (SMI) as a novel source area contributing to the somatosensory evoked fields (SEF) to median nerve stimulation. Using combined MEG and fMRI measurements we found that two separate areas in the lateral fissure may be the generators for the SEF responses from the secondary somatosensory cortex region. The two imaging methods indicated activation in corresponding locations. By using complementary information from MEG and fMRI we established a spatiotemporal model of somatosensory cortical processing. This spatiotemporal model of cerebral activity was in good agreement with results from several studies using invasive electrophysiological measurements and with anatomical studies in monkey and man concerning the connections between somatosensory areas. In neurosurgical patients, the MEG dipole model turned out to be more reliable than fMRI in the identification of the central sulcus. This was due to prominent activation in non-primary areas in fMRI, which in some cases led to erroneous or ambiguous localization of the central sulcus.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We report a set of measurements of particle production in inelastic pbar{p} collisions collected with a minimum-bias trigger at the Tevatron Collider with the CDF II experiment. The inclusive charged particle transverse momentum differential cross section is measured, with improved precision, over a range about ten times wider than in previous measurements. The former modeling of the spectrum appears to be incompatible with the high particle momenta observed. The dependence of the charged particle transverse momentum on the event particle multiplicity is analyzed to study the various components of hadron interactions. This is one of the observable variables most poorly reproduced by the available Monte Carlo generators. A first measurement of the event transverse energy sum differential cross section is also reported. A comparison with a Pythia prediction at the hadron level is performed. The inclusive charged particle differential production cross section is fairly well reproduced only in the transverse momentum range available from previous measurements. At higher momentum the agreement is poor. The transverse energy sum is poorly reproduced over the whole spectrum. The dependence of the charged particle transverse momentum on the particle multiplicity needs the introduction of more sophisticated particle production mechanisms, such as multiple parton interactions, in order to be better explained.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We report a set of measurements of particle production in inelastic pbar{p} collisions collected with a minimum-bias trigger at the Tevatron Collider with the CDF II experiment. The inclusive charged particle transverse momentum differential cross section is measured, with improved precision, over a range about ten times wider than in previous measurements. The former modeling of the spectrum appears to be incompatible with the high particle momenta observed. The dependence of the charged particle transverse momentum on the event particle multiplicity is analyzed to study the various components of hadron interactions. This is one of the observable variables most poorly reproduced by the available Monte Carlo generators. A first measurement of the event transverse energy sum differential cross section is also reported. A comparison with a Pythia prediction at the hadron level is performed. The inclusive charged particle differential production cross section is fairly well reproduced only in the transverse momentum range available from previous measurements. At higher momentum the agreement is poor. The transverse energy sum is poorly reproduced over the whole spectrum. The dependence of the charged particle transverse momentum on the particle multiplicity needs the introduction of more sophisticated particle production mechanisms, such as multiple parton interactions, in order to be better explained.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An edge dominating set for a graph G is a set D of edges such that each edge of G is in D or adjacent to at least one edge in D. This work studies deterministic distributed approximation algorithms for finding minimum-size edge dominating sets. The focus is on anonymous port-numbered networks: there are no unique identifiers, but a node of degree d can refer to its neighbours by integers 1, 2, ..., d. The present work shows that in the port-numbering model, edge dominating sets can be approximated as follows: in d-regular graphs, to within 4 − 6/(d + 1) for an odd d and to within 4 − 2/d for an even d; and in graphs with maximum degree Δ, to within 4 − 2/(Δ − 1) for an odd Δ and to within 4 − 2/Δ for an even Δ. These approximation ratios are tight for all values of d and Δ: there are matching lower bounds.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a distributed algorithm that finds a maximal edge packing in O(Δ + log* W) synchronous communication rounds in a weighted graph, independent of the number of nodes in the network; here Δ is the maximum degree of the graph and W is the maximum weight. As a direct application, we have a distributed 2-approximation algorithm for minimum-weight vertex cover, with the same running time. We also show how to find an f-approximation of minimum-weight set cover in O(f2k2 + fk log* W) rounds; here k is the maximum size of a subset in the set cover instance, f is the maximum frequency of an element, and W is the maximum weight of a subset. The algorithms are deterministic, and they can be applied in anonymous networks.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis studies optimisation problems related to modern large-scale distributed systems, such as wireless sensor networks and wireless ad-hoc networks. The concrete tasks that we use as motivating examples are the following: (i) maximising the lifetime of a battery-powered wireless sensor network, (ii) maximising the capacity of a wireless communication network, and (iii) minimising the number of sensors in a surveillance application. A sensor node consumes energy both when it is transmitting or forwarding data, and when it is performing measurements. Hence task (i), lifetime maximisation, can be approached from two different perspectives. First, we can seek for optimal data flows that make the most out of the energy resources available in the network; such optimisation problems are examples of so-called max-min linear programs. Second, we can conserve energy by putting redundant sensors into sleep mode; we arrive at the sleep scheduling problem, in which the objective is to find an optimal schedule that determines when each sensor node is asleep and when it is awake. In a wireless network simultaneous radio transmissions may interfere with each other. Task (ii), capacity maximisation, therefore gives rise to another scheduling problem, the activity scheduling problem, in which the objective is to find a minimum-length conflict-free schedule that satisfies the data transmission requirements of all wireless communication links. Task (iii), minimising the number of sensors, is related to the classical graph problem of finding a minimum dominating set. However, if we are not only interested in detecting an intruder but also locating the intruder, it is not sufficient to solve the dominating set problem; formulations such as minimum-size identifying codes and locating–dominating codes are more appropriate. This thesis presents approximation algorithms for each of these optimisation problems, i.e., for max-min linear programs, sleep scheduling, activity scheduling, identifying codes, and locating–dominating codes. Two complementary approaches are taken. The main focus is on local algorithms, which are constant-time distributed algorithms. The contributions include local approximation algorithms for max-min linear programs, sleep scheduling, and activity scheduling. In the case of max-min linear programs, tight upper and lower bounds are proved for the best possible approximation ratio that can be achieved by any local algorithm. The second approach is the study of centralised polynomial-time algorithms in local graphs – these are geometric graphs whose structure exhibits spatial locality. Among other contributions, it is shown that while identifying codes and locating–dominating codes are hard to approximate in general graphs, they admit a polynomial-time approximation scheme in local graphs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a distributed 2-approximation algorithm for the minimum vertex cover problem. The algorithm is deterministic, and it runs in (Δ + 1)2 synchronous communication rounds, where Δ is the maximum degree of the graph. For Δ = 3, we give a 2-approximation algorithm also for the weighted version of the problem.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a distributed 2-approximation algorithm for the minimum vertex cover problem. The algorithm is deterministic, and it runs in (Δ + 1)2 synchronous communication rounds, where Δ is the maximum degree of the graph. For Δ = 3, we give a 2-approximation algorithm also for the weighted version of the problem.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A local algorithm with local horizon r is a distributed algorithm that runs in r synchronous communication rounds; here r is a constant that does not depend on the size of the network. As a consequence, the output of a node in a local algorithm only depends on the input within r hops from the node. We give tight bounds on the local horizon for a class of local algorithms for combinatorial problems on unit-disk graphs (UDGs). Most of our bounds are due to a refined analysis of existing approaches, while others are obtained by suggesting new algorithms. The algorithms we consider are based on network decompositions guided by a rectangular tiling of the plane. The algorithms are applied to matching, independent set, graph colouring, vertex cover, and dominating set. We also study local algorithms on quasi-UDGs, which are a popular generalisation of UDGs, aimed at more realistic modelling of communication between the network nodes. Analysing the local algorithms on quasi-UDGs allows one to assume that the nodes know their coordinates only approximately, up to an additive error. Despite the localisation error, the quality of the solution to problems on quasi-UDGs remains the same as for the case of UDGs with perfect location awareness. We analyse the increase in the local horizon that comes along with moving from UDGs to quasi-UDGs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work studies decision problems from the perspective of nondeterministic distributed algorithms. For a yes-instance there must exist a proof that can be verified with a distributed algorithm: all nodes must accept a valid proof, and at least one node must reject an invalid proof. We focus on locally checkable proofs that can be verified with a constant-time distributed algorithm. For example, it is easy to prove that a graph is bipartite: the locally checkable proof gives a 2-colouring of the graph, which only takes 1 bit per node. However, it is more difficult to prove that a graph is not bipartite—it turns out that any locally checkable proof requires Ω(log n) bits per node. In this work we classify graph problems according to their local proof complexity, i.e., how many bits per node are needed in a locally checkable proof. We establish tight or near-tight results for classical graph properties such as the chromatic number. We show that the proof complexities form a natural hierarchy of complexity classes: for many classical graph problems, the proof complexity is either 0, Θ(1), Θ(log n), or poly(n) bits per node. Among the most difficult graph properties are symmetric graphs, which require Ω(n2) bits per node, and non-3-colourable graphs, which require Ω(n2/log n) bits per node—any pure graph property admits a trivial proof of size O(n2).