3 resultados para Graph partitioning
em Helda - Digital Repository of University of Helsinki
Resumo:
A distributed system is a collection of networked autonomous processing units which must work in a cooperative manner. Currently, large-scale distributed systems, such as various telecommunication and computer networks, are abundant and used in a multitude of tasks. The field of distributed computing studies what can be computed efficiently in such systems. Distributed systems are usually modelled as graphs where nodes represent the processors and edges denote communication links between processors. This thesis concentrates on the computational complexity of the distributed graph colouring problem. The objective of the graph colouring problem is to assign a colour to each node in such a way that no two nodes connected by an edge share the same colour. In particular, it is often desirable to use only a small number of colours. This task is a fundamental symmetry-breaking primitive in various distributed algorithms. A graph that has been coloured in this manner using at most k different colours is said to be k-coloured. This work examines the synchronous message-passing model of distributed computation: every node runs the same algorithm, and the system operates in discrete synchronous communication rounds. During each round, a node can communicate with its neighbours and perform local computation. In this model, the time complexity of a problem is the number of synchronous communication rounds required to solve the problem. It is known that 3-colouring any k-coloured directed cycle requires at least ½(log* k - 3) communication rounds and is possible in ½(log* k + 7) communication rounds for all k ≥ 3. This work shows that for any k ≥ 3, colouring a k-coloured directed cycle with at most three colours is possible in ½(log* k + 3) rounds. In contrast, it is also shown that for some values of k, colouring a directed cycle with at most three colours requires at least ½(log* k + 1) communication rounds. Furthermore, in the case of directed rooted trees, reducing a k-colouring into a 3-colouring requires at least log* k + 1 rounds for some k and possible in log* k + 3 rounds for all k ≥ 3. The new positive and negative results are derived using computational methods, as the existence of distributed colouring algorithms corresponds to the colourability of so-called neighbourhood graphs. The colourability of these graphs is analysed using Boolean satisfiability (SAT) solvers. Finally, this thesis shows that similar methods are applicable in capturing the existence of distributed algorithms for other graph problems, such as the maximal matching problem.
Resumo:
Questions of the small size of non-industrial private forest (NIPF) holdings in Finland are considered and factors affecting their partitioning are analyzed. This work arises out of Finnish forest policy statements in which the small average size of holdings has been seen to have a negative influence on the economics of forestry. A survey of the literature indicates that the size of holdings is an important factor determining the costs of logging and silvicultural operations, while its influence on the timber supply is slight. The empirical data are based on a sample of 314 holdings collected by interviewing forest owners in the years 1980-86. In 1990-91 the same holdings were resurveyed by means of a postal inquiry and partly by interviewing forest owners. The principal objective in compiling the data is to assist in quantifying ownership factors that influence partitioning among different kinds of NIPF holdings. Thus the mechanism of partitioning were described and a maximum likelihood logistic regression model was constructed using seven independent holding and ownership variables. One out of four holdings had undergone partitioning in conjunction with a change in ownership, one fifth among family owned holdings and nearly a half among jointly owned holdings. The results of the logistic regression model indicate, for instance, that the odds on partitioning is about three times greater for jointly owned holdings than for family owned ones. Also, the probabilities of partitioning were estimated and the impact of independent dichotomous variables on the probability of partitioning ranged between 0.02 and 0.10. The low value of the Hosmer-Lemeshow test statistic indicates a good fit of the model and the rate of correct classification was estimated to be 88 per cent with a cutoff point of 0.5. The average size of holdings undergoing ownership changes decreased from 29.9 ha to 28.7 ha over the approximate interval 1983-90. In addition, the transition probability matrix showed that the trends towards smaller size categories mostly involved in the small size categories, less than 20 ha. The results of the study can be used in considering the effects of the small size of holdings for forestry and if the purpose is to influence partitioning through forest or rural policy.
Resumo:
Gene mapping is a systematic search for genes that affect observable characteristics of an organism. In this thesis we offer computational tools to improve the efficiency of (disease) gene-mapping efforts. In the first part of the thesis we propose an efficient simulation procedure for generating realistic genetical data from isolated populations. Simulated data is useful for evaluating hypothesised gene-mapping study designs and computational analysis tools. As an example of such evaluation, we demonstrate how a population-based study design can be a powerful alternative to traditional family-based designs in association-based gene-mapping projects. In the second part of the thesis we consider a prioritisation of a (typically large) set of putative disease-associated genes acquired from an initial gene-mapping analysis. Prioritisation is necessary to be able to focus on the most promising candidates. We show how to harness the current biomedical knowledge for the prioritisation task by integrating various publicly available biological databases into a weighted biological graph. We then demonstrate how to find and evaluate connections between entities, such as genes and diseases, from this unified schema by graph mining techniques. Finally, in the last part of the thesis, we define the concept of reliable subgraph and the corresponding subgraph extraction problem. Reliable subgraphs concisely describe strong and independent connections between two given vertices in a random graph, and hence they are especially useful for visualising such connections. We propose novel algorithms for extracting reliable subgraphs from large random graphs. The efficiency and scalability of the proposed graph mining methods are backed by extensive experiments on real data. While our application focus is in genetics, the concepts and algorithms can be applied to other domains as well. We demonstrate this generality by considering coauthor graphs in addition to biological graphs in the experiments.