13 resultados para Graph matching
em Helda - Digital Repository of University of Helsinki
Resumo:
A distributed system is a collection of networked autonomous processing units which must work in a cooperative manner. Currently, large-scale distributed systems, such as various telecommunication and computer networks, are abundant and used in a multitude of tasks. The field of distributed computing studies what can be computed efficiently in such systems. Distributed systems are usually modelled as graphs where nodes represent the processors and edges denote communication links between processors. This thesis concentrates on the computational complexity of the distributed graph colouring problem. The objective of the graph colouring problem is to assign a colour to each node in such a way that no two nodes connected by an edge share the same colour. In particular, it is often desirable to use only a small number of colours. This task is a fundamental symmetry-breaking primitive in various distributed algorithms. A graph that has been coloured in this manner using at most k different colours is said to be k-coloured. This work examines the synchronous message-passing model of distributed computation: every node runs the same algorithm, and the system operates in discrete synchronous communication rounds. During each round, a node can communicate with its neighbours and perform local computation. In this model, the time complexity of a problem is the number of synchronous communication rounds required to solve the problem. It is known that 3-colouring any k-coloured directed cycle requires at least ½(log* k - 3) communication rounds and is possible in ½(log* k + 7) communication rounds for all k ≥ 3. This work shows that for any k ≥ 3, colouring a k-coloured directed cycle with at most three colours is possible in ½(log* k + 3) rounds. In contrast, it is also shown that for some values of k, colouring a directed cycle with at most three colours requires at least ½(log* k + 1) communication rounds. Furthermore, in the case of directed rooted trees, reducing a k-colouring into a 3-colouring requires at least log* k + 1 rounds for some k and possible in log* k + 3 rounds for all k ≥ 3. The new positive and negative results are derived using computational methods, as the existence of distributed colouring algorithms corresponds to the colourability of so-called neighbourhood graphs. The colourability of these graphs is analysed using Boolean satisfiability (SAT) solvers. Finally, this thesis shows that similar methods are applicable in capturing the existence of distributed algorithms for other graph problems, such as the maximal matching problem.
Resumo:
The present thesis discusses relevant issues in education: 1) learning disabilities including the role of comorbidity in LDs, and 2) the use of research-based interventions. This thesis consists of a series of four studies (three articles), which deepens the knowledge of the field of special education. Intervention studies (N=242) aimed to examine whether training using a nonverbal auditory-visual matching computer program had a remedial effect in different learning disabilities, such as developmental dyslexia, Attention Deficit Disorder (ADD) and Specific Language Impairment (SLI). These studies were conducted in both Finland and Sweden. The intervention’s non-verbal character made an international perspective possible. The results of the intervention studies confirmed, that the auditory-visual matching computer program, called Audilex had positive intervention effects. In Study I of children with developmental dyslexia there were also improvements in reading skills, specifically in reading nonsense words and reading speed. These improvements in tasks, which are thought to rely on phonological processing, suggest that such reading difficulties in dyslexia may stem in part from more basic perceptual difficulties, including those required to manage the visual and auditory components of the decoding task. In Study II the intervention had a positive effect on children with dyslexia; older students with dyslexia and surprisingly, students with ADD also benefited from this intervention. In conclusion, the role of comorbidity was apparent. An intervention effect was evident also in students’ school behavior. Study III showed that children with SLI experience difficulties very similar to those of children with dyslexia in auditory-visual matching. Children with language-based learning disabilities, such as dyslexia and SLI benefited from the auditory-visual matching intervention. Also comorbidity was evident among these children; in addition to formal diagnoses, comorbidity was explored with an assessment inventory, which was developed for this thesis. Interestingly, an overview of the data of this thesis shows positive intervention effects in all studies despite learning disability, language, gender or age. These findings have been described by a concept inter-modal transpose. Self-evidently these issues need further studies. In learning disabilities the aim in the future will also be to identify individuals at risk rather than by deficit; this aim can be achieved by using research-based interventions, intensified support in general education and inclusive special education. Keywords: learning disabilities, developmental dyslexia, attention deficit disorder, specific language impairment, language-based learning disabilities, comorbidity, auditory-visual matching, research-based interventions, inter-modal transpose
Resumo:
Event-based systems are seen as good candidates for supporting distributed applications in dynamic and ubiquitous environments because they support decoupled and asynchronous many-to-many information dissemination. Event systems are widely used, because asynchronous messaging provides a flexible alternative to RPC (Remote Procedure Call). They are typically implemented using an overlay network of routers. A content-based router forwards event messages based on filters that are installed by subscribers and other routers. The filters are organized into a routing table in order to forward incoming events to proper subscribers and neighbouring routers. This thesis addresses the optimization of content-based routing tables organized using the covering relation and presents novel data structures and configurations for improving local and distributed operation. Data structures are needed for organizing filters into a routing table that supports efficient matching and runtime operation. We present novel results on dynamic filter merging and the integration of filter merging with content-based routing tables. In addition, the thesis examines the cost of client mobility using different protocols and routing topologies. We also present a new matching technique called temporal subspace matching. The technique combines two new features. The first feature, temporal operation, supports notifications, or content profiles, that persist in time. The second feature, subspace matching, allows more expressive semantics, because notifications may contain intervals and be defined as subspaces of the content space. We also present an application of temporal subspace matching pertaining to metadata-based continuous collection and object tracking.
Resumo:
The core aim of machine learning is to make a computer program learn from the experience. Learning from data is usually defined as a task of learning regularities or patterns in data in order to extract useful information, or to learn the underlying concept. An important sub-field of machine learning is called multi-view learning where the task is to learn from multiple data sets or views describing the same underlying concept. A typical example of such scenario would be to study a biological concept using several biological measurements like gene expression, protein expression and metabolic profiles, or to classify web pages based on their content and the contents of their hyperlinks. In this thesis, novel problem formulations and methods for multi-view learning are presented. The contributions include a linear data fusion approach during exploratory data analysis, a new measure to evaluate different kinds of representations for textual data, and an extension of multi-view learning for novel scenarios where the correspondence of samples in the different views or data sets is not known in advance. In order to infer the one-to-one correspondence of samples between two views, a novel concept of multi-view matching is proposed. The matching algorithm is completely data-driven and is demonstrated in several applications such as matching of metabolites between humans and mice, and matching of sentences between documents in two languages.
Resumo:
This thesis analyzes how matching takes place at the Finnish labor market from three different angles. The Finnish labor market has undergone severe structural changes following the economic crisis in the early 1990s. The labor market has had problems adjusting from these changes and hence a high and persistent unemployment has followed. In this thesis I analyze if matching problems, and in particular if changes in matching, can explain some of this persistence. The thesis consists of three essays. In the first essay Finnish Evidence of Changes in the Labor Market Matching Process the matching process at the Finnish labor market is analyzed. The key finding is that the matching process has changed thoroughly between the booming 1980s and the post-crisis period. The importance of the number of unemployed, and in particular long-term unemployed, for the matching process has vanished. More unemployed do not increase matching as theory predicts but rather the opposite. In the second essay, The Aggregate Matching Function and Directed Search -Finnish Evidence, stock-flow matching as a potential micro foundation of the aggregate matching function is studied. In the essay I show that newly unemployed match mainly with the stock of vacancies while longer term unemployed match with the inflow of vacancies. When aggregating I still find evidence of the traditional aggregate matching function. This could explain the huge support the aggregate matching function has received despite its odd randomness assumption. The third essay, How do Registered Job Seekers really match? -Finnish occupational level Evidence, studies matching for nine occupational groups and finds that very different matching problems exist for different occupations. In this essay also misspecification stemming from non-corresponding variables is dealt with through the introduction of a completely new set of variables. The new outflow measure used is vacancies filled with registered job seekers and it is matched by the supply side measure registered job seekers.
Resumo:
Gene mapping is a systematic search for genes that affect observable characteristics of an organism. In this thesis we offer computational tools to improve the efficiency of (disease) gene-mapping efforts. In the first part of the thesis we propose an efficient simulation procedure for generating realistic genetical data from isolated populations. Simulated data is useful for evaluating hypothesised gene-mapping study designs and computational analysis tools. As an example of such evaluation, we demonstrate how a population-based study design can be a powerful alternative to traditional family-based designs in association-based gene-mapping projects. In the second part of the thesis we consider a prioritisation of a (typically large) set of putative disease-associated genes acquired from an initial gene-mapping analysis. Prioritisation is necessary to be able to focus on the most promising candidates. We show how to harness the current biomedical knowledge for the prioritisation task by integrating various publicly available biological databases into a weighted biological graph. We then demonstrate how to find and evaluate connections between entities, such as genes and diseases, from this unified schema by graph mining techniques. Finally, in the last part of the thesis, we define the concept of reliable subgraph and the corresponding subgraph extraction problem. Reliable subgraphs concisely describe strong and independent connections between two given vertices in a random graph, and hence they are especially useful for visualising such connections. We propose novel algorithms for extracting reliable subgraphs from large random graphs. The efficiency and scalability of the proposed graph mining methods are backed by extensive experiments on real data. While our application focus is in genetics, the concepts and algorithms can be applied to other domains as well. We demonstrate this generality by considering coauthor graphs in addition to biological graphs in the experiments.
Resumo:
An edge dominating set for a graph G is a set D of edges such that each edge of G is in D or adjacent to at least one edge in D. This work studies deterministic distributed approximation algorithms for finding minimum-size edge dominating sets. The focus is on anonymous port-numbered networks: there are no unique identifiers, but a node of degree d can refer to its neighbours by integers 1, 2, ..., d. The present work shows that in the port-numbering model, edge dominating sets can be approximated as follows: in d-regular graphs, to within 4 − 6/(d + 1) for an odd d and to within 4 − 2/d for an even d; and in graphs with maximum degree Δ, to within 4 − 2/(Δ − 1) for an odd Δ and to within 4 − 2/Δ for an even Δ. These approximation ratios are tight for all values of d and Δ: there are matching lower bounds.
Resumo:
A local algorithm with local horizon r is a distributed algorithm that runs in r synchronous communication rounds; here r is a constant that does not depend on the size of the network. As a consequence, the output of a node in a local algorithm only depends on the input within r hops from the node. We give tight bounds on the local horizon for a class of local algorithms for combinatorial problems on unit-disk graphs (UDGs). Most of our bounds are due to a refined analysis of existing approaches, while others are obtained by suggesting new algorithms. The algorithms we consider are based on network decompositions guided by a rectangular tiling of the plane. The algorithms are applied to matching, independent set, graph colouring, vertex cover, and dominating set. We also study local algorithms on quasi-UDGs, which are a popular generalisation of UDGs, aimed at more realistic modelling of communication between the network nodes. Analysing the local algorithms on quasi-UDGs allows one to assume that the nodes know their coordinates only approximately, up to an additive error. Despite the localisation error, the quality of the solution to problems on quasi-UDGs remains the same as for the case of UDGs with perfect location awareness. We analyse the increase in the local horizon that comes along with moving from UDGs to quasi-UDGs.