45 resultados para SUBGRAPHS
Resumo:
We study the following problem: given a geometric graph G and an integer k, determine if G has a planar spanning subgraph (with the original embedding and straight-line edges) such that all nodes have degree at least k. If G is a unit disk graph, the problem is trivial to solve for k = 1. We show that even the slightest deviation from the trivial case (e.g., quasi unit disk graphs or k = 1) leads to NP-hard problems.
Resumo:
We establish conditions for the existence, in a chordal graph, of subgraphs homeomorphic to K-n (n greater than or equal to 3), K-m,K-n (m,n greater than or equal to 2), and wheels W-r (r greater than or equal to 3). Using these results, we develop a simple linear time algorithm for testing planarity of chordal graphs. We also show how these results lead to simple polynomial time algorithms for the Fixed Subgraph Homeomorphism problem on chordal graphs for some special classes of pattern graphs.
Resumo:
The Lovasz θ function of a graph, is a fundamental tool in combinatorial optimization and approximation algorithms. Computing θ involves solving a SDP and is extremely expensive even for moderately sized graphs. In this paper we establish that the Lovasz θ function is equivalent to a kernel learning problem related to one class SVM. This interesting connection opens up many opportunities bridging graph theoretic algorithms and machine learning. We show that there exist graphs, which we call SVM−θ graphs, on which the Lovasz θ function can be approximated well by a one-class SVM. This leads to a novel use of SVM techniques to solve algorithmic problems in large graphs e.g. identifying a planted clique of size Θ(n√) in a random graph G(n,12). A classic approach for this problem involves computing the θ function, however it is not scalable due to SDP computation. We show that the random graph with a planted clique is an example of SVM−θ graph, and as a consequence a SVM based approach easily identifies the clique in large graphs and is competitive with the state-of-the-art. Further, we introduce the notion of a ''common orthogonal labeling'' which extends the notion of a ''orthogonal labelling of a single graph (used in defining the θ function) to multiple graphs. The problem of finding the optimal common orthogonal labelling is cast as a Multiple Kernel Learning problem and is used to identify a large common dense region in multiple graphs. The proposed algorithm achieves an order of magnitude scalability compared to the state of the art.
Resumo:
In this paper we establish that the Lovasz theta function on a graph can be restated as a kernel learning problem. We introduce the notion of SVM-theta graphs, on which Lovasz theta function can be approximated well by a Support vector machine (SVM). We show that Erdos-Renyi random G(n, p) graphs are SVM-theta graphs for log(4)n/n <= p < 1. Even if we embed a large clique of size Theta(root np/1-p) in a G(n, p) graph the resultant graph still remains a SVM-theta graph. This immediately suggests an SVM based algorithm for recovering a large planted clique in random graphs. Associated with the theta function is the notion of orthogonal labellings. We introduce common orthogonal labellings which extends the idea of orthogonal labellings to multiple graphs. This allows us to propose a Multiple Kernel learning (MKL) based solution which is capable of identifying a large common dense subgraph in multiple graphs. Both in the planted clique case and common subgraph detection problem the proposed solutions beat the state of the art by an order of magnitude.
Resumo:
In distributed networks, it is often useful for the nodes to be aware of dense subgraphs, e.g., such a dense subgraph could reveal dense substructures in otherwise sparse graphs (e.g. the World Wide Web or social networks); these might reveal community clusters or dense regions for possibly maintaining good communication infrastructure. In this work, we address the problem of self-awareness of nodes in a dynamic network with regards to graph density, i.e., we give distributed algorithms for maintaining dense subgraphs that the member nodes are aware of. The only knowledge that the nodes need is that of the dynamic diameter D, i.e., the maximum number of rounds it takes for a message to traverse the dynamic network. For our work, we consider a model where the number of nodes are fixed, but a powerful adversary can add or remove a limited number of edges from the network at each time step. The communication is by broadcast only and follows the CONGEST model. Our algorithms are continuously executed on the network, and at any time (after some initialization) each node will be aware if it is part (or not) of a particular dense subgraph. We give algorithms that (2 + e)-approximate the densest subgraph and (3 + e)-approximate the at-least-k-densest subgraph (for a given parameter k). Our algorithms work for a wide range of parameter values and run in O(D log n) time. Further, a special case of our results also gives the first fully decentralized approximation algorithms for densest and at-least-k-densest subgraph problems for static distributed graphs. © 2012 Springer-Verlag.
Resumo:
In distributed networks, some groups of nodes may have more inter-connections, perhaps due to their larger bandwidth availability or communication requirements. In many scenarios, it may be useful for the nodes to know if they form part of a dense subgraph, e.g., such a dense subgraph could form a high bandwidth backbone for the network. In this work, we address the problem of self-awareness of nodes in a dynamic network with regards to graph density, i.e., we give distributed algorithms for maintaining dense subgraphs (subgraphs that the member nodes are aware of). The only knowledge that the nodes need is that of the dynamic diameter D, i.e., the maximum number of rounds it takes for a message to traverse the dynamic network. For our work, we consider a model where the number of nodes are fixed, but a powerful adversary can add or remove a limited number of edges from the network at each time step. The communication is by broadcast only and follows the CONGEST model in the sense that only messages of O(log n) size are permitted, where n is the number of nodes in the network. Our algorithms are continuously executed on the network, and at any time (after some initialization) each node will be aware if it is part (or not) of a particular dense subgraph. We give algorithms that approximate both the densest subgraph, i.e., the subgraph of the highest density in the network, and the at-least-k-densest subgraph (for a given parameter k), i.e., the densest subgraph of size at least k. We give a (2 + e)-approximation algorithm for the densest subgraph problem. The at-least-k-densest subgraph is known to be NP-hard for the general case in the centralized setting and the best known algorithm gives a 2-approximation. We present an algorithm that maintains a (3+e)-approximation in our distributed, dynamic setting. Our algorithms run in O(Dlog n) time. © 2012 Authors.
Resumo:
The distance DG(v) of a vertex v in an undirected graph G is the sum of the distances between v and all other vertices of G. The set of vertices in G with maximum (minimum) distance is the antimedian (median) set of a graph G. It is proved that for arbitrary graphs G and J and a positive integer r 2, there exists a connected graph H such that G is the antimedian and J the median subgraphs of H, respectively, and that dH(G, J) = r. When both G and J are connected, G and J can in addition be made convex subgraphs of H.
Resumo:
This paper describes a new technique referred to as watched subgraphs which improves the performance of BBMC, a leading state of the art exact maximum clique solver (MCP). It is based on watched literals employed by modern SAT solvers for boolean constraint propagation. In efficient SAT algorithms, a list of clauses is kept for each literal (it is said that the clauses watch the literal) so that only those in the list are checked for constraint propagation when a (watched) literal is assigned during search. BBMC encodes vertex sets as bit strings, a bit block representing a subset of vertices (and the corresponding induced subgraph) the size of the CPU register word. The paper proposes to watch two subgraphs of critical sets during MCP search to efficiently compute a number of basic operations. Reported results validate the approach as the size and density of problem instances rise, while achieving comparable performance in the general case.
Resumo:
Includes bibliographical references.
Resumo:
In recent years, considerable research efforts have been directed to micro-array technologies and their role in providing simultaneous information on expression profiles for thousands of genes. These data, when subjected to clustering and classification procedures, can assist in identifying patterns and providing insight on biological processes. To understand the properties of complex gene expression datasets, graphical representations can be used. Intuitively, the data can be represented in terms of a bipartite graph, with weighted edges corresponding to gene-sample node couples in the dataset. Biologically meaningful subgraphs can be sought, but performance can be influenced both by the search algorithm, and, by the graph-weighting scheme and both merit rigorous investigation. In this paper, we focus on edge-weighting schemes for bipartite graphical representation of gene expression. Two novel methods are presented: the first is based on empirical evidence; the second on a geometric distribution. The schemes are compared for several real datasets, assessing efficiency of performance based on four essential properties: robustness to noise and missing values, discrimination, parameter influence on scheme efficiency and reusability. Recommendations and limitations are briefly discussed. Keywords: Edge-weighting; weighted graphs; gene expression; bi-clustering
Resumo:
In this paper we develop compilation techniques for the realization of applications described in a High Level Language (HLL) onto a Runtime Reconfigurable Architecture. The compiler determines Hyper Operations (HyperOps) that are subgraphs of a data flow graph (of an application) and comprise elementary operations that have strong producer-consumer relationship. These HyperOps are hosted on computation structures that are provisioned on demand at runtime. We also report compiler optimizations that collectively reduce the overheads of data-driven computations in runtime reconfigurable architectures. On an average, HyperOps offer a 44% reduction in total execution time and a 18% reduction in management overheads as compared to using basic blocks as coarse grained operations. We show that HyperOps formed using our compiler are suitable to support data flow software pipelining.
Resumo:
A k-dimensional box is the Cartesian product R-1 X R-2 X ... X R-k where each R-i is a closed interval on the real line. The boxicity of a graph G, denoted as box(G), is the minimum integer k such that G can be represented as the intersection graph of a collection of k-dimensional boxes. A unit cube in k-dimensional space or a k-cube is defined as the Cartesian product R-1 X R-2 X ... X R-k where each R-i is a closed interval oil the real line of the form a(i), a(i) + 1]. The cubicity of G, denoted as cub(G), is the minimum integer k such that G can be represented as the intersection graph of a collection of k-cubes. The threshold dimension of a graph G(V, E) is the smallest integer k such that E can be covered by k threshold spanning subgraphs of G. In this paper we will show that there exists no polynomial-time algorithm for approximating the threshold dimension of a graph on n vertices with a factor of O(n(0.5-epsilon)) for any epsilon > 0 unless NP = ZPP. From this result we will show that there exists no polynomial-time algorithm for approximating the boxicity and the cubicity of a graph on n vertices with factor O(n(0.5-epsilon)) for any epsilon > 0 unless NP = ZPP. In fact all these hardness results hold even for a highly structured class of graphs, namely the split graphs. We will also show that it is NP-complete to determine whether a given split graph has boxicity at most 3. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Gene mapping is a systematic search for genes that affect observable characteristics of an organism. In this thesis we offer computational tools to improve the efficiency of (disease) gene-mapping efforts. In the first part of the thesis we propose an efficient simulation procedure for generating realistic genetical data from isolated populations. Simulated data is useful for evaluating hypothesised gene-mapping study designs and computational analysis tools. As an example of such evaluation, we demonstrate how a population-based study design can be a powerful alternative to traditional family-based designs in association-based gene-mapping projects. In the second part of the thesis we consider a prioritisation of a (typically large) set of putative disease-associated genes acquired from an initial gene-mapping analysis. Prioritisation is necessary to be able to focus on the most promising candidates. We show how to harness the current biomedical knowledge for the prioritisation task by integrating various publicly available biological databases into a weighted biological graph. We then demonstrate how to find and evaluate connections between entities, such as genes and diseases, from this unified schema by graph mining techniques. Finally, in the last part of the thesis, we define the concept of reliable subgraph and the corresponding subgraph extraction problem. Reliable subgraphs concisely describe strong and independent connections between two given vertices in a random graph, and hence they are especially useful for visualising such connections. We propose novel algorithms for extracting reliable subgraphs from large random graphs. The efficiency and scalability of the proposed graph mining methods are backed by extensive experiments on real data. While our application focus is in genetics, the concepts and algorithms can be applied to other domains as well. We demonstrate this generality by considering coauthor graphs in addition to biological graphs in the experiments.