250 resultados para graph matching algorithms


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Parallel execution of computational mechanics codes requires efficient mesh-partitioning techniques. These mesh-partitioning techniques divide the mesh into specified number of submeshes of approximately the same size and at the same time, minimise the interface nodes of the submeshes. This paper describes a new mesh partitioning technique, employing Genetic Algorithms. The proposed algorithm operates on the deduced graph (dual or nodal graph) of the given finite element mesh rather than directly on the mesh itself. The algorithm works by first constructing a coarse graph approximation using an automatic graph coarsening method. The coarse graph is partitioned and the results are interpolated onto the original graph to initialise an optimisation of the graph partition problem. In practice, hierarchy of (usually more than two) graphs are used to obtain the final graph partition. The proposed partitioning algorithm is applied to graphs derived from unstructured finite element meshes describing practical engineering problems and also several example graphs related to finite element meshes given in the literature. The test results indicate that the proposed GA based graph partitioning algorithm generates high quality partitions and are superior to spectral and multilevel graph partitioning algorithms.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recognizing similarities and deriving relationships among protein molecules is a fundamental requirement in present-day biology. Similarities can be present at various levels which can be detected through comparison of protein sequences or their structural folds. In some cases similarities obscure at these levels could be present merely in the substructures at their binding sites. Inferring functional similarities between protein molecules by comparing their binding sites is still largely exploratory and not as yet a routine protocol. One of the main reasons for this is the limitation in the choice of appropriate analytical tools that can compare binding sites with high sensitivity. To benefit from the enormous amount of structural data that is being rapidly accumulated, it is essential to have high throughput tools that enable large scale binding site comparison. Results: Here we present a new algorithm PocketMatch for comparison of binding sites in a frame invariant manner. Each binding site is represented by 90 lists of sorted distances capturing shape and chemical nature of the site. The sorted arrays are then aligned using an incremental alignment method and scored to obtain PMScores for pairs of sites. A comprehensive sensitivity analysis and an extensive validation of the algorithm have been carried out. A comparison with other site matching algorithms is also presented. Perturbation studies where the geometry of a given site was retained but the residue types were changed randomly, indicated that chance similarities were virtually non-existent. Our analysis also demonstrates that shape information alone is insufficient to discriminate between diverse binding sites, unless combined with chemical nature of amino acids. Conclusion: A new algorithm has been developed to compare binding sites in accurate, efficient and high-throughput manner. Though the representation used is conceptually simplistic, we demonstrate that along with the new alignment strategy used, it is sufficient to enable binding comparison with high sensitivity. Novel methodology has also been presented for validating the algorithm for accuracy and sensitivity with respect to geometry and chemical nature of the site. The method is also fast and takes about 1/250(th) second for one comparison on a single processor. A parallel version on BlueGene has also been implemented.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Structural alignments are the most widely used tools for comparing proteins with low sequence similarity. The main contribution of this paper is to derive various kernels on proteins from structural alignments, which do not use sequence information. Central to the kernels is a novel alignment algorithm which matches substructures of fixed size using spectral graph matching techniques. We derive positive semi-definite kernels which capture the notion of similarity between substructures. Using these as base more sophisticated kernels on protein structures are proposed. To empirically evaluate the kernels we used a 40% sequence non-redundant structures from 15 different SCOP superfamilies. The kernels when used with SVMs show competitive performance with CE, a state of the art structure comparison program.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Lovasz θ function of a graph, is a fundamental tool in combinatorial optimization and approximation algorithms. Computing θ involves solving a SDP and is extremely expensive even for moderately sized graphs. In this paper we establish that the Lovasz θ function is equivalent to a kernel learning problem related to one class SVM. This interesting connection opens up many opportunities bridging graph theoretic algorithms and machine learning. We show that there exist graphs, which we call SVM−θ graphs, on which the Lovasz θ function can be approximated well by a one-class SVM. This leads to a novel use of SVM techniques to solve algorithmic problems in large graphs e.g. identifying a planted clique of size Θ(n√) in a random graph G(n,12). A classic approach for this problem involves computing the θ function, however it is not scalable due to SDP computation. We show that the random graph with a planted clique is an example of SVM−θ graph, and as a consequence a SVM based approach easily identifies the clique in large graphs and is competitive with the state-of-the-art. Further, we introduce the notion of a ''common orthogonal labeling'' which extends the notion of a ''orthogonal labelling of a single graph (used in defining the θ function) to multiple graphs. The problem of finding the optimal common orthogonal labelling is cast as a Multiple Kernel Learning problem and is used to identify a large common dense region in multiple graphs. The proposed algorithm achieves an order of magnitude scalability compared to the state of the art.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The problem of scaling up data integration, such that new sources can be quickly utilized as they are discovered, remains elusive: Global schemas for integrated data are difficult to develop and expand, and schema and record matching techniques are limited by the fact that data and metadata are often under-specified and must be disambiguated by data experts. One promising approach is to avoid using a global schema, and instead to develop keyword search-based data integration-where the system lazily discovers associations enabling it to join together matches to keywords, and return ranked results. The user is expected to understand the data domain and provide feedback about answers' quality. The system generalizes such feedback to learn how to correctly integrate data. A major open challenge is that under this model, the user only sees and offers feedback on a few ``top-'' results: This result set must be carefully selected to include answers of high relevance and answers that are highly informative when feedback is given on them. Existing systems merely focus on predicting relevance, by composing the scores of various schema and record matching algorithms. In this paper, we show how to predict the uncertainty associated with a query result's score, as well as how informative feedback is on a given result. We build upon these foundations to develop an active learning approach to keyword search-based data integration, and we validate the effectiveness of our solution over real data from several very different domains.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Wireless sensor networks can often be viewed in terms of a uniform deployment of a large number of nodes in a region of Euclidean space. Following deployment, the nodes self-organize into a mesh topology with a key aspect being self-localization. Having obtained a mesh topology in a dense, homogeneous deployment, a frequently used approximation is to take the hop distance between nodes to be proportional to the Euclidean distance between them. In this work, we analyze this approximation through two complementary analyses. We assume that the mesh topology is a random geometric graph on the nodes; and that some nodes are designated as anchors with known locations. First, we obtain high probability bounds on the Euclidean distances of all nodes that are h hops away from a fixed anchor node. In the second analysis, we provide a heuristic argument that leads to a direct approximation for the density function of the Euclidean distance between two nodes that are separated by a hop distance h. This approximation is shown, through simulation, to very closely match the true density function. Localization algorithms that draw upon the preceding analyses are then proposed and shown to perform better than some of the well-known algorithms present in the literature. Belief-propagation-based message-passing is then used to further enhance the performance of the proposed localization algorithms. To our knowledge, this is the first usage of message-passing for hop-count-based self-localization.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this work, we study the well-known r-DIMENSIONAL k-MATCHING ((r, k)-DM), and r-SET k-PACKING ((r, k)-SP) problems. Given a universe U := U-1 ... U-r and an r-uniform family F subset of U-1 x ... x U-r, the (r, k)-DM problem asks if F admits a collection of k mutually disjoint sets. Given a universe U and an r-uniform family F subset of 2(U), the (r, k)-SP problem asks if F admits a collection of k mutually disjoint sets. We employ techniques based on dynamic programming and representative families. This leads to a deterministic algorithm with running time O(2.851((r-1)k) .vertical bar F vertical bar. n log(2)n . logW) for the weighted version of (r, k)-DM, where W is the maximum weight in the input, and a deterministic algorithm with running time O(2.851((r-0.5501)k).vertical bar F vertical bar.n log(2) n . logW) for the weighted version of (r, k)-SP. Thus, we significantly improve the previous best known deterministic running times for (r, k)-DM and (r, k)-SP and the previous best known running times for their weighted versions. We rely on structural properties of (r, k)-DM and (r, k)-SP to develop algorithms that are faster than those that can be obtained by a standard use of representative sets. Incorporating the principles of iterative expansion, we obtain a better algorithm for (3, k)-DM, running in time O(2.004(3k).vertical bar F vertical bar . n log(2)n). We believe that this algorithm demonstrates an interesting application of representative families in conjunction with more traditional techniques. Furthermore, we present kernels of size O(e(r)r(k-1)(r) logW) for the weighted versions of (r, k)-DM and (r, k)-SP, improving the previous best known kernels of size O(r!r(k-1)(r) logW) for these problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Reeb graph tracks topology changes in level sets of a scalar function and finds applications in scientific visualization and geometric modeling. We describe an algorithm that constructs the Reeb graph of a Morse function defined on a 3-manifold. Our algorithm maintains connected components of the two dimensional levels sets as a dynamic graph and constructs the Reeb graph in O(nlogn+nlogg(loglogg)3) time, where n is the number of triangles in the tetrahedral mesh representing the 3-manifold and g is the maximum genus over all level sets of the function. We extend this algorithm to construct Reeb graphs of d-manifolds in O(nlogn(loglogn)3) time, where n is the number of triangles in the simplicial complex that represents the d-manifold. Our result is a significant improvement over the previously known O(n2) algorithm. Finally, we present experimental results of our implementation and demonstrate that our algorithm for 3-manifolds performs efficiently in practice.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An algorithm is described for developing a hierarchy among a set of elements having certain precedence relations. This algorithm, which is based on tracing a path through the graph, is easily implemented by a computer.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An algorithm is described for developing a hierarchy among a set of elements having certain precedence relations. This algorithm, which is based on tracing a path through the graph, is easily implemented by a computer.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We study the problem of matching applicants to jobs under one-sided preferences: that is, each applicant ranks a non-empty subset of jobs under an order of preference, possibly involving ties. A matching M is said to be rnore popular than T if the applicants that prefer M to T outnumber those that prefer T to M. A matching is said to be popular if there is no matching more popular than it. Equivalently, a matching M is popular if phi(M,T) >= phi(T, M) for all matchings T, where phi(X, Y) is the number of applicants that prefer X to Y. Previously studied solution concepts based oil the popularity criterion are either not guaranteed to exist for every instance (e.g., popular matchings) or are NP-hard to compute (e.g., least unpopular matchings). This paper addresses this issue by considering mixed matchings. A mixed matching is simply a probability distributions over matchings in the input graph. The function phi that compares two matchings generalizes in a natural manner to mixed matchings by taking expectation. A mixed matching P is popular if phi(P,Q) >= phi(Q,P) for all mixed matchings Q. We show that popular mixed matchings always exist. and we design polynomial time algorithms for finding them. Then we study their efficiency and give tight bounds on the price of anarchy and price of stability of the popular matching problem.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the problem of computing an approximate minimum cycle basis of an undirected edge-weighted graph G with m edges and n vertices; the extension to directed graphs is also discussed. In this problem, a {0,1} incidence vector is associated with each cycle and the vector space over F-2 generated by these vectors is the cycle space of G. A set of cycles is called a cycle basis of G if it forms a basis for its cycle space. A cycle basis where the sum of the weights of the cycles is minimum is called a minimum cycle basis of G. Cycle bases of low weight are useful in a number of contexts, e.g. the analysis of electrical networks, structural engineering, chemistry, and surface reconstruction. We present two new algorithms to compute an approximate minimum cycle basis. For any integer k >= 1, we give (2k - 1)-approximation algorithms with expected running time 0(kmn(1+2/k) + mn((1+1/k)(omega-1))) and deterministic running time 0(n(3+2/k)), respectively. Here omega is the best exponent of matrix multiplication. It is presently known that omega < 2.376. Both algorithms are o(m(omega)) for dense graphs. This is the first time that any algorithm which computes sparse cycle bases with a guarantee drops below the Theta(m(omega)) bound. We also present a 2-approximation algorithm with O(m(omega) root n log n) expected running time, a linear time 2-approximation algorithm for planar graphs and an O(n(3)) time 2.42-approximation algorithm for the complete Euclidean graph in the plane.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this article we study the one-dimensional random geometric (random interval) graph when the location of the nodes are independent and exponentially distributed. We derive exact results and limit theorems for the connectivity and other properties associated with this random graph. We show that the asymptotic properties of a graph with a truncated exponential distribution can be obtained using the exponential random geometric graph. © 2007 Wiley Periodicals, Inc. Random Struct. Alg., 2008.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The max-coloring problem is to compute a legal coloring of the vertices of a graph G = (V, E) with a non-negative weight function w on V such that Sigma(k)(i=1) max(v epsilon Ci) w(v(i)) is minimized, where C-1, ... , C-k are the various color classes. Max-coloring general graphs is as hard as the classical vertex coloring problem, a special case where vertices have unit weight. In fact, in some cases it can even be harder: for example, no polynomial time algorithm is known for max-coloring trees. In this paper we consider the problem of max-coloring paths and its generalization, max-coloring abroad class of trees and show it can be solved in time O(vertical bar V vertical bar+time for sorting the vertex weights). When vertex weights belong to R, we show a matching lower bound of Omega(vertical bar V vertical bar log vertical bar V vertical bar) in the algebraic computation tree model.