851 resultados para Graph mining
Resumo:
Mycobacterium tuberculosis owes its high pathogenic potential to its ability to evade host immune responses and thrive inside the macrophage. The outcome of infection is largely determined by the cellular response comprising a multitude of molecular events. The complexity and inter-relatedness in the processes makes it essential to adopt systems approaches to study them. In this work, we construct a comprehensive network of infection-related processes in a human macrophage comprising 1888 proteins and 14,016 interactions. We then compute response networks based on available gene expression profiles corresponding to states of health, disease and drug treatment. We use a novel formulation for mining response networks that has led to identifying highest activities in the cell. Highest activity paths provide mechanistic insights into pathogenesis and response to treatment. The approach used here serves as a generic framework for mining dynamic changes in genome-scale protein interaction networks.
Resumo:
Entropy is a fundamental thermodynamic property that has attracted a wide attention across domains, including chemistry. Inference of entropy of chemical compounds using various approaches has been a widely studied topic. However, many aspects of entropy in chemical compounds remain unexplained. In the present work, we propose two new information-theoretical molecular descriptors for the prediction of gas phase thermal entropy of organic compounds. The descriptors reflect the bulk and size of the compounds as well as the gross topological symmetry in their structures, all of which are believed to determine entropy. A high correlation () between the entropy values and our information-theoretical indices have been found and the predicted entropy values, obtained from the corresponding statistically significant regression model, have been found to be within acceptable approximation. We provide additional mathematical result in the form of a theorem and proof that might further help in assessing changes in gas phase thermal entropy values with the changes in molecular structures. The proposed information-theoretical molecular descriptors, regression model and the mathematical result are expected to augment predictions of gas phase thermal entropy for a large number of chemical compounds.
Resumo:
Precise pointer analysis is a problem of interest to both the compiler and the program verification community. Flow-sensitivity is an important dimension of pointer analysis that affects the precision of the final result computed. Scaling flow-sensitive pointer analysis to millions of lines of code is a major challenge. Recently, staged flow-sensitive pointer analysis has been proposed, which exploits a sparse representation of program code created by staged analysis. In this paper we formulate the staged flow-sensitive pointer analysis as a graph-rewriting problem. Graph-rewriting has already been used for flow-insensitive analysis. However, formulating flow-sensitive pointer analysis as a graph-rewriting problem adds additional challenges due to the nature of flow-sensitivity. We implement our parallel algorithm using Intel Threading Building Blocks and demonstrate considerable scaling (upto 2.6x) for 8 threads on a set of 10 benchmarks. Compared to the sequential implementation of staged flow-sensitive analysis, a single threaded execution of our implementation performs better in 8 of the benchmarks.
Resumo:
In today's API-rich world, programmer productivity depends heavily on the programmer's ability to discover the required APIs. In this paper, we present a technique and tool, called MATHFINDER, to discover APIs for mathematical computations by mining unit tests of API methods. Given a math expression, MATHFINDER synthesizes pseudo-code to compute the expression by mapping its subexpressions to API method calls. For each subexpression, MATHFINDER searches for a method such that there is a mapping between method inputs and variables of the subexpression. The subexpression, when evaluated on the test inputs of the method under this mapping, should produce results that match the method output on a large number of tests. We implemented MATHFINDER as an Eclipse plugin for discovery of third-party Java APIs and performed a user study to evaluate its effectiveness. In the study, the use of MATHFINDER resulted in a 2x improvement in programmer productivity. In 96% of the subexpressions queried for in the study, MATHFINDER retrieved the desired API methods as the top-most result. The top-most pseudo-code snippet to implement the entire expression was correct in 93% of the cases. Since the number of methods and unit tests to mine could be large in practice, we also implement MATHFINDER in a MapReduce framework and evaluate its scalability and response time.
Resumo:
We show that every graph of maximum degree 3 can be represented as the intersection graph of axis parallel boxes in three dimensions, that is, every vertex can be mapped to an axis parallel box such that two boxes intersect if and only if their corresponding vertices are adjacent. In fact, we construct a representation in which any two intersecting boxes touch just at their boundaries.
Resumo:
We apply the objective method of Aldous to the problem of finding the minimum-cost edge cover of the complete graph with random independent and identically distributed edge costs. The limit, as the number of vertices goes to infinity, of the expected minimum cost for this problem is known via a combinatorial approach of Hessler and Wastlund. We provide a proof of this result using the machinery of the objective method and local weak convergence, which was used to prove the (2) limit of the random assignment problem. A proof via the objective method is useful because it provides us with more information on the nature of the edge's incident on a typical root in the minimum-cost edge cover. We further show that a belief propagation algorithm converges asymptotically to the optimal solution. This can be applied in a computational linguistics problem of semantic projection. The belief propagation algorithm yields a near optimal solution with lesser complexity than the known best algorithms designed for optimality in worst-case settings.
Resumo:
Rainbow connection number, rc(G), of a connected graph G is the minimum number of colors needed to color its edges so that every pair of vertices is connected by at least one path in which no two edges are colored the same (note that the coloring need not be proper). In this paper we study the rainbow connection number with respect to three important graph product operations (namely the Cartesian product, the lexicographic product and the strong product) and the operation of taking the power of a graph. In this direction, we show that if G is a graph obtained by applying any of the operations mentioned above on non-trivial graphs, then rc(G) a parts per thousand currency sign 2r(G) + c, where r(G) denotes the radius of G and . In general the rainbow connection number of a bridgeless graph can be as high as the square of its radius 1]. This is an attempt to identify some graph classes which have rainbow connection number very close to the obvious lower bound of diameter (and thus the radius). The bounds reported are tight up to additive constants. The proofs are constructive and hence yield polynomial time -factor approximation algorithms.
Resumo:
Today's programming languages are supported by powerful third-party APIs. For a given application domain, it is common to have many competing APIs that provide similar functionality. Programmer productivity therefore depends heavily on the programmer's ability to discover suitable APIs both during an initial coding phase, as well as during software maintenance. The aim of this work is to support the discovery and migration of math APIs. Math APIs are at the heart of many application domains ranging from machine learning to scientific computations. Our approach, called MATHFINDER, combines executable specifications of mathematical computations with unit tests (operational specifications) of API methods. Given a math expression, MATHFINDER synthesizes pseudo-code comprised of API methods to compute the expression by mining unit tests of the API methods. We present a sequential version of our unit test mining algorithm and also design a more scalable data-parallel version. We perform extensive evaluation of MATHFINDER (1) for API discovery, where math algorithms are to be implemented from scratch and (2) for API migration, where client programs utilizing a math API are to be migrated to another API. We evaluated the precision and recall of MATHFINDER on a diverse collection of math expressions, culled from algorithms used in a wide range of application areas such as control systems and structural dynamics. In a user study to evaluate the productivity gains obtained by using MATHFINDER for API discovery, the programmers who used MATHFINDER finished their programming tasks twice as fast as their counterparts who used the usual techniques like web and code search, IDE code completion, and manual inspection of library documentation. For the problem of API migration, as a case study, we used MATHFINDER to migrate Weka, a popular machine learning library. Overall, our evaluation shows that MATHFINDER is easy to use, provides highly precise results across several math APIs and application domains even with a small number of unit tests per method, and scales to large collections of unit tests.
Resumo:
In a complete bipartite graph with vertex sets of cardinalities n and n', assign random weights from exponential distribution with mean 1, independently to each edge. We show that, as n -> infinity, with n' = n/alpha] for any fixed alpha > 1, the minimum weight of many-to-one matchings converges to a constant (depending on alpha). Many-to-one matching arises as an optimization step in an algorithm for genome sequencing and as a measure of distance between finite sets. We prove that a belief propagation (BP) algorithm converges asymptotically to the optimal solution. We use the objective method of Aldous to prove our results. We build on previous works on minimum weight matching and minimum weight edge cover problems to extend the objective method and to further the applicability of belief propagation to random combinatorial optimization problems.
Resumo:
The disclosure of information and its misuse in Privacy Preserving Data Mining (PPDM) systems is a concern to the parties involved. In PPDM systems data is available amongst multiple parties collaborating to achieve cumulative mining accuracy. The vertically partitioned data available with the parties involved cannot provide accurate mining results when compared to the collaborative mining results. To overcome the privacy issue in data disclosure this paper describes a Key Distribution-Less Privacy Preserving Data Mining (KDLPPDM) system in which the publication of local association rules generated by the parties is published. The association rules are securely combined to form the combined rule set using the Commutative RSA algorithm. The combined rule sets established are used to classify or mine the data. The results discussed in this paper compare the accuracy of the rules generated using the C4. 5 based KDLPPDM system and the CS. 0 based KDLPPDM system using receiver operating characteristics curves (ROC).
Resumo:
We consider a continuum percolation model consisting of two types of nodes, namely legitimate and eavesdropper nodes, distributed according to independent Poisson point processes in R-2 of intensities lambda and lambda(E), respectively. A directed edge from one legitimate node A to another legitimate node B exists provided that the strength of the signal transmitted from node A that is received at node B is higher than that received at any eavesdropper node. The strength of the signal received at a node from a legitimate node depends not only on the distance between these nodes, but also on the location of the other legitimate nodes and an interference suppression parameter gamma. The graph is said to percolate when there exists an infinitely connected component. We show that for any finite intensity lambda(E) of eavesdropper nodes, there exists a critical intensity lambda(c) < infinity such that for all lambda > lambda(c) the graph percolates for sufficiently small values of the interference parameter. Furthermore, for the subcritical regime, we show that there exists a lambda(0) such that for all lambda < lambda(0) <= lambda(c) a suitable graph defined over eavesdropper node connections percolates that precludes percolation in the graphs formed by the legitimate nodes.
Resumo:
Despite significant advances in recent years, structure-from-motion (SfM) pipelines suffer from two important drawbacks. Apart from requiring significant computational power to solve the large-scale computations involved, such pipelines sometimes fail to correctly reconstruct when the accumulated error in incremental reconstruction is large or when the number of 3D to 2D correspondences are insufficient. In this paper we present a novel approach to mitigate the above-mentioned drawbacks. Using an image match graph based on matching features we partition the image data set into smaller sets or components which are reconstructed independently. Following such reconstructions we utilise the available epipolar relationships that connect images across components to correctly align the individual reconstructions in a global frame of reference. This results in both a significant speed up of at least one order of magnitude and also mitigates the problems of reconstruction failures with a marginal loss in accuracy. The effectiveness of our approach is demonstrated on some large-scale real world data sets.
Resumo:
We give an overview of recent results and techniques in parameterized algorithms for graph modification problems.
Resumo:
The Jansen mechanism is a one degree-of-freedom, planar, 12-link, leg mechanism that can be used in mobile robotic applications and in gait analysis. This paper presents the kinematics and dynamics of the Jansen leg mechanism. The forward kinematics, accomplished using circle intersection method, determines the trajectories of various points on the mechanism in the chassis (stationary link) reference frame. From the foot point trajectory, the step length is shown to vary linearly while step height varies non-linearly with change in crank radius. A dynamic model for the Jansen leg mechanism is proposed using bond graph approach with modulated multiport transformers. For given ground reaction force pattern and crank angular speed, this model helps determine the motor torque profile as well as the link and joint stresses. The model can therefore be used to rate the actuator torque and in design of the hardware and controller for such a system. The kinematics of the mechanism can also be obtained from this dynamic model. The proposed model is thus a useful tool for analysis and design of systems based on the Jansen leg mechanism. (C) 2015 Elsevier B.V. All rights reserved.
Resumo:
Graph algorithms have been shown to possess enough parallelism to keep several computing resources busy-even hundreds of cores on a GPU. Unfortunately, tuning their implementation for efficient execution on a particular hardware configuration of heterogeneous systems consisting of multicore CPUs and GPUs is challenging, time consuming, and error prone. To address these issues, we propose a domain-specific language (DSL), Falcon, for implementing graph algorithms that (i) abstracts the hardware, (ii) provides constructs to write explicitly parallel programs at a higher level, and (iii) can work with general algorithms that may change the graph structure (morph algorithms). We illustrate the usage of our DSL to implement local computation algorithms (that do not change the graph structure) and morph algorithms such as Delaunay mesh refinement, survey propagation, and dynamic SSSP on GPU and multicore CPUs. Using a set of benchmark graphs, we illustrate that the generated code performs close to the state-of-the-art hand-tuned implementations.