3 resultados para Random graph
em DRUM (Digital Repository at the University of Maryland)
Resumo:
In this dissertation, we apply mathematical programming techniques (i.e., integer programming and polyhedral combinatorics) to develop exact approaches for influence maximization on social networks. We study four combinatorial optimization problems that deal with maximizing influence at minimum cost over a social network. To our knowl- edge, all previous work to date involving influence maximization problems has focused on heuristics and approximation. We start with the following viral marketing problem that has attracted a significant amount of interest from the computer science literature. Given a social network, find a target set of customers to seed with a product. Then, a cascade will be caused by these initial adopters and other people start to adopt this product due to the influence they re- ceive from earlier adopters. The idea is to find the minimum cost that results in the entire network adopting the product. We first study a problem called the Weighted Target Set Selection (WTSS) Prob- lem. In the WTSS problem, the diffusion can take place over as many time periods as needed and a free product is given out to the individuals in the target set. Restricting the number of time periods that the diffusion takes place over to be one, we obtain a problem called the Positive Influence Dominating Set (PIDS) problem. Next, incorporating partial incentives, we consider a problem called the Least Cost Influence Problem (LCIP). The fourth problem studied is the One Time Period Least Cost Influence Problem (1TPLCIP) which is identical to the LCIP except that we restrict the number of time periods that the diffusion takes place over to be one. We apply a common research paradigm to each of these four problems. First, we work on special graphs: trees and cycles. Based on the insights we obtain from special graphs, we develop efficient methods for general graphs. On trees, first, we propose a polynomial time algorithm. More importantly, we present a tight and compact extended formulation. We also project the extended formulation onto the space of the natural vari- ables that gives the polytope on trees. Next, building upon the result for trees---we derive the polytope on cycles for the WTSS problem; as well as a polynomial time algorithm on cycles. This leads to our contribution on general graphs. For the WTSS problem and the LCIP, using the observation that the influence propagation network must be a directed acyclic graph (DAG), the strong formulation for trees can be embedded into a formulation on general graphs. We use this to design and implement a branch-and-cut approach for the WTSS problem and the LCIP. In our computational study, we are able to obtain high quality solutions for random graph instances with up to 10,000 nodes and 20,000 edges (40,000 arcs) within a reasonable amount of time.
Resumo:
In the past decade, systems that extract information from millions of Internet documents have become commonplace. Knowledge graphs -- structured knowledge bases that describe entities, their attributes and the relationships between them -- are a powerful tool for understanding and organizing this vast amount of information. However, a significant obstacle to knowledge graph construction is the unreliability of the extracted information, due to noise and ambiguity in the underlying data or errors made by the extraction system and the complexity of reasoning about the dependencies between these noisy extractions. My dissertation addresses these challenges by exploiting the interdependencies between facts to improve the quality of the knowledge graph in a scalable framework. I introduce a new approach called knowledge graph identification (KGI), which resolves the entities, attributes and relationships in the knowledge graph by incorporating uncertain extractions from multiple sources, entity co-references, and ontological constraints. I define a probability distribution over possible knowledge graphs and infer the most probable knowledge graph using a combination of probabilistic and logical reasoning. Such probabilistic models are frequently dismissed due to scalability concerns, but my implementation of KGI maintains tractable performance on large problems through the use of hinge-loss Markov random fields, which have a convex inference objective. This allows the inference of large knowledge graphs using 4M facts and 20M ground constraints in 2 hours. To further scale the solution, I develop a distributed approach to the KGI problem which runs in parallel across multiple machines, reducing inference time by 90%. Finally, I extend my model to the streaming setting, where a knowledge graph is continuously updated by incorporating newly extracted facts. I devise a general approach for approximately updating inference in convex probabilistic models, and quantify the approximation error by defining and bounding inference regret for online models. Together, my work retains the attractive features of probabilistic models while providing the scalability necessary for large-scale knowledge graph construction. These models have been applied on a number of real-world knowledge graph projects, including the NELL project at Carnegie Mellon and the Google Knowledge Graph.
Resumo:
Graphs are powerful tools to describe social, technological and biological networks, with nodes representing agents (people, websites, gene, etc.) and edges (or links) representing relations (or interactions) between agents. Examples of real-world networks include social networks, the World Wide Web, collaboration networks, protein networks, etc. Researchers often model these networks as random graphs. In this dissertation, we study a recently introduced social network model, named the Multiplicative Attribute Graph model (MAG), which takes into account the randomness of nodal attributes in the process of link formation (i.e., the probability of a link existing between two nodes depends on their attributes). Kim and Lesckovec, who defined the model, have claimed that this model exhibit some of the properties a real world social network is expected to have. Focusing on a homogeneous version of this model, we investigate the existence of zero-one laws for graph properties, e.g., the absence of isolated nodes, graph connectivity and the emergence of triangles. We obtain conditions on the parameters of the model, so that these properties occur with high or vanishingly probability as the number of nodes becomes unboundedly large. In that regime, we also investigate the property of triadic closure and the nodal degree distribution.