3 resultados para Most probable number

em DRUM (Digital Repository at the University of Maryland)


Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the past decade, systems that extract information from millions of Internet documents have become commonplace. Knowledge graphs -- structured knowledge bases that describe entities, their attributes and the relationships between them -- are a powerful tool for understanding and organizing this vast amount of information. However, a significant obstacle to knowledge graph construction is the unreliability of the extracted information, due to noise and ambiguity in the underlying data or errors made by the extraction system and the complexity of reasoning about the dependencies between these noisy extractions. My dissertation addresses these challenges by exploiting the interdependencies between facts to improve the quality of the knowledge graph in a scalable framework. I introduce a new approach called knowledge graph identification (KGI), which resolves the entities, attributes and relationships in the knowledge graph by incorporating uncertain extractions from multiple sources, entity co-references, and ontological constraints. I define a probability distribution over possible knowledge graphs and infer the most probable knowledge graph using a combination of probabilistic and logical reasoning. Such probabilistic models are frequently dismissed due to scalability concerns, but my implementation of KGI maintains tractable performance on large problems through the use of hinge-loss Markov random fields, which have a convex inference objective. This allows the inference of large knowledge graphs using 4M facts and 20M ground constraints in 2 hours. To further scale the solution, I develop a distributed approach to the KGI problem which runs in parallel across multiple machines, reducing inference time by 90%. Finally, I extend my model to the streaming setting, where a knowledge graph is continuously updated by incorporating newly extracted facts. I devise a general approach for approximately updating inference in convex probabilistic models, and quantify the approximation error by defining and bounding inference regret for online models. Together, my work retains the attractive features of probabilistic models while providing the scalability necessary for large-scale knowledge graph construction. These models have been applied on a number of real-world knowledge graph projects, including the NELL project at Carnegie Mellon and the Google Knowledge Graph.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Edge-labeled graphs have proliferated rapidly over the last decade due to the increased popularity of social networks and the Semantic Web. In social networks, relationships between people are represented by edges and each edge is labeled with a semantic annotation. Hence, a huge single graph can express many different relationships between entities. The Semantic Web represents each single fragment of knowledge as a triple (subject, predicate, object), which is conceptually identical to an edge from subject to object labeled with predicates. A set of triples constitutes an edge-labeled graph on which knowledge inference is performed. Subgraph matching has been extensively used as a query language for patterns in the context of edge-labeled graphs. For example, in social networks, users can specify a subgraph matching query to find all people that have certain neighborhood relationships. Heavily used fragments of the SPARQL query language for the Semantic Web and graph queries of other graph DBMS can also be viewed as subgraph matching over large graphs. Though subgraph matching has been extensively studied as a query paradigm in the Semantic Web and in social networks, a user can get a large number of answers in response to a query. These answers can be shown to the user in accordance with an importance ranking. In this thesis proposal, we present four different scoring models along with scalable algorithms to find the top-k answers via a suite of intelligent pruning techniques. The suggested models consist of a practically important subset of the SPARQL query language augmented with some additional useful features. The first model called Substitution Importance Query (SIQ) identifies the top-k answers whose scores are calculated from matched vertices' properties in each answer in accordance with a user-specified notion of importance. The second model called Vertex Importance Query (VIQ) identifies important vertices in accordance with a user-defined scoring method that builds on top of various subgraphs articulated by the user. Approximate Importance Query (AIQ), our third model, allows partial and inexact matchings and returns top-k of them with a user-specified approximation terms and scoring functions. In the fourth model called Probabilistic Importance Query (PIQ), a query consists of several sub-blocks: one mandatory block that must be mapped and other blocks that can be opportunistically mapped. The probability is calculated from various aspects of answers such as the number of mapped blocks, vertices' properties in each block and so on and the most top-k probable answers are returned. An important distinguishing feature of our work is that we allow the user a huge amount of freedom in specifying: (i) what pattern and approximation he considers important, (ii) how to score answers - irrespective of whether they are vertices or substitution, and (iii) how to combine and aggregate scores generated by multiple patterns and/or multiple substitutions. Because so much power is given to the user, indexing is more challenging than in situations where additional restrictions are imposed on the queries the user can ask. The proposed algorithms for the first model can also be used for answering SPARQL queries with ORDER BY and LIMIT, and the method for the second model also works for SPARQL queries with GROUP BY, ORDER BY and LIMIT. We test our algorithms on multiple real-world graph databases, showing that our algorithms are far more efficient than popular triple stores.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The current study examined the frequency and quality of how 3- to 4-year-old children and their parents explore the relations between symbolic and non-symbolic quantities in the context of a playful math experience, as well as the role of both parent and child factors in this exploration. Preschool children’s numerical knowledge was assessed while parents completed a survey about the number-related experiences they share with their children at home, and their math-related beliefs. Parent-child dyads were then videotaped playing a modified version of the card game War. Results suggest that parents and children explored quantity explicitly on only half of the cards and card pairs played, and dyads of young children and those with lower number knowledge tended to be most explicit in their quantity exploration. Dyads with older children, on the other hand, often completed their turns without discussing the numbers at all, likely because they were knowledgeable enough about numbers that they could move through the game with ease. However, when dyads did explore the quantities explicitly, they focused on identifying numbers symbolically, used non-symbolic card information interchangeably with symbolic information to make the quantity comparison judgments, and in some instances, emphasized the connection between the symbolic and non-symbolic number representations on the cards. Parents reported that math experiences such as card game play and quantity comparison occurred relatively infrequently at home compared to activities geared towards more foundational practice of number, such as counting out loud and naming numbers. However, parental beliefs were important in predicting both the frequency of at-home math engagement as well as the quality of these experiences. In particular, parents’ specific beliefs about their children’s abilities and interests were associated with the frequency of home math activities, while parents’ math-related ability beliefs and values along with children’s engagement in the card game were associated with the quality of dyads’ number exploration during the card game. Taken together, these findings suggest that card games can be an engaging context for parent-preschooler exploration of numbers in multiple representations, and suggests that parents’ beliefs and children’s level of engagement are important predictors of this exploration.