994 resultados para Graph construction
Resumo:
In the past decade, systems that extract information from millions of Internet documents have become commonplace. Knowledge graphs -- structured knowledge bases that describe entities, their attributes and the relationships between them -- are a powerful tool for understanding and organizing this vast amount of information. However, a significant obstacle to knowledge graph construction is the unreliability of the extracted information, due to noise and ambiguity in the underlying data or errors made by the extraction system and the complexity of reasoning about the dependencies between these noisy extractions. My dissertation addresses these challenges by exploiting the interdependencies between facts to improve the quality of the knowledge graph in a scalable framework. I introduce a new approach called knowledge graph identification (KGI), which resolves the entities, attributes and relationships in the knowledge graph by incorporating uncertain extractions from multiple sources, entity co-references, and ontological constraints. I define a probability distribution over possible knowledge graphs and infer the most probable knowledge graph using a combination of probabilistic and logical reasoning. Such probabilistic models are frequently dismissed due to scalability concerns, but my implementation of KGI maintains tractable performance on large problems through the use of hinge-loss Markov random fields, which have a convex inference objective. This allows the inference of large knowledge graphs using 4M facts and 20M ground constraints in 2 hours. To further scale the solution, I develop a distributed approach to the KGI problem which runs in parallel across multiple machines, reducing inference time by 90%. Finally, I extend my model to the streaming setting, where a knowledge graph is continuously updated by incorporating newly extracted facts. I devise a general approach for approximately updating inference in convex probabilistic models, and quantify the approximation error by defining and bounding inference regret for online models. Together, my work retains the attractive features of probabilistic models while providing the scalability necessary for large-scale knowledge graph construction. These models have been applied on a number of real-world knowledge graph projects, including the NELL project at Carnegie Mellon and the Google Knowledge Graph.
Resumo:
The Reeb graph tracks topology changes in level sets of a scalar function and finds applications in scientific visualization and geometric modeling. This paper describes a near-optimal two-step algorithm that constructs the Reeb graph of a Morse function defined over manifolds in any dimension. The algorithm first identifies the critical points of the input manifold, and then connects these critical points in the second step to obtain the Reeb graph. A simplification mechanism based on topological persistence aids in the removal of noise and unimportant features. A radial layout scheme results in a feature-directed drawing of the Reeb graph. Experimental results demonstrate the efficiency of the Reeb graph construction in practice and its applications.
Resumo:
We present a novel approach to goal recognition based on a two-stage paradigm of graph construction and analysis. First, a graph structure called a Goal Graph is constructed to represent the observed actions, the state of the world, and the achieved goals as well as various connections between these nodes at consecutive time steps. Then, the Goal Graph is analysed at each time step to recognise those partially or fully achieved goals that are consistent with the actions observed so far. The Goal Graph analysis also reveals valid plans for the recognised goals or part of these goals. Our approach to goal recognition does not need a plan library. It does not suffer from the problems in the acquisition and hand-coding of large plan libraries, neither does it have the problems in searching the plan space of exponential size. We describe two algorithms for Goal Graph construction and analysis in this paradigm. These algorithms are both provably sound, polynomial-time, and polynomial-space. The number of goals recognised by our algorithms is usually very small after a sequence of observed actions has been processed. Thus the sequence of observed actions is well explained by the recognised goals with little ambiguity. We have evaluated these algorithms in the UNIX domain, in which excellent performance has been achieved in terms of accuracy, efficiency, and scalability.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
An elementary combinatorial Tanner graph construction for a family of near-regular low density parity check (LDPC) codes achieving high girth is presented. These codes are near regular in the sense that the degree of a left/right vertex is allowed to differ by at most one from the average. The construction yields in quadratic time complexity an asymptotic code family with provable lower bounds on the rate and the girth for a given choice of block length and average degree. The construction gives flexibility in the choice of design parameters of the code like rate, girth and average degree. Performance simulations of iterative decoding algorithm for the AWGN channel on codes designed using the method demonstrate that these codes perform better than regular PEG codes and MacKay codes of similar length for all values of Signal to noise ratio.
Resumo:
Pós-graduação em Matemática em Rede Nacional - IBILCE
Resumo:
This thesis proposes a novel graphical model for inference called the Affinity Network,which displays the closeness between pairs of variables and is an alternative to Bayesian Networks and Dependency Networks. The Affinity Network shares some similarities with Bayesian Networks and Dependency Networks but avoids their heuristic and stochastic graph construction algorithms by using a message passing scheme. A comparison with the above two instances of graphical models is given for sparse discrete and continuous medical data and data taken from the UCI machine learning repository. The experimental study reveals that the Affinity Network graphs tend to be more accurate on the basis of an exhaustive search with the small datasets. Moreover, the graph construction algorithm is faster than the other two methods with huge datasets. The Affinity Network is also applied to data produced by a synchronised system. A detailed analysis and numerical investigation into this dynamical system is provided and it is shown that the Affinity Network can be used to characterise its emergent behaviour even in the presence of noise.
Resumo:
The purpose of this paper is to describe a new decomposition construction for perfect secret sharing schemes with graph access structures. The previous decomposition construction proposed by Stinson is a recursive method that uses small secret sharing schemes as building blocks in the construction of larger schemes. When the Stinson method is applied to the graph access structures, the number of such “small” schemes is typically exponential in the number of the participants, resulting in an exponential algorithm. Our method has the same flavor as the Stinson decomposition construction; however, the linear programming problem involved in the construction is formulated in such a way that the number of “small” schemes is polynomial in the size of the participants, which in turn gives rise to a polynomial time construction. We also show that if we apply the Stinson construction to the “small” schemes arising from our new construction, both have the same information rate.
Resumo:
We show the first deterministic construction of an unconditionally secure multiparty computation (MPC) protocol in the passive adversarial model over black-box non-Abelian groups which is both optimal (secure against an adversary who possesses any t
Resumo:
In this paper, we present a random iterative graph based hyper-heuristic to produce a collection of heuristic sequences to construct solutions of different quality. These heuristic sequences can be seen as dynamic hybridisations of different graph colouring heuristics that construct solutions step by step. Based on these sequences, we statistically analyse the way in which graph colouring heuristics are automatically hybridised. This, to our knowledge, represents a new direction in hyper-heuristic research. It is observed that spending the search effort on hybridising Largest Weighted Degree with Saturation Degree at the early stage of solution construction tends to generate high quality solutions. Based on these observations, an iterative hybrid approach is developed to adaptively hybridise these two graph colouring heuristics at different stages of solution construction. The overall aim here is to automate the heuristic design process, which draws upon an emerging research theme on developing computer methods to design and adapt heuristics automatically. Experimental results on benchmark exam timetabling and graph colouring problems demonstrate the effectiveness and generality of this adaptive hybrid approach compared with previous methods on automatically generating and adapting heuristics. Indeed, we also show that the approach is competitive with the state of the art human produced methods.
Resumo:
A complex network is an abstract representation of an intricate system of interrelated elements where the patterns of connection hold significant meaning. One particular complex network is a social network whereby the vertices represent people and edges denote their daily interactions. Understanding social network dynamics can be vital to the mitigation of disease spread as these networks model the interactions, and thus avenues of spread, between individuals. To better understand complex networks, algorithms which generate graphs exhibiting observed properties of real-world networks, known as graph models, are often constructed. While various efforts to aid with the construction of graph models have been proposed using statistical and probabilistic methods, genetic programming (GP) has only recently been considered. However, determining that a graph model of a complex network accurately describes the target network(s) is not a trivial task as the graph models are often stochastic in nature and the notion of similarity is dependent upon the expected behavior of the network. This thesis examines a number of well-known network properties to determine which measures best allowed networks generated by different graph models, and thus the models themselves, to be distinguished. A proposed meta-analysis procedure was used to demonstrate how these network measures interact when used together as classifiers to determine network, and thus model, (dis)similarity. The analytical results form the basis of the fitness evaluation for a GP system used to automatically construct graph models for complex networks. The GP-based automatic inference system was used to reproduce existing, well-known graph models as well as a real-world network. Results indicated that the automatically inferred models exemplified functional similarity when compared to their respective target networks. This approach also showed promise when used to infer a model for a mammalian brain network.
Resumo:
The ability of automatic graphic user interface construction is described. It is based on the building of user interface as reflection of the data domain logical definition. The submitted approach to development of the information system user interface enables dynamic adaptation of the system during their operation. This approach is used for creation of information systems based on CASE-system METAS.
Resumo:
This dissertation introduces a new approach for assessing the effects of pediatric epilepsy on the language connectome. Two novel data-driven network construction approaches are presented. These methods rely on connecting different brain regions using either extent or intensity of language related activations as identified by independent component analysis of fMRI data. An auditory description decision task (ADDT) paradigm was used to activate the language network for 29 patients and 30 controls recruited from three major pediatric hospitals. Empirical evaluations illustrated that pediatric epilepsy can cause, or is associated with, a network efficiency reduction. Patients showed a propensity to inefficiently employ the whole brain network to perform the ADDT language task; on the contrary, controls seemed to efficiently use smaller segregated network components to achieve the same task. To explain the causes of the decreased efficiency, graph theoretical analysis was carried out. The analysis revealed no substantial global network feature differences between the patient and control groups. It also showed that for both subject groups the language network exhibited small-world characteristics; however, the patient's extent of activation network showed a tendency towards more random networks. It was also shown that the intensity of activation network displayed ipsilateral hub reorganization on the local level. The left hemispheric hubs displayed greater centrality values for patients, whereas the right hemispheric hubs displayed greater centrality values for controls. This hub hemispheric disparity was not correlated with a right atypical language laterality found in six patients. Finally it was shown that a multi-level unsupervised clustering scheme based on self-organizing maps, a type of artificial neural network, and k-means was able to fairly and blindly separate the subjects into their respective patient or control groups. The clustering was initiated using the local nodal centrality measurements only. Compared to the extent of activation network, the intensity of activation network clustering demonstrated better precision. This outcome supports the assertion that the local centrality differences presented by the intensity of activation network can be associated with focal epilepsy.^
Resumo:
This dissertation introduces a new approach for assessing the effects of pediatric epilepsy on the language connectome. Two novel data-driven network construction approaches are presented. These methods rely on connecting different brain regions using either extent or intensity of language related activations as identified by independent component analysis of fMRI data. An auditory description decision task (ADDT) paradigm was used to activate the language network for 29 patients and 30 controls recruited from three major pediatric hospitals. Empirical evaluations illustrated that pediatric epilepsy can cause, or is associated with, a network efficiency reduction. Patients showed a propensity to inefficiently employ the whole brain network to perform the ADDT language task; on the contrary, controls seemed to efficiently use smaller segregated network components to achieve the same task. To explain the causes of the decreased efficiency, graph theoretical analysis was carried out. The analysis revealed no substantial global network feature differences between the patient and control groups. It also showed that for both subject groups the language network exhibited small-world characteristics; however, the patient’s extent of activation network showed a tendency towards more random networks. It was also shown that the intensity of activation network displayed ipsilateral hub reorganization on the local level. The left hemispheric hubs displayed greater centrality values for patients, whereas the right hemispheric hubs displayed greater centrality values for controls. This hub hemispheric disparity was not correlated with a right atypical language laterality found in six patients. Finally it was shown that a multi-level unsupervised clustering scheme based on self-organizing maps, a type of artificial neural network, and k-means was able to fairly and blindly separate the subjects into their respective patient or control groups. The clustering was initiated using the local nodal centrality measurements only. Compared to the extent of activation network, the intensity of activation network clustering demonstrated better precision. This outcome supports the assertion that the local centrality differences presented by the intensity of activation network can be associated with focal epilepsy.