991 resultados para Graph Models
Resumo:
Understanding a complex network's structure holds the key to understanding its function. The physics community has contributed a multitude of methods and analyses to this cross-disciplinary endeavor. Structural features exist on both the microscopic level, resulting from differences between single node properties, and the mesoscopic level resulting from properties shared by groups of nodes. Disentangling the determinants of network structure on these different scales has remained a major, and so far unsolved, challenge. Here we show how multiscale generative probabilistic exponential random graph models combined with efficient, distributive message-passing inference techniques can be used to achieve this separation of scales, leading to improved detection accuracy of latent classes as demonstrated on benchmark problems. It sheds new light on the statistical significance of motif-distributions in neural networks and improves the link-prediction accuracy as exemplified for gene-disease associations in the highly consequential Online Mendelian Inheritance in Man database. © 2011 Reichardt et al.
Resumo:
This paper considers the problem of finding an optimal deployment of information resources on an InfoStation network in order to minimize the overhead and reduce the time needed to satisfy user requests for resources. The RG-Optimization problem and an approach for its solving are presented as well.
Resumo:
In the past decade, systems that extract information from millions of Internet documents have become commonplace. Knowledge graphs -- structured knowledge bases that describe entities, their attributes and the relationships between them -- are a powerful tool for understanding and organizing this vast amount of information. However, a significant obstacle to knowledge graph construction is the unreliability of the extracted information, due to noise and ambiguity in the underlying data or errors made by the extraction system and the complexity of reasoning about the dependencies between these noisy extractions. My dissertation addresses these challenges by exploiting the interdependencies between facts to improve the quality of the knowledge graph in a scalable framework. I introduce a new approach called knowledge graph identification (KGI), which resolves the entities, attributes and relationships in the knowledge graph by incorporating uncertain extractions from multiple sources, entity co-references, and ontological constraints. I define a probability distribution over possible knowledge graphs and infer the most probable knowledge graph using a combination of probabilistic and logical reasoning. Such probabilistic models are frequently dismissed due to scalability concerns, but my implementation of KGI maintains tractable performance on large problems through the use of hinge-loss Markov random fields, which have a convex inference objective. This allows the inference of large knowledge graphs using 4M facts and 20M ground constraints in 2 hours. To further scale the solution, I develop a distributed approach to the KGI problem which runs in parallel across multiple machines, reducing inference time by 90%. Finally, I extend my model to the streaming setting, where a knowledge graph is continuously updated by incorporating newly extracted facts. I devise a general approach for approximately updating inference in convex probabilistic models, and quantify the approximation error by defining and bounding inference regret for online models. Together, my work retains the attractive features of probabilistic models while providing the scalability necessary for large-scale knowledge graph construction. These models have been applied on a number of real-world knowledge graph projects, including the NELL project at Carnegie Mellon and the Google Knowledge Graph.
Resumo:
In questo elaborato ci siamo occupati della legge di Zipf sia da un punto di vista applicativo che teorico. Tale legge empirica afferma che il rango in frequenza (RF) delle parole di un testo seguono una legge a potenza con esponente -1. Per quanto riguarda l'approccio teorico abbiamo trattato due classi di modelli in grado di ricreare leggi a potenza nella loro distribuzione di probabilità. In particolare, abbiamo considerato delle generalizzazioni delle urne di Polya e i processi SSR (Sample Space Reducing). Di questi ultimi abbiamo dato una formalizzazione in termini di catene di Markov. Infine abbiamo proposto un modello di dinamica delle popolazioni capace di unificare e riprodurre i risultati dei tre SSR presenti in letteratura. Successivamente siamo passati all'analisi quantitativa dell'andamento del RF sulle parole di un corpus di testi. Infatti in questo caso si osserva che la RF non segue una pura legge a potenza ma ha un duplice andamento che può essere rappresentato da una legge a potenza che cambia esponente. Abbiamo cercato di capire se fosse possibile legare l'analisi dell'andamento del RF con le proprietà topologiche di un grafo. In particolare, a partire da un corpus di testi abbiamo costruito una rete di adiacenza dove ogni parola era collegata tramite un link alla parola successiva. Svolgendo un'analisi topologica della struttura del grafo abbiamo trovato alcuni risultati che sembrano confermare l'ipotesi che la sua struttura sia legata al cambiamento di pendenza della RF. Questo risultato può portare ad alcuni sviluppi nell'ambito dello studio del linguaggio e della mente umana. Inoltre, siccome la struttura del grafo presenterebbe alcune componenti che raggruppano parole in base al loro significato, un approfondimento di questo studio potrebbe condurre ad alcuni sviluppi nell'ambito della comprensione automatica del testo (text mining).
Resumo:
With each directed acyclic graph (this includes some D-dimensional lattices) one can associate some Abelian algebras that we call directed Abelian algebras (DAAs). On each site of the graph one attaches a generator of the algebra. These algebras depend on several parameters and are semisimple. Using any DAA, one can define a family of Hamiltonians which give the continuous time evolution of a stochastic process. The calculation of the spectra and ground-state wave functions (stationary state probability distributions) is an easy algebraic exercise. If one considers D-dimensional lattices and chooses Hamiltonians linear in the generators, in finite-size scaling the Hamiltonian spectrum is gapless with a critical dynamic exponent z=D. One possible application of the DAA is to sandpile models. In the paper we present this application, considering one- and two-dimensional lattices. In the one-dimensional case, when the DAA conserves the number of particles, the avalanches belong to the random walker universality class (critical exponent sigma(tau)=3/2). We study the local density of particles inside large avalanches, showing a depletion of particles at the source of the avalanche and an enrichment at its end. In two dimensions we did extensive Monte-Carlo simulations and found sigma(tau)=1.780 +/- 0.005.
Resumo:
Dissertation to obtain the degree of Doctor in Electrical and Computer Engineering, specialization of Collaborative Networks
Resumo:
Dissertação para obtenção do Grau de Mestre em Matemática e Aplicações Especialização em Actuariado, Estatística e Investigação Operacional
Resumo:
Gene-on-gene regulations are key components of every living organism. Dynamical abstract models of genetic regulatory networks help explain the genome's evolvability and robustness. These properties can be attributed to the structural topology of the graph formed by genes, as vertices, and regulatory interactions, as edges. Moreover, the actual gene interaction of each gene is believed to play a key role in the stability of the structure. With advances in biology, some effort was deployed to develop update functions in Boolean models that include recent knowledge. We combine real-life gene interaction networks with novel update functions in a Boolean model. We use two sub-networks of biological organisms, the yeast cell-cycle and the mouse embryonic stem cell, as topological support for our system. On these structures, we substitute the original random update functions by a novel threshold-based dynamic function in which the promoting and repressing effect of each interaction is considered. We use a third real-life regulatory network, along with its inferred Boolean update functions to validate the proposed update function. Results of this validation hint to increased biological plausibility of the threshold-based function. To investigate the dynamical behavior of this new model, we visualized the phase transition between order and chaos into the critical regime using Derrida plots. We complement the qualitative nature of Derrida plots with an alternative measure, the criticality distance, that also allows to discriminate between regimes in a quantitative way. Simulation on both real-life genetic regulatory networks show that there exists a set of parameters that allows the systems to operate in the critical region. This new model includes experimentally derived biological information and recent discoveries, which makes it potentially useful to guide experimental research. The update function confers additional realism to the model, while reducing the complexity and solution space, thus making it easier to investigate.
Resumo:
Functionally relevant large scale brain dynamics operates within the framework imposed by anatomical connectivity and time delays due to finite transmission speeds. To gain insight on the reliability and comparability of large scale brain network simulations, we investigate the effects of variations in the anatomical connectivity. Two different sets of detailed global connectivity structures are explored, the first extracted from the CoCoMac database and rescaled to the spatial extent of the human brain, the second derived from white-matter tractography applied to diffusion spectrum imaging (DSI) for a human subject. We use the combination of graph theoretical measures of the connection matrices and numerical simulations to explicate the importance of both connectivity strength and delays in shaping dynamic behaviour. Our results demonstrate that the brain dynamics derived from the CoCoMac database are more complex and biologically more realistic than the one based on the DSI database. We propose that the reason for this difference is the absence of directed weights in the DSI connectivity matrix.
Resumo:
The use of domain-specific languages (DSLs) has been proposed as an approach to cost-e ectively develop families of software systems in a restricted application domain. Domain-specific languages in combination with the accumulated knowledge and experience of previous implementations, can in turn be used to generate new applications with unique sets of requirements. For this reason, DSLs are considered to be an important approach for software reuse. However, the toolset supporting a particular domain-specific language is also domain-specific and is per definition not reusable. Therefore, creating and maintaining a DSL requires additional resources that could be even larger than the savings associated with using them. As a solution, di erent tool frameworks have been proposed to simplify and reduce the cost of developments of DSLs. Developers of tool support for DSLs need to instantiate, customize or configure the framework for a particular DSL. There are di erent approaches for this. An approach is to use an application programming interface (API) and to extend the basic framework using an imperative programming language. An example of a tools which is based on this approach is Eclipse GEF. Another approach is to configure the framework using declarative languages that are independent of the underlying framework implementation. We believe this second approach can bring important benefits as this brings focus to specifying what should the tool be like instead of writing a program specifying how the tool achieves this functionality. In this thesis we explore this second approach. We use graph transformation as the basic approach to customize a domain-specific modeling (DSM) tool framework. The contributions of this thesis includes a comparison of di erent approaches for defining, representing and interchanging software modeling languages and models and a tool architecture for an open domain-specific modeling framework that e ciently integrates several model transformation components and visual editors. We also present several specific algorithms and tool components for DSM framework. These include an approach for graph query based on region operators and the star operator and an approach for reconciling models and diagrams after executing model transformation programs. We exemplify our approach with two case studies MICAS and EFCO. In these studies we show how our experimental modeling tool framework has been used to define tool environments for domain-specific languages.
Resumo:
The advancement of science and technology makes it clear that no single perspective is any longer sufficient to describe the true nature of any phenomenon. That is why the interdisciplinary research is gaining more attention overtime. An excellent example of this type of research is natural computing which stands on the borderline between biology and computer science. The contribution of research done in natural computing is twofold: on one hand, it sheds light into how nature works and how it processes information and, on the other hand, it provides some guidelines on how to design bio-inspired technologies. The first direction in this thesis focuses on a nature-inspired process called gene assembly in ciliates. The second one studies reaction systems, as a modeling framework with its rationale built upon the biochemical interactions happening within a cell. The process of gene assembly in ciliates has attracted a lot of attention as a research topic in the past 15 years. Two main modelling frameworks have been initially proposed in the end of 1990s to capture ciliates’ gene assembly process, namely the intermolecular model and the intramolecular model. They were followed by other model proposals such as templatebased assembly and DNA rearrangement pathways recombination models. In this thesis we are interested in a variation of the intramolecular model called simple gene assembly model, which focuses on the simplest possible folds in the assembly process. We propose a new framework called directed overlap-inclusion (DOI) graphs to overcome the limitations that previously introduced models faced in capturing all the combinatorial details of the simple gene assembly process. We investigate a number of combinatorial properties of these graphs, including a necessary property in terms of forbidden induced subgraphs. We also introduce DOI graph-based rewriting rules that capture all the operations of the simple gene assembly model and prove that they are equivalent to the string-based formalization of the model. Reaction systems (RS) is another nature-inspired modeling framework that is studied in this thesis. Reaction systems’ rationale is based upon two main regulation mechanisms, facilitation and inhibition, which control the interactions between biochemical reactions. Reaction systems is a complementary modeling framework to traditional quantitative frameworks, focusing on explicit cause-effect relationships between reactions. The explicit formulation of facilitation and inhibition mechanisms behind reactions, as well as the focus on interactions between reactions (rather than dynamics of concentrations) makes their applicability potentially wide and useful beyond biological case studies. In this thesis, we construct a reaction system model corresponding to the heat shock response mechanism based on a novel concept of dominance graph that captures the competition on resources in the ODE model. We also introduce for RS various concepts inspired by biology, e.g., mass conservation, steady state, periodicity, etc., to do model checking of the reaction systems based models. We prove that the complexity of the decision problems related to these properties varies from P to NP- and coNP-complete to PSPACE-complete. We further focus on the mass conservation relation in an RS and introduce the conservation dependency graph to capture the relation between the species and also propose an algorithm to list the conserved sets of a given reaction system.
Resumo:
Slides and an essay on the Web Graph, search engines and how Google calculates Page Rank
Resumo:
This article presents a statistical method for detecting recombination in DNA sequence alignments, which is based on combining two probabilistic graphical models: (1) a taxon graph (phylogenetic tree) representing the relationship between the taxa, and (2) a site graph (hidden Markov model) representing interactions between different sites in the DNA sequence alignments. We adopt a Bayesian approach and sample the parameters of the model from the posterior distribution with Markov chain Monte Carlo, using a Metropolis-Hastings and Gibbs-within-Gibbs scheme. The proposed method is tested on various synthetic and real-world DNA sequence alignments, and we compare its performance with the established detection methods RECPARS, PLATO, and TOPAL, as well as with two alternative parameter estimation schemes.