7 resultados para Graph models
em AMS Tesi di Laurea - Alm@DL - Università di Bologna
Resumo:
In questo elaborato ci siamo occupati della legge di Zipf sia da un punto di vista applicativo che teorico. Tale legge empirica afferma che il rango in frequenza (RF) delle parole di un testo seguono una legge a potenza con esponente -1. Per quanto riguarda l'approccio teorico abbiamo trattato due classi di modelli in grado di ricreare leggi a potenza nella loro distribuzione di probabilità. In particolare, abbiamo considerato delle generalizzazioni delle urne di Polya e i processi SSR (Sample Space Reducing). Di questi ultimi abbiamo dato una formalizzazione in termini di catene di Markov. Infine abbiamo proposto un modello di dinamica delle popolazioni capace di unificare e riprodurre i risultati dei tre SSR presenti in letteratura. Successivamente siamo passati all'analisi quantitativa dell'andamento del RF sulle parole di un corpus di testi. Infatti in questo caso si osserva che la RF non segue una pura legge a potenza ma ha un duplice andamento che può essere rappresentato da una legge a potenza che cambia esponente. Abbiamo cercato di capire se fosse possibile legare l'analisi dell'andamento del RF con le proprietà topologiche di un grafo. In particolare, a partire da un corpus di testi abbiamo costruito una rete di adiacenza dove ogni parola era collegata tramite un link alla parola successiva. Svolgendo un'analisi topologica della struttura del grafo abbiamo trovato alcuni risultati che sembrano confermare l'ipotesi che la sua struttura sia legata al cambiamento di pendenza della RF. Questo risultato può portare ad alcuni sviluppi nell'ambito dello studio del linguaggio e della mente umana. Inoltre, siccome la struttura del grafo presenterebbe alcune componenti che raggruppano parole in base al loro significato, un approfondimento di questo studio potrebbe condurre ad alcuni sviluppi nell'ambito della comprensione automatica del testo (text mining).
Resumo:
Monomer-dimer models are amongst the models in statistical mechanics which found application in many areas of science, ranging from biology to social sciences. This model describes a many-body system in which monoatomic and diatomic particles subject to hard-core interactions get deposited on a graph. In our work we provide an extension of this model to higher-order particles. The aim of our work is threefold: first we study the thermodynamic properties of the newly introduced model. We solve analytically some regular cases and find that, differently from the original, our extension admits phase transitions. Then we tackle the inverse problem, both from an analytical and numerical perspective. Finally we propose an application to aggregation phenomena in virtual messaging services.
Resumo:
Much of the real-world dataset, including textual data, can be represented using graph structures. The use of graphs to represent textual data has many advantages, mainly related to maintaining a more significant amount of information, such as the relationships between words and their types. In recent years, many neural network architectures have been proposed to deal with tasks on graphs. Many of them consider only node features, ignoring or not giving the proper relevance to relationships between them. However, in many node classification tasks, they play a fundamental role. This thesis aims to analyze the main GNNs, evaluate their advantages and disadvantages, propose an innovative solution considered as an extension of GAT, and apply them to a case study in the biomedical field. We propose the reference GNNs, implemented with methodologies later analyzed, and then applied to a question answering system in the biomedical field as a replacement for the pre-existing GNN. We attempt to obtain better results by using models that can accept as input both node and edge features. As shown later, our proposed models can beat the original solution and define the state-of-the-art for the task under analysis.
Resumo:
Artificial Intelligence is reshaping the field of fashion industry in different ways. E-commerce retailers exploit their data through AI to enhance their search engines, make outfit suggestions and forecast the success of a specific fashion product. However, it is a challenging endeavour as the data they possess is huge, complex and multi-modal. The most common way to search for fashion products online is by matching keywords with phrases in the product's description which are often cluttered, inadequate and differ across collections and sellers. A customer may also browse an online store's taxonomy, although this is time-consuming and doesn't guarantee relevant items. With the advent of Deep Learning architectures, particularly Vision-Language models, ad-hoc solutions have been proposed to model both the product image and description to solve this problems. However, the suggested solutions do not exploit effectively the semantic or syntactic information of these modalities, and the unique qualities and relations of clothing items. In this work of thesis, a novel approach is proposed to address this issues, which aims to model and process images and text descriptions as graphs in order to exploit the relations inside and between each modality and employs specific techniques to extract syntactic and semantic information. The results obtained show promising performances on different tasks when compared to the present state-of-the-art deep learning architectures.
Resumo:
Nowadays the idea of injecting world or domain-specific structured knowledge into pre-trained language models (PLMs) is becoming an increasingly popular approach for solving problems such as biases, hallucinations, huge architectural sizes, and explainability lack—critical for real-world natural language processing applications in sensitive fields like bioinformatics. One recent work that has garnered much attention in Neuro-symbolic AI is QA-GNN, an end-to-end model for multiple-choice open-domain question answering (MCOQA) tasks via interpretable text-graph reasoning. Unlike previous publications, QA-GNN mutually informs PLMs and graph neural networks (GNNs) on top of relevant facts retrieved from knowledge graphs (KGs). However, taking a more holistic view, existing PLM+KG contributions mainly consider commonsense benchmarks and ignore or shallowly analyze performances on biomedical datasets. This thesis start from a propose of a deep investigation of QA-GNN for biomedicine, comparing existing or brand-new PLMs, KGs, edge-aware GNNs, preprocessing techniques, and initialization strategies. By combining the insights emerged in DISI's research, we introduce Bio-QA-GNN that include a KG. Working with this part has led to an improvement in state-of-the-art of MCOQA model on biomedical/clinical text, largely outperforming the original one (+3.63\% accuracy on MedQA). Our findings also contribute to a better understanding of the explanation degree allowed by joint text-graph reasoning architectures and their effectiveness on different medical subjects and reasoning types. Codes, models, datasets, and demos to reproduce the results are freely available at: \url{https://github.com/disi-unibo-nlp/bio-qagnn}.
Resumo:
In this work, integro-differential reaction-diffusion models are presented for the description of the temporal and spatial evolution of the concentrations of Abeta and tau proteins involved in Alzheimer's disease. Initially, a local model is analysed: this is obtained by coupling with an interaction term two heterodimer models, modified by adding diffusion and Holling functional terms of the second type. We then move on to the presentation of three nonlocal models, which differ according to the type of the growth (exponential, logistic or Gompertzian) considered for healthy proteins. In these models integral terms are introduced to consider the interaction between proteins that are located at different spatial points possibly far apart. For each of the models introduced, the determination of equilibrium points with their stability and a study of the clearance inequalities are carried out. In addition, since the integrals introduced imply a spatial nonlocality in the models exhibited, some general features of nonlocal models are presented. Afterwards, with the aim of developing simulations, it is decided to transfer the nonlocal models to a brain graph called connectome. Therefore, after setting out the construction of such a graph, we move on to the description of Laplacian and convolution operations on a graph. Taking advantage of all these elements, we finally move on to the translation of the continuous models described above into discrete models on the connectome. To conclude, the results of some simulations concerning the discrete models just derived are presented.
Resumo:
La presenti tesi ha come obiettivo lo studio di due algoritmi per il rilevamento di anomalie all' interno di grafi random. Per entrambi gli algoritmi sono stati creati dei modelli generativi di grafi dinamici in modo da eseguire dei test sintetici. La tesi si compone in una parte iniziale teorica e di una seconda parte sperimentale. Il secondo capitolo introduce la teoria dei grafi. Il terzo capitolo presenta il problema del rilevamento di comunità. Il quarto capitolo introduce possibili definizioni del concetto di anomalie dinamiche e il problema del loro rilevamento. Il quinto capitolo propone l' introduzione di un punteggio di outlierness associato ad ogni nodo sulla base del confronto tra la sua dinamica e quella della comunità a cui appartiene. L' ultimo capitolo si incentra sul problema della ricerca di una descrizione della rete in termini di gruppi o ruoli sulla base della quale incentrare la ricerca delle anomalie dinamiche.