Bio-QA-GNN: Reasoning with Language Models and Knowledge Graphs for Interpretable Biomedical Question Answering


Autoria(s): Gnagnarella, Enrico
Contribuinte(s)

Carbonaro, Antonella

Frisoni, Giacomo

Data(s)

15/12/2022

Resumo

Nowadays the idea of injecting world or domain-specific structured knowledge into pre-trained language models (PLMs) is becoming an increasingly popular approach for solving problems such as biases, hallucinations, huge architectural sizes, and explainability lack—critical for real-world natural language processing applications in sensitive fields like bioinformatics. One recent work that has garnered much attention in Neuro-symbolic AI is QA-GNN, an end-to-end model for multiple-choice open-domain question answering (MCOQA) tasks via interpretable text-graph reasoning. Unlike previous publications, QA-GNN mutually informs PLMs and graph neural networks (GNNs) on top of relevant facts retrieved from knowledge graphs (KGs). However, taking a more holistic view, existing PLM+KG contributions mainly consider commonsense benchmarks and ignore or shallowly analyze performances on biomedical datasets. This thesis start from a propose of a deep investigation of QA-GNN for biomedicine, comparing existing or brand-new PLMs, KGs, edge-aware GNNs, preprocessing techniques, and initialization strategies. By combining the insights emerged in DISI's research, we introduce Bio-QA-GNN that include a KG. Working with this part has led to an improvement in state-of-the-art of MCOQA model on biomedical/clinical text, largely outperforming the original one (+3.63\% accuracy on MedQA). Our findings also contribute to a better understanding of the explanation degree allowed by joint text-graph reasoning architectures and their effectiveness on different medical subjects and reasoning types. Codes, models, datasets, and demos to reproduce the results are freely available at: \url{https://github.com/disi-unibo-nlp/bio-qagnn}.

Formato

application/pdf

Identificador

http://amslaurea.unibo.it/27513/1/Bio-QA-GNN%20Reasoning%20with%20Language%20Models%20and%20Knowledge%20Graphs%20for%20Interpretable%20Biomedical%20Question%20Answering.pdf

Gnagnarella, Enrico (2022) Bio-QA-GNN: Reasoning with Language Models and Knowledge Graphs for Interpretable Biomedical Question Answering. [Laurea magistrale], Università di Bologna, Corso di Studio in Ingegneria e scienze informatiche [LM-DM270] - Cesena <http://amslaurea.unibo.it/view/cds/CDS8614/>, Documento ad accesso riservato.

Idioma(s)

en

Publicador

Alma Mater Studiorum - Università di Bologna

Relação

http://amslaurea.unibo.it/27513/

Direitos

Free to read

Palavras-Chave #Natural Language Understanding,Knowledge Graph,Subgraph Retrieval,Graph Neural Networks,Language Modeling #Ingegneria e scienze informatiche [LM-DM270] - Cesena
Tipo

PeerReviewed

info:eu-repo/semantics/masterThesis