Synthetic data generation for the assessment of antimicrobial resistance through machine learning


Autoria(s): Zaghi, Adriano
Contribuinte(s)

Sala, Claudia

Castellani, Gastone

Data(s)

23/09/2022

Resumo

As a consequence of the diffusion of next generation sequencing techniques, metagenomics databases have become one of the most promising repositories of information about features and behavior of microorganisms. One of the subjects that can be studied from those data are bacteria populations. Next generation sequencing techniques allow to study the bacteria population within an environment by sampling genetic material directly from it, without the needing of culturing a similar population in vitro and observing its behavior. As a drawback, it is quite complex to extract information from those data and usually there is more than one way to do that; AMR is no exception. In this study we will discuss how the quantified AMR, which regards the genotype of the bacteria, can be related to the bacteria phenotype and its actual level of resistance against the specific substance. In order to have a quantitative information about bacteria genotype, we will evaluate the resistome from the read libraries, aligning them against CARD database. With those data, we will test various machine learning algorithms for predicting the bacteria phenotype. The samples that we exploit should resemble those that could be obtained from a natural context, but are actually produced by a read libraries simulation tool. In this way we are able to design the populations with bacteria of known genotype, so that we can relay on a secure ground truth for training and testing our algorithms.

Formato

application/pdf

Identificador

http://amslaurea.unibo.it/26560/1/Tesi-1.pdf

Zaghi, Adriano (2022) Synthetic data generation for the assessment of antimicrobial resistance through machine learning. [Laurea magistrale], Università di Bologna, Corso di Studio in Physics [LM-DM270] <http://amslaurea.unibo.it/view/cds/CDS9245/>

Idioma(s)

en

Publicador

Alma Mater Studiorum - Università di Bologna

Relação

http://amslaurea.unibo.it/26560/

Direitos

cc_by_nc_sa4

Palavras-Chave #Metagenomics,DNA,Bacteria,AMR,Antibiotics,Simulated data,Machine learning,Anti Microbial Resistance,PCA,Principal Component Analysis,Random Forest,Ada Boost Classifier,Elastic Net,Logistic Regression #Physics [LM-DM270]
Tipo

PeerReviewed

info:eu-repo/semantics/masterThesis