43 resultados para Machine Learning,hepatocellular malignancies,HCC,MVI


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this thesis, we investigate the role of applied physics in epidemiological surveillance through the application of mathematical models, network science and machine learning. The spread of a communicable disease depends on many biological, social, and health factors. The large masses of data available make it possible, on the one hand, to monitor the evolution and spread of pathogenic organisms; on the other hand, to study the behavior of people, their opinions and habits. Presented here are three lines of research in which an attempt was made to solve real epidemiological problems through data analysis and the use of statistical and mathematical models. In Chapter 1, we applied language-inspired Deep Learning models to transform influenza protein sequences into vectors encoding their information content. We then attempted to reconstruct the antigenic properties of different viral strains using regression models and to identify the mutations responsible for vaccine escape. In Chapter 2, we constructed a compartmental model to describe the spread of a bacterium within a hospital ward. The model was informed and validated on time series of clinical measurements, and a sensitivity analysis was used to assess the impact of different control measures. Finally (Chapter 3) we reconstructed the network of retweets among COVID-19 themed Twitter users in the early months of the SARS-CoV-2 pandemic. By means of community detection algorithms and centrality measures, we characterized users’ attention shifts in the network, showing that scientific communities, initially the most retweeted, lost influence over time to national political communities. In the Conclusion, we highlighted the importance of the work done in light of the main contemporary challenges for epidemiological surveillance. In particular, we present reflections on the importance of nowcasting and forecasting, the relationship between data and scientific research, and the need to unite the different scales of epidemiological surveillance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The rapid progression of biomedical research coupled with the explosion of scientific literature has generated an exigent need for efficient and reliable systems of knowledge extraction. This dissertation contends with this challenge through a concentrated investigation of digital health, Artificial Intelligence, and specifically Machine Learning and Natural Language Processing's (NLP) potential to expedite systematic literature reviews and refine the knowledge extraction process. The surge of COVID-19 complicated the efforts of scientists, policymakers, and medical professionals in identifying pertinent articles and assessing their scientific validity. This thesis presents a substantial solution in the form of the COKE Project, an initiative that interlaces machine reading with the rigorous protocols of Evidence-Based Medicine to streamline knowledge extraction. In the framework of the COKE (“COVID-19 Knowledge Extraction framework for next-generation discovery science”) Project, this thesis aims to underscore the capacity of machine reading to create knowledge graphs from scientific texts. The project is remarkable for its innovative use of NLP techniques such as a BERT + bi-LSTM language model. This combination is employed to detect and categorize elements within medical abstracts, thereby enhancing the systematic literature review process. The COKE project's outcomes show that NLP, when used in a judiciously structured manner, can significantly reduce the time and effort required to produce medical guidelines. These findings are particularly salient during times of medical emergency, like the COVID-19 pandemic, when quick and accurate research results are critical.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent decades, two prominent trends have influenced the data modeling field, namely network analysis and machine learning. This thesis explores the practical applications of these techniques within the domain of drug research, unveiling their multifaceted potential for advancing our comprehension of complex biological systems. The research undertaken during this PhD program is situated at the intersection of network theory, computational methods, and drug research. Across six projects presented herein, there is a gradual increase in model complexity. These projects traverse a diverse range of topics, with a specific emphasis on drug repurposing and safety in the context of neurological diseases. The aim of these projects is to leverage existing biomedical knowledge to develop innovative approaches that bolster drug research. The investigations have produced practical solutions, not only providing insights into the intricacies of biological systems, but also allowing the creation of valuable tools for their analysis. In short, the achievements are: • A novel computational algorithm to identify adverse events specific to fixed-dose drug combinations. • A web application that tracks the clinical drug research response to SARS-CoV-2. • A Python package for differential gene expression analysis and the identification of key regulatory "switch genes". • The identification of pivotal events causing drug-induced impulse control disorders linked to specific medications. • An automated pipeline for discovering potential drug repurposing opportunities. • The creation of a comprehensive knowledge graph and development of a graph machine learning model for predictions. Collectively, these projects illustrate diverse applications of data science and network-based methodologies, highlighting the profound impact they can have in supporting drug research activities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background There is a wide variation of recurrence risk of Non-small-cell lung cancer (NSCLC) within the same Tumor Node Metastasis (TNM) stage, suggesting that other parameters are involved in determining this probability. Radiomics allows extraction of quantitative information from images that can be used for clinical purposes. The primary objective of this study is to develop a radiomic prognostic model that predicts a 3 year disease free-survival (DFS) of resected Early Stage (ES) NSCLC patients. Material and Methods 56 pre-surgery non contrast Computed Tomography (CT) scans were retrieved from the PACS of our institution and anonymized. Then they were automatically segmented with an open access deep learning pipeline and reviewed by an experienced radiologist to obtain 3D masks of the NSCLC. Images and masks underwent to resampling normalization and discretization. From the masks hundreds Radiomic Features (RF) were extracted using Py-Radiomics. Hence, RF were reduced to select the most representative features. The remaining RF were used in combination with Clinical parameters to build a DFS prediction model using Leave-one-out cross-validation (LOOCV) with Random Forest. Results and Conclusion A poor agreement between the radiologist and the automatic segmentation algorithm (DICE score of 0.37) was found. Therefore, another experienced radiologist manually segmented the lesions and only stable and reproducible RF were kept. 50 RF demonstrated a high correlation with the DFS but only one was confirmed when clinicopathological covariates were added: Busyness a Neighbouring Gray Tone Difference Matrix (HR 9.610). 16 clinical variables (which comprised TNM) were used to build the LOOCV model demonstrating a higher Area Under the Curve (AUC) when RF were included in the analysis (0.67 vs 0.60) but the difference was not statistically significant (p=0,5147).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Although the debate of what data science is has a long history and has not reached a complete consensus yet, Data Science can be summarized as the process of learning from data. Guided by the above vision, this thesis presents two independent data science projects developed in the scope of multidisciplinary applied research. The first part analyzes fluorescence microscopy images typically produced in life science experiments, where the objective is to count how many marked neuronal cells are present in each image. Aiming to automate the task for supporting research in the area, we propose a neural network architecture tuned specifically for this use case, cell ResUnet (c-ResUnet), and discuss the impact of alternative training strategies in overcoming particular challenges of our data. The approach provides good results in terms of both detection and counting, showing performance comparable to the interpretation of human operators. As a meaningful addition, we release the pre-trained model and the Fluorescent Neuronal Cells dataset collecting pixel-level annotations of where neuronal cells are located. In this way, we hope to help future research in the area and foster innovative methodologies for tackling similar problems. The second part deals with the problem of distributed data management in the context of LHC experiments, with a focus on supporting ATLAS operations concerning data transfer failures. In particular, we analyze error messages produced by failed transfers and propose a Machine Learning pipeline that leverages the word2vec language model and K-means clustering. This provides groups of similar errors that are presented to human operators as suggestions of potential issues to investigate. The approach is demonstrated on one full day of data, showing promising ability in understanding the message content and providing meaningful groupings, in line with previously reported incidents by human operators.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background and rationale for the study. This study investigated whether human immunodeficiency virus (HIV) infection adversely affects the prognosis of patients diagnosed with hepatocellular carcinoma (HCC).Thirty-four HIV-positive patients with chronic liver disease, consecutively diagnosed with HCC from 1998 to 2007 were one-to-one matched with 34 HIV negative controls for: sex, liver function (Child-Turcotte-Pugh class [CTP]), cancer stage (BCLC model) and, whenever possible, age, etiology of liver disease and modality of cancer diagnosis. Survival in the two groups and independent prognostic predictors were assessed. Results. Among HIV patients 88% were receiving HAART. HIV-RNA was undetectable in 65% of cases; median lymphocyte CD4+ count was 368.5/mmc. Etiology of liver disease was mostly related to HCV infection. CTP class was: A in 38%, B in 41%, C in 21% of cases. BCLC cancer stage was: early in 50%, intermediate in 23.5%, advanced in 5.9%, end-stage in 20.6% of cases. HCC treatments and death causes did not differ between the two groups. Median survival did not differ, being 16 months (95% CI: 6-26) in HIV positive and 23 months (95% CI: 5-41) in HIV negative patients (P=0.391). BCLC cancer stage and HCC treatment proved to be independent predictors of survival both in the whole population and in HIV patients. Conclusions. Survival of HIV infected patients receiving antiretroviral therapy and diagnosed with HCC is similar to that of HIV negative patients bearing this tumor. Prognosis is determined by the cancer bulk and its treatment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Notch signalling is a cellular pathway that results conserved from Drosophila to Homo sapiens controlling a wide range of cellular processes in development and in differentiated organs. It induces cell proliferation or differentiation, increased survival or apoptosis, and it is involved in stemness maintainance. These functions are conserved, but exerted with a high tissue and cellular context specificity. Signalling activation determs nuclear translocation of the receptor’s cytoplasmic domain and activation of target genes transcription. As many developmental pathway, Notch deregulation is involved in cancer, leading to oncogenic or tumour suppressive role depending on the functions exerted in normal tissue. Notch1 and Notch3 resulted aberrantly expressed in human hepatocellular carcinoma (HCC) that is the more frequent tumour of the liver and the sixth most common tumour worldwide. This thesis has the aim to investigate the role of the signalling in HCC, with particular attention to dissect common and uncommon regulatory pathways between Notch1 and Notch3 and to define the role of the signalling in HCC. Nocth1 and Notch3 were analysed on their regulation on Hes1 target and involvement in cell cycle control. They showed to regulate CDKN1C/p57kip2 expression through Hes1 target. CDKN1C/p57kip2 induces not only cell cycle arrest, but also senescence in HCC cell lines. Moreover, the involvement of Notch1 in cancer progression and epithelial to mesenchymal transition was investigated. Notch1 showed to induce invasion of HCC, regulating EMT and E- Cadherin expression. Moreover, Notch3 showed specific regulation on p53 at post translational levels. In vitro and ex vivo analysis on HCC samples suggests a complex role of both receptors in regulate HCC, with an oncogenic role but also showing tumour suppressive effects, suggesting a complex and deep involvement of this signalling in HCC.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In many application domains data can be naturally represented as graphs. When the application of analytical solutions for a given problem is unfeasible, machine learning techniques could be a viable way to solve the problem. Classical machine learning techniques are defined for data represented in a vectorial form. Recently some of them have been extended to deal directly with structured data. Among those techniques, kernel methods have shown promising results both from the computational complexity and the predictive performance point of view. Kernel methods allow to avoid an explicit mapping in a vectorial form relying on kernel functions, which informally are functions calculating a similarity measure between two entities. However, the definition of good kernels for graphs is a challenging problem because of the difficulty to find a good tradeoff between computational complexity and expressiveness. Another problem we face is learning on data streams, where a potentially unbounded sequence of data is generated by some sources. There are three main contributions in this thesis. The first contribution is the definition of a new family of kernels for graphs based on Directed Acyclic Graphs (DAGs). We analyzed two kernels from this family, achieving state-of-the-art results from both the computational and the classification point of view on real-world datasets. The second contribution consists in making the application of learning algorithms for streams of graphs feasible. Moreover,we defined a principled way for the memory management. The third contribution is the application of machine learning techniques for structured data to non-coding RNA function prediction. In this setting, the secondary structure is thought to carry relevant information. However, existing methods considering the secondary structure have prohibitively high computational complexity. We propose to apply kernel methods on this domain, obtaining state-of-the-art results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Introduzione Attualmente i principali punti critici del trattamento dell’HCC avanzato sono: 1) la mancanza di predittori di risposta alla terapia con sorafenib, 2) lo sviluppo resistenze al sorafenib, 3) la mancanza di terapie di seconda linea codificate. Scopo della tesi 1) ricerca di predittori clinico-laboratoristici di risposta al sorafenib in pazienti ambulatoriali con HCC; 2) valutazione dell’impatto della sospensione temporanea-definitiva del sorafenib in un modello murino di HCC mediante tecniche ecografiche; 3) valutazione dell’efficacia della capecitabina metronomica come seconda linea dell’HCC non responsivo a sorafenib. Risultati Studio-1: 94 pazienti con HCC trattato con sorafenib: a presenza di metastasi e PVT-neoplastica non sembra inficiare l’efficacia del sorafenib. AFP basale <19 ng/ml è risultata predittrice di maggiore sopravvivenza, mentre lo sviluppo di nausea di una peggiore sopravvivenza. Studio -2: 14 topi con xenografts di HCC: gruppo-1 trattato con placebo, gruppo-2 trattato con sorafenib con interruzione temporanea del farmaco e gruppo-3 trattato con sorafenib con sospensione definitiva del sorafenib. La CEUS targettata per il VEGFR2 ha mostrato al giorno 13 valori maggiori di dTE nel gruppo-3 confermato da un aumento del VEGFR2 al Western-Blot. I tumori del gruppo-2 dopo 2 giorni di ritrattamento, hanno mostrato un aumento dell’elasticità tissutale all’elastonografia. Studio-3:19 pazienti trattati con capecitabina metronomica dopo sorafenib. Il TTP è stato di 5 mesi (95% CI 0-10), la PFS di 3,6 mesi (95% CI 2,8-4,3) ed la OS di 6,3 mesi (95% CI 4-8,6). Conclusioni Lo sviluppo di nausea ed astenia ed AFP basale >19, sono risultati predittivi di una minore risposta al sorafenib. La sospensione temporanea del sorafenib in un modello murino di HCC non impedisce il ripristino della risposta tumorale, mentre una interruzione definitiva tende a stimolare un “effetto rebound” dell’angiogenesi. La capecitabina metronomica dopo sorafenib ha mostrato una discreta attività anti-neoplastica ed una sicurezza accettabile.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Information is nowadays a key resource: machine learning and data mining techniques have been developed to extract high-level information from great amounts of data. As most data comes in form of unstructured text in natural languages, research on text mining is currently very active and dealing with practical problems. Among these, text categorization deals with the automatic organization of large quantities of documents in priorly defined taxonomies of topic categories, possibly arranged in large hierarchies. In commonly proposed machine learning approaches, classifiers are automatically trained from pre-labeled documents: they can perform very accurate classification, but often require a consistent training set and notable computational effort. Methods for cross-domain text categorization have been proposed, allowing to leverage a set of labeled documents of one domain to classify those of another one. Most methods use advanced statistical techniques, usually involving tuning of parameters. A first contribution presented here is a method based on nearest centroid classification, where profiles of categories are generated from the known domain and then iteratively adapted to the unknown one. Despite being conceptually simple and having easily tuned parameters, this method achieves state-of-the-art accuracy in most benchmark datasets with fast running times. A second, deeper contribution involves the design of a domain-independent model to distinguish the degree and type of relatedness between arbitrary documents and topics, inferred from the different types of semantic relationships between respective representative words, identified by specific search algorithms. The application of this model is tested on both flat and hierarchical text categorization, where it potentially allows the efficient addition of new categories during classification. Results show that classification accuracy still requires improvements, but models generated from one domain are shown to be effectively able to be reused in a different one.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background and Aims: Hepatocellular carcinoma (HCC) represents the second leading cause of cancer deaths worldwide. Protein induced by vitamin K absence (PIVKA-II) has been proposed as potential screening biomarker for HCC.This study has been designed to evaluate the role of PIVKA-II as diagnostic HCC marker, through the comparison between PIVKA-II and alpha-fetoprotein (AFP) serum levels on HCC patients and the two control groupsof patients with liver disease and without HCC. Methods: In an Italian prospective cohort, PIVKA-II levels were assessed on serum samplesby an automated chemiluminescent immunoassay (Abbott ARCHITECT). The study population included 65 patients with HCC (both “de novo” and recurrent), 111 with liver cirrhosis (LC) and 111 with chronic hepatitis C (CHC). Results: PIVKA-II levels were increased in patients with HCC (median 63.75, range: 12-2675 mAU/mL) compared to LC (median value: 30.95, range: 11.70–1251mAU / mL, Mann Whitney test p < 0.0001) and CHC (median value: 24.89, range: 12.98-67.68mAU / mL, p < 0.0001).The area under curve (AUC) for PIVKA-II was 0.817 (95% Confidence Interval(CI), 0.752-0.881). At the optimal threshold of 37 mAU / mL, identified by the Youden Index, the sensitivity and specificity were 79% and 76%, respectively. PIVKA-II was a better biomarker than AFP for the diagnosis of HCC, since the AUC for AFP was 0.670 (95% CI 0.585-0.754, p<0.0001) and at the best cutoff of 16.4 ng / mL AFP yielded 98% specificity but only 34% sensitivity. Conclusions:These initial data suggest the potential utility of this tool in the diagnosis of HCC.PIVKA-II alone or in combination may help to an early diagnosis of HCC and a significant optimization of patient management.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays robotic applications are widespread and most of the manipulation tasks are efficiently solved. However, Deformable-Objects (DOs) still represent a huge limitation for robots. The main difficulty in DOs manipulation is dealing with the shape and dynamics uncertainties, which prevents the use of model-based approaches (since they are excessively computationally complex) and makes sensory data difficult to interpret. This thesis reports the research activities aimed to address some applications in robotic manipulation and sensing of Deformable-Linear-Objects (DLOs), with particular focus to electric wires. In all the works, a significant effort was made in the study of an effective strategy for analyzing sensory signals with various machine learning algorithms. In the former part of the document, the main focus concerns the wire terminals, i.e. detection, grasping, and insertion. First, a pipeline that integrates vision and tactile sensing is developed, then further improvements are proposed for each module. A novel procedure is proposed to gather and label massive amounts of training images for object detection with minimal human intervention. Together with this strategy, we extend a generic object detector based on Convolutional-Neural-Networks for orientation prediction. The insertion task is also extended by developing a closed-loop control capable to guide the insertion of a longer and curved segment of wire through a hole, where the contact forces are estimated by means of a Recurrent-Neural-Network. In the latter part of the thesis, the interest shifts to the DLO shape. Robotic reshaping of a DLO is addressed by means of a sequence of pick-and-place primitives, while a decision making process driven by visual data learns the optimal grasping locations exploiting Deep Q-learning and finds the best releasing point. The success of the solution leverages on a reliable interpretation of the DLO shape. For this reason, further developments are made on the visual segmentation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the most visionary goals of Artificial Intelligence is to create a system able to mimic and eventually surpass the intelligence observed in biological systems including, ambitiously, the one observed in humans. The main distinctive strength of humans is their ability to build a deep understanding of the world by learning continuously and drawing from their experiences. This ability, which is found in various degrees in all intelligent biological beings, allows them to adapt and properly react to changes by incrementally expanding and refining their knowledge. Arguably, achieving this ability is one of the main goals of Artificial Intelligence and a cornerstone towards the creation of intelligent artificial agents. Modern Deep Learning approaches allowed researchers and industries to achieve great advancements towards the resolution of many long-standing problems in areas like Computer Vision and Natural Language Processing. However, while this current age of renewed interest in AI allowed for the creation of extremely useful applications, a concerningly limited effort is being directed towards the design of systems able to learn continuously. The biggest problem that hinders an AI system from learning incrementally is the catastrophic forgetting phenomenon. This phenomenon, which was discovered in the 90s, naturally occurs in Deep Learning architectures where classic learning paradigms are applied when learning incrementally from a stream of experiences. This dissertation revolves around the Continual Learning field, a sub-field of Machine Learning research that has recently made a comeback following the renewed interest in Deep Learning approaches. This work will focus on a comprehensive view of continual learning by considering algorithmic, benchmarking, and applicative aspects of this field. This dissertation will also touch on community aspects such as the design and creation of research tools aimed at supporting Continual Learning research, and the theoretical and practical aspects concerning public competitions in this field.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Deep learning methods are extremely promising machine learning tools to analyze neuroimaging data. However, their potential use in clinical settings is limited because of the existing challenges of applying these methods to neuroimaging data. In this study, first a data leakage type caused by slice-level data split that is introduced during training and validation of a 2D CNN is surveyed and a quantitative assessment of the model’s performance overestimation is presented. Second, an interpretable, leakage-fee deep learning software written in a python language with a wide range of options has been developed to conduct both classification and regression analysis. The software was applied to the study of mild cognitive impairment (MCI) in patients with small vessel disease (SVD) using multi-parametric MRI data where the cognitive performance of 58 patients measured by five neuropsychological tests is predicted using a multi-input CNN model taking brain image and demographic data. Each of the cognitive test scores was predicted using different MRI-derived features. As MCI due to SVD has been hypothesized to be the effect of white matter damage, DTI-derived features MD and FA produced the best prediction outcome of the TMT-A score which is consistent with the existing literature. In a second study, an interpretable deep learning system aimed at 1) classifying Alzheimer disease and healthy subjects 2) examining the neural correlates of the disease that causes a cognitive decline in AD patients using CNN visualization tools and 3) highlighting the potential of interpretability techniques to capture a biased deep learning model is developed. Structural magnetic resonance imaging (MRI) data of 200 subjects was used by the proposed CNN model which was trained using a transfer learning-based approach producing a balanced accuracy of 71.6%. Brain regions in the frontal and parietal lobe showing the cerebral cortex atrophy were highlighted by the visualization tools.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Deep Neural Networks (DNNs) have revolutionized a wide range of applications beyond traditional machine learning and artificial intelligence fields, e.g., computer vision, healthcare, natural language processing and others. At the same time, edge devices have become central in our society, generating an unprecedented amount of data which could be used to train data-hungry models such as DNNs. However, the potentially sensitive or confidential nature of gathered data poses privacy concerns when storing and processing them in centralized locations. To this purpose, decentralized learning decouples model training from the need of directly accessing raw data, by alternating on-device training and periodic communications. The ability of distilling knowledge from decentralized data, however, comes at the cost of facing more challenging learning settings, such as coping with heterogeneous hardware and network connectivity, statistical diversity of data, and ensuring verifiable privacy guarantees. This Thesis proposes an extensive overview of decentralized learning literature, including a novel taxonomy and a detailed description of the most relevant system-level contributions in the related literature for privacy, communication efficiency, data and system heterogeneity, and poisoning defense. Next, this Thesis presents the design of an original solution to tackle communication efficiency and system heterogeneity, and empirically evaluates it on federated settings. For communication efficiency, an original method, specifically designed for Convolutional Neural Networks, is also described and evaluated against the state-of-the-art. Furthermore, this Thesis provides an in-depth review of recently proposed methods to tackle the performance degradation introduced by data heterogeneity, followed by empirical evaluations on challenging data distributions, highlighting strengths and possible weaknesses of the considered solutions. Finally, this Thesis presents a novel perspective on the usage of Knowledge Distillation as a mean for optimizing decentralized learning systems in settings characterized by data heterogeneity or system heterogeneity. Our vision on relevant future research directions close the manuscript.