807 resultados para Machine Learning,hepatocellular malignancies,HCC,MVI
Resumo:
Machine Learning makes computers capable of performing tasks typically requiring human intelligence. A domain where it is having a considerable impact is the life sciences, allowing to devise new biological analysis protocols, develop patients’ treatments efficiently and faster, and reduce healthcare costs. This Thesis work presents new Machine Learning methods and pipelines for the life sciences focusing on the unsupervised field. At a methodological level, two methods are presented. The first is an “Ab Initio Local Principal Path” and it is a revised and improved version of a pre-existing algorithm in the manifold learning realm. The second contribution is an improvement over the Import Vector Domain Description (one-class learning) through the Kullback-Leibler divergence. It hybridizes kernel methods to Deep Learning obtaining a scalable solution, an improved probabilistic model, and state-of-the-art performances. Both methods are tested through several experiments, with a central focus on their relevance in life sciences. Results show that they improve the performances achieved by their previous versions. At the applicative level, two pipelines are presented. The first one is for the analysis of RNA-Seq datasets, both transcriptomic and single-cell data, and is aimed at identifying genes that may be involved in biological processes (e.g., the transition of tissues from normal to cancer). In this project, an R package is released on CRAN to make the pipeline accessible to the bioinformatic Community through high-level APIs. The second pipeline is in the drug discovery domain and is useful for identifying druggable pockets, namely regions of a protein with a high probability of accepting a small molecule (a drug). Both these pipelines achieve remarkable results. Lastly, a detour application is developed to identify the strengths/limitations of the “Principal Path” algorithm by analyzing Convolutional Neural Networks induced vector spaces. This application is conducted in the music and visual arts domains.
Resumo:
Trying to explain to a robot what to do is a difficult undertaking, and only specific types of people have been able to do so far, such as programmers or operators who have learned how to use controllers to communicate with a robot. My internship's goal was to create and develop a framework that would make that easier. The system uses deep learning techniques to recognize a set of hand gestures, both static and dynamic. Then, based on the gesture, it sends a command to a robot. To be as generic as feasible, the communication is implemented using Robot Operating System (ROS). Furthermore, users can add new recognizable gestures and link them to new robot actions; a finite state automaton enforces the users' input verification and correct action sequence. Finally, the users can create and utilize a macro to describe a sequence of actions performable by a robot.
Resumo:
Il ruolo dell’informatica è diventato chiave del funzionamento del mondo moderno, ormai sempre più in progressiva digitalizzazione di ogni singolo aspetto della vita dell’individuo. Con l’aumentare della complessità e delle dimensioni dei programmi, il rilevamento di errori diventa sempre di più un’attività difficile e che necessita l’impiego di tempo e risorse. Meccanismi di analisi del codice sorgente tradizionali sono esistiti fin dalla nascita dell’informatica stessa e il loro ruolo all’interno della catena produttiva di un team di programmatori non è mai stato cosi fondamentale come lo è tuttora. Questi meccanismi di analisi, però, non sono esenti da problematiche: il tempo di esecuzione su progetti di grandi dimensioni e la percentuale di falsi positivi possono, infatti, diventare un importante problema. Per questi motivi, meccanismi fondati su Machine Learning, e più in particolare Deep Learning, sono stati sviluppati negli ultimi anni. Questo lavoro di tesi si pone l’obbiettivo di esplorare e sviluppare un modello di Deep Learning per il riconoscimento di errori in un qualsiasi file sorgente scritto in linguaggio C e C++.
Resumo:
Il mondo della moda è in continua e costante evoluzione, non solo dal punto di vista sociale, ma anche da quello tecnologico. Nel corso del presente elaborato si è studiata la possibilità di riconoscere e segmentare abiti presenti in una immagine utilizzando reti neurali profonde e approcci moderni. Sono state, quindi, analizzate reti quali FasterRCNN, MaskRCNN, YOLOv5, FashionPedia e Match-RCNN. In seguito si è approfondito l’addestramento delle reti neurali profonde in scenari di alta parallelizzazione e su macchine dotate di molteplici GPU al fine di ridurre i tempi di addestramento. Inoltre si è sperimentata la possibilità di creare una rete per prevedere se un determinato abito possa avere successo in futuro analizzando semplicemente dati passati e una immagine del vestito in questione. Necessaria per tali compiti è stata, inoltre, una approfondita analisi dei dataset esistenti nel mondo della moda e dei metodi per utilizzarli per l’addestramento. Il presente elaborato è stato svolto nell’ambito del progetto FA.RE.TRA. per il quale l'Università di Bologna svolge un compito di consulenza per lo studio di fattibilità su reti neurali in grado di svolgere i compiti menzionati.
Resumo:
La tesi ha lo scopo di ricercare, esaminare ed implementare un sistema di Machine Learning, un Recommendation Systems per precisione, che permetta la racommandazione di documenti di natura giuridica, i quali sono già stati analizzati e categorizzati appropriatamente, in maniera ottimale, il cui scopo sarebbe quello di accompagnare un sistema già implementato di Information Retrieval, istanziato sopra una web application, che permette di ricercare i documenti giuridici appena menzionati.
Resumo:
Many real-word decision- making problems are defined based on forecast parameters: for example, one may plan an urban route by relying on traffic predictions. In these cases, the conventional approach consists in training a predictor and then solving an optimization problem. This may be problematic since mistakes made by the predictor may trick the optimizer into taking dramatically wrong decisions. Recently, the field of Decision-Focused Learning overcomes this limitation by merging the two stages at training time, so that predictions are rewarded and penalized based on their outcome in the optimization problem. There are however still significant challenges toward a widespread adoption of the method, mostly related to the limitation in terms of generality and scalability. One possible solution for dealing with the second problem is introducing a caching-based approach, to speed up the training process. This project aims to investigate these techniques, in order to reduce even more, the solver calls. For each considered method, we designed a particular smart sampling approach, based on their characteristics. In the case of the SPO method, we ended up discovering that it is only necessary to initialize the cache with only several solutions; those needed to filter the elements that we still need to properly learn. For the Blackbox method, we designed a smart sampling approach, based on inferred solutions.
Resumo:
Le interfacce cervello-macchina (BMIs) permettono di guidare devices esterni utilizzando segnali neurali. Le BMIs rappresentano un’importante tecnologia per tentare di ripristinare funzioni perse in patologie che interrompono il canale di comunicazione tra cervello e corpo, come malattie neurodegenerative o lesioni spinali. Di importanza chiave per il corretto funzionamento di una BCI è la decodifica dei segnali neurali per trasformarli in segnali idonei per guidare devices esterni. Negli anni sono stati implementati diversi tipi di algoritmi. Tra questi gli algoritmi di machine learning imparano a riconoscere i pattern neurali di attivazione mappando con grande efficienza l’input, possibilmente l’attività dei neuroni, con l’output, ad esempio i comandi motori per guidare una possibile protesi. Tra gli algoritmi di machine learning ci si è focalizzati sulle deep neural networks (DNN). Un problema delle DNN è l’elevato tempo di training. Questo infatti prevede il calcolo dei parametri ottimali della rete per minimizzare l’errore di predizione. Per ridurre questo problema si possono utilizzare le reti neurali convolutive (CNN), reti caratterizzate da minori parametri di addestramento rispetto ad altri tipi di DNN con maggiori parametri come le reti neurali ricorrenti (RNN). In questo elaborato è esposto uno studio esplorante l’utilizzo innovativo di CNN per la decodifica dell’attività di neuroni registrati da macaco sveglio mentre svolgeva compiti motori. La CNN risultante ha consentito di ottenere risultati comparabili allo stato dell’arte con un minor numero di parametri addestrabili. Questa caratteristica in futuro potrebbe essere chiave per l’utilizzo di questo tipo di reti all’interno di BMIs grazie ai tempi di calcolo ridotti, consentendo in tempo reale la traduzione di un segnale neurale in segnali per muovere neuroprotesi.
Resumo:
Due to the imprecise nature of biological experiments, biological data is often characterized by the presence of redundant and noisy data. This may be due to errors that occurred during data collection, such as contaminations in laboratorial samples. It is the case of gene expression data, where the equipments and tools currently used frequently produce noisy biological data. Machine Learning algorithms have been successfully used in gene expression data analysis. Although many Machine Learning algorithms can deal with noise, detecting and removing noisy instances from the training data set can help the induction of the target hypothesis. This paper evaluates the use of distance-based pre-processing techniques for noise detection in gene expression data classification problems. This evaluation analyzes the effectiveness of the techniques investigated in removing noisy data, measured by the accuracy obtained by different Machine Learning classifiers over the pre-processed data.
Resumo:
Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdos-Renyi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabasi-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree k variation, decreasing its network recovery rate with the increase of k. The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.
Resumo:
Hepatocellular carcinoma (HCC) ranks in prevalence and mortality among top 10 cancers worldwide. Butyric acid (BA), a member of histone deacetylase inhibitors (HDACi) has been proposed as an anticareinogenic agent. However, its short half-life is a therapeutical limitation. This problem could be circumvented with tributyrin (TB), a proposed BA prodrug. To investigate TB effectiveness for chemoprevention, rats were treated with the compound during initial phases of ""resistant hepatocyte"" model of hepatocarcinogenesis, and cellular and molecular parameters were evaluated. TB inhibited (p < 0.05) development of hepatic preneoplastic lesions (PNL) including persistent ones considered HCC progression sites. TB increased (p < 0.05) PNL remodeling, a process whereby they tend to disappear. TB did not inhibit cell proliferation in PNL, but induced (p < 0.05) apoptosis in remodeling ones. Compared to controls, rats treated with TB presented increased (P < 0.05) hepatic levels of BA indicating its effectiveness as a prodrug. Molecular mechanisms of TB-induced hepatocarcinogenesis chemoprevention were investigated. TB increased (p < 0.05) hepatic nuclear histone H3K9 hyperacetylation specifically in PNL and p21 protein expression, which could be associated with inhibitory HDAC effects. Moreover, it reduced (p < 0.05) the frequency of persistent PNL with aberrant cytoplasmic p53 accumulation, an alteration associated with increased malignancy. Original data observed in our study support the effectiveness of TB as a prodrug of BA and as an HDACi in hepatocarcinogenesis chemoprevention. Besides histone acetylation and p21 restored expression, molecular mechanisms involved with TB anticarcinogenic actions could also be related to modulation of p53 pathways. (C) 2008 Wiley-Liss, Inc.
Resumo:
DNA mismatch repair is an important mechanism involved in maintaining the fidelity of genomic DNA. Defective DNA mismatch repair is implicated in a variety of gastrointestinal and other turners; however, its role in hepatocellular carcinoma (HCC) has not been assessed. Formalin-fixed, paraffin-embedded archival pathology tissues from 46 primary liver tumors were studied by microdissection and microsatellite analysis of extracted DNA to assess the degree of microsatellite instability, a marker of defective mismatch repair, and to determine the extent and timing of allelic loss of two DNA mismatch repair genes, human Mut S homologue-2 (hMSH2) and human Mut L homologue-1 (hMLH1), and the tumor suppressor genes adenomatous polyposis coli gene (APC), p53, and DPC4. Microsatellite instability was detected in 16 of the tumors (34.8%). Loss of heterozygosity at microsatellites linked to the DNA mismatch repair genes, hMSH2 and/or hMLH1, was found in 9 cases (19.6%), usually in association with microsatellite instability. Importantly, the pattern of allelic loss was uniform in 8 of these 9 tumors, suggesting that clonal loss had occurred. Moreover, loss at these loci also occurred in nonmalignant tissue adjacent to 4 of these tumors, where it was associated with marked allelic heterogeneity. There was relatively infrequent loss of APC, p53, or DPC4 loci that appeared unrelated to loss of hMSH2 or hMLH1 gene loci. Loss of heterozygosity at hMSH2 and/or hMLH1 gene loci, and the associated microsatellite instability in premalignant hepatic tissues suggests a possible causal role in hepatic carcinogenesis in a subset of hepatomas.
Resumo:
Hepatocellular carcinoma (HCC) is associated with multiple risk factors and is believed to arise from pre-neoplastic lesions, usually in the background of cirrhosis. However, the genetic and epigenetic events of hepatocarcinogenesis are relatively poorly understood. HCC display gross genomic alterations, including chromosomal instability (CIN), CpG island methylation, DNA rearrangements associated with hepatitis B virus (HBV) DNA integration, DNA hypomethylation and, to a lesser degree, microsatellite instability. Various studies have reported CIN at chromosomal regions, 1p, 4q, 5q, 6q, 8p, 10q, 11p, 16p, 16q, 17p and 22q. Frequent promoter hypermethylation and subsequent loss of protein expression has also been demonstrated in HCC at tumor suppressor gene (TSG), p16, p14, p15, SOCS1, RIZ1, E-cadherin and 14-3-3 sigma. An interesting observation emerging from these studies is the presence of a methylator phenotype in hepatocarcinogenesis, although it does not seem advantageous to have high levels of microsatellite instability. Methylation also appears to be an early event, suggesting that this may precede cirrhosis. However, these genes have been studied in isolation and global studies of methylator phenotype are required to assess the significance of epigenetic silencing in hepatocarcinogenesis. Based on previous data there are obvious fundamental differences in the mechanisms of hepatic carcinogenesis, with at least two distinct mechanisms of malignant transformation in the liver, related to CIN and CpG island methylation. The reason for these differences and the relative importance of these mechanisms are not clear but likely relate to the etiopathogenesis of HCC. Defining these broad mechanisms is a necessary prelude to determine the timing of events in malignant transformation of the liver and to investigate the role of known risk factors for HCC.
Resumo:
The longest open reading frame of PKHD1 (polycystic kidney and hepatic disease 1), the autosomal recessive polycystic kidney disease (ARPKD) gene, encodes a single-pass, integral membrane protein named polyductin or fibrocystin. A fusion protein comprising its intracellular C-terminus, FP2, was previously used to raise a polyclonal antiserum shown to detect polyductin in several human tissues, including liver. In the current study, we aimed to investigate by immunohistochemistry the detailed polyductin localization pattern in normal (ductal plate [DP], remodelling ductal plate [RDP], remodelled bile ducts) and abnormal development of the primitive intrahepatic biliary system, known as ductal plate malformation (DPM). This work also included the characterization of polyductin expression profile in various histological forms of neonatal and infantile cholestasis, and in cholangiocellular carcinoma (CCC) and hepatocellular carcinoma (HCC). We detected polyductin expression in the intrahepatic biliary system during the DP and the RDP stages as well as in DPM. No specific staining was found at the stage of remodelled bile ducts. Polyductin was also detected in liver biopsies with neonatal cholestasis, including mainly biliary atresia and neonatal hepatitis with ductular reaction as well as congenital hepatic fibrosis. In addition, polyductin was present in CCC, whereas it was absent in HCC. Polyductin was also co-localized in some DP cells together with oval stem cell markers. These results represent the first systematic study of polyductin expression in human pathologies associated with abnormal development of intrahepatic biliary tree, and support the following conclusions: (i) polyductin expression mirrors developmental properties of the primitive intrahepatic biliary system; (ii) polyductin is re-expressed in pathological conditions associated with DPM and (iii) polyductin might be a potential marker to distinguish CCC from HCC.
Resumo:
Recent studies have demonstrated that spatial patterns of fMRI BOLD activity distribution over the brain may be used to classify different groups or mental states. These studies are based on the application of advanced pattern recognition approaches and multivariate statistical classifiers. Most published articles in this field are focused on improving the accuracy rates and many approaches have been proposed to accomplish this task. Nevertheless, a point inherent to most machine learning methods (and still relatively unexplored in neuroimaging) is how the discriminative information can be used to characterize groups and their differences. In this work, we introduce the Maximum Uncertainty Linear Discrimination Analysis (MLDA) and show how it can be applied to infer groups` patterns by discriminant hyperplane navigation. In addition, we show that it naturally defines a behavioral score, i.e., an index quantifying the distance between the states of a subject from predefined groups. We validate and illustrate this approach using a motor block design fMRI experiment data with 35 subjects. (C) 2008 Elsevier Inc. All rights reserved.