899 resultados para Multiple kernel learning
Resumo:
Whole Exome Sequencing (WES) is rapidly becoming the first-tier test in clinics, both thanks to its declining costs and the development of new platforms that help clinicians in the analysis and interpretation of SNV and InDels. However, we still know very little on how CNV detection could increase WES diagnostic yield. A plethora of exome CNV callers have been published over the years, all showing good performances towards specific CNV classes and sizes, suggesting that the combination of multiple tools is needed to obtain an overall good detection performance. Here we present TrainX, a ML-based method for calling heterozygous CNVs in WES data using EXCAVATOR2 Normalized Read Counts. We select males and females’ non pseudo-autosomal chromosome X alignments to construct our dataset and train our model, make predictions on autosomes target regions and use HMM to call CNVs. We compared TrainX against a set of CNV tools differing for the detection method (GATK4 gCNV, ExomeDepth, DECoN, CNVkit and EXCAVATOR2) and found that our algorithm outperformed them in terms of stability, as we identified both deletions and duplications with good scores (0.87 and 0.82 F1-scores respectively) and for sizes reaching the minimum resolution of 2 target regions. We also evaluated the method robustness using a set of WES and SNP array data (n=251), part of the Italian cohort of Epi25 collaborative, and were able to retrieve all clinical CNVs previously identified by the SNP array. TrainX showed good accuracy in detecting heterozygous CNVs of different sizes, making it a promising tool to use in a diagnostic setting.
Resumo:
Reinforcement Learning (RL) provides a powerful framework to address sequential decision-making problems in which the transition dynamics is unknown or too complex to be represented. The RL approach is based on speculating what is the best decision to make given sample estimates obtained from previous interactions, a recipe that led to several breakthroughs in various domains, ranging from game playing to robotics. Despite their success, current RL methods hardly generalize from one task to another, and achieving the kind of generalization obtained through unsupervised pre-training in non-sequential problems seems unthinkable. Unsupervised RL has recently emerged as a way to improve generalization of RL methods. Just as its non-sequential counterpart, the unsupervised RL framework comprises two phases: An unsupervised pre-training phase, in which the agent interacts with the environment without external feedback, and a supervised fine-tuning phase, in which the agent aims to efficiently solve a task in the same environment by exploiting the knowledge acquired during pre-training. In this thesis, we study unsupervised RL via state entropy maximization, in which the agent makes use of the unsupervised interactions to pre-train a policy that maximizes the entropy of its induced state distribution. First, we provide a theoretical characterization of the learning problem by considering a convex RL formulation that subsumes state entropy maximization. Our analysis shows that maximizing the state entropy in finite trials is inherently harder than RL. Then, we study the state entropy maximization problem from an optimization perspective. Especially, we show that the primal formulation of the corresponding optimization problem can be (approximately) addressed through tractable linear programs. Finally, we provide the first practical methodologies for state entropy maximization in complex domains, both when the pre-training takes place in a single environment as well as multiple environments.
Resumo:
The recent widespread use of social media platforms and web services has led to a vast amount of behavioral data that can be used to model socio-technical systems. A significant part of this data can be represented as graphs or networks, which have become the prevalent mathematical framework for studying the structure and the dynamics of complex interacting systems. However, analyzing and understanding these data presents new challenges due to their increasing complexity and diversity. For instance, the characterization of real-world networks includes the need of accounting for their temporal dimension, together with incorporating higher-order interactions beyond the traditional pairwise formalism. The ongoing growth of AI has led to the integration of traditional graph mining techniques with representation learning and low-dimensional embeddings of networks to address current challenges. These methods capture the underlying similarities and geometry of graph-shaped data, generating latent representations that enable the resolution of various tasks, such as link prediction, node classification, and graph clustering. As these techniques gain popularity, there is even a growing concern about their responsible use. In particular, there has been an increased emphasis on addressing the limitations of interpretability in graph representation learning. This thesis contributes to the advancement of knowledge in the field of graph representation learning and has potential applications in a wide range of complex systems domains. We initially focus on forecasting problems related to face-to-face contact networks with time-varying graph embeddings. Then, we study hyperedge prediction and reconstruction with simplicial complex embeddings. Finally, we analyze the problem of interpreting latent dimensions in node embeddings for graphs. The proposed models are extensively evaluated in multiple experimental settings and the results demonstrate their effectiveness and reliability, achieving state-of-the-art performances and providing valuable insights into the properties of the learned representations.
Resumo:
The Cherenkov Telescope Array (CTA) will be the next-generation ground-based observatory to study the universe in the very-high-energy domain. The observatory will rely on a Science Alert Generation (SAG) system to analyze the real-time data from the telescopes and generate science alerts. The SAG system will play a crucial role in the search and follow-up of transients from external alerts, enabling multi-wavelength and multi-messenger collaborations. It will maximize the potential for the detection of the rarest phenomena, such as gamma-ray bursts (GRBs), which are the science case for this study. This study presents an anomaly detection method based on deep learning for detecting gamma-ray burst events in real-time. The performance of the proposed method is evaluated and compared against the Li&Ma standard technique in two use cases of serendipitous discoveries and follow-up observations, using short exposure times. The method shows promising results in detecting GRBs and is flexible enough to allow real-time search for transient events on multiple time scales. The method does not assume background nor source models and doe not require a minimum number of photon counts to perform analysis, making it well-suited for real-time analysis. Future improvements involve further tests, relaxing some of the assumptions made in this study as well as post-trials correction of the detection significance. Moreover, the ability to detect other transient classes in different scenarios must be investigated for completeness. The system can be integrated within the SAG system of CTA and deployed on the onsite computing clusters. This would provide valuable insights into the method's performance in a real-world setting and be another valuable tool for discovering new transient events in real-time. Overall, this study makes a significant contribution to the field of astrophysics by demonstrating the effectiveness of deep learning-based anomaly detection techniques for real-time source detection in gamma-ray astronomy.
Resumo:
Machine Learning makes computers capable of performing tasks typically requiring human intelligence. A domain where it is having a considerable impact is the life sciences, allowing to devise new biological analysis protocols, develop patients’ treatments efficiently and faster, and reduce healthcare costs. This Thesis work presents new Machine Learning methods and pipelines for the life sciences focusing on the unsupervised field. At a methodological level, two methods are presented. The first is an “Ab Initio Local Principal Path” and it is a revised and improved version of a pre-existing algorithm in the manifold learning realm. The second contribution is an improvement over the Import Vector Domain Description (one-class learning) through the Kullback-Leibler divergence. It hybridizes kernel methods to Deep Learning obtaining a scalable solution, an improved probabilistic model, and state-of-the-art performances. Both methods are tested through several experiments, with a central focus on their relevance in life sciences. Results show that they improve the performances achieved by their previous versions. At the applicative level, two pipelines are presented. The first one is for the analysis of RNA-Seq datasets, both transcriptomic and single-cell data, and is aimed at identifying genes that may be involved in biological processes (e.g., the transition of tissues from normal to cancer). In this project, an R package is released on CRAN to make the pipeline accessible to the bioinformatic Community through high-level APIs. The second pipeline is in the drug discovery domain and is useful for identifying druggable pockets, namely regions of a protein with a high probability of accepting a small molecule (a drug). Both these pipelines achieve remarkable results. Lastly, a detour application is developed to identify the strengths/limitations of the “Principal Path” algorithm by analyzing Convolutional Neural Networks induced vector spaces. This application is conducted in the music and visual arts domains.
Resumo:
In medicine, innovation depends on a better knowledge of the human body mechanism, which represents a complex system of multi-scale constituents. Unraveling the complexity underneath diseases proves to be challenging. A deep understanding of the inner workings comes with dealing with many heterogeneous information. Exploring the molecular status and the organization of genes, proteins, metabolites provides insights on what is driving a disease, from aggressiveness to curability. Molecular constituents, however, are only the building blocks of the human body and cannot currently tell the whole story of diseases. This is why nowadays attention is growing towards the contemporary exploitation of multi-scale information. Holistic methods are then drawing interest to address the problem of integrating heterogeneous data. The heterogeneity may derive from the diversity across data types and from the diversity within diseases. Here, four studies conducted data integration using customly designed workflows that implement novel methods and views to tackle the heterogeneous characterization of diseases. The first study devoted to determine shared gene regulatory signatures for onco-hematology and it showed partial co-regulation across blood-related diseases. The second study focused on Acute Myeloid Leukemia and refined the unsupervised integration of genomic alterations, which turned out to better resemble clinical practice. In the third study, network integration for artherosclerosis demonstrated, as a proof of concept, the impact of network intelligibility when it comes to model heterogeneous data, which showed to accelerate the identification of new potential pharmaceutical targets. Lastly, the fourth study introduced a new method to integrate multiple data types in a unique latent heterogeneous-representation that facilitated the selection of important data types to predict the tumour stage of invasive ductal carcinoma. The results of these four studies laid the groundwork to ease the detection of new biomarkers ultimately beneficial to medical practice and to the ever-growing field of Personalized Medicine.
Resumo:
Sales prediction plays a huge role in modern business strategies. One of it's many use cases revolves around estimating the effects of promotions. While promotions generally have a positive effect on sales of the promoted product, they can also have a negative effect on those of other products. This phenomenon is calles sales cannibalisation. Sales cannibalisation can pose a big problem to sales forcasting algorithms. A lot of times, these algorithms focus on sales over time of a single product in a single store (a couple). This research focusses on using knowledge of a product across multiple different stores. To achieve this, we applied transfer learning on a neural model developed by Kantar Consulting to demo an approach to estimating the effect of cannibalisation. Our results show a performance increase of between 10 and 14 percent. This is a very good and desired result, and Kantar will use the approach when integrating this test method into their actual systems.
Resumo:
The cerebellum is an important site for cortical demyelination in multiple sclerosis, but the functional significance of this finding is not fully understood. To evaluate the clinical and cognitive impact of cerebellar grey-matter pathology in multiple sclerosis patients. Forty-two relapsing-remitting multiple sclerosis patients and 30 controls underwent clinical assessment including the Multiple Sclerosis Functional Composite, Expanded Disability Status Scale (EDSS) and cerebellar functional system (FS) score, and cognitive evaluation, including the Paced Auditory Serial Addition Test (PASAT) and the Symbol-Digit Modalities Test (SDMT). Magnetic resonance imaging was performed with a 3T scanner and variables of interest were: brain white-matter and cortical lesion load, cerebellar intracortical and leukocortical lesion volumes, and brain cortical and cerebellar white-matter and grey-matter volumes. After multivariate analysis high burden of cerebellar intracortical lesions was the only predictor for the EDSS (p<0.001), cerebellar FS (p = 0.002), arm function (p = 0.049), and for leg function (p<0.001). Patients with high burden of cerebellar leukocortical lesions had lower PASAT scores (p = 0.013), while patients with greater volumes of cerebellar intracortical lesions had worse SDMT scores (p = 0.015). Cerebellar grey-matter pathology is widely present and contributes to clinical dysfunction in relapsing-remitting multiple sclerosis patients, independently of brain grey-matter damage.
Resumo:
Desmoid tumor (DT) is a common manifestation of Gardner's Syndrome (GS), although it is a rare condition in the general population. DT in patients with GS is usually located in the abdominal wall and/or intra-abdominal cavity. We report a case of a 32 years-old female patient with familial adenomatous polyposis (FAP), who was already submitted to total colectomy and developed multiple DT, located in the abdominal wall and in the left breast. The patient underwent several surgical procedures, with a multidisciplinary team of surgeons. Wide surgical resections of the left breast and the abdominal wall tumors were performed in separate steps. Polypropylene mesh reconstruction and muscle flaps were needed to cover the defects of the thoracic and abdominal walls. After partial necrosis of the adipose-cutaneous flap in the abdomen that required a new skin graft, she had a satisfactory outcome with complete healing of the surgical incisions. DT is frequent in GS, however, breast localization is very rare, with few cases reported in the literature. Recurrence of DT is not negligible, even after a wide surgical resection. GS patients must be followed up closely, and clinical examination, associated with imaging studies, should be performed to detect any signs of tumor. DT represents one of the most significant causes of the morbidity and mortality that affects FAP patients following colectomy. In general, the surgical procedures to excise DT are highly complex, requiring a multidisciplinary team.
Resumo:
Sexual dysfunction (SD) affects up to 80% of multiple sclerosis (MS) patients and pelvic floor muscles (PFMs) play an important role in the sexual function of these patients. The objective of this paper is to evaluate the impact of a rehabilitation program to treat lower urinary tract symptoms on SD of women with MS. Thirty MS women were randomly allocated to one of three groups: pelvic floor muscle training (PFMT) with electromyographic (EMG) biofeedback and sham neuromuscular electrostimulation (NMES) (Group I), PFMT with EMG biofeedback and intravaginal NMES (Group II), and PFMT with EMG biofeedback and transcutaneous tibial nerve stimulation (TTNS) (Group III). Assessments, before and after the treatment, included: PFM function, PFM tone, flexibility of the vaginal opening and ability to relax the PFMs, and the Female Sexual Function Index (FSFI) questionnaire. After treatment, all groups showed improvements in all domains of the PERFECT scheme. PFM tone and flexibility of the vaginal opening was lower after the intervention only for Group II. All groups improved in arousal, lubrication, satisfaction and total score domains of the FSFI questionnaire. This study indicates that PFMT alone or in combination with intravaginal NMES or TTNS contributes to the improvement of SD.
Resumo:
346
Resumo:
Fingolimod is a new and efficient treatment for multiple sclerosis (MS). The drug administration requires special attention to the first dose, since cardiovascular adverse events can be observed during the initial six hours of fingolimod ingestion. The present study consisted of a review of cardiovascular data on 180 patients with MS receiving the first dose of fingolimod. The rate of bradycardia in these patients was higher than that observed in clinical trials with very strict inclusion criteria for patients. There were less than 10% of cases requiring special attention, but no fatal cases. All but one patient continued the treatment after this initial dose. This is the first report on real-life administration of fingolimod to Brazilian patients with MS, and one of the few studies with these characteristics in the world.
Resumo:
Palpable mass is a common complaint presented to the breast surgeon. It is very uncommon for patients to report breast mass associated with palpable masses in other superficial structures. When these masses are related to systemic granulomatous diseases, the diagnosis and initiation of specific therapy can be challenging. The purpose of this paper is to report a case initially assessed by the breast surgeon and ultimately diagnosed as granulomatous variant of T-cell lymphoma, and discuss the main systemic granulomatous diseases associated with palpable masses involving the breast.
Resumo:
Multiple sclerosis, which is the most common cause of chronic neurological disability in young adults, is an inflammatory, demyelinating, and neurodegenerative disease of the CNS, which leads to the formation of multiple foci of demyelinated lesions in the white matter. The diagnosis is based currently on magnetic resonance image and evidence of dissemination in time and space. However, this could be facilitated if biomarkers were available to rule out other disorders with similar symptoms as well as to avoid cerebrospinal fluid analysis, which requires an invasive collection. Additionally, the molecular mechanisms of the disease are not completely elucidated, especially those related to the neurodegenerative aspects of the disease. The identification of biomarker candidates and molecular mechanisms of multiple sclerosis may be approached by proteomics. In the last 10 years, proteomic techniques have been applied in different biological samples (CNS tissue, cerebrospinal fluid, and blood) from multiple sclerosis patients and in its experimental model. In this review, we summarize these data, presenting their value to the current knowledge of the disease mechanisms, as well as their importance in identifying biomarkers or treatment targets.
Resumo:
Ecological science contributes to solving a broad range of environmental problems. However, lack of ecological literacy in practice often limits application of this knowledge. In this paper, we highlight a critical but often overlooked demand on ecological literacy: to enable professionals of various careers to apply scientific knowledge when faced with environmental problems. Current university courses on ecology often fail to persuade students that ecological science provides important tools for environmental problem solving. We propose problem-based learning to improve the understanding of ecological science and its usefulness for real-world environmental issues that professionals in careers as diverse as engineering, public health, architecture, social sciences, or management will address. Courses should set clear learning objectives for cognitive skills they expect students to acquire. Thus, professionals in different fields will be enabled to improve environmental decision-making processes and to participate effectively in multidisciplinary work groups charged with tackling environmental issues.