7 resultados para Workflows semânticos

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: WGS is increasingly used as a first-line diagnostic test for patients with rare genetic diseases such as neurodevelopmental disorders (NDD). Clinical applications require a robust infrastructure to support processing, storage and analysis of WGS data. The identification and interpretation of SVs from WGS data also needs to be improved. Finally, there is a need for a prioritization system that enables downstream clinical analysis and facilitates data interpretation. Here, we present the results of a clinical application of WGS in a cohort of patients with NDD. Methods: We developed highly portable workflows for processing WGS data, including alignment, quality control, and variant calling of SNVs and SVs. A benchmark analysis of state-of-the-art SV detection tools was performed to select the most accurate combination for SV calling. A gene-based prioritization system was also implemented to support variant interpretation. Results: Using a benchmark analysis, we selected the most accurate combination of tools to improve SV detection from WGS data and build a dedicated pipeline. Our workflows were used to process WGS data from 77 NDD patient-parent families. The prioritization system supported downstream analysis and enabled molecular diagnosis in 32% of patients, 25% of which were SVs and suggested a potential diagnosis in 20% of patients, requiring further investigation to achieve diagnostic certainty. Conclusion: Our data suggest that the integration of SNVs and SVs is a main factor that increases diagnostic yield by WGS and show that the adoption of a dedicated pipeline improves the process of variant detection and interpretation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Coordinating activities in a distributed system is an open research topic. Several models have been proposed to achieve this purpose such as message passing, publish/subscribe, workflows or tuple spaces. We have focused on the latter model, trying to overcome some of its disadvantages. In particular we have applied spatial database techniques to tuple spaces in order to increase their performance when handling a large number of tuples. Moreover, we have studied how structured peer to peer approaches can be applied to better distribute tuples on large networks. Using some of these result, we have developed a tuple space implementation for the Globus Toolkit that can be used by Grid applications as a coordination service. The development of such a service has been quite challenging due to the limitations imposed by XML serialization that have heavily influenced its design. Nevertheless, we were able to complete its implementation and use it to implement two different types of test applications: a completely parallelizable one and a plasma simulation that is not completely parallelizable. Using this last application we have compared the performance of our service against MPI. Finally, we have developed and tested a simple workflow in order to show the versatility of our service.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La ricerca sulla comunicazione e gestione multilingue della conoscenza in azienda si è sinora concentrata sulle multinazionali o PMI in fase di globalizzazione. La presente ricerca riguarda invece le PMI in zone storicamente multilingui al fine di studiare se l’abitudine all’uso di lingue diverse sul mercato locale possa rappresentare un vantaggio competitivo. La tesi illustra una ricerca multimetodo condotta nel 2012-2013 in Alto Adige/Südtirol. Il dataset consiste in 443 risposte valide a un questionario online e 23 interviste con manager e imprenditori locali. Le domande miravano a capire come le aziende altoatesine affrontino la sfida del multilinguismo, con particolare attenzione ai seguenti ambiti: comunicazione multilingue, documentazione, traduzione e terminologia. I risultati delineano un quadro generale delle strategie di multilinguismo applicate in Alto Adige, sottolineandone punti di forza e punti deboli. Nonostante la presenza di personale multilingue infatti il potenziale vantaggio competitivo che ne deriva non è sfruttato appieno: le aziende si rivolgono ai mercati in cui si parla la loro stessa lingua (le imprese a conduzione italiana al mercato nazionale, quelle di lingua tedesca ad Austria e Germania). La comunicazione interna è multilingue solo nei casi in sia imprescindibile. Le “traduzioni fai-da-te” offrono l’illusione di gestire lingue diverse, ma il livello qualitativo rimane limitato. I testi sono sovente tradotti da personale interno privo di competenze specifiche. Anche nella cooperazione con i traduttori esterni si evidenza la mancata capacità di ottenere il massimo profitto dagli investimenti. La tesi propone delle raccomandazioni pratiche volte a ottimizzare i processi attuali e massimizzare la resa delle risorse disponibili per superare la sfida della gestione e comunicazione multilingue. Le raccomandazioni non richiedono investimenti economici di rilievo e sono facilmente trasferibili anche ad altre regioni multilingui/di confine, come ad altre PMI che impiegano personale plurilingue. Possono dunque risultare utili per un elevato numero di imprese.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this Ph.D. project, original and innovative approaches for the quali-quantitative analysis of abuse substances, as well as therapeutic agents with abuse potential and related compounds were designed, developed and validated for application to different fields such as forensics, clinical and pharmaceutical. All the parameters involved in the developed analytical workflows were properly and accurately optimised, from sample collection to sample pretreatment up to the instrumental analysis. Advanced dried blood microsampling technologies have been developed, able of bringing several advantages to the method as a whole, such as significant reduction of solvent use, feasible storage and transportation conditions and enhancement of analyte stability. At the same time, the use of capillary blood allows to increase subject compliance and overall method applicability by exploiting such innovative technologies. Both biological and non-biological samples involved in this project were subjected to optimised pretreatment techniques developed ad-hoc for each target analyte, making also use of advanced microextraction techniques. Finally, original and advanced instrumental analytical methods have been developed based on high and ultra-high performance liquid chromatography (HPLC,UHPLC) coupled to different detection means (mainly mass spectrometry, but also electrochemical, and spectrophotometric detection for screening purpose), and on attenuated total reflectance-Fourier transform infrared spectroscopy (ATR-FTIR) for solid-state analysis. Each method has been designed to obtain highly selective, sensitive yet sustainable systems and has been validated according to international guidelines. All the methods developed herein proved to be suitable for the analysis of the compounds under investigation and may be useful tools in medicinal chemistry, pharmaceutical analysis, within clinical studies and forensic investigations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the CERN LHC program underway, there has been an acceleration of data growth in the High Energy Physics (HEP) field and the usage of Machine Learning (ML) in HEP will be critical during the HL-LHC program when the data that will be produced will reach the exascale. ML techniques have been successfully used in many areas of HEP nevertheless, the development of a ML project and its implementation for production use is a highly time-consuming task and requires specific skills. Complicating this scenario is the fact that HEP data is stored in ROOT data format, which is mostly unknown outside of the HEP community. The work presented in this thesis is focused on the development of a ML as a Service (MLaaS) solution for HEP, aiming to provide a cloud service that allows HEP users to run ML pipelines via HTTP calls. These pipelines are executed by using the MLaaS4HEP framework, which allows reading data, processing data, and training ML models directly using ROOT files of arbitrary size from local or distributed data sources. Such a solution provides HEP users non-expert in ML with a tool that allows them to apply ML techniques in their analyses in a streamlined manner. Over the years the MLaaS4HEP framework has been developed, validated, and tested and new features have been added. A first MLaaS solution has been developed by automatizing the deployment of a platform equipped with the MLaaS4HEP framework. Then, a service with APIs has been developed, so that a user after being authenticated and authorized can submit MLaaS4HEP workflows producing trained ML models ready for the inference phase. A working prototype of this service is currently running on a virtual machine of INFN-Cloud and is compliant to be added to the INFN Cloud portfolio of services.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The term Artificial intelligence acquired a lot of baggage since its introduction and in its current incarnation is synonymous with Deep Learning. The sudden availability of data and computing resources has opened the gates to myriads of applications. Not all are created equal though, and problems might arise especially for fields not closely related to the tasks that pertain tech companies that spearheaded DL. The perspective of practitioners seems to be changing, however. Human-Centric AI emerged in the last few years as a new way of thinking DL and AI applications from the ground up, with a special attention at their relationship with humans. The goal is designing a system that can gracefully integrate in already established workflows, as in many real-world scenarios AI may not be good enough to completely replace its humans. Often this replacement may even be unneeded or undesirable. Another important perspective comes from, Andrew Ng, a DL pioneer, who recently started shifting the focus of development from “better models” towards better, and smaller, data. He defined his approach Data-Centric AI. Without downplaying the importance of pushing the state of the art in DL, we must recognize that if the goal is creating a tool for humans to use, more raw performance may not align with more utility for the final user. A Human-Centric approach is compatible with a Data-Centric one, and we find that the two overlap nicely when human expertise is used as the driving force behind data quality. This thesis documents a series of case-studies where these approaches were employed, to different extents, to guide the design and implementation of intelligent systems. We found human expertise proved crucial in improving datasets and models. The last chapter includes a slight deviation, with studies on the pandemic, still preserving the human and data centric perspective.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In medicine, innovation depends on a better knowledge of the human body mechanism, which represents a complex system of multi-scale constituents. Unraveling the complexity underneath diseases proves to be challenging. A deep understanding of the inner workings comes with dealing with many heterogeneous information. Exploring the molecular status and the organization of genes, proteins, metabolites provides insights on what is driving a disease, from aggressiveness to curability. Molecular constituents, however, are only the building blocks of the human body and cannot currently tell the whole story of diseases. This is why nowadays attention is growing towards the contemporary exploitation of multi-scale information. Holistic methods are then drawing interest to address the problem of integrating heterogeneous data. The heterogeneity may derive from the diversity across data types and from the diversity within diseases. Here, four studies conducted data integration using customly designed workflows that implement novel methods and views to tackle the heterogeneous characterization of diseases. The first study devoted to determine shared gene regulatory signatures for onco-hematology and it showed partial co-regulation across blood-related diseases. The second study focused on Acute Myeloid Leukemia and refined the unsupervised integration of genomic alterations, which turned out to better resemble clinical practice. In the third study, network integration for artherosclerosis demonstrated, as a proof of concept, the impact of network intelligibility when it comes to model heterogeneous data, which showed to accelerate the identification of new potential pharmaceutical targets. Lastly, the fourth study introduced a new method to integrate multiple data types in a unique latent heterogeneous-representation that facilitated the selection of important data types to predict the tumour stage of invasive ductal carcinoma. The results of these four studies laid the groundwork to ease the detection of new biomarkers ultimately beneficial to medical practice and to the ever-growing field of Personalized Medicine.