24 resultados para Genomic data integration
Resumo:
Beside the traditional paradigm of "centralized" power generation, a new concept of "distributed" generation is emerging, in which the same user becomes pro-sumer. During this transition, the Energy Storage Systems (ESS) can provide multiple services and features, which are necessary for a higher quality of the electrical system and for the optimization of non-programmable Renewable Energy Source (RES) power plants. A ESS prototype was designed, developed and integrated into a renewable energy production system in order to create a smart microgrid and consequently manage in an efficient and intelligent way the energy flow as a function of the power demand. The produced energy can be introduced into the grid, supplied to the load directly or stored in batteries. The microgrid is composed by a 7 kW wind turbine (WT) and a 17 kW photovoltaic (PV) plant are part of. The load is given by electrical utilities of a cheese factory. The ESS is composed by the following two subsystems, a Battery Energy Storage System (BESS) and a Power Control System (PCS). With the aim of sizing the ESS, a Remote Grid Analyzer (RGA) was designed, realized and connected to the wind turbine, photovoltaic plant and the switchboard. Afterwards, different electrochemical storage technologies were studied, and taking into account the load requirements present in the cheese factory, the most suitable solution was identified in the high temperatures salt Na-NiCl2 battery technology. The data acquisition from all electrical utilities provided a detailed load analysis, indicating the optimal storage size equal to a 30 kW battery system. Moreover a container was designed and realized to locate the BESS and PCS, meeting all the requirements and safety conditions. Furthermore, a smart control system was implemented in order to handle the different applications of the ESS, such as peak shaving or load levelling.
3D Surveying and Data Management towards the Realization of a Knowledge System for Cultural Heritage
Resumo:
The research activities involved the application of the Geomatic techniques in the Cultural Heritage field, following the development of two themes: Firstly, the application of high precision surveying techniques for the restoration and interpretation of relevant monuments and archaeological finds. The main case regards the activities for the generation of a high-fidelity 3D model of the Fountain of Neptune in Bologna. In this work, aimed to the restoration of the manufacture, both the geometrical and radiometrical aspects were crucial. The final product was the base of a 3D information system representing a shared tool where the different figures involved in the restoration activities shared their contribution in a multidisciplinary approach. Secondly, the arrangement of 3D databases for a Building Information Modeling (BIM) approach, in a process which involves the generation and management of digital representations of physical and functional characteristics of historical buildings, towards a so-called Historical Building Information Model (HBIM). A first application was conducted for the San Michele in Acerboli’s church in Santarcangelo di Romagna. The survey was performed by the integration of the classical and modern Geomatic techniques and the point cloud representing the church was used for the development of a HBIM model, where the relevant information connected to the building could be stored and georeferenced. A second application regards the domus of Obellio Firmo in Pompeii, surveyed by the integration of the classical and modern Geomatic techniques. An historical analysis permitted the definitions of phases and the organization of a database of materials and constructive elements. The goal is the obtaining of a federate model able to manage the different aspects: documental, analytic and reconstructive ones.
Resumo:
In this thesis we discuss in what ways computational logic (CL) and data science (DS) can jointly contribute to the management of knowledge within the scope of modern and future artificial intelligence (AI), and how technically-sound software technologies can be realised along the path. An agent-oriented mindset permeates the whole discussion, by stressing pivotal role of autonomous agents in exploiting both means to reach higher degrees of intelligence. Accordingly, the goals of this thesis are manifold. First, we elicit the analogies and differences among CL and DS, hence looking for possible synergies and complementarities along 4 major knowledge-related dimensions, namely representation, acquisition (a.k.a. learning), inference (a.k.a. reasoning), and explanation. In this regard, we propose a conceptual framework through which bridges these disciplines can be described and designed. We then survey the current state of the art of AI technologies, w.r.t. their capability to support bridging CL and DS in practice. After detecting lacks and opportunities, we propose the notion of logic ecosystem as the new conceptual, architectural, and technological solution supporting the incremental integration of symbolic and sub-symbolic AI. Finally, we discuss how our notion of logic ecosys- tem can be reified into actual software technology and extended towards many DS-related directions.
Resumo:
The advent of omic data production has opened many new perspectives in the quest for modelling complexity in biophysical systems. With the capability of characterizing a complex organism through the patterns of its molecular states, observed at different levels through various omics, a new paradigm of investigation is arising. In this thesis, we investigate the links between perturbations of the human organism, described as the ensemble of crosstalk of its molecular states, and health. Machine learning plays a key role within this picture, both in omic data analysis and model building. We propose and discuss different frameworks developed by the author using machine learning for data reduction, integration, projection on latent features, pattern analysis, classification and clustering of omic data, with a focus on 1H NMR metabolomic spectral data. The aim is to link different levels of omic observations of molecular states, from nanoscale to macroscale, to study perturbations such as diseases and diet interpreted as changes in molecular patterns. The first part of this work focuses on the fingerprinting of diseases, linking cellular and systemic metabolomics with genomic to asses and predict the downstream of perturbations all the way down to the enzymatic network. The second part is a set of frameworks and models, developed with 1H NMR metabolomic at its core, to study the exposure of the human organism to diet and food intake in its full complexity, from epidemiological data analysis to molecular characterization of food structure.
Resumo:
In the last decades, Artificial Intelligence has witnessed multiple breakthroughs in deep learning. In particular, purely data-driven approaches have opened to a wide variety of successful applications due to the large availability of data. Nonetheless, the integration of prior knowledge is still required to compensate for specific issues like lack of generalization from limited data, fairness, robustness, and biases. In this thesis, we analyze the methodology of integrating knowledge into deep learning models in the field of Natural Language Processing (NLP). We start by remarking on the importance of knowledge integration. We highlight the possible shortcomings of these approaches and investigate the implications of integrating unstructured textual knowledge. We introduce Unstructured Knowledge Integration (UKI) as the process of integrating unstructured knowledge into machine learning models. We discuss UKI in the field of NLP, where knowledge is represented in a natural language format. We identify UKI as a complex process comprised of multiple sub-processes, different knowledge types, and knowledge integration properties to guarantee. We remark on the challenges of integrating unstructured textual knowledge and bridge connections with well-known research areas in NLP. We provide a unified vision of structured knowledge extraction (KE) and UKI by identifying KE as a sub-process of UKI. We investigate some challenging scenarios where structured knowledge is not a feasible prior assumption and formulate each task from the point of view of UKI. We adopt simple yet effective neural architectures and discuss the challenges of such an approach. Finally, we identify KE as a form of symbolic representation. From this perspective, we remark on the need of defining sophisticated UKI processes to verify the validity of knowledge integration. To this end, we foresee frameworks capable of combining symbolic and sub-symbolic representations for learning as a solution.
Resumo:
Whole Exome Sequencing (WES) is rapidly becoming the first-tier test in clinics, both thanks to its declining costs and the development of new platforms that help clinicians in the analysis and interpretation of SNV and InDels. However, we still know very little on how CNV detection could increase WES diagnostic yield. A plethora of exome CNV callers have been published over the years, all showing good performances towards specific CNV classes and sizes, suggesting that the combination of multiple tools is needed to obtain an overall good detection performance. Here we present TrainX, a ML-based method for calling heterozygous CNVs in WES data using EXCAVATOR2 Normalized Read Counts. We select males and females’ non pseudo-autosomal chromosome X alignments to construct our dataset and train our model, make predictions on autosomes target regions and use HMM to call CNVs. We compared TrainX against a set of CNV tools differing for the detection method (GATK4 gCNV, ExomeDepth, DECoN, CNVkit and EXCAVATOR2) and found that our algorithm outperformed them in terms of stability, as we identified both deletions and duplications with good scores (0.87 and 0.82 F1-scores respectively) and for sizes reaching the minimum resolution of 2 target regions. We also evaluated the method robustness using a set of WES and SNP array data (n=251), part of the Italian cohort of Epi25 collaborative, and were able to retrieve all clinical CNVs previously identified by the SNP array. TrainX showed good accuracy in detecting heterozygous CNVs of different sizes, making it a promising tool to use in a diagnostic setting.
Resumo:
Time Series Analysis of multispectral satellite data offers an innovative way to extract valuable information of our changing planet. This is now a real option for scientists thanks to data availability as well as innovative cloud-computing platforms, such as Google Earth Engine. The integration of different missions would mitigate known issues in multispectral time series construction, such as gaps due to clouds or other atmospheric effects. With this purpose, harmonization among Landsat-like missions is possible through statistical analysis. This research offers an overview of the different instruments from Landsat and Sentinel missions (TM, ETM, OLI, OLI-2 and MSI sensors) and products levels (Collection-2 Level-1 and Surface Reflectance for Landsat and Level-1C and Level-2A for Sentinel-2). Moreover, a cross-sensors comparison was performed to assess the interoperability of the sensors on-board Landsat and Sentinel-2 constellations, having in mind a possible combined use for time series analysis. Firstly, more than 20,000 pairs of images almost simultaneously acquired all over Europe were selected over a period of several years. The study performed a cross-comparison analysis on these data, and provided an assessment of the calibration coefficients that can be used to minimize differences in the combined use. Four of the most popular vegetation indexes were selected for the study: NDVI, EVI, SAVI and NDMI. As a result, it is possible to reconstruct a longer and denser harmonized time series since 1984, useful for vegetation monitoring purposes. Secondly, the spectral characteristics of the recent Landsat-9 mission were assessed for a combined use with Landsat-8 and Sentinel-2. A cross-sensor analysis of common bands of more than 3,000 almost simultaneous acquisitions verified a high consistency between datasets. The most relevant discrepancy has been observed in the blue and SWIRS bands, often used in vegetation and water related studies. This analysis was supported with spectroradiometer ground measurements.
Resumo:
The purpose of this research study is to discuss privacy and data protection-related regulatory and compliance challenges posed by digital transformation in healthcare in the wake of the COVID-19 pandemic. The public health crisis accelerated the development of patient-centred remote/hybrid healthcare delivery models that make increased use of telehealth services and related digital solutions. The large-scale uptake of IoT-enabled medical devices and wellness applications, and the offering of healthcare services via healthcare platforms (online doctor marketplaces) have catalysed these developments. However, the use of new enabling technologies (IoT, AI) and the platformisation of healthcare pose complex challenges to the protection of patient’s privacy and personal data. This happens at a time when the EU is drawing up a new regulatory landscape for the use of data and digital technologies. Against this background, the study presents an interdisciplinary (normative and technology-oriented) critical assessment on how the new regulatory framework may affect privacy and data protection requirements regarding the deployment and use of Internet of Health Things (hardware) devices and interconnected software (AI systems). The study also assesses key privacy and data protection challenges that affect healthcare platforms (online doctor marketplaces) in their offering of video API-enabled teleconsultation services and their (anticipated) integration into the European Health Data Space. The overall conclusion of the study is that regulatory deficiencies may create integrity risks for the protection of privacy and personal data in telehealth due to uncertainties about the proper interplay, legal effects and effectiveness of (existing and proposed) EU legislation. The proliferation of normative measures may increase compliance costs, hinder innovation and ultimately, deprive European patients from state-of-the-art digital health technologies, which is paradoxically, the opposite of what the EU plans to achieve.
Resumo:
Pathogenic aberrations in homologous recombination DNA repair (HRR) genes occur in approximately 1 to 4 men with advanced prostate cancer (PCa). Treatment with PARP inhibitors (PARPi) has recently been introduced for metastatic castration-resistant PCa patients, increasing clinicians' interest in the molecular characterization of all PCa patients. The limitations of using old, low-quality tumor tissue for genetic analysis, which is very common for PCa, can be overcome by using liquid biopsy as an alternative biomarker source. In this study, we aimed to evaluate the detection of molecular alterations in HRR genes on liquid biopsy compared with tumor tissue from PCa patients. Secondarily, we explored the genomic instability score (GIS), and a broader range of gene alterations for in-depth characterization of the PCa cohort. Plasma samples were collected from 63 patients with PCa. Sophia Homologous Recombination Solution (targeting 16 HRR genes) and shallow whole genome sequencing (sWGS) were used for genomic analysis of tissue DNA and circulating tumor DNA (ct). A total of 33 alterations (mainly on TP53, ATM, CHEK2, CDK12, and BRCA1/2) were identified in 28,5% of PCa plasma patients. By integrating the mutational and sWGS data, the HRR status of PCa patients was determined and a concordance agreement of 85,7% was identified with tumor tissue. A median GIS of 15 was obtained, reaching a score of 63 in 2 samples with double alterations, BRCA1 and TP53. We explored the PCa mutation landscape, and the most significant enriched pathways identified were the sphingosine 1-phosphate (S1P) receptor signaling and the PI3K-AKT-mTOR pathway. HRR analysis on FFPE and liquid biopsy samples show high concordance, demonstrating that the noninvasive ctDNA-enriched plasma can be an optimal alternative source for molecular SNV and CNV analysis. In addition, the evaluation of GIS and pathway interaction should be considered for more comprehensive molecular characterization in PCa patients.