5 resultados para Data Quality Management
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
The Gaia space mission is a major project for the European astronomical community. As challenging as it is, the processing and analysis of the huge data-flow incoming from Gaia is the subject of thorough study and preparatory work by the DPAC (Data Processing and Analysis Consortium), in charge of all aspects of the Gaia data reduction. This PhD Thesis was carried out in the framework of the DPAC, within the team based in Bologna. The task of the Bologna team is to define the calibration model and to build a grid of spectro-photometric standard stars (SPSS) suitable for the absolute flux calibration of the Gaia G-band photometry and the BP/RP spectrophotometry. Such a flux calibration can be performed by repeatedly observing each SPSS during the life-time of the Gaia mission and by comparing the observed Gaia spectra to the spectra obtained by our ground-based observations. Due to both the different observing sites involved and the huge amount of frames expected (≃100000), it is essential to maintain the maximum homogeneity in data quality, acquisition and treatment, and a particular care has to be used to test the capabilities of each telescope/instrument combination (through the “instrument familiarization plan”), to devise methods to keep under control, and eventually to correct for, the typical instrumental effects that can affect the high precision required for the Gaia SPSS grid (a few % with respect to Vega). I contributed to the ground-based survey of Gaia SPSS in many respects: with the observations, the instrument familiarization plan, the data reduction and analysis activities (both photometry and spectroscopy), and to the maintenance of the data archives. However, the field I was personally responsible for was photometry and in particular relative photometry for the production of short-term light curves. In this context I defined and tested a semi-automated pipeline which allows for the pre-reduction of imaging SPSS data and the production of aperture photometry catalogues ready to be used for further analysis. A series of semi-automated quality control criteria are included in the pipeline at various levels, from pre-reduction, to aperture photometry, to light curves production and analysis.
Resumo:
The human movement analysis (HMA) aims to measure the abilities of a subject to stand or to walk. In the field of HMA, tests are daily performed in research laboratories, hospitals and clinics, aiming to diagnose a disease, distinguish between disease entities, monitor the progress of a treatment and predict the outcome of an intervention [Brand and Crowninshield, 1981; Brand, 1987; Baker, 2006]. To achieve these purposes, clinicians and researchers use measurement devices, like force platforms, stereophotogrammetric systems, accelerometers, baropodometric insoles, etc. This thesis focus on the force platform (FP) and in particular on the quality assessment of the FP data. The principal objective of our work was the design and the experimental validation of a portable system for the in situ calibration of FPs. The thesis is structured as follows: Chapter 1. Description of the physical principles used for the functioning of a FP: how these principles are used to create force transducers, such as strain gauges and piezoelectrics transducers. Then, description of the two category of FPs, three- and six-component, the signals acquisition (hardware structure), and the signals calibration. Finally, a brief description of the use of FPs in HMA, for balance or gait analysis. Chapter 2. Description of the inverse dynamics, the most common method used in the field of HMA. This method uses the signals measured by a FP to estimate kinetic quantities, such as joint forces and moments. The measures of these variables can not be taken directly, unless very invasive techniques; consequently these variables can only be estimated using indirect techniques, as the inverse dynamics. Finally, a brief description of the sources of error, present in the gait analysis. Chapter 3. State of the art in the FP calibration. The selected literature is divided in sections, each section describes: systems for the periodic control of the FP accuracy; systems for the error reduction in the FP signals; systems and procedures for the construction of a FP. In particular is detailed described a calibration system designed by our group, based on the theoretical method proposed by ?. This system was the “starting point” for the new system presented in this thesis. Chapter 4. Description of the new system, divided in its parts: 1) the algorithm; 2) the device; and 3) the calibration procedure, for the correct performing of the calibration process. The algorithm characteristics were optimized by a simulation approach, the results are here presented. In addiction, the different versions of the device are described. Chapter 5. Experimental validation of the new system, achieved by testing it on 4 commercial FPs. The effectiveness of the calibration was verified by measuring, before and after calibration, the accuracy of the FPs in measuring the center of pressure of an applied force. The new system can estimate local and global calibration matrices; by local and global calibration matrices, the non–linearity of the FPs was quantified and locally compensated. Further, a non–linear calibration is proposed. This calibration compensates the non– linear effect in the FP functioning, due to the bending of its upper plate. The experimental results are presented. Chapter 6. Influence of the FP calibration on the estimation of kinetic quantities, with the inverse dynamics approach. Chapter 7. The conclusions of this thesis are presented: need of a calibration of FPs and consequential enhancement in the kinetic data quality. Appendix: Calibration of the LC used in the presented system. Different calibration set–up of a 3D force transducer are presented, and is proposed the optimal set–up, with particular attention to the compensation of non–linearities. The optimal set–up is verified by experimental results.
Resumo:
Precision horticulture and spatial analysis applied to orchards are a growing and evolving part of precision agriculture technology. The aim of this discipline is to reduce production costs by monitoring and analysing orchard-derived information to improve crop performance in an environmentally sound manner. Georeferencing and geostatistical analysis coupled to point-specific data mining allow to devise and implement management decisions tailored within the single orchard. Potential applications range from the opportunity to verify in real time along the season the effectiveness of cultural practices to achieve the production targets in terms of fruit size, number, yield and, in a near future, fruit quality traits. These data will impact not only the pre-harvest but their effect will extend to the post-harvest sector of the fruit chain. Chapter 1 provides an updated overview on precision horticulture , while in Chapter 2 a preliminary spatial statistic analysis of the variability in apple orchards is provided before and after manual thinning; an interpretation of this variability and how it can be managed to maximize orchard performance is offered. Then in Chapter 3 a stratification of spatial data into management classes to interpret and manage spatial variation on the orchard is undertaken. An inverse model approach is also applied to verify whether the crop production explains environmental variation. In Chapter 4 an integration of the techniques adopted before is presented. A new key for reading the information gathered within the field is offered. The overall goal of this Dissertation was to probe into the feasibility, the desirability and the effectiveness of a precision approach to fruit growing, following the lines of other areas of agriculture that already adopt this management tool. As existing applications of precision horticulture already had shown, crop specificity is an important factor to be accounted for. This work focused on apple because of its importance in the area where the work was carried out, and worldwide.
Resumo:
The term Artificial intelligence acquired a lot of baggage since its introduction and in its current incarnation is synonymous with Deep Learning. The sudden availability of data and computing resources has opened the gates to myriads of applications. Not all are created equal though, and problems might arise especially for fields not closely related to the tasks that pertain tech companies that spearheaded DL. The perspective of practitioners seems to be changing, however. Human-Centric AI emerged in the last few years as a new way of thinking DL and AI applications from the ground up, with a special attention at their relationship with humans. The goal is designing a system that can gracefully integrate in already established workflows, as in many real-world scenarios AI may not be good enough to completely replace its humans. Often this replacement may even be unneeded or undesirable. Another important perspective comes from, Andrew Ng, a DL pioneer, who recently started shifting the focus of development from “better models” towards better, and smaller, data. He defined his approach Data-Centric AI. Without downplaying the importance of pushing the state of the art in DL, we must recognize that if the goal is creating a tool for humans to use, more raw performance may not align with more utility for the final user. A Human-Centric approach is compatible with a Data-Centric one, and we find that the two overlap nicely when human expertise is used as the driving force behind data quality. This thesis documents a series of case-studies where these approaches were employed, to different extents, to guide the design and implementation of intelligent systems. We found human expertise proved crucial in improving datasets and models. The last chapter includes a slight deviation, with studies on the pandemic, still preserving the human and data centric perspective.
Resumo:
Artificial Intelligence (AI) and Machine Learning (ML) are novel data analysis techniques providing very accurate prediction results. They are widely adopted in a variety of industries to improve efficiency and decision-making, but they are also being used to develop intelligent systems. Their success grounds upon complex mathematical models, whose decisions and rationale are usually difficult to comprehend for human users to the point of being dubbed as black-boxes. This is particularly relevant in sensitive and highly regulated domains. To mitigate and possibly solve this issue, the Explainable AI (XAI) field became prominent in recent years. XAI consists of models and techniques to enable understanding of the intricated patterns discovered by black-box models. In this thesis, we consider model-agnostic XAI techniques, which can be applied to Tabular data, with a particular focus on the Credit Scoring domain. Special attention is dedicated to the LIME framework, for which we propose several modifications to the vanilla algorithm, in particular: a pair of complementary Stability Indices that accurately measure LIME stability, and the OptiLIME policy which helps the practitioner finding the proper balance among explanations' stability and reliability. We subsequently put forward GLEAMS a model-agnostic surrogate interpretable model which requires to be trained only once, while providing both Local and Global explanations of the black-box model. GLEAMS produces feature attributions and what-if scenarios, from both dataset and model perspective. Eventually, we argue that synthetic data are an emerging trend in AI, being more and more used to train complex models instead of original data. To be able to explain the outcomes of such models, we must guarantee that synthetic data are reliable enough to be able to translate their explanations to real-world individuals. To this end we propose DAISYnt, a suite of tests to measure synthetic tabular data quality and privacy.