937 resultados para Data quality control
Resumo:
There is increasing evidence that many of the mitochondrial DNA (mtDNA) databases published in the fields of forensic science and molecular anthropology are flawed. An a posteriori phylogenetic analysis of the sequences could help to eliminate most of the errors and thus greatly improve data quality. However, previously published caveats and recommendations along these lines were not yet picked up by all researchers. Here we call for stringent quality control of mtDNA data by haplogroup-directed database comparisons. We take some problematic databases of East Asian mtDNAs, published in the Journal of Forensic Sciences and Forensic Science International, as examples to demonstrate the process of pinpointing obvious errors. Our results show that data sets are not only notoriously plagued by base shifts and artificial recombination but also by lab-specific phantom mutations, especially in the second hypervariable region (HVR-II). (C) 2003 Elsevier Ireland Ltd. All rights reserved.
Resumo:
Background: The recent development of semi-automated techniques for staining and analyzing flow cytometry samples has presented new challenges. Quality control and quality assessment are critical when developing new high throughput technologies and their associated information services. Our experience suggests that significant bottlenecks remain in the development of high throughput flow cytometry methods for data analysis and display. Especially, data quality control and quality assessment are crucial steps in processing and analyzing high throughput flow cytometry data. Methods: We propose a variety of graphical exploratory data analytic tools for exploring ungated flow cytometry data. We have implemented a number of specialized functions and methods in the Bioconductor package rflowcyt. We demonstrate the use of these approaches by investigating two independent sets of high throughput flow cytometry data. Results: We found that graphical representations can reveal substantial non-biological differences in samples. Empirical Cumulative Distribution Function and summary scatterplots were especially useful in the rapid identification of problems not identified by manual review. Conclusions: Graphical exploratory data analytic tools are quick and useful means of assessing data quality. We propose that the described visualizations should be used as quality assessment tools and where possible, be used for quality control.
Resumo:
An array of Bio-Argo floats equipped with radiometric sensors has been recently deployed in various open ocean areas representative of the diversity of trophic and bio-optical conditions prevailing in the so-called Case 1 waters. Around solar noon and almost everyday, each float acquires 0-250 m vertical profiles of Photosynthetically Available Radiation and downward irradiance at three wavelengths (380, 412 and 490 nm). Up until now, more than 6500 profiles for each radiometric channel have been acquired. As these radiometric data are collected out of operator’s control and regardless of meteorological conditions, specific and automatic data processing protocols have to be developed. Here, we present a data quality-control procedure aimed at verifying profile shapes and providing near real-time data distribution. This procedure is specifically developed to: 1) identify main issues of measurements (i.e. dark signal, atmospheric clouds, spikes and wave-focusing occurrences); 2) validate the final data with a hierarchy of tests to ensure a scientific utilization. The procedure, adapted to each of the four radiometric channels, is designed to flag each profile in a way compliant with the data management procedure used by the Argo program. Main perturbations in the light field are identified by the new protocols with good performances over the whole dataset. This highlights its potential applicability at the global scale. Finally, the comparison with modeled surface irradiances allows assessing the accuracy of quality-controlled measured irradiance values and identifying any possible evolution over the float lifetime due to biofouling and instrumental drift.
Resumo:
An array of Bio-Argo floats equipped with radiometric sensors has been recently deployed in various open ocean areas representative of the diversity of trophic and bio-optical conditions prevailing in the so-called Case 1 waters. Around solar noon and almost everyday, each float acquires 0-250 m vertical profiles of Photosynthetically Available Radiation and downward irradiance at three wavelengths (380, 412 and 490 nm). Up until now, more than 6500 profiles for each radiometric channel have been acquired. As these radiometric data are collected out of operator’s control and regardless of meteorological conditions, specific and automatic data processing protocols have to be developed. Here, we present a data quality-control procedure aimed at verifying profile shapes and providing near real-time data distribution. This procedure is specifically developed to: 1) identify main issues of measurements (i.e. dark signal, atmospheric clouds, spikes and wave-focusing occurrences); 2) validate the final data with a hierarchy of tests to ensure a scientific utilization. The procedure, adapted to each of the four radiometric channels, is designed to flag each profile in a way compliant with the data management procedure used by the Argo program. Main perturbations in the light field are identified by the new protocols with good performances over the whole dataset. This highlights its potential applicability at the global scale. Finally, the comparison with modeled surface irradiances allows assessing the accuracy of quality-controlled measured irradiance values and identifying any possible evolution over the float lifetime due to biofouling and instrumental drift.
Resumo:
The Gaia space mission is a major project for the European astronomical community. As challenging as it is, the processing and analysis of the huge data-flow incoming from Gaia is the subject of thorough study and preparatory work by the DPAC (Data Processing and Analysis Consortium), in charge of all aspects of the Gaia data reduction. This PhD Thesis was carried out in the framework of the DPAC, within the team based in Bologna. The task of the Bologna team is to define the calibration model and to build a grid of spectro-photometric standard stars (SPSS) suitable for the absolute flux calibration of the Gaia G-band photometry and the BP/RP spectrophotometry. Such a flux calibration can be performed by repeatedly observing each SPSS during the life-time of the Gaia mission and by comparing the observed Gaia spectra to the spectra obtained by our ground-based observations. Due to both the different observing sites involved and the huge amount of frames expected (≃100000), it is essential to maintain the maximum homogeneity in data quality, acquisition and treatment, and a particular care has to be used to test the capabilities of each telescope/instrument combination (through the “instrument familiarization plan”), to devise methods to keep under control, and eventually to correct for, the typical instrumental effects that can affect the high precision required for the Gaia SPSS grid (a few % with respect to Vega). I contributed to the ground-based survey of Gaia SPSS in many respects: with the observations, the instrument familiarization plan, the data reduction and analysis activities (both photometry and spectroscopy), and to the maintenance of the data archives. However, the field I was personally responsible for was photometry and in particular relative photometry for the production of short-term light curves. In this context I defined and tested a semi-automated pipeline which allows for the pre-reduction of imaging SPSS data and the production of aperture photometry catalogues ready to be used for further analysis. A series of semi-automated quality control criteria are included in the pipeline at various levels, from pre-reduction, to aperture photometry, to light curves production and analysis.
Resumo:
Mode of access: Internet.
Resumo:
Cover title.
Resumo:
Cover title: Quality control of wind profiler data. Wind profiler training manual number two.
Resumo:
With the construction of operational oceanography systems, the need for real-time has become more and more important. A lot of work had been done in the past, within National Data Centres (NODC) and International Oceanographic Data and Information Exchange (IODE) to standardise delayed mode quality control procedures. Concerning such quality control procedures applicable in real-time (within hours to a maximum of a week from acquisition), which means automatically, some recommendations were set up for physical parameters but mainly within projects without consolidation with other initiatives. During the past ten years the EuroGOOS community has been working on such procedures within international programs such as Argo, OceanSites or GOSUD, or within EC projects such as Mersea, MFSTEP, FerryBox, ECOOP, and MyOcean. In collaboration with the FP7 SeaDataNet project that is standardizing the delayed mode quality control procedures in NODCs, and MyOcean GMES FP7 project that is standardizing near real time quality control procedures for operational oceanography purposes, the DATA-MEQ working group decided to put together this document to summarize the recommendations for near real-time QC procedures that they judged mature enough to be advertised and recommended to EuroGOOS.
Resumo:
A pragmatic method for assessing the accuracy and precision of a given processing pipeline required for converting computed tomography (CT) image data of bones into representative three dimensional (3D) models of bone shapes is proposed. The method is based on coprocessing a control object with known geometry which enables the assessment of the quality of resulting 3D models. At three stages of the conversion process, distance measurements were obtained and statistically evaluated. For this study, 31 CT datasets were processed. The final 3D model of the control object contained an average deviation from reference values of −1.07±0.52 mm standard deviation (SD) for edge distances and −0.647±0.43 mm SD for parallel side distances of the control object. Coprocessing a reference object enables the assessment of the accuracy and precision of a given processing pipeline for creating CTbased 3D bone models and is suitable for detecting most systematic or human errors when processing a CT-scan. Typical errors have about the same size as the scan resolution.