966 resultados para Data quality-aware mechanisms
Open business intelligence: on the importance of data quality awareness in user-friendly data mining
Resumo:
Citizens demand more and more data for making decisions in their daily life. Therefore, mechanisms that allow citizens to understand and analyze linked open data (LOD) in a user-friendly manner are highly required. To this aim, the concept of Open Business Intelligence (OpenBI) is introduced in this position paper. OpenBI facilitates non-expert users to (i) analyze and visualize LOD, thus generating actionable information by means of reporting, OLAP analysis, dashboards or data mining; and to (ii) share the new acquired information as LOD to be reused by anyone. One of the most challenging issues of OpenBI is related to data mining, since non-experts (as citizens) need guidance during preprocessing and application of mining algorithms due to the complexity of the mining process and the low quality of the data sources. This is even worst when dealing with LOD, not only because of the different kind of links among data, but also because of its high dimensionality. As a consequence, in this position paper we advocate that data mining for OpenBI requires data quality-aware mechanisms for guiding non-expert users in obtaining and sharing the most reliable knowledge from the available LOD.
Resumo:
This paper is a summary of the main contribu- tions of the PhD thesis published in [1]. The main research contributions of the thesis are driven by the research question how to design simple, yet efficient and robust run-time adaptive resource allocation schemes within the commu- nication stack of Wireless Sensor Network (WSN) nodes. The thesis addresses several problem domains with con- tributions on different layers of the WSN communication stack. The main contributions can be summarized as follows: First, a a novel run-time adaptive MAC protocol is intro- duced, which stepwise allocates the power-hungry radio interface in an on-demand manner when the encountered traffic load requires it. Second, the thesis outlines a metho- dology for robust, reliable and accurate software-based energy-estimation, which is calculated at network run- time on the sensor node itself. Third, the thesis evaluates several Forward Error Correction (FEC) strategies to adap- tively allocate the correctional power of Error Correcting Codes (ECCs) to cope with timely and spatially variable bit error rates. Fourth, in the context of TCP-based communi- cations in WSNs, the thesis evaluates distributed caching and local retransmission strategies to overcome the perfor- mance degrading effects of packet corruption and trans- mission failures when transmitting data over multiple hops. The performance of all developed protocols are eval- uated on a self-developed real-world WSN testbed and achieve superior performance over selected existing ap- proaches, especially where traffic load and channel condi- tions are suspect to rapid variations over time.
Resumo:
Websites are, nowadays, the face of institutions, but they are often neglected, especially when it comes to contents. In the present paper, we put forth an investigation work whose final goal is the development of a model for the measurement of data quality in institutional websites for health units. To that end, we have carried out a bibliographic review of the available approaches for the evaluation of website content quality, in order to identify the most recurrent dimensions and the attributes, and we are currently carrying out a Delphi Method process, presently in its second stage, with the purpose of reaching an adequate set of attributes for the measurement of content quality.
Resumo:
This article presents a research work, the goal of which was to achieve a model for the evaluation of data quality in institutional websites of health units in a broad and balanced way. We have carried out a literature review of the available approaches for the evaluation of website content quality, in order to identify the most recurrent dimensions and the attributes, and we have also carried out a Delphi method process with experts in order to reach an adequate set of attributes and their respective weights for the measurement of content quality. The results obtained revealed a high level of consensus among the experts who participated in the Delphi process. On the other hand, the different statistical analysis and techniques implemented are robust and attach confidence to our results and consequent model obtained.
Resumo:
Biofilm research is growing more diverse and dependent on high-throughput technologies and the large-scale production of results aggravates data substantiation. In particular, it is often the case that experimental protocols are adapted to meet the needs of a particular laboratory and no statistical validation of the modified method is provided. This paper discusses the impact of intra-laboratory adaptation and non-rigorous documentation of experimental protocols on biofilm data interchange and validation. The case study is a non-standard, but widely used, workflow for Pseudomonas aeruginosa biofilm development, considering three analysis assays: the crystal violet (CV) assay for biomass quantification, the XTT assay for respiratory activity assessment, and the colony forming units (CFU) assay for determination of cell viability. The ruggedness of the protocol was assessed by introducing small changes in the biofilm growth conditions, which simulate minor protocol adaptations and non-rigorous protocol documentation. Results show that even minor variations in the biofilm growth conditions may affect the results considerably, and that the biofilm analysis assays lack repeatability. Intra-laboratory validation of non-standard protocols is found critical to ensure data quality and enable the comparison of results within and among laboratories.
Resumo:
We construct estimates of educational attainment for a sample of OECD countries using previously unexploited sources. We follow a heuristic approach to obtain plausible time profiles for attainment levels by removing sharp breaks in the data that seem to reflect changes in classification criteria. We then construct indicators of the information content of our series and a number of previously available data sets and examine their performance in several growth specifications. We find a clear positive correlation between data quality and the size and significance of human capital coefficients in growth regressions. Using an extension of the classical errors in variables model, we construct a set of meta-estimates of the coefficient of years of schooling in an aggregate Cobb-Douglas production function. Our results suggest that, after correcting for measurement error bias, the value of this parameter is well above 0.50.
Resumo:
The European Surveillance of Congenital Anomalies (EUROCAT) network of population-based congenital anomaly registries is an important source of epidemiologic information on congenital anomalies in Europe covering live births, fetal deaths from 20 weeks gestation, and terminations of pregnancy for fetal anomaly. EUROCAT's policy is to strive for high-quality data, while ensuring consistency and transparency across all member registries. A set of 30 data quality indicators (DQIs) was developed to assess five key elements of data quality: completeness of case ascertainment, accuracy of diagnosis, completeness of information on EUROCAT variables, timeliness of data transmission, and availability of population denominator information. This article describes each of the individual DQIs and presents the output for each registry as well as the EUROCAT (unweighted) average, for 29 full member registries for 2004-2008. This information is also available on the EUROCAT website for previous years. The EUROCAT DQIs allow registries to evaluate their performance in relation to other registries and allows appropriate interpretations to be made of the data collected. The DQIs provide direction for improving data collection and ascertainment, and they allow annual assessment for monitoring continuous improvement. The DQI are constantly reviewed and refined to best document registry procedures and processes regarding data collection, to ensure appropriateness of DQI, and to ensure transparency so that the data collected can make a substantial and useful contribution to epidemiologic research on congenital anomalies.
Resumo:
Breastfeeding has important health benefits for both mother and child. Breastfed babies are less likely to report with gastric, respiratory and urinary tract infections and allergic diseases, while they are also less likely to become obese in later childhood. Improving breastfeeding initiation has become a national priority, and a national target has been set ̢?oto deliver an increase of two percentage points per annum in breastfeeding initiation rate, focusing especially on women from disadvantaged areas̢?. Despite improvements in data quality in previous years, it still remains difficult to construct an accurate and reliable picture of variations and trends in breastfeeding in the East Midlands. It is essential that nationally standardised data collection systems are put in place to enable effective and accurate monitoring and evaluation of breastfeeding status both at a local and national level.
Resumo:
One in a series of six data briefings based on regional-level analysis of data from the National Child Measurement Programme (NCMP) undertaken by the National Obesity Observatory (NOO). The briefings are intended to complement the headline results for the region published in January 2010. This briefing covers issues relating to the quality and completeness of the NCMP data. Detailed analysis of the NCMP at national level is available from NOO at http://www.noo.org.uk/NOO_pubInformation on the methods used to
Resumo:
The NRS state data quality standards identify the policies, processes and materials that states and local programs should have in place to collect valid and reliable data for the National Reporting System (NRS). The Division of Adult Education (DAEL) within the Office of Vocational and Adult Education developed the standards to define the characteristics of high quality state and local data collection systems for the NRS. The standards provide an organized way for DAEL to understand the quality of NRS data collection within the states and also provide guidance to states on how to improve their systems. States are to complete this checklist, which incorporates the standards, with their annual NRS data submission to rate their level of implementation of the standards. The accompanying policy document describes DAEL’s requirements for state conformance to the standards and explains the use of the information from this checklist.
Resumo:
The largest fresh meat brand names in Spain are analyzed here to studyhow quality is signaled in agribusiness and how the underlying quality-assurance organizations work. Results show, first, that organizationalform varies according to the specialization of the brand name.Publicly-controlled brand names are grounded on market contracting withindividual producers, providing stronger incentives. In contrast,private brands rely more on hierarchy, taking advantage of itssuperiority in solving specific coordination problems. Second, theseemingly redundant coexistence of several quality indicators for agiven product is explained in efficiency terms. Multiple brands areshown to be complementary, given their specialization in guaranteeingdifferent attributes of the product.
Resumo:
Le prélèvement des ganglions sentinelles apparaît comme une technique séduisante pour l'évaluation ganglionnaire des cancers du col utérin de faible stade. La sélection d'une population à bas risque de métastase ganglionnaire, un entraînement minimal et le respect de quelques règles simples permettent de limiter le risque de faux négatif au minimum. La technique apporte des informations supplémentaires sur le plan anatomique en identifiant des ganglions situés en dehors des zones habituelles de curage, et sur le plan histologique avec la mise en évidence de cellules tumorales isolées et surtout de micrométastases dont la valeur pronostique est suspectée Sentinel node biopsy appears as a promising technique for the assessment of nodal disease in early cervical cancers. Selection of a population with a low risk of nodal metastasis, a minimal training, and simple rules allow a low false negative rate. Sentinel node biopsy provides supplementary information, such as anatomical information (nodes outside of routine lymphadenectomy areas) and histological information (isolated tumors cells and micrometastases).
Resumo:
In a networked business environment the visibility requirements towards the supply operations and customer interface has become tighter. In order to meet those requirements the master data of case company is seen as an enabler. However the current state of master data and its quality are not seen good enough to meet those requirements. In this thesis the target of research was to develop a process for managing master data quality as a continuous process and find solutions to cleanse the current customer and supplier data to meet the quality requirements defined in that process. Based on the theory of Master Data Management and data cleansing, small amount of master data was analyzed and cleansed using one commercial data cleansing solution available on the market. This was conducted in cooperation with the vendor as a proof of concept. In the proof of concept the cleansing solution’s applicability to improve the quality of current master data was proved. Based on those findings and the theory of data management the recommendations and proposals for improving the quality of data were given. In the results was also discovered that the biggest reasons for poor data quality is the lack of data governance in the company, and the current master data solutions and its restrictions.