962 resultados para Data quality problems
Resumo:
Aircraft Maintenance, Repair and Overhaul (MRO) agencies rely largely on row-data based quotation systems to select the best suppliers for the customers (airlines). The data quantity and quality becomes a key issue to determining the success of an MRO job, since we need to ensure we achieve cost and quality benchmarks. This paper introduces a data mining approach to create an MRO quotation system that enhances the data quantity and data quality, and enables significantly more precise MRO job quotations. Regular Expression was utilized to analyse descriptive textual feedback (i.e. engineers reports) in order to extract more referable highly normalised data for job quotation. A text mining based key influencer analysis function enables the user to proactively select sub-parts, defects and possible solutions to make queries more accurate. Implementation results show that system data would improve cost quotation in 40% of MRO jobs, would reduce service cost without causing a drop in service quality.
Resumo:
The position of Real Estate within a multi-asset portfolio has received considerable attention recently. Previous research has concentrated on the percentage holding property would achieve given its risk/return characteristics. Such studies have invariably used Modern Portfolio Theory and these approaches have been criticised for both the quality of the real estate data and problems with the methodology itself. The first problem is now well understood, and the second can be addressed by the use of realistic constraints on asset holdings. This paper takes a different approach. We determine the level of return that Real Estate needs to achieve to justify an allocation within the multi asset portfolio. In order to test the importance of the quality of the data we use historic appraisal based and desmoothed returns to examine the sensitivity of the results. Consideration is also given to the Holding period and the imposition of realistic constraints on the asset holdings in order to model portfolios held by pension fund investors. We conclude, using several benchmark levels of portfolio risk and return, that using appraisal based data the required level of return for Real Estate was less than that achieved over the period 1972-1993. The use of desmoothed series can reverse this result at the highest levels of desmoothing although within a restricted holding period Real Estate offered returns in excess of those required to enter the portfolio and might have a role to play in the multi-asset portfolio.
Resumo:
The catchment of the River Thames, the principal river system in southern England, provides the main water supply for London but is highly vulnerable to changes in climate, land use and population. The river is eutrophic with significant algal blooms with phosphorus assumed to be the primary chemical indicator of ecosystem health. In the Thames Basin, phosphorus is available from point sources such as wastewater treatment plants and from diffuse sources such as agriculture. In order to predict vulnerability to future change, the integrated catchments model for phosphorus (INCA-P) has been applied to the river basin and used to assess the cost-effectiveness of a range of mitigation and adaptation strategies. It is shown that scenarios of future climate and land-use change will exacerbate the water quality problems, but a range of mitigation measures can improve the situation. A cost-effectiveness study has been undertaken to compare the economic benefits of each mitigation measure and to assess the phosphorus reductions achieved. The most effective strategy is to reduce fertilizer use by 20% together with the treatment of effluent to a high standard. Such measures will reduce the instream phosphorus concentrations to close to the EU Water Framework Directive target for the Thames.
Resumo:
We propose a new class of neurofuzzy construction algorithms with the aim of maximizing generalization capability specifically for imbalanced data classification problems based on leave-one-out (LOO) cross validation. The algorithms are in two stages, first an initial rule base is constructed based on estimating the Gaussian mixture model with analysis of variance decomposition from input data; the second stage carries out the joint weighted least squares parameter estimation and rule selection using orthogonal forward subspace selection (OFSS)procedure. We show how different LOO based rule selection criteria can be incorporated with OFSS, and advocate either maximizing the leave-one-out area under curve of the receiver operating characteristics, or maximizing the leave-one-out Fmeasure if the data sets exhibit imbalanced class distribution. Extensive comparative simulations illustrate the effectiveness of the proposed algorithms.
Resumo:
For users of climate services, the ability to quickly determine the datasets that best fit one's needs would be invaluable. The volume, variety and complexity of climate data makes this judgment difficult. The ambition of CHARMe ("Characterization of metadata to enable high-quality climate services") is to give a wider interdisciplinary community access to a range of supporting information, such as journal articles, technical reports or feedback on previous applications of the data. The capture and discovery of this "commentary" information, often created by data users rather than data providers, and currently not linked to the data themselves, has not been significantly addressed previously. CHARMe applies the principles of Linked Data and open web standards to associate, record, search and publish user-derived annotations in a way that can be read both by users and automated systems. Tools have been developed within the CHARMe project that enable annotation capability for data delivery systems already in wide use for discovering climate data. In addition, the project has developed advanced tools for exploring data and commentary in innovative ways, including an interactive data explorer and comparator ("CHARMe Maps") and a tool for correlating climate time series with external "significant events" (e.g. instrument failures or large volcanic eruptions) that affect the data quality. Although the project focuses on climate science, the concepts are general and could be applied to other fields. All CHARMe system software is open-source, released under a liberal licence, permitting future projects to re-use the source code as they wish.
Resumo:
This paper presents the results of the investigations that were done to identify and to quantify the power quality problems resultant from the actions done to improve the efficiency on electric energy consumption. The efficiencies of several electric devices were evaluated, among them: fluorescent bulb, electronic ballast, soft-starter, temperature controller for showers, dimmer and others. This evaluation allowed to establish a cause/effect analysis of the power quality.
Resumo:
This work presents one software developed to process solar radiation data. This software can be used in meteorological and climatic stations, and also as a support for solar radiation measurements in researches of solar energy availability allowing data quality control, statistical calculations and validation of models, as well as ease interchanging of data. (C) 1999 Elsevier B.V. Ltd. All rights reserved.
Resumo:
Nowadays, with the expansion of the reference stations networks, several positioning techniques have been developed and/or improved. Among them, the VRS (Virtual Reference Station) concept has been very used. In this paper the goal is to generate VRS data in a modified technique. In the proposed methodology the DD (double difference) ambiguities are not computed. The network correction terms are obtained using only atmospheric (ionospheric and tropospheric) models. In order to carry out the experiments it was used data of five reference stations from the GPS Active Network of West of So Paulo State and an extra station. To evaluate the VRS data quality it was used three different strategies: PPP (Precise Point Positioning) and Relative Positioning in static and kinematic modes, and DGPS (Differential GPS). Furthermore, the VRS data were generated in the position of a real reference station. The results provided by the VRS data agree quite well with those of the real file data.
Resumo:
In geophysics and seismology, raw data need to be processed to generate useful information that can be turned into knowledge by researchers. The number of sensors that are acquiring raw data is increasing rapidly. Without good data management systems, more time can be spent in querying and preparing datasets for analyses than in acquiring raw data. Also, a lot of good quality data acquired at great effort can be lost forever if they are not correctly stored. Local and international cooperation will probably be reduced, and a lot of data will never become scientific knowledge. For this reason, the Seismological Laboratory of the Institute of Astronomy, Geophysics and Atmospheric Sciences at the University of So Paulo (IAG-USP) has concentrated fully on its data management system. This report describes the efforts of the IAG-USP to set up a seismology data management system to facilitate local and international cooperation. 2011 by the Istituto Nazionale di Geofisica e Vulcanologia. All rights reserved.
Resumo:
Wireless Sensor Networks (WSNs) can be used to monitor hazardous and inaccessible areas. In these situations, the power supply (e.g. battery) of each node cannot be easily replaced. One solution to deal with the limited capacity of current power supplies is to deploy a large number of sensor nodes, since the lifetime and dependability of the network will increase through cooperation among nodes. Applications on WSN may also have other concerns, such as meeting temporal deadlines on message transmissions and maximizing the quality of information. Data fusion is a well-known technique that can be useful for the enhancement of data quality and for the maximization of WSN lifetime. In this paper, we propose an approach that allows the implementation of parallel data fusion techniques in IEEE 802.15.4 networks. One of the main advantages of the proposed approach is that it enables a trade-off between different user-defined metrics through the use of a genetic machine learning algorithm. Simulations and field experiments performed in different communication scenarios highlight significant improvements when compared with, for instance, the Gur Game approach or the implementation of conventional periodic communication techniques over IEEE 802.15.4 networks. 2013 Elsevier B.V. All rights reserved.
Resumo:
Coordenao de Aperfeioamento de Pessoal de Nvel Superior (CAPES)
Resumo:
.--A. Introduction.--B. Summary of evaluation.
Resumo:
A anlise de ocorrncias no sistema de energia eltrica de fundamento mportncia para uma operao segura, e para manter a qualidade da energia eltrica lornecida aos consumidores. As concessionrias do setor de energia eltrica usam equipamentos, chamados registradores de perturbao (RP's), para monitora diagnosticar problemas nos sistemas eltrico e de proteo. As formas de onda normalmente analisadas nos centros de operao das concessionrias, so aquelas geradas por eventos que quase sempre causam a aocrtul je linhas devido a operao dos rels comandados pelos dispositivos de proteo .Contudo, uma grande quantidade de registros armazenados que podem conte informaes importantes sobre o comportamento e desempenho do sistema eltricl jeixa de ser analisada. O objetivo desse trabalho usar os dados disponveis nos centros de ontrole, operao das concessionrias de energia eltrica obtidos pelos RP's, para classificar e quantificar de forma automtica sinais que caracterizem problemas de qualidade da energia, quanto a variaes de tenso de curta durao: afundamentos, elevaes e interrupes. O mtodo proposto usa a transformada wavelet para obter um vetor caracterstico para as tenses das fases A, B e C, e uma rede neural probabilstica para classificao. Os sinais classificados como apresentando variaes de curta durao so quantilicados quanto a durao e amplitude, usando-se as propriedades da anlise nultiresoluo da decomposio do sinal. Esses parmetros, ento, iro formar uma Jase de dados onde procedimentos de anlise estatstica podem ser usados para gerar relatrios com as caractersticas da qualidade da energia. Os resultados obtidos com a metodologia proposta para um sistema real so tambm apresentados.
Resumo:
Durante o processo de extrao do conhecimento em bases de dados, alguns problemas podem ser encontrados como por exemplo, a ausncia de determinada instncia de um atributo. A ocorrncia de tal problemtica pode causar efeitos danosos nos resultados finais do processo, pois afeta diretamente a qualidade dos dados a ser submetido a um algoritmo de aprendizado de mquina. Na literatura, diversas propostas so apresentadas a fim de contornar tal dano, dentre eles est a de imputao de dados, a qual estima um valor plausvel para substituir o ausente. Seguindo essa rea de soluo para o problema de valores ausentes, diversos trabalhos foram analisados e algumas observaes foram realizadas como, a pouca utilizao de bases sintticas que simulem os principais mecanismos de ausncia de dados e uma recente tendncia a utilizao de algoritmos bio-inspirados como tratamento do problema. Com base nesse cenrio, esta dissertao apresenta um mtodo de imputao de dados baseado em otimizao por enxame de partculas, pouco explorado na rea, e o aplica para o tratamento de bases sinteticamente geradas, as quais consideram os principais mecanismos de ausncia de dados, MAR, MCAR e NMAR. Os resultados obtidos ao comprar diferentes configuraes do mtodo outros dois conhecidos na rea (KNNImpute e SVMImpute) so promissores para sua utilizao na rea de tratamento de valores ausentes uma vez que alcanou os melhores valores na maioria dos experimentos realizados.
Resumo:
The human movement analysis (HMA) aims to measure the abilities of a subject to stand or to walk. In the field of HMA, tests are daily performed in research laboratories, hospitals and clinics, aiming to diagnose a disease, distinguish between disease entities, monitor the progress of a treatment and predict the outcome of an intervention [Brand and Crowninshield, 1981; Brand, 1987; Baker, 2006]. To achieve these purposes, clinicians and researchers use measurement devices, like force platforms, stereophotogrammetric systems, accelerometers, baropodometric insoles, etc. This thesis focus on the force platform (FP) and in particular on the quality assessment of the FP data. The principal objective of our work was the design and the experimental validation of a portable system for the in situ calibration of FPs. The thesis is structured as follows: Chapter 1. Description of the physical principles used for the functioning of a FP: how these principles are used to create force transducers, such as strain gauges and piezoelectrics transducers. Then, description of the two category of FPs, three- and six-component, the signals acquisition (hardware structure), and the signals calibration. Finally, a brief description of the use of FPs in HMA, for balance or gait analysis. Chapter 2. Description of the inverse dynamics, the most common method used in the field of HMA. This method uses the signals measured by a FP to estimate kinetic quantities, such as joint forces and moments. The measures of these variables can not be taken directly, unless very invasive techniques; consequently these variables can only be estimated using indirect techniques, as the inverse dynamics. Finally, a brief description of the sources of error, present in the gait analysis. Chapter 3. State of the art in the FP calibration. The selected literature is divided in sections, each section describes: systems for the periodic control of the FP accuracy; systems for the error reduction in the FP signals; systems and procedures for the construction of a FP. In particular is detailed described a calibration system designed by our group, based on the theoretical method proposed by ?. This system was the starting point for the new system presented in this thesis. Chapter 4. Description of the new system, divided in its parts: 1) the algorithm; 2) the device; and 3) the calibration procedure, for the correct performing of the calibration process. The algorithm characteristics were optimized by a simulation approach, the results are here presented. In addiction, the different versions of the device are described. Chapter 5. Experimental validation of the new system, achieved by testing it on 4 commercial FPs. The effectiveness of the calibration was verified by measuring, before and after calibration, the accuracy of the FPs in measuring the center of pressure of an applied force. The new system can estimate local and global calibration matrices; by local and global calibration matrices, the nonlinearity of the FPs was quantified and locally compensated. Further, a nonlinear calibration is proposed. This calibration compensates the non linear effect in the FP functioning, due to the bending of its upper plate. The experimental results are presented. Chapter 6. Influence of the FP calibration on the estimation of kinetic quantities, with the inverse dynamics approach. Chapter 7. The conclusions of this thesis are presented: need of a calibration of FPs and consequential enhancement in the kinetic data quality. Appendix: Calibration of the LC used in the presented system. Different calibration setup of a 3D force transducer are presented, and is proposed the optimal setup, with particular attention to the compensation of nonlinearities. The optimal setup is verified by experimental results.