980 resultados para Data errors
Resumo:
Senior thesis written for Oceanography 445
Resumo:
The relationship between spot volume and variation for all protein spots observed on large format 2D gels when utilising silver stain technology and a model system based on mammalian NSO cell extracts is reported. By running multiple gels we have shown that the reproducibility of data generated in this way is dependent on individual protein spot volumes, which in turn are directly correlated with the coefficient of variation. The coefficients of variation across all observed protein spots were highest for low abundant proteins which are the primary contributors to process error, and lowest for more abundant proteins. Using the relationship between spot volume and coefficient of variation we show it is necessary to calculate variation for individual protein spot volumes. The inherent limitations of silver staining therefore mean that errors in individual protein spot volumes must be considered when assessing significant changes in protein spot volume and not global error. (C) 2003 Elsevier Science (USA). All rights reserved.
Resumo:
Background: Intravenous (IV) fluid administration is an integral component of clinical care. Errors in administration can cause detrimental patient outcomes and increase healthcare costs, although little is known about medication administration errors associated with continuous IV infusions. Objectives: ( 1) To ascertain the prevalence of medication administration errors for continuous IV infusions and identify the variables that caused them. ( 2) To quantify the probability of errors by fitting a logistic regression model to the data. Methods: A prospective study was conducted on three surgical wards at a teaching hospital in Australia. All study participants received continuous infusions of IV fluids. Parenteral nutrition and non-electrolyte containing intermittent drug infusions ( such as antibiotics) were excluded. Medication administration errors and contributing variables were documented using a direct observational approach. Results: Six hundred and eighty seven observations were made, with 124 (18.0%) having at least one medication administration error. The most common error observed was wrong administration rate. The median deviation from the prescribed rate was 247 ml/h (interquartile range 275 to + 33.8 ml/ h). Errors were more likely to occur if an IV infusion control device was not used and as the duration of the infusion increased. Conclusions: Administration errors involving continuous IV infusions occur frequently. They could be reduced by more common use of IV infusion control devices and regular checking of administration rates.
Resumo:
The schema of an information system can significantly impact the ability of end users to efficiently and effectively retrieve the information they need. Obtaining quickly the appropriate data increases the likelihood that an organization will make good decisions and respond adeptly to challenges. This research presents and validates a methodology for evaluating, ex ante, the relative desirability of alternative instantiations of a model of data. In contrast to prior research, each instantiation is based on a different formal theory. This research theorizes that the instantiation that yields the lowest weighted average query complexity for a representative sample of information requests is the most desirable instantiation for end-user queries. The theory was validated by an experiment that compared end-user performance using an instantiation of a data structure based on the relational model of data with performance using the corresponding instantiation of the data structure based on the object-relational model of data. Complexity was measured using three different Halstead metrics: program length, difficulty, and effort. For a representative sample of queries, the average complexity using each instantiation was calculated. As theorized, end users querying the instantiation with the lower average complexity made fewer semantic errors, i.e., were more effective at composing queries. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Data on the occurrence of species are widely used to inform the design of reserve networks. These data contain commission errors (when a species is mistakenly thought to be present) and omission errors (when a species is mistakenly thought to be absent), and the rates of the two types of error are inversely related. Point locality data can minimize commission errors, but those obtained from museum collections are generally sparse, suffer from substantial spatial bias and contain large omission errors. Geographic ranges generate large commission errors because they assume homogenous species distributions. Predicted distribution data make explicit inferences on species occurrence and their commission and omission errors depend on model structure, on the omission of variables that determine species distribution and on data resolution. Omission errors lead to identifying networks of areas for conservation action that are smaller than required and centred on known species occurrences, thus affecting the comprehensiveness, representativeness and efficiency of selected areas. Commission errors lead to selecting areas not relevant to conservation, thus affecting the representativeness and adequacy of reserve networks. Conservation plans should include an estimation of commission and omission errors in underlying species data and explicitly use this information to influence conservation planning outcomes.
Resumo:
Although managers consider accurate, timely, and relevant information as critical to the quality of their decisions, evidence of large variations in data quality abounds. Over a period of twelve months, the action research project reported herein attempted to investigate and track data quality initiatives undertaken by the participating organisation. The investigation focused on two types of errors: transaction input errors and processing errors. Whenever the action research initiative identified non-trivial errors, the participating organisation introduced actions to correct the errors and prevent similar errors in the future. Data quality metrics were taken quarterly to measure improvements resulting from the activities undertaken during the action research project. The action research project results indicated that for a mission-critical database to ensure and maintain data quality, commitment to continuous data quality improvement is necessary. Also, communication among all stakeholders is required to ensure common understanding of data quality improvement goals. The action research project found that to further substantially improve data quality, structural changes within the organisation and to the information systems are sometimes necessary. The major goal of the action research study is to increase the level of data quality awareness within all organisations and to motivate them to examine the importance of achieving and maintaining high-quality data.
Resumo:
This paper presents load profiles of electricity customers, using the knowledge discovery in databases (KDD) procedure, a data mining technique, to determine the load profiles for different types of customers. In this paper, the current load profiling methods are compared using data mining techniques, by analysing and evaluating these classification techniques. The objective of this study is to determine the best load profiling methods and data mining techniques to classify, detect and predict non-technical losses in the distribution sector, due to faulty metering and billing errors, as well as to gather knowledge on customer behaviour and preferences so as to gain a competitive advantage in the deregulated market. This paper focuses mainly on the comparative analysis of the classification techniques selected; a forthcoming paper will focus on the detection and prediction methods.
Resumo:
The increasing demand for high capacity data storage requires decreasing the head-to-tape gap and reducing the track width. A problem very often encountered is the development of adhesive debris on the heads at low humidity and high temperatures that can lead to an increase of space between the head and media, and thus a decrease in the playback signal. The influence of stains on the playback signal of reading heads is studied using RAW (Read After Write) tests and their influence on the wear of the heads by using indentation technique. The playback signal has been found to vary and the errors to increase as stains form a patchy pattern and grow in size to form a continuous layer. The indentation technique shows that stains reduce the wear rate of the heads. In addition, the wear tends to be more pronounced at the leading edge of the head compared to the trailing one. Chemical analysis of the stains using ferrite samples in conjunction with MP (metal particulate) tapes shows that stains contain iron particles and polymeric binder transferred from the MP tape. The chemical anchors in the binder used to grip the iron particles now react with the ferrite surface to create strong chemical bonds. At high humidity, a thin layer of iron oxyhydroxide forms on the surface of the ferrite. This soft material increases the wear rate and so reduces the amount of stain present on the heads. The stability of the binder under high humidity and under high temperature as well as the chemical reactions that might occur on the ferrite poles of the heads influences the dynamic behaviour of stains. A model of stain formation taking into account the channels of binder degradation and evolution upon different environmental conditions is proposed.
Resumo:
There is a growing demand for data transmission over digital networks involving mobile terminals. An important class of data required for transmission over mobile terminals is image information such as street maps, floor plans and identikit images. This sort of transmission is of particular interest to the service industries such as the Police force, Fire brigade, medical services and other services. These services cannot be applied directly to mobile terminals because of the limited capacity of the mobile channels and the transmission errors caused by the multipath (Rayleigh) fading. In this research, transmission of line diagram images such as floor plans and street maps, over digital networks involving mobile terminals at transmission rates of 2400 bits/s and 4800 bits/s have been studied. A low bit-rate source encoding technique using geometric codes is found to be suitable to represent line diagram images. In geometric encoding, the amount of data required to represent or store the line diagram images is proportional to the image detail. Thus a simple line diagram image would require a small amount of data. To study the effect of transmission errors due to mobile channels on the transmitted images, error sources (error files), which represent mobile channels under different conditions, have been produced using channel modelling techniques. Satisfactory models of the mobile channel have been obtained when compared to the field test measurements. Subjective performance tests have been carried out to evaluate the quality and usefulness of the received line diagram images under various mobile channel conditions. The effect of mobile transmission errors on the quality of the received images has been determined. To improve the quality of the received images under various mobile channel conditions, forward error correcting codes (FEC) with interleaving and automatic repeat request (ARQ) schemes have been proposed. The performance of the error control codes have been evaluated under various mobile channel conditions. It has been shown that a FEC code with interleaving can be used effectively to improve the quality of the received images under normal and severe mobile channel conditions. Under normal channel conditions, similar results have been obtained when using ARQ schemes. However, under severe mobile channel conditions, the FEC code with interleaving shows better performance.
Resumo:
Satellite-borne scatterometers are used to measure backscattered micro-wave radiation from the ocean surface. This data may be used to infer surface wind vectors where no direct measurements exist. Inherent in this data are outliers owing to aberrations on the water surface and measurement errors within the equipment. We present two techniques for identifying outliers using neural networks; the outliers may then be removed to improve models derived from the data. Firstly the generative topographic mapping (GTM) is used to create a probability density model; data with low probability under the model may be classed as outliers. In the second part of the paper, a sensor model with input-dependent noise is used and outliers are identified based on their probability under this model. GTM was successfully modified to incorporate prior knowledge of the shape of the observation manifold; however, GTM could not learn the double skinned nature of the observation manifold. To learn this double skinned manifold necessitated the use of a sensor model which imposes strong constraints on the mapping. The results using GTM with a fixed noise level suggested the noise level may vary as a function of wind speed. This was confirmed by experiments using a sensor model with input-dependent noise, where the variation in noise is most sensitive to the wind speed input. Both models successfully identified gross outliers with the largest differences between models occurring at low wind speeds. © 2003 Elsevier Science Ltd. All rights reserved.
Resumo:
Recently within the machine learning and spatial statistics communities many papers have explored the potential of reduced rank representations of the covariance matrix, often referred to as projected or fixed rank approaches. In such methods the covariance function of the posterior process is represented by a reduced rank approximation which is chosen such that there is minimal information loss. In this paper a sequential framework for inference in such projected processes is presented, where the observations are considered one at a time. We introduce a C++ library for carrying out such projected, sequential estimation which adds several novel features. In particular we have incorporated the ability to use a generic observation operator, or sensor model, to permit data fusion. We can also cope with a range of observation error characteristics, including non-Gaussian observation errors. Inference for the variogram parameters is based on maximum likelihood estimation. We illustrate the projected sequential method in application to synthetic and real data sets. We discuss the software implementation and suggest possible future extensions.
Resumo:
Distributed Brillouin sensing of strain and temperature works by making spatially resolved measurements of the position of the measurand-dependent extremum of the resonance curve associated with the scattering process in the weakly nonlinear regime. Typically, measurements of backscattered Stokes intensity (the dependent variable) are made at a number of predetermined fixed frequencies covering the design measurand range of the apparatus and combined to yield an estimate of the position of the extremum. The measurand can then be found because its relationship to the position of the extremum is assumed known. We present analytical expressions relating the relative error in the extremum position to experimental errors in the dependent variable. This is done for two cases: (i) a simple non-parametric estimate of the mean based on moments and (ii) the case in which a least squares technique is used to fit a Lorentzian to the data. The question of statistical bias in the estimates is discussed and in the second case we go further and present for the first time a general method by which the probability density function (PDF) of errors in the fitted parameters can be obtained in closed form in terms of the PDFs of the errors in the noisy data.
Resumo:
Objective - To review and summarise published data on medication errors in older people with mental health problems. Methods - A systematic review was conducted to identify studies that investigated medication errors in older people with mental health problems. MEDLINE, EMBASE, PHARMLINE, COCHRANE COLLABORATION and PsycINFO were searched electronically. Any studies identified were scrutinized for further references. The title, abstract or full text was systematically reviewed for relevance. Results - Data were extracted from eight studies. In total, information about 728 errors (459 administration, 248 prescribing, 7 dispensing, 12 transcribing, 2 unclassified) was available. The dataset related almost exclusively to inpatients, frequently involved non-psychotropics, and the majority of the errors were not serious. Conclusions - Due to methodology issues it was impossible to calculate overall error rates. Future research should concentrate on serious errors within community settings, and clarify potential risk factors.
Resumo:
We investigate two numerical procedures for the Cauchy problem in linear elasticity, involving the relaxation of either the given boundary displacements (Dirichlet data) or the prescribed boundary tractions (Neumann data) on the over-specified boundary, in the alternating iterative algorithm of Kozlov et al. (1991). The two mixed direct (well-posed) problems associated with each iteration are solved using the method of fundamental solutions (MFS), in conjunction with the Tikhonov regularization method, while the optimal value of the regularization parameter is chosen via the generalized cross-validation (GCV) criterion. An efficient regularizing stopping criterion which ceases the iterative procedure at the point where the accumulation of noise becomes dominant and the errors in predicting the exact solutions increase, is also presented. The MFS-based iterative algorithms with relaxation are tested for Cauchy problems for isotropic linear elastic materials in various geometries to confirm the numerical convergence, stability, accuracy and computational efficiency of the proposed method.
Resumo:
In the face of global population growth and the uneven distribution of water supply, a better knowledge of the spatial and temporal distribution of surface water resources is critical. Remote sensing provides a synoptic view of ongoing processes, which addresses the intricate nature of water surfaces and allows an assessment of the pressures placed on aquatic ecosystems. However, the main challenge in identifying water surfaces from remotely sensed data is the high variability of spectral signatures, both in space and time. In the last 10 years only a few operational methods have been proposed to map or monitor surface water at continental or global scale, and each of them show limitations. The objective of this study is to develop and demonstrate the adequacy of a generic multi-temporal and multi-spectral image analysis method to detect water surfaces automatically, and to monitor them in near-real-time. The proposed approach, based on a transformation of the RGB color space into HSV, provides dynamic information at the continental scale. The validation of the algorithm showed very few omission errors and no commission errors. It demonstrates the ability of the proposed algorithm to perform as effectively as human interpretation of the images. The validation of the permanent water surface product with an independent dataset derived from high resolution imagery, showed an accuracy of 91.5% and few commission errors. Potential applications of the proposed method have been identified and discussed. The methodology that has been developed 27 is generic: it can be applied to sensors with similar bands with good reliability, and minimal effort. Moreover, this experiment at continental scale showed that the methodology is efficient for a large range of environmental conditions. Additional preliminary tests over other continents indicate that the proposed methodology could also be applied at the global scale without too many difficulties