994 resultados para imputation method
Resumo:
Imputation is commonly used to compensate for item non-response in sample surveys. If we treat the imputed values as if they are true values, and then compute the variance estimates by using standard methods, such as the jackknife, we can seriously underestimate the true variances. We propose a modified jackknife variance estimator which is defined for any without-replacement unequal probability sampling design in the presence of imputation and non-negligible sampling fraction. Mean, ratio and random-imputation methods will be considered. The practical advantage of the method proposed is its breadth of applicability.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
To assist cattle producers transition from microsatellite (MS) to single nucleotide polymorphism (SNP) genotyping for parental verification we previously devised an effective and inexpensive method to impute MS alleles from SNP haplotypes. While the reported method was verified with only a limited data set (N = 479) from Brown Swiss, Guernsey, Holstein, and Jersey cattle, some of the MS-SNP haplotype associations were concordant across these phylogenetically diverse breeds. This implied that some haplotypes predate modern breed formation and remain in strong linkage disequilibrium. To expand the utility of MS allele imputation across breeds, MS and SNP data from more than 8000 animals representing 39 breeds (Bos taurus and B. indicus) were used to predict 9410 SNP haplotypes, incorporating an average of 73 SNPs per haplotype, for which alleles from 12 MS markers could be accurately be imputed. Approximately 25% of the MS-SNP haplotypes were present in multiple breeds (N = 2 to 36 breeds). These shared haplotypes allowed for MS imputation in breeds that were not represented in the reference population with only a small increase in Mendelian inheritance inconsistancies. Our reported reference haplotypes can be used for any cattle breed and the reported methods can be applied to any species to aid the transition from MS to SNP genetic markers. While ~91% of the animals with imputed alleles for 12 MS markers had ≤1 Mendelian inheritance conflicts with their parents' reported MS genotypes, this figure was 96% for our reference animals, indicating potential errors in the reported MS genotypes. The workflow we suggest autocorrects for genotyping errors and rare haplotypes, by MS genotyping animals whose imputed MS alleles fail parentage verification, and then incorporating those animals into the reference dataset.
Resumo:
We propose a new method for fitting proportional hazards models with error-prone covariates. Regression coefficients are estimated by solving an estimating equation that is the average of the partial likelihood scores based on imputed true covariates. For the purpose of imputation, a linear spline model is assumed on the baseline hazard. We discuss consistency and asymptotic normality of the resulting estimators, and propose a stochastic approximation scheme to obtain the estimates. The algorithm is easy to implement, and reduces to the ordinary Cox partial likelihood approach when the measurement error has a degenerative distribution. Simulations indicate high efficiency and robustness. We consider the special case where error-prone replicates are available on the unobserved true covariates. As expected, increasing the number of replicate for the unobserved covariates increases efficiency and reduces bias. We illustrate the practical utility of the proposed method with an Eastern Cooperative Oncology Group clinical trial where a genetic marker, c-myc expression level, is subject to measurement error.
Resumo:
The fuzzy min–max neural network classifier is a supervised learning method. This classifier takes the hybrid neural networks and fuzzy systems approach. All input variables in the network are required to correspond to continuously valued variables, and this can be a significant constraint in many real-world situations where there are not only quantitative but also categorical data. The usual way of dealing with this type of variables is to replace the categorical by numerical values and treat them as if they were continuously valued. But this method, implicitly defines a possibly unsuitable metric for the categories. A number of different procedures have been proposed to tackle the problem. In this article, we present a new method. The procedure extends the fuzzy min–max neural network input to categorical variables by introducing new fuzzy sets, a new operation, and a new architecture. This provides for greater flexibility and wider application. The proposed method is then applied to missing data imputation in voting intention polls. The micro data—the set of the respondents’ individual answers to the questions—of this type of poll are especially suited for evaluating the method since they include a large number of numerical and categorical attributes.
Resumo:
There are many situations where input feature vectors are incomplete and methods to tackle the problem have been studied for a long time. A commonly used procedure is to replace each missing value with an imputation. This paper presents a method to perform categorical missing data imputation from numerical and categorical variables. The imputations are based on Simpson’s fuzzy min-max neural networks where the input variables for learning and classification are just numerical. The proposed method extends the input to categorical variables by introducing new fuzzy sets, a new operation and a new architecture. The procedure is tested and compared with others using opinion poll data.
Resumo:
In large epidemiological studies missing data can be a problem, especially if information is sought on a sensitive topic or when a composite measure is calculated from several variables each affected by missing values. Multiple imputation is the method of choice for 'filling in' missing data based on associations among variables. Using an example about body mass index from the Australian Longitudinal Study on Women's Health, we identify a subset of variables that are particularly useful for imputing values for the target variables. Then we illustrate two uses of multiple imputation. The first is to examine and correct for bias when data are not missing completely at random. The second is to impute missing values for an important covariate; in this case omission from the imputation process of variables to be used in the analysis may introduce bias. We conclude with several recommendations for handling issues of missing data. Copyright (C) 2004 John Wiley Sons, Ltd.
Resumo:
Objectives: To estimate differences in self-rated health by mode of administration and to assess the value of multiple imputation to make self-rated health comparable for telephone and mail. Methods: In 1996, Survey 1 of the Australian Longitudinal Study on Women's Health was answered by mail. In 1998, 706 and 11,595 mid-age women answered Survey 2 by telephone and mail respectively. Self-rated health was measured by the physical and mental health scores of the SF-36. Mean change in SF-36 scores between Surveys 1 and 2 were compared for telephone and mail respondents to Survey 2, before and after adjustment for socio-demographic and health characteristics. Missing values and SF-36 scores for telephone respondents at Survey 2 were imputed from SF-36 mail responses and telephone and mail responses to socio-demographic and health questions. Results: At Survey 2, self-rated health improved for telephone respondents but not mail respondents. After adjustment, mean changes in physical health and mental health scores remained higher (0.4 and 1.6 respectively) for telephone respondents compared with mail respondents (-1.2 and 0.1 respectively). Multiple imputation yielded adjusted changes in SF-36 scores that were similar for telephone and mail respondents. Conclusions and Implications: The effect of mode of administration on the change in mental health is important given that a difference of two points in SF-36 scores is accepted as clinically meaningful. Health evaluators should be aware of and adjust for the effects of mode of administration on self-rated health. Multiple imputation is one method that may be used to adjust SF-36 scores for mode of administration bias.
Resumo:
We have undertaken two-dimensional gel electrophoresis proteomic profiling on a series of cell lines with different recombinant antibody production rates. Due to the nature of gel-based experiments not all protein spots are detected across all samples in an experiment, and hence datasets are invariably incomplete. New approaches are therefore required for the analysis of such graduated datasets. We approached this problem in two ways. Firstly, we applied a missing value imputation technique to calculate missing data points. Secondly, we combined a singular value decomposition based hierarchical clustering with the expression variability test to identify protein spots whose expression correlates with increased antibody production. The results have shown that while imputation of missing data was a useful method to improve the statistical analysis of such data sets, this was of limited use in differentiating between the samples investigated, and highlighted a small number of candidate proteins for further investigation. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
The present paper describes a novel, simple and reliable differential pulse voltammetric method for determining amitriptyline (AMT) in pharmaceutical formulations. It has been described for many authors that this antidepressant is electrochemically inactive at carbon electrodes. However, the procedure proposed herein consisted in electrochemically oxidizing AMT at an unmodified carbon nanotube paste electrode in the presence of 0.1 mol L(-1) sulfuric acid used as electrolyte. At such concentration, the acid facilitated the AMT electroxidation through one-electron transfer at 1.33 V vs. Ag/AgCl, as observed by the augmentation of peak current. Concerning optimized conditions (modulation time 5 ms, scan rate 90 mV s(-1), and pulse amplitude 120 mV) a linear calibration curve was constructed in the range of 0.0-30.0 μmol L(-1), with a correlation coefficient of 0.9991 and a limit of detection of 1.61 μmol L(-1). The procedure was successfully validated for intra- and inter-day precision and accuracy. Moreover, its feasibility was assessed through analysis of commercial pharmaceutical formulations and it has been compared to the UV-vis spectrophotometric method used as standard analytical technique recommended by the Brazilian Pharmacopoeia.
Resumo:
The present work compared the local injection of mononuclear cells to the spinal cord lateral funiculus with the alternative approach of local delivery with fibrin sealant after ventral root avulsion (VRA) and reimplantation. For that, female adult Lewis rats were divided into the following groups: avulsion only, reimplantation with fibrin sealant; root repair with fibrin sealant associated with mononuclear cells; and repair with fibrin sealant and injected mononuclear cells. Cell therapy resulted in greater survival of spinal motoneurons up to four weeks post-surgery, especially when mononuclear cells were added to the fibrin glue. Injection of mononuclear cells to the lateral funiculus yield similar results to the reimplantation alone. Additionally, mononuclear cells added to the fibrin glue increased neurotrophic factor gene transcript levels in the spinal cord ventral horn. Regarding the motor recovery, evaluated by the functional peroneal index, as well as the paw print pressure, cell treated rats performed equally well as compared to reimplanted only animals, and significantly better than the avulsion only subjects. The results herein demonstrate that mononuclear cells therapy is neuroprotective by increasing levels of brain derived neurotrophic factor (BDNF) and glial derived neurotrophic factor (GDNF). Moreover, the use of fibrin sealant mononuclear cells delivery approach gave the best and more long lasting results.
Resumo:
It is well known that long term use of shampoo causes damage to human hair. Although the Lowry method has been widely used to quantify hair damage, it is unsuitable to determine this in the presence of some surfactants and there is no other method proposed in literature. In this work, a different method is used to investigate and compare the hair damage induced by four types of surfactants (including three commercial-grade surfactants) and water. Hair samples were immersed in aqueous solution of surfactants under conditions that resemble a shower (38 °C, constant shaking). These solutions become colored with time of contact with hair and its UV-vis spectra were recorded. For comparison, the amount of extracted proteins from hair by sodium dodecyl sulfate (SDS) and by water were estimated by the Lowry method. Additionally, non-pigmented vs. pigmented hair and also sepia melanin were used to understand the washing solution color and their spectra. The results presented herein show that hair degradation is mostly caused by the extraction of proteins, cuticle fragments and melanin granules from hair fiber. It was found that the intensity of solution color varies with the charge density of the surfactants. Furthermore, the intensity of solution color can be correlated to the amount of proteins quantified by the Lowry method as well as to the degree of hair damage. UV-vis spectrum of hair washing solutions is a simple and straightforward method to quantify and compare hair damages induced by different commercial surfactants.
Resumo:
In this study, the transmission-line modeling (TLM) applied to bio-thermal problems was improved by incorporating several novel computational techniques, which include application of graded meshes which resulted in 9 times faster in computational time and uses only a fraction (16%) of the computational resources used by regular meshes in analyzing heat flow through heterogeneous media. Graded meshes, unlike regular meshes, allow heat sources to be modeled in all segments of the mesh. A new boundary condition that considers thermal properties and thus resulting in a more realistic modeling of complex problems is introduced. Also, a new way of calculating an error parameter is introduced. The calculated temperatures between nodes were compared against the results obtained from the literature and agreed within less than 1% difference. It is reasonable, therefore, to conclude that the improved TLM model described herein has great potential in heat transfer of biological systems.
Resumo:
It is well known that trichomes protect plant organs, and several studies have investigated their role in the adaptation of plants to harsh environments. Recent studies have shown that the production of hydrophilic substances by glandular trichomes and the deposition of this secretion on young organs may facilitate water retention, thus preventing desiccation and favouring organ growth until the plant develops other protective mechanisms. Lychnophora diamantinana is a species endemic to the Brazilian 'campos rupestres' (rocky fields), a region characterized by intense solar radiation and water deficits. This study sought to investigate trichomes and the origin of the substances observed on the stem apices of L. diamantinana. Samples of stem apices, young and expanded leaves were studied using standard techniques, including light microscopy and scanning and transmission electron microscopy. Histochemical tests were used to identify the major groups of metabolites present in the trichomes and the hyaline material deposited on the apices. Non-glandular trichomes and glandular trichomes were observed. The material deposited on the stem apices was hyaline, highly hydrophilic and viscous. This hyaline material primarily consists of carbohydrates that result from the partial degradation of the cell wall of uniseriate trichomes. This degradation occurs at the same time that glandular trichomes secrete terpenoids, phenolic compounds and proteins. These results suggest that the non-glandular trichomes on the leaves of L. diamantinana help protect the young organ, particularly against desiccation, by deposition of highly hydrated substances on the apices. Furthermore, the secretion of glandular trichomes probably repels herbivore and pathogen attacks.
Resumo:
To determine the most adequate number and size of tissue microarray (TMA) cores for pleomorphic adenoma immunohistochemical studies. Eighty-two pleomorphic adenoma cases were distributed in 3 TMA blocks assembled in triplicate containing 1.0-, 2.0-, and 3.0-mm cores. Immunohistochemical analysis against cytokeratin 7, Ki67, p63, and CD34 were performed and subsequently evaluated with PixelCount, nuclear, and microvessel software applications. The 1.0-mm TMA presented lower results than 2.0- and 3.0-mm TMAs versus conventional whole section slides. Possibly because of an increased amount of stromal tissue, 3.0-mm cores presented a higher microvessel density. Comparing the results obtained with one, two, and three 2.0-mm cores, there was no difference between triplicate or duplicate TMAs and a single-core TMA. Considering the possible loss of cylinders during immunohistochemical reactions, 2.0-mm TMAs in duplicate are a more reliable approach for pleomorphic adenoma immunohistochemical study.