993 resultados para multiple imputation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A new database called the World Resource Table is constructed in this study. Missing values are known to produce complications when constructing global databases. This study provides a solution for applying multiple imputation techniques and estimates the global environmental Kuznets curve (EKC) for CO2, SO2, PM10, and BOD. Policy implications for each type of emission are derived based on the results of the EKC using WRI. Finally, we predicted the future emissions trend and regional share of CO2 emissions. We found that East Asia and South Asia will be increasing their emissions share while other major CO2 emitters will still produce large shares of the total global emissions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sequence analysis and optimal matching are useful heuristic tools for the descriptive analysis of heterogeneous individual pathways such as educational careers, job sequences or patterns of family formation. However, to date it remains unclear how to handle the inevitable problems caused by missing values with regard to such analysis. Multiple Imputation (MI) offers a possible solution for this problem but it has not been tested in the context of sequence analysis. Against this background, we contribute to the literature by assessing the potential of MI in the context of sequence analyses using an empirical example. Methodologically, we draw upon the work of Brendan Halpin and extend it to additional types of missing value patterns. Our empirical case is a sequence analysis of panel data with substantial attrition that examines the typical patterns and the persistence of sex segregation in school-to-work transitions in Switzerland. The preliminary results indicate that MI is a valuable methodology for handling missing values due to panel mortality in the context of sequence analysis. MI is especially useful in facilitating a sound interpretation of the resulting sequence types.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this study is to investigate the effects of predictor variable correlations and patterns of missingness with dichotomous and/or continuous data in small samples when missing data is multiply imputed. Missing data of predictor variables is multiply imputed under three different multivariate models: the multivariate normal model for continuous data, the multinomial model for dichotomous data and the general location model for mixed dichotomous and continuous data. Subsequent to the multiple imputation process, Type I error rates of the regression coefficients obtained with logistic regression analysis are estimated under various conditions of correlation structure, sample size, type of data and patterns of missing data. The distributional properties of average mean, variance and correlations among the predictor variables are assessed after the multiple imputation process. ^ For continuous predictor data under the multivariate normal model, Type I error rates are generally within the nominal values with samples of size n = 100. Smaller samples of size n = 50 resulted in more conservative estimates (i.e., lower than the nominal value). Correlation and variance estimates of the original data are retained after multiple imputation with less than 50% missing continuous predictor data. For dichotomous predictor data under the multinomial model, Type I error rates are generally conservative, which in part is due to the sparseness of the data. The correlation structure for the predictor variables is not well retained on multiply-imputed data from small samples with more than 50% missing data with this model. For mixed continuous and dichotomous predictor data, the results are similar to those found under the multivariate normal model for continuous data and under the multinomial model for dichotomous data. With all data types, a fully-observed variable included with variables subject to missingness in the multiple imputation process and subsequent statistical analysis provided liberal (larger than nominal values) Type I error rates under a specific pattern of missing data. It is suggested that future studies focus on the effects of multiple imputation in multivariate settings with more realistic data characteristics and a variety of multivariate analyses, assessing both Type I error and power. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: In this secondary data analysis, three statistical methodologies were implemented to handle cases with missing data in a motivational interviewing and feedback study. The aim was to evaluate the impact that these methodologies have on the data analysis. ^ Methods: We first evaluated whether the assumption of missing completely at random held for this study. We then proceeded to conduct a secondary data analysis using a mixed linear model to handle missing data with three methodologies (a) complete case analysis, (b) multiple imputation with explicit model containing outcome variables, time, and the interaction of time and treatment, and (c) multiple imputation with explicit model containing outcome variables, time, the interaction of time and treatment, and additional covariates (e.g., age, gender, smoke, years in school, marital status, housing, race/ethnicity, and if participants play on athletic team). Several comparisons were conducted including the following ones: 1) the motivation interviewing with feedback group (MIF) vs. the assessment only group (AO), the motivation interviewing group (MIO) vs. AO, and the intervention of the feedback only group (FBO) vs. AO, 2) MIF vs. FBO, and 3) MIF vs. MIO.^ Results: We first evaluated the patterns of missingness in this study, which indicated that about 13% of participants showed monotone missing patterns, and about 3.5% showed non-monotone missing patterns. Then we evaluated the assumption of missing completely at random by Little's missing completely at random (MCAR) test, in which the Chi-Square test statistic was 167.8 with 125 degrees of freedom, and its associated p-value was p=0.006, which indicated that the data could not be assumed to be missing completely at random. After that, we compared if the three different strategies reached the same results. For the comparison between MIF and AO as well as the comparison between MIF and FBO, only the multiple imputation with additional covariates by uncongenial and congenial models reached different results. For the comparison between MIF and MIO, all the methodologies for handling missing values obtained different results. ^ Discussions: The study indicated that, first, missingness was crucial in this study. Second, to understand the assumptions of the model was important since we could not identify if the data were missing at random or missing not at random. Therefore, future researches should focus on exploring more sensitivity analyses under missing not at random assumption.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

National Highway Traffic Safety Administration, Washington, D.C.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In large epidemiological studies missing data can be a problem, especially if information is sought on a sensitive topic or when a composite measure is calculated from several variables each affected by missing values. Multiple imputation is the method of choice for 'filling in' missing data based on associations among variables. Using an example about body mass index from the Australian Longitudinal Study on Women's Health, we identify a subset of variables that are particularly useful for imputing values for the target variables. Then we illustrate two uses of multiple imputation. The first is to examine and correct for bias when data are not missing completely at random. The second is to impute missing values for an important covariate; in this case omission from the imputation process of variables to be used in the analysis may introduce bias. We conclude with several recommendations for handling issues of missing data. Copyright (C) 2004 John Wiley Sons, Ltd.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objectives: To estimate differences in self-rated health by mode of administration and to assess the value of multiple imputation to make self-rated health comparable for telephone and mail. Methods: In 1996, Survey 1 of the Australian Longitudinal Study on Women's Health was answered by mail. In 1998, 706 and 11,595 mid-age women answered Survey 2 by telephone and mail respectively. Self-rated health was measured by the physical and mental health scores of the SF-36. Mean change in SF-36 scores between Surveys 1 and 2 were compared for telephone and mail respondents to Survey 2, before and after adjustment for socio-demographic and health characteristics. Missing values and SF-36 scores for telephone respondents at Survey 2 were imputed from SF-36 mail responses and telephone and mail responses to socio-demographic and health questions. Results: At Survey 2, self-rated health improved for telephone respondents but not mail respondents. After adjustment, mean changes in physical health and mental health scores remained higher (0.4 and 1.6 respectively) for telephone respondents compared with mail respondents (-1.2 and 0.1 respectively). Multiple imputation yielded adjusted changes in SF-36 scores that were similar for telephone and mail respondents. Conclusions and Implications: The effect of mode of administration on the change in mental health is important given that a difference of two points in SF-36 scores is accepted as clinically meaningful. Health evaluators should be aware of and adjust for the effects of mode of administration on self-rated health. Multiple imputation is one method that may be used to adjust SF-36 scores for mode of administration bias.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ABSTRACT Researchers frequently have to analyze scales in which some participants have failed to respond to some items. In this paper we focus on the exploratory factor analysis of multidimensional scales (i.e., scales that consist of a number of subscales) where each subscale is made up of a number of Likert-type items, and the aim of the analysis is to estimate participants' scores on the corresponding latent traits. We propose a new approach to deal with missing responses in such a situation that is based on (1) multiple imputation of non-responses and (2) simultaneous rotation of the imputed datasets. We applied the approach in a real dataset where missing responses were artificially introduced following a real pattern of non-responses, and a simulation study based on artificial datasets. The results show that our approach (specifically, Hot-Deck multiple imputation followed of Consensus Promin rotation) was able to successfully compute factor score estimates even for participants that have missing data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Obtaining attribute values of non-chosen alternatives in a revealed preference context is challenging because non-chosen alternative attributes are unobserved by choosers, chooser perceptions of attribute values may not reflect reality, existing methods for imputing these values suffer from shortcomings, and obtaining non-chosen attribute values is resource intensive. This paper presents a unique Bayesian (multiple) Imputation Multinomial Logit model that imputes unobserved travel times and distances of non-chosen travel modes based on random draws from the conditional posterior distribution of missing values. The calibrated Bayesian (multiple) Imputation Multinomial Logit model imputes non-chosen time and distance values that convincingly replicate observed choice behavior. Although network skims were used for calibration, more realistic data such as supplemental geographically referenced surveys or stated preference data may be preferred. The model is ideally suited for imputing variation in intrazonal non-chosen mode attributes and for assessing the marginal impacts of travel policies, programs, or prices within traffic analysis zones.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Attrition in longitudinal studies can lead to biased results. The study is motivated by the unexpected observation that alcohol consumption decreased despite increased availability, which may be due to sample attrition of heavy drinkers. Several imputation methods have been proposed, but rarely compared in longitudinal studies of alcohol consumption. The imputation of consumption level measurements is computationally particularly challenging due to alcohol consumption being a semi-continuous variable (dichotomous drinking status and continuous volume among drinkers), and the non-normality of data in the continuous part. Data come from a longitudinal study in Denmark with four waves (2003-2006) and 1771 individuals at baseline. Five techniques for missing data are compared: Last value carried forward (LVCF) was used as a single, and Hotdeck, Heckman modelling, multivariate imputation by chained equations (MICE), and a Bayesian approach as multiple imputation methods. Predictive mean matching was used to account for non-normality, where instead of imputing regression estimates, "real" observed values from similar cases are imputed. Methods were also compared by means of a simulated dataset. The simulation showed that the Bayesian approach yielded the most unbiased estimates for imputation. The finding of no increase in consumption levels despite a higher availability remained unaltered. Copyright (C) 2011 John Wiley & Sons, Ltd.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Les logiciels utilisés sont Splus et R.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: Currently used Trauma and Injury Severity Score (TRISS) coefficients, which measure probability of survival (Ps), were derived from the Major Trauma Outcome Study (MTOS) in 1995 and are now unlikely to be optimal. This study aims to estimate new TRISS coefficients using a contemporary database of injured patients presenting to emergency departments in the United States; and to compare these against the MTOS coefficients.---------- Methods: Data were obtained from the National Trauma Data Bank (NTDB) and the NTDB National Sample Project (NSP). TRISS coefficients were estimated using logistic regression. Separate coefficients were derived from complete case and multistage multiple imputation analyses for each NTDB and NSP dataset. Associated Ps over Injury Severity Score values were graphed and compared by age (adult ≥ 15 years; pediatric < 15 years) and injury mechanism (blunt; penetrating) groups. Area under the Receiver Operating Characteristic curves was used to assess coefficients’ predictive performance.---------- Results: Overall 1,072,033 NTDB and 1,278,563 weighted NSP injury events were included, compared with 23,177 used in the original MTOS analyses. Large differences were seen between results from complete case and imputed analyses. For blunt mechanism and adult penetrating mechanism injuries, there were similarities between coefficients estimated on imputed samples, and marked divergences between associated Ps estimated and those from the MTOS. However, negligible differences existed between area under the receiver operating characteristic curves estimates because the overwhelming majority of patients had minor trauma and survived. For pediatric penetrating mechanism injuries, variability in coefficients was large and Ps estimates unreliable.---------- Conclusions: Imputed NTDB coefficients are recommended as the TRISS coefficients 2009 revision for blunt mechanism and adult penetrating mechanism injuries. Coefficients for pediatric penetrating mechanism injuries could not be reliably estimated.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Soil-based emissions of nitrous oxide (N2O), a well-known greenhouse gas, have been associated with changes in soil water-filled pore space (WFPS) and soil temperature in many previous studies. However, it is acknowledged that the environment-N2O relationship is complex and still relatively poorly unknown. In this article, we employed a Bayesian model selection approach (Reversible jump Markov chain Monte Carlo) to develop a data-informed model of the relationship between daily N2O emissions and daily WFPS and soil temperature measurements between March 2007 and February 2009 from a soil under pasture in Queensland, Australia, taking seasonal factors and time-lagged effects into account. The model indicates a very strong relationship between a hybrid seasonal structure and daily N2O emission, with the latter substantially increased in summer. Given the other variables in the model, daily soil WFPS, lagged by a week, had a negative influence on daily N2O; there was evidence of a nonlinear positive relationship between daily soil WFPS and daily N2O emission; and daily soil temperature tended to have a linear positive relationship with daily N2O emission when daily soil temperature was above a threshold of approximately 19°C. We suggest that this flexible Bayesian modeling approach could facilitate greater understanding of the shape of the covariate-N2O flux relation and detection of effect thresholds in the natural temporal variation of environmental variables on N2O emission.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

OBJECTIVE: To evaluate the effectiveness of a telephone-delivered behavioral weight loss and physical activity intervention targeting Australian primary care patients with type 2 diabetes. RESEARCH DESIGN AND METHODS: Pragmatic randomized controlled trial of telephone counseling (n = 151) versus usual care (n = 151). Reported here are 18-month (end-of-intervention) and 24-month (maintenance) primary outcomes of weight, moderate-to-vigorous-intensity physical activity (MVPA; via accelerometer), and HbA1c level. Secondary outcomes include dietary energy intake and diet quality, waist circumference, lipid levels, and blood pressure. Data were analyzed via adjusted linear mixed models with multiple imputation of missing data. RESULTS: Relative to usual-care participants, telephone counseling participants achieved modest, but significant, improvements in weight loss (relative rate [RR] -1.42% of baseline body weight [95% CI -2.54 to -0.30% of baseline body weight]), MVPA (RR 1.42 [95% CI 1.06-1.90]), diet quality (2.72 [95% CI 0.55-4.89]), and waist circumference (-1.84 cm [95% CI -3.16 to -0.51 cm]), but not in HbA1c level (RR 0.99 [95% CI 0.96-1.02]), or other cardio-metabolic markers. None of the outcomes showed a significant change/deterioration over the maintenance period. However, only the intervention effect for MVPA remained statistically significant at 24 months. CONCLUSIONS: The modest improvements in weight loss and behavior change, but the lack of changes in cardio-metabolic markers, may limit the utility, scalability, and sustainability of such an approach.