342 resultados para multiple imputation

em Queensland University of Technology - ePrints Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A new database called the World Resource Table is constructed in this study. Missing values are known to produce complications when constructing global databases. This study provides a solution for applying multiple imputation techniques and estimates the global environmental Kuznets curve (EKC) for CO2, SO2, PM10, and BOD. Policy implications for each type of emission are derived based on the results of the EKC using WRI. Finally, we predicted the future emissions trend and regional share of CO2 emissions. We found that East Asia and South Asia will be increasing their emissions share while other major CO2 emitters will still produce large shares of the total global emissions.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Obtaining attribute values of non-chosen alternatives in a revealed preference context is challenging because non-chosen alternative attributes are unobserved by choosers, chooser perceptions of attribute values may not reflect reality, existing methods for imputing these values suffer from shortcomings, and obtaining non-chosen attribute values is resource intensive. This paper presents a unique Bayesian (multiple) Imputation Multinomial Logit model that imputes unobserved travel times and distances of non-chosen travel modes based on random draws from the conditional posterior distribution of missing values. The calibrated Bayesian (multiple) Imputation Multinomial Logit model imputes non-chosen time and distance values that convincingly replicate observed choice behavior. Although network skims were used for calibration, more realistic data such as supplemental geographically referenced surveys or stated preference data may be preferred. The model is ideally suited for imputing variation in intrazonal non-chosen mode attributes and for assessing the marginal impacts of travel policies, programs, or prices within traffic analysis zones.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: Currently used Trauma and Injury Severity Score (TRISS) coefficients, which measure probability of survival (Ps), were derived from the Major Trauma Outcome Study (MTOS) in 1995 and are now unlikely to be optimal. This study aims to estimate new TRISS coefficients using a contemporary database of injured patients presenting to emergency departments in the United States; and to compare these against the MTOS coefficients.---------- Methods: Data were obtained from the National Trauma Data Bank (NTDB) and the NTDB National Sample Project (NSP). TRISS coefficients were estimated using logistic regression. Separate coefficients were derived from complete case and multistage multiple imputation analyses for each NTDB and NSP dataset. Associated Ps over Injury Severity Score values were graphed and compared by age (adult ≥ 15 years; pediatric < 15 years) and injury mechanism (blunt; penetrating) groups. Area under the Receiver Operating Characteristic curves was used to assess coefficients’ predictive performance.---------- Results: Overall 1,072,033 NTDB and 1,278,563 weighted NSP injury events were included, compared with 23,177 used in the original MTOS analyses. Large differences were seen between results from complete case and imputed analyses. For blunt mechanism and adult penetrating mechanism injuries, there were similarities between coefficients estimated on imputed samples, and marked divergences between associated Ps estimated and those from the MTOS. However, negligible differences existed between area under the receiver operating characteristic curves estimates because the overwhelming majority of patients had minor trauma and survived. For pediatric penetrating mechanism injuries, variability in coefficients was large and Ps estimates unreliable.---------- Conclusions: Imputed NTDB coefficients are recommended as the TRISS coefficients 2009 revision for blunt mechanism and adult penetrating mechanism injuries. Coefficients for pediatric penetrating mechanism injuries could not be reliably estimated.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Soil-based emissions of nitrous oxide (N2O), a well-known greenhouse gas, have been associated with changes in soil water-filled pore space (WFPS) and soil temperature in many previous studies. However, it is acknowledged that the environment-N2O relationship is complex and still relatively poorly unknown. In this article, we employed a Bayesian model selection approach (Reversible jump Markov chain Monte Carlo) to develop a data-informed model of the relationship between daily N2O emissions and daily WFPS and soil temperature measurements between March 2007 and February 2009 from a soil under pasture in Queensland, Australia, taking seasonal factors and time-lagged effects into account. The model indicates a very strong relationship between a hybrid seasonal structure and daily N2O emission, with the latter substantially increased in summer. Given the other variables in the model, daily soil WFPS, lagged by a week, had a negative influence on daily N2O; there was evidence of a nonlinear positive relationship between daily soil WFPS and daily N2O emission; and daily soil temperature tended to have a linear positive relationship with daily N2O emission when daily soil temperature was above a threshold of approximately 19°C. We suggest that this flexible Bayesian modeling approach could facilitate greater understanding of the shape of the covariate-N2O flux relation and detection of effect thresholds in the natural temporal variation of environmental variables on N2O emission.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

OBJECTIVE: To evaluate the effectiveness of a telephone-delivered behavioral weight loss and physical activity intervention targeting Australian primary care patients with type 2 diabetes. RESEARCH DESIGN AND METHODS: Pragmatic randomized controlled trial of telephone counseling (n = 151) versus usual care (n = 151). Reported here are 18-month (end-of-intervention) and 24-month (maintenance) primary outcomes of weight, moderate-to-vigorous-intensity physical activity (MVPA; via accelerometer), and HbA1c level. Secondary outcomes include dietary energy intake and diet quality, waist circumference, lipid levels, and blood pressure. Data were analyzed via adjusted linear mixed models with multiple imputation of missing data. RESULTS: Relative to usual-care participants, telephone counseling participants achieved modest, but significant, improvements in weight loss (relative rate [RR] -1.42% of baseline body weight [95% CI -2.54 to -0.30% of baseline body weight]), MVPA (RR 1.42 [95% CI 1.06-1.90]), diet quality (2.72 [95% CI 0.55-4.89]), and waist circumference (-1.84 cm [95% CI -3.16 to -0.51 cm]), but not in HbA1c level (RR 0.99 [95% CI 0.96-1.02]), or other cardio-metabolic markers. None of the outcomes showed a significant change/deterioration over the maintenance period. However, only the intervention effect for MVPA remained statistically significant at 24 months. CONCLUSIONS: The modest improvements in weight loss and behavior change, but the lack of changes in cardio-metabolic markers, may limit the utility, scalability, and sustainability of such an approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper studies the missing covariate problem which is often encountered in survival analysis. Three covariate imputation methods are employed in the study, and the effectiveness of each method is evaluated within the hazard prediction framework. Data from a typical engineering asset is used in the case study. Covariate values in some time steps are deliberately discarded to generate an incomplete covariate set. It is found that although the mean imputation method is simpler than others for solving missing covariate problems, the results calculated by it can differ largely from the real values of the missing covariates. This study also shows that in general, results obtained from the regression method are more accurate than those of the mean imputation method but at the cost of a higher computational expensive. Gaussian Mixture Model (GMM) method is found to be the most effective method within these three in terms of both computation efficiency and predication accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Associations between single nucleotide polymorphisms (SNPs) at 5p15 and multiple cancer types have been reported. We have previously shown evidence for a strong association between prostate cancer (PrCa) risk and rs2242652 at 5p15, intronic in the telomerase reverse transcriptase (TERT) gene that encodes TERT. To comprehensively evaluate the association between genetic variation across this region and PrCa, we performed a fine-mapping analysis by genotyping 134 SNPs using a custom Illumina iSelect array or Sequenom MassArray iPlex, followed by imputation of 1094 SNPs in 22 301 PrCa cases and 22 320 controls in The PRACTICAL consortium. Multiple stepwise logistic regression analysis identified four signals in the promoter or intronic regions of TERT that independently associated with PrCa risk. Gene expression analysis of normal prostate tissue showed evidence that SNPs within one of these regions also associated with TERT expression, providing a potential mechanism for predisposition to disease.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Environmental data usually include measurements, such as water quality data, which fall below detection limits, because of limitations of the instruments or of certain analytical methods used. The fact that some responses are not detected needs to be properly taken into account in statistical analysis of such data. However, it is well-known that it is challenging to analyze a data set with detection limits, and we often have to rely on the traditional parametric methods or simple imputation methods. Distributional assumptions can lead to biased inference and justification of distributions is often not possible when the data are correlated and there is a large proportion of data below detection limits. The extent of bias is usually unknown. To draw valid conclusions and hence provide useful advice for environmental management authorities, it is essential to develop and apply an appropriate statistical methodology. This paper proposes rank-based procedures for analyzing non-normally distributed data collected at different sites over a period of time in the presence of multiple detection limits. To take account of temporal correlations within each site, we propose an optimal linear combination of estimating functions and apply the induced smoothing method to reduce the computational burden. Finally, we apply the proposed method to the water quality data collected at Susquehanna River Basin in United States of America, which dearly demonstrates the advantages of the rank regression models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genome-wide association studies (GWAS) have identified numerous common prostate cancer (PrCa) susceptibility loci. We have fine-mapped 64 GWAS regions known at the conclusion of the iCOGS study using large-scale genotyping and imputation in 25 723 PrCa cases and 26 274 controls of European ancestry. We detected evidence for multiple independent signals at 16 regions, 12 of which contained additional newly identified significant associations. A single signal comprising a spectrum of correlated variation was observed at 39 regions; 35 of which are now described by a novel more significantly associated lead SNP, while the originally reported variant remained as the lead SNP only in 4 regions. We also confirmed two association signals in Europeans that had been previously reported only in East-Asian GWAS. Based on statistical evidence and linkage disequilibrium (LD) structure, we have curated and narrowed down the list of the most likely candidate causal variants for each region. Functional annotation using data from ENCODE filtered for PrCa cell lines and eQTL analysis demonstrated significant enrichment for overlap with bio-features within this set. By incorporating the novel risk variants identified here alongside the refined data for existing association signals, we estimate that these loci now explain ∼38.9% of the familial relative risk of PrCa, an 8.9% improvement over the previously reported GWAS tag SNPs. This suggests that a significant fraction of the heritability of PrCa may have been hidden during the discovery phase of GWAS, in particular due to the presence of multiple independent signals within the same region.