2 resultados para Error-location numbers

em DigitalCommons@The Texas Medical Center


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this study is to investigate the effects of predictor variable correlations and patterns of missingness with dichotomous and/or continuous data in small samples when missing data is multiply imputed. Missing data of predictor variables is multiply imputed under three different multivariate models: the multivariate normal model for continuous data, the multinomial model for dichotomous data and the general location model for mixed dichotomous and continuous data. Subsequent to the multiple imputation process, Type I error rates of the regression coefficients obtained with logistic regression analysis are estimated under various conditions of correlation structure, sample size, type of data and patterns of missing data. The distributional properties of average mean, variance and correlations among the predictor variables are assessed after the multiple imputation process. ^ For continuous predictor data under the multivariate normal model, Type I error rates are generally within the nominal values with samples of size n = 100. Smaller samples of size n = 50 resulted in more conservative estimates (i.e., lower than the nominal value). Correlation and variance estimates of the original data are retained after multiple imputation with less than 50% missing continuous predictor data. For dichotomous predictor data under the multinomial model, Type I error rates are generally conservative, which in part is due to the sparseness of the data. The correlation structure for the predictor variables is not well retained on multiply-imputed data from small samples with more than 50% missing data with this model. For mixed continuous and dichotomous predictor data, the results are similar to those found under the multivariate normal model for continuous data and under the multinomial model for dichotomous data. With all data types, a fully-observed variable included with variables subject to missingness in the multiple imputation process and subsequent statistical analysis provided liberal (larger than nominal values) Type I error rates under a specific pattern of missing data. It is suggested that future studies focus on the effects of multiple imputation in multivariate settings with more realistic data characteristics and a variety of multivariate analyses, assessing both Type I error and power. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Research studies on the association between exposures to air contaminants and disease frequently use worn dosimeters to measure the concentration of the contaminant of interest. But investigation of exposure determinants requires additional knowledge beyond concentration, i.e., knowledge about personal activity such as whether the exposure occurred in a building or outdoors. Current studies frequently depend upon manual activity logging to record location. This study's purpose was to evaluate the use of a worn data logger recording three environmental parameters—temperature, humidity, and light intensity—as well as time of day, to determine indoor or outdoor location, with an ultimate aim of eliminating the need to manually log location or at least providing a method to verify such logs. For this study, data collection was limited to a single geographical area (Houston, Texas metropolitan area) during a single season (winter) using a HOBO H8 four-channel data logger. Data for development of a Location Model were collected using the logger for deliberate sampling of programmed activities in outdoor, building, and vehicle locations at various times of day. The Model was developed by analyzing the distributions of environmental parameters by location and time to establish a prioritized set of cut points for assessing locations. The final Model consisted of four "processors" that varied these priorities and cut points. Data to evaluate the Model were collected by wearing the logger during "typical days" while maintaining a location log. The Model was tested by feeding the typical day data into each processor and generating assessed locations for each record. These assessed locations were then compared with true locations recorded in the manual log to determine accurate versus erroneous assessments. The utility of each processor was evaluated by calculating overall error rates across all times of day, and calculating individual error rates by time of day. Unfortunately, the error rates were large, such that there would be no benefit in using the Model. Another analysis in which assessed locations were classified as either indoor (including both building and vehicle) or outdoor yielded slightly lower error rates that still precluded any benefit of the Model's use.^