910 resultados para Missing data
Resumo:
Providing transportation system operators and travelers with accurate travel time information allows them to make more informed decisions, yielding benefits for individual travelers and for the entire transportation system. Most existing advanced traveler information systems (ATIS) and advanced traffic management systems (ATMS) use instantaneous travel time values estimated based on the current measurements, assuming that traffic conditions remain constant in the near future. For more effective applications, it has been proposed that ATIS and ATMS should use travel times predicted for short-term future conditions rather than instantaneous travel times measured or estimated for current conditions. ^ This dissertation research investigates short-term freeway travel time prediction using Dynamic Neural Networks (DNN) based on traffic detector data collected by radar traffic detectors installed along a freeway corridor. DNN comprises a class of neural networks that are particularly suitable for predicting variables like travel time, but has not been adequately investigated for this purpose. Before this investigation, it was necessary to identifying methods for data imputation to account for missing data usually encountered when collecting data using traffic detectors. It was also necessary to identify a method to estimate the travel time on the freeway corridor based on data collected using point traffic detectors. A new travel time estimation method referred to as the Piecewise Constant Acceleration Based (PCAB) method was developed and compared with other methods reported in the literatures. The results show that one of the simple travel time estimation methods (the average speed method) can work as well as the PCAB method, and both of them out-perform other methods. This study also compared the travel time prediction performance of three different DNN topologies with different memory setups. The results show that one DNN topology (the time-delay neural networks) out-performs the other two DNN topologies for the investigated prediction problem. This topology also performs slightly better than the simple multilayer perceptron (MLP) neural network topology that has been used in a number of previous studies for travel time prediction.^
Resumo:
Over the last two decades social vulnerability has emerged as a major area of study, with increasing attention to the study of vulnerable populations. Generally, the elderly are among the most vulnerable members of any society, and widespread population aging has led to greater focus on elderly vulnerability. However, the absence of a valid and practical measure constrains the ability of policy-makers to address this issue in a comprehensive way. This study developed a composite indicator, The Elderly Social Vulnerability Index (ESVI), and used it to undertake a comparative analysis of the availability of support for elderly Jamaicans based on their access to human, material and social resources. The results of the ESVI indicated that while the elderly are more vulnerable overall, certain segments of the population appear to be at greater risk. Females had consistently lower scores than males, and the oldest-old had the highest scores of all groups of older persons. Vulnerability scores also varied according to place of residence, with more rural parishes having higher scores than their urban counterparts. These findings support the political economy framework which locates disadvantage in old age within political and ideological structures. The findings also point to the pervasiveness and persistence of gender inequality as argued by feminist theories of aging. Based on the results of the study it is clear that there is a need for policies that target specific population segments, in addition to universal policies that could make the experience of old age less challenging for the majority of older persons. Overall, the ESVI has displayed usefulness as a tool for theoretical analysis and demonstrated its potential as a policy instrument to assist decision-makers in determining where to target their efforts as they seek to address the issue of social vulnerability in old age. Data for this study came from the 2001 population and housing census of Jamaica, with multiple imputation for missing data. The index was derived from the linear aggregation of three equally weighted domains, comprised of eleven unweighted indicators which were normalized using z-scores. Indicators were selected based on theoretical relevance and data availability.
Resumo:
Providing transportation system operators and travelers with accurate travel time information allows them to make more informed decisions, yielding benefits for individual travelers and for the entire transportation system. Most existing advanced traveler information systems (ATIS) and advanced traffic management systems (ATMS) use instantaneous travel time values estimated based on the current measurements, assuming that traffic conditions remain constant in the near future. For more effective applications, it has been proposed that ATIS and ATMS should use travel times predicted for short-term future conditions rather than instantaneous travel times measured or estimated for current conditions. This dissertation research investigates short-term freeway travel time prediction using Dynamic Neural Networks (DNN) based on traffic detector data collected by radar traffic detectors installed along a freeway corridor. DNN comprises a class of neural networks that are particularly suitable for predicting variables like travel time, but has not been adequately investigated for this purpose. Before this investigation, it was necessary to identifying methods for data imputation to account for missing data usually encountered when collecting data using traffic detectors. It was also necessary to identify a method to estimate the travel time on the freeway corridor based on data collected using point traffic detectors. A new travel time estimation method referred to as the Piecewise Constant Acceleration Based (PCAB) method was developed and compared with other methods reported in the literatures. The results show that one of the simple travel time estimation methods (the average speed method) can work as well as the PCAB method, and both of them out-perform other methods. This study also compared the travel time prediction performance of three different DNN topologies with different memory setups. The results show that one DNN topology (the time-delay neural networks) out-performs the other two DNN topologies for the investigated prediction problem. This topology also performs slightly better than the simple multilayer perceptron (MLP) neural network topology that has been used in a number of previous studies for travel time prediction.
Resumo:
This paper provides a method for constructing a new historical global nitrogen fertilizer application map (0.5° × 0.5° resolution) for the period 1961-2010 based on country-specific information from Food and Agriculture Organization statistics (FAOSTAT) and various global datasets. This new map incorporates the fraction of NH+4 (and NONO-3) in N fertilizer inputs by utilizing fertilizer species information in FAOSTAT, in which species can be categorized as NH+4 and/or NO-3-forming N fertilizers. During data processing, we applied a statistical data imputation method for the missing data (19 % of national N fertilizer consumption) in FAOSTAT. The multiple imputation method enabled us to fill gaps in the time-series data using plausible values using covariates information (year, population, GDP, and crop area). After the imputation, we downscaled the national consumption data to a gridded cropland map. Also, we applied the multiple imputation method to the available chemical fertilizer species consumption, allowing for the estimation of the NH+4/NO-3 ratio in national fertilizer consumption. In this study, the synthetic N fertilizer inputs in 2000 showed a general consistency with the existing N fertilizer map (Potter et al., 2010, doi:10.1175/2009EI288.1) in relation to the ranges of N fertilizer inputs. Globally, the estimated N fertilizer inputs based on the sum of filled data increased from 15 Tg-N to 110 Tg-N during 1961-2010. On the other hand, the global NO-3 input started to decline after the late 1980s and the fraction of NO-3 in global N fertilizer decreased consistently from 35 % to 13 % over a 50-year period. NH+4 based fertilizers are dominant in most countries; however, the NH+4/NO-3 ratio in N fertilizer inputs shows clear differences temporally and geographically. This new map can be utilized as an input data to global model studies and bring new insights for the assessment of historical terrestrial N cycling changes.
Resumo:
Copyright © 2014 International Anesthesia Research Society.BACKGROUND: Goal-directed fluid therapy (GDFT) is associated with improved outcomes after surgery. The esophageal Doppler monitor (EDM) is widely used, but has several limitations. The NICOM, a completely noninvasive cardiac output monitor (Cheetah Medical), may be appropriate for guiding GDFT. No prospective studies have compared the NICOM and the EDM. We hypothesized that the NICOM is not significantly different from the EDM for monitoring during GDFT. METHODS: One hundred adult patients undergoing elective colorectal surgery participated in this study. Patients in phase I (n = 50) had intraoperative GDFT guided by the EDM while the NICOM was connected, and patients in phase II (n = 50) had intraoperative GDFT guided by the NICOM while the EDM was connected. Each patient's stroke volume was optimized using 250- mL colloid boluses. Agreement between the monitors was assessed, and patient outcomes (postoperative pain, nausea, and return of bowel function), complications (renal, pulmonary, infectious, and wound complications), and length of hospital stay (LOS) were compared. RESULTS: Using a 10% increase in stroke volume after fluid challenge, agreement between monitors was 60% at 5 minutes, 61% at 10 minutes, and 66% at 15 minutes, with no significant systematic disagreement (McNemar P > 0.05) at any time point. The EDM had significantly more missing data than the NICOM. No clinically significant differences were found in total LOS or other outcomes. The mean LOS was 6.56 ± 4.32 days in phase I and 6.07 ± 2.85 days in phase II, and 95% confidence limits for the difference were -0.96 to +1.95 days (P = 0.5016). CONCLUSIONS: The NICOM performs similarly to the EDM in guiding GDFT, with no clinically significant differences in outcomes, and offers increased ease of use as well as fewer missing data points. The NICOM may be a viable alternative monitor to guide GDFT.
Resumo:
Surveys can collect important data that inform policy decisions and drive social science research. Large government surveys collect information from the U.S. population on a wide range of topics, including demographics, education, employment, and lifestyle. Analysis of survey data presents unique challenges. In particular, one needs to account for missing data, for complex sampling designs, and for measurement error. Conceptually, a survey organization could spend lots of resources getting high-quality responses from a simple random sample, resulting in survey data that are easy to analyze. However, this scenario often is not realistic. To address these practical issues, survey organizations can leverage the information available from other sources of data. For example, in longitudinal studies that suffer from attrition, they can use the information from refreshment samples to correct for potential attrition bias. They can use information from known marginal distributions or survey design to improve inferences. They can use information from gold standard sources to correct for measurement error.
This thesis presents novel approaches to combining information from multiple sources that address the three problems described above.
The first method addresses nonignorable unit nonresponse and attrition in a panel survey with a refreshment sample. Panel surveys typically suffer from attrition, which can lead to biased inference when basing analysis only on cases that complete all waves of the panel. Unfortunately, the panel data alone cannot inform the extent of the bias due to attrition, so analysts must make strong and untestable assumptions about the missing data mechanism. Many panel studies also include refreshment samples, which are data collected from a random sample of new
individuals during some later wave of the panel. Refreshment samples offer information that can be utilized to correct for biases induced by nonignorable attrition while reducing reliance on strong assumptions about the attrition process. To date, these bias correction methods have not dealt with two key practical issues in panel studies: unit nonresponse in the initial wave of the panel and in the
refreshment sample itself. As we illustrate, nonignorable unit nonresponse
can significantly compromise the analyst's ability to use the refreshment samples for attrition bias correction. Thus, it is crucial for analysts to assess how sensitive their inferences---corrected for panel attrition---are to different assumptions about the nature of the unit nonresponse. We present an approach that facilitates such sensitivity analyses, both for suspected nonignorable unit nonresponse
in the initial wave and in the refreshment sample. We illustrate the approach using simulation studies and an analysis of data from the 2007-2008 Associated Press/Yahoo News election panel study.
The second method incorporates informative prior beliefs about
marginal probabilities into Bayesian latent class models for categorical data.
The basic idea is to append synthetic observations to the original data such that
(i) the empirical distributions of the desired margins match those of the prior beliefs, and (ii) the values of the remaining variables are left missing. The degree of prior uncertainty is controlled by the number of augmented records. Posterior inferences can be obtained via typical MCMC algorithms for latent class models, tailored to deal efficiently with the missing values in the concatenated data.
We illustrate the approach using a variety of simulations based on data from the American Community Survey, including an example of how augmented records can be used to fit latent class models to data from stratified samples.
The third method leverages the information from a gold standard survey to model reporting error. Survey data are subject to reporting error when respondents misunderstand the question or accidentally select the wrong response. Sometimes survey respondents knowingly select the wrong response, for example, by reporting a higher level of education than they actually have attained. We present an approach that allows an analyst to model reporting error by incorporating information from a gold standard survey. The analyst can specify various reporting error models and assess how sensitive their conclusions are to different assumptions about the reporting error process. We illustrate the approach using simulations based on data from the 1993 National Survey of College Graduates. We use the method to impute error-corrected educational attainments in the 2010 American Community Survey using the 2010 National Survey of College Graduates as the gold standard survey.
Resumo:
Background: Sickle Cell Disease (SCD) is a genetic hematological disorder that affects more than 7 million people globally (NHLBI, 2009). It is estimated that 50% of adults with SCD experience pain on most days, with 1/3 experiencing chronic pain daily (Smith et al., 2008). Persons with SCD also experience higher levels of pain catastrophizing (feelings of helplessness, pain rumination and magnification) than other chronic pain conditions, which is associated with increases in pain intensity, pain behavior, analgesic consumption, frequency and duration of hospital visits, and with reduced daily activities (Sullivan, Bishop, & Pivik, 1995; Keefe et al., 2000; Gil et al., 1992 & 1993). Therefore effective interventions are needed that can successfully be used manage pain and pain-related outcomes (e.g., pain catastrophizing) in persons with SCD. A review of the literature demonstrated limited information regarding the feasibility and efficacy of non-pharmacological approaches for pain in persons with SCD, finding an average effect size of .33 on pain reduction across measurable non-pharmacological studies. Second, a prospective study on persons with SCD that received care for a vaso-occlusive crisis (VOC; N = 95) found: (1) high levels of patient reported depression (29%) and anxiety (34%), and (2) that unemployment was significantly associated with increased frequency of acute care encounters and hospital admissions per person. Research suggests that one promising category of non-pharmacological interventions for managing both physical and affective components of pain are Mindfulness-based Interventions (MBIs; Thompson et al., 2010; Cox et al., 2013). The primary goal of this dissertation was thus to develop and test the feasibility, acceptability, and efficacy of a telephonic MBI for pain catastrophizing in persons with SCD and chronic pain.
Methods: First, a telephonic MBI was developed through an informal process that involved iterative feedback from patients, clinical experts in SCD and pain management, social workers, psychologists, and mindfulness clinicians. Through this process, relevant topics and skills were selected to adapt in each MBI session. Second, a pilot randomized controlled trial was conducted to test the feasibility, acceptability, and efficacy of the telephonic MBI for pain catastrophizing in persons with SCD and chronic pain. Acceptability and feasibility were determined by assessment of recruitment, attrition, dropout, and refusal rates (including refusal reasons), along with semi-structured interviews with nine randomly selected patients at the end of study. Participants completed assessments at baseline, Week 1, 3, and 6 to assess efficacy of the intervention on decreasing pain catastrophizing and other pain-related outcomes.
Results: A telephonic MBI is feasible and acceptable for persons with SCD and chronic pain. Seventy-eight patients with SCD and chronic pain were approached, and 76% (N = 60) were enrolled and randomized. The MBI attendance rate, approximately 57% of participants completing at least four mindfulness sessions, was deemed acceptable, and participants that received the telephonic MBI described it as acceptable, easy to access, and consume in post-intervention interviews. The amount of missing data was undesirable (MBI condition, 40%; control condition, 25%), but fell within the range of expected missing outcome data for a RCT with multiple follow-up assessments. Efficacy of the MBI on pain catastrophizing could not be determined due to small sample size and degree of missing data, but trajectory analyses conducted for the MBI condition only trended in the right direction and pain catastrophizing approached statistically significance.
Conclusion: Overall results showed that at telephonic group-based MBI is acceptable and feasible for persons with SCD and chronic pain. Though the study was not able to determine treatment efficacy nor powered to detect a statistically significant difference between conditions, participants (1) described the intervention as acceptable, and (2) the observed effect sizes for the MBI condition demonstrated large effects of the MBI on pain catastrophizing, mental health, and physical health. Replication of this MBI study with a larger sample size, active control group, and additional assessments at the end of each week (e.g., Week 1 through Week 6) is needed to determine treatment efficacy. Many lessons were learned that will guide the development of future studies including which MBI strategies were most helpful, methods to encourage continued participation, and how to improve data capture.
Resumo:
Previously developed models for predicting absolute risk of invasive epithelial ovarian cancer have included a limited number of risk factors and have had low discriminatory power (area under the receiver operating characteristic curve (AUC) < 0.60). Because of this, we developed and internally validated a relative risk prediction model that incorporates 17 established epidemiologic risk factors and 17 genome-wide significant single nucleotide polymorphisms (SNPs) using data from 11 case-control studies in the United States (5,793 cases; 9,512 controls) from the Ovarian Cancer Association Consortium (data accrued from 1992 to 2010). We developed a hierarchical logistic regression model for predicting case-control status that included imputation of missing data. We randomly divided the data into an 80% training sample and used the remaining 20% for model evaluation. The AUC for the full model was 0.664. A reduced model without SNPs performed similarly (AUC = 0.649). Both models performed better than a baseline model that included age and study site only (AUC = 0.563). The best predictive power was obtained in the full model among women younger than 50 years of age (AUC = 0.714); however, the addition of SNPs increased the AUC the most for women older than 50 years of age (AUC = 0.638 vs. 0.616). Adapting this improved model to estimate absolute risk and evaluating it in prospective data sets is warranted.
Resumo:
Over the last two decades social vulnerability has emerged as a major area of study, with increasing attention to the study of vulnerable populations. Generally, the elderly are among the most vulnerable members of any society, and widespread population aging has led to greater focus on elderly vulnerability. However, the absence of a valid and practical measure constrains the ability of policy-makers to address this issue in a comprehensive way. This study developed a composite indicator, The Elderly Social Vulnerability Index (ESVI), and used it to undertake a comparative analysis of the availability of support for elderly Jamaicans based on their access to human, material and social resources. The results of the ESVI indicated that while the elderly are more vulnerable overall, certain segments of the population appear to be at greater risk. Females had consistently lower scores than males, and the oldest-old had the highest scores of all groups of older persons. Vulnerability scores also varied according to place of residence, with more rural parishes having higher scores than their urban counterparts. These findings support the political economy framework which locates disadvantage in old age within political and ideological structures. The findings also point to the pervasiveness and persistence of gender inequality as argued by feminist theories of aging. Based on the results of the study it is clear that there is a need for policies that target specific population segments, in addition to universal policies that could make the experience of old age less challenging for the majority of older persons. Overall, the ESVI has displayed usefulness as a tool for theoretical analysis and demonstrated its potential as a policy instrument to assist decision-makers in determining where to target their efforts as they seek to address the issue of social vulnerability in old age. Data for this study came from the 2001 population and housing census of Jamaica, with multiple imputation for missing data. The index was derived from the linear aggregation of three equally weighted domains, comprised of eleven unweighted indicators which were normalized using z-scores. Indicators were selected based on theoretical relevance and data availability.
Resumo:
Daily records of nine meteorological variables covering the interval 1961-2013 were used in order to create a state-of-the-art homogenized climatic dataset over Romania at a spatial resolution of 0.1°. All meteorological stations with full data records, as well as stations with up to 30 % missing data, were used for the following variables: air pressure (150 stations); minimum, maximum, and average air temperature (150 stations); soil temperature (127 stations); precipitation (188 stations); sunshine hours (135 stations); cloud cover (104 stations); relative humidity (150 stations). For each parameter, the data series were first homogenized with the software MASH (Multiple Analysis of Series for Homogenization); then, the data series were gridded by means of the software MISH (Meteorological Interpolation based on Surface Homogenized Data).
Resumo:
BACKGROUND: Moderate-to-vigorous physical activity (MVPA) is an important determinant of children’s physical health, and is commonly measured using accelerometers. A major limitation of accelerometers is non-wear time, which is the time the participant did not wear their device. Given that non-wear time is traditionally discarded from the dataset prior to estimating MVPA, final estimates of MVPA may be biased. Therefore, alternate approaches should be explored. OBJECTIVES: The objectives of this thesis were to 1) develop and describe an imputation approach that uses the socio-demographic, time, health, and behavioural data from participants to replace non-wear time accelerometer data, 2) determine the extent to which imputation of non-wear time data influences estimates of MVPA, and 3) determine if imputation of non-wear time data influences the associations between MVPA, body mass index (BMI), and systolic blood pressure (SBP). METHODS: Seven days of accelerometer data were collected using Actical accelerometers from 332 children aged 10-13. Three methods for handling missing accelerometer data were compared: 1) the “non-imputed” method wherein non-wear time was deleted from the dataset, 2) imputation dataset I, wherein the imputation of MVPA during non-wear time was based upon socio-demographic factors of the participant (e.g., age), health information (e.g., BMI), and time characteristics of the non-wear period (e.g., season), and 3) imputation dataset II wherein the imputation of MVPA was based upon the same variables as imputation dataset I, plus organized sport information. Associations between MVPA and health outcomes in each method were assessed using linear regression. RESULTS: Non-wear time accounted for 7.5% of epochs during waking hours. The average minutes/day of MVPA was 56.8 (95% CI: 54.2, 59.5) in the non-imputed dataset, 58.4 (95% CI: 55.8, 61.0) in imputed dataset I, and 59.0 (95% CI: 56.3, 61.5) in imputed dataset II. Estimates between datasets were not significantly different. The strength of the relationship between MVPA with BMI and SBP were comparable between all three datasets. CONCLUSION: These findings suggest that studies that achieve high accelerometer compliance with unsystematic patterns of missing data can use the traditional approach of deleting non-wear time from the dataset to obtain MVPA measures without substantial bias.
Resumo:
Background: Primary total knee replacement is a common operation that is performed to provide pain relief and restore functional ability. Inpatient physiotherapy is routinely provided after surgery to enhance recovery prior to hospital discharge. However, international variation exists in the provision of outpatient physiotherapy after hospital discharge. While evidence indicates that outpatient physiotherapy can improve short-term function, the longer term benefits are unknown. The aim of this randomised controlled trial is to evaluate the long-term clinical effectiveness and cost-effectiveness of a 6-week group-based outpatient physiotherapy intervention following knee replacement. Methods/design: Two hundred and fifty-six patients waiting for knee replacement because of osteoarthritis will be recruited from two orthopaedic centres. Participants randomised to the usual-care group (n = 128) will be given a booklet about exercise and referred for physiotherapy if deemed appropriate by the clinical care team. The intervention group (n = 128) will receive the same usual care and additionally be invited to attend a group-based outpatient physiotherapy class starting 6 weeks after surgery. The 1-hour class will be run on a weekly basis over 6 weeks and will involve task-orientated and individualised exercises. The primary outcome will be the Lower Extremity Functional Scale at 12 months post-operative. Secondary outcomes include: quality of life, knee pain and function, depression, anxiety and satisfaction. Data collection will be by questionnaire prior to surgery and 3, 6 and 12 months after surgery and will include a resource-use questionnaire to enable a trial-based economic evaluation. Trial participation and satisfaction with the classes will be evaluated through structured telephone interviews. The primary statistical and economic analyses will be conducted on an intention-to-treat basis with and without imputation of missing data. The primary economic result will estimate the incremental cost per quality-adjusted life year gained from this intervention from a National Health Services (NHS) and personal social services perspective. Discussion: This research aims to benefit patients and the NHS by providing evidence on the long-term effectiveness and cost-effectiveness of outpatient physiotherapy after knee replacement. If the intervention is found to be effective and cost-effective, implementation into clinical practice could lead to improvement in patients’ outcomes and improved health care resource efficiency.
Resumo:
Thesis (Master's)--University of Washington, 2016-08
Resumo:
We present a new method for ecologically sustainable land use planning within multiple land use schemes. Our aims were (1) to develop a method that can be used to locate important areas based on their ecological values; (2) to evaluate the quality, quantity, availability, and usability of existing ecological data sets; and (3) to demonstrate the use of the method in Eastern Finland, where there are requirements for the simultaneous development of nature conservation, tourism, and recreation. We compiled all available ecological data sets from the study area, complemented the missing data using habitat suitability modeling, calculated the total ecological score (TES) for each 1 ha grid cell in the study area, and finally, demonstrated the use of TES in assessing the success of nature conservation in covering ecologically valuable areas and locating ecologically sustainable areas for tourism and recreational infrastructure. The method operated quite well at the level required for regional and local scale planning. The quality, quantity, availability, and usability of existing data sets were generally high, and they could be further complemented by modeling. There are still constraints that limit the use of the method in practical land use planning. However, as increasing data become available and open access, and modeling tools improve, the usability and applicability of the method will increase.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08