949 resultados para monotone missing data
Resumo:
2000 Mathematics Subject Classification: 62-04, 62H30, 62J20
Resumo:
2010 Mathematics Subject Classification: 62F12, 62M05, 62M09, 62M10, 60G42.
Resumo:
Concurrent coding is an encoding scheme with 'holographic' type properties that are shown here to be robust against a significant amount of noise and signal loss. This single encoding scheme is able to correct for random errors and burst errors simultaneously, but does not rely on cyclic codes. A simple and practical scheme has been tested that displays perfect decoding when the signal to noise ratio is of order -18dB. The same scheme also displays perfect reconstruction when a contiguous block of 40% of the transmission is missing. In addition this scheme is 50% more efficient in terms of transmitted power requirements than equivalent cyclic codes. A simple model is presented that describes the process of decoding and can determine the computational load that would be expected, as well as describing the critical levels of noise and missing data at which false messages begin to be generated.
Resumo:
We propose a novel template matching approach for the discrimination of handwritten and machine-printed text. We first pre-process the scanned document images by performing denoising, circles/lines exclusion and word-block level segmentation. We then align and match characters in a flexible sized gallery with the segmented regions, using parallelised normalised cross-correlation. The experimental results over the Pattern Recognition & Image Analysis Research Lab-Natural History Museum (PRImA-NHM) dataset show remarkably high robustness of the algorithm in classifying cluttered, occluded and noisy samples, in addition to those with significant high missing data. The algorithm, which gives 84.0% classification rate with false positive rate 0.16 over the dataset, does not require training samples and generates compelling results as opposed to the training-based approaches, which have used the same benchmark.
Resumo:
Providing transportation system operators and travelers with accurate travel time information allows them to make more informed decisions, yielding benefits for individual travelers and for the entire transportation system. Most existing advanced traveler information systems (ATIS) and advanced traffic management systems (ATMS) use instantaneous travel time values estimated based on the current measurements, assuming that traffic conditions remain constant in the near future. For more effective applications, it has been proposed that ATIS and ATMS should use travel times predicted for short-term future conditions rather than instantaneous travel times measured or estimated for current conditions. ^ This dissertation research investigates short-term freeway travel time prediction using Dynamic Neural Networks (DNN) based on traffic detector data collected by radar traffic detectors installed along a freeway corridor. DNN comprises a class of neural networks that are particularly suitable for predicting variables like travel time, but has not been adequately investigated for this purpose. Before this investigation, it was necessary to identifying methods for data imputation to account for missing data usually encountered when collecting data using traffic detectors. It was also necessary to identify a method to estimate the travel time on the freeway corridor based on data collected using point traffic detectors. A new travel time estimation method referred to as the Piecewise Constant Acceleration Based (PCAB) method was developed and compared with other methods reported in the literatures. The results show that one of the simple travel time estimation methods (the average speed method) can work as well as the PCAB method, and both of them out-perform other methods. This study also compared the travel time prediction performance of three different DNN topologies with different memory setups. The results show that one DNN topology (the time-delay neural networks) out-performs the other two DNN topologies for the investigated prediction problem. This topology also performs slightly better than the simple multilayer perceptron (MLP) neural network topology that has been used in a number of previous studies for travel time prediction.^
Resumo:
Over the last two decades social vulnerability has emerged as a major area of study, with increasing attention to the study of vulnerable populations. Generally, the elderly are among the most vulnerable members of any society, and widespread population aging has led to greater focus on elderly vulnerability. However, the absence of a valid and practical measure constrains the ability of policy-makers to address this issue in a comprehensive way. This study developed a composite indicator, The Elderly Social Vulnerability Index (ESVI), and used it to undertake a comparative analysis of the availability of support for elderly Jamaicans based on their access to human, material and social resources. The results of the ESVI indicated that while the elderly are more vulnerable overall, certain segments of the population appear to be at greater risk. Females had consistently lower scores than males, and the oldest-old had the highest scores of all groups of older persons. Vulnerability scores also varied according to place of residence, with more rural parishes having higher scores than their urban counterparts. These findings support the political economy framework which locates disadvantage in old age within political and ideological structures. The findings also point to the pervasiveness and persistence of gender inequality as argued by feminist theories of aging. Based on the results of the study it is clear that there is a need for policies that target specific population segments, in addition to universal policies that could make the experience of old age less challenging for the majority of older persons. Overall, the ESVI has displayed usefulness as a tool for theoretical analysis and demonstrated its potential as a policy instrument to assist decision-makers in determining where to target their efforts as they seek to address the issue of social vulnerability in old age. Data for this study came from the 2001 population and housing census of Jamaica, with multiple imputation for missing data. The index was derived from the linear aggregation of three equally weighted domains, comprised of eleven unweighted indicators which were normalized using z-scores. Indicators were selected based on theoretical relevance and data availability.
Resumo:
Providing transportation system operators and travelers with accurate travel time information allows them to make more informed decisions, yielding benefits for individual travelers and for the entire transportation system. Most existing advanced traveler information systems (ATIS) and advanced traffic management systems (ATMS) use instantaneous travel time values estimated based on the current measurements, assuming that traffic conditions remain constant in the near future. For more effective applications, it has been proposed that ATIS and ATMS should use travel times predicted for short-term future conditions rather than instantaneous travel times measured or estimated for current conditions. This dissertation research investigates short-term freeway travel time prediction using Dynamic Neural Networks (DNN) based on traffic detector data collected by radar traffic detectors installed along a freeway corridor. DNN comprises a class of neural networks that are particularly suitable for predicting variables like travel time, but has not been adequately investigated for this purpose. Before this investigation, it was necessary to identifying methods for data imputation to account for missing data usually encountered when collecting data using traffic detectors. It was also necessary to identify a method to estimate the travel time on the freeway corridor based on data collected using point traffic detectors. A new travel time estimation method referred to as the Piecewise Constant Acceleration Based (PCAB) method was developed and compared with other methods reported in the literatures. The results show that one of the simple travel time estimation methods (the average speed method) can work as well as the PCAB method, and both of them out-perform other methods. This study also compared the travel time prediction performance of three different DNN topologies with different memory setups. The results show that one DNN topology (the time-delay neural networks) out-performs the other two DNN topologies for the investigated prediction problem. This topology also performs slightly better than the simple multilayer perceptron (MLP) neural network topology that has been used in a number of previous studies for travel time prediction.
Resumo:
This paper provides a method for constructing a new historical global nitrogen fertilizer application map (0.5° × 0.5° resolution) for the period 1961-2010 based on country-specific information from Food and Agriculture Organization statistics (FAOSTAT) and various global datasets. This new map incorporates the fraction of NH+4 (and NONO-3) in N fertilizer inputs by utilizing fertilizer species information in FAOSTAT, in which species can be categorized as NH+4 and/or NO-3-forming N fertilizers. During data processing, we applied a statistical data imputation method for the missing data (19 % of national N fertilizer consumption) in FAOSTAT. The multiple imputation method enabled us to fill gaps in the time-series data using plausible values using covariates information (year, population, GDP, and crop area). After the imputation, we downscaled the national consumption data to a gridded cropland map. Also, we applied the multiple imputation method to the available chemical fertilizer species consumption, allowing for the estimation of the NH+4/NO-3 ratio in national fertilizer consumption. In this study, the synthetic N fertilizer inputs in 2000 showed a general consistency with the existing N fertilizer map (Potter et al., 2010, doi:10.1175/2009EI288.1) in relation to the ranges of N fertilizer inputs. Globally, the estimated N fertilizer inputs based on the sum of filled data increased from 15 Tg-N to 110 Tg-N during 1961-2010. On the other hand, the global NO-3 input started to decline after the late 1980s and the fraction of NO-3 in global N fertilizer decreased consistently from 35 % to 13 % over a 50-year period. NH+4 based fertilizers are dominant in most countries; however, the NH+4/NO-3 ratio in N fertilizer inputs shows clear differences temporally and geographically. This new map can be utilized as an input data to global model studies and bring new insights for the assessment of historical terrestrial N cycling changes.
Resumo:
Copyright © 2014 International Anesthesia Research Society.BACKGROUND: Goal-directed fluid therapy (GDFT) is associated with improved outcomes after surgery. The esophageal Doppler monitor (EDM) is widely used, but has several limitations. The NICOM, a completely noninvasive cardiac output monitor (Cheetah Medical), may be appropriate for guiding GDFT. No prospective studies have compared the NICOM and the EDM. We hypothesized that the NICOM is not significantly different from the EDM for monitoring during GDFT. METHODS: One hundred adult patients undergoing elective colorectal surgery participated in this study. Patients in phase I (n = 50) had intraoperative GDFT guided by the EDM while the NICOM was connected, and patients in phase II (n = 50) had intraoperative GDFT guided by the NICOM while the EDM was connected. Each patient's stroke volume was optimized using 250- mL colloid boluses. Agreement between the monitors was assessed, and patient outcomes (postoperative pain, nausea, and return of bowel function), complications (renal, pulmonary, infectious, and wound complications), and length of hospital stay (LOS) were compared. RESULTS: Using a 10% increase in stroke volume after fluid challenge, agreement between monitors was 60% at 5 minutes, 61% at 10 minutes, and 66% at 15 minutes, with no significant systematic disagreement (McNemar P > 0.05) at any time point. The EDM had significantly more missing data than the NICOM. No clinically significant differences were found in total LOS or other outcomes. The mean LOS was 6.56 ± 4.32 days in phase I and 6.07 ± 2.85 days in phase II, and 95% confidence limits for the difference were -0.96 to +1.95 days (P = 0.5016). CONCLUSIONS: The NICOM performs similarly to the EDM in guiding GDFT, with no clinically significant differences in outcomes, and offers increased ease of use as well as fewer missing data points. The NICOM may be a viable alternative monitor to guide GDFT.
Resumo:
Surveys can collect important data that inform policy decisions and drive social science research. Large government surveys collect information from the U.S. population on a wide range of topics, including demographics, education, employment, and lifestyle. Analysis of survey data presents unique challenges. In particular, one needs to account for missing data, for complex sampling designs, and for measurement error. Conceptually, a survey organization could spend lots of resources getting high-quality responses from a simple random sample, resulting in survey data that are easy to analyze. However, this scenario often is not realistic. To address these practical issues, survey organizations can leverage the information available from other sources of data. For example, in longitudinal studies that suffer from attrition, they can use the information from refreshment samples to correct for potential attrition bias. They can use information from known marginal distributions or survey design to improve inferences. They can use information from gold standard sources to correct for measurement error.
This thesis presents novel approaches to combining information from multiple sources that address the three problems described above.
The first method addresses nonignorable unit nonresponse and attrition in a panel survey with a refreshment sample. Panel surveys typically suffer from attrition, which can lead to biased inference when basing analysis only on cases that complete all waves of the panel. Unfortunately, the panel data alone cannot inform the extent of the bias due to attrition, so analysts must make strong and untestable assumptions about the missing data mechanism. Many panel studies also include refreshment samples, which are data collected from a random sample of new
individuals during some later wave of the panel. Refreshment samples offer information that can be utilized to correct for biases induced by nonignorable attrition while reducing reliance on strong assumptions about the attrition process. To date, these bias correction methods have not dealt with two key practical issues in panel studies: unit nonresponse in the initial wave of the panel and in the
refreshment sample itself. As we illustrate, nonignorable unit nonresponse
can significantly compromise the analyst's ability to use the refreshment samples for attrition bias correction. Thus, it is crucial for analysts to assess how sensitive their inferences---corrected for panel attrition---are to different assumptions about the nature of the unit nonresponse. We present an approach that facilitates such sensitivity analyses, both for suspected nonignorable unit nonresponse
in the initial wave and in the refreshment sample. We illustrate the approach using simulation studies and an analysis of data from the 2007-2008 Associated Press/Yahoo News election panel study.
The second method incorporates informative prior beliefs about
marginal probabilities into Bayesian latent class models for categorical data.
The basic idea is to append synthetic observations to the original data such that
(i) the empirical distributions of the desired margins match those of the prior beliefs, and (ii) the values of the remaining variables are left missing. The degree of prior uncertainty is controlled by the number of augmented records. Posterior inferences can be obtained via typical MCMC algorithms for latent class models, tailored to deal efficiently with the missing values in the concatenated data.
We illustrate the approach using a variety of simulations based on data from the American Community Survey, including an example of how augmented records can be used to fit latent class models to data from stratified samples.
The third method leverages the information from a gold standard survey to model reporting error. Survey data are subject to reporting error when respondents misunderstand the question or accidentally select the wrong response. Sometimes survey respondents knowingly select the wrong response, for example, by reporting a higher level of education than they actually have attained. We present an approach that allows an analyst to model reporting error by incorporating information from a gold standard survey. The analyst can specify various reporting error models and assess how sensitive their conclusions are to different assumptions about the reporting error process. We illustrate the approach using simulations based on data from the 1993 National Survey of College Graduates. We use the method to impute error-corrected educational attainments in the 2010 American Community Survey using the 2010 National Survey of College Graduates as the gold standard survey.
Resumo:
Background: Sickle Cell Disease (SCD) is a genetic hematological disorder that affects more than 7 million people globally (NHLBI, 2009). It is estimated that 50% of adults with SCD experience pain on most days, with 1/3 experiencing chronic pain daily (Smith et al., 2008). Persons with SCD also experience higher levels of pain catastrophizing (feelings of helplessness, pain rumination and magnification) than other chronic pain conditions, which is associated with increases in pain intensity, pain behavior, analgesic consumption, frequency and duration of hospital visits, and with reduced daily activities (Sullivan, Bishop, & Pivik, 1995; Keefe et al., 2000; Gil et al., 1992 & 1993). Therefore effective interventions are needed that can successfully be used manage pain and pain-related outcomes (e.g., pain catastrophizing) in persons with SCD. A review of the literature demonstrated limited information regarding the feasibility and efficacy of non-pharmacological approaches for pain in persons with SCD, finding an average effect size of .33 on pain reduction across measurable non-pharmacological studies. Second, a prospective study on persons with SCD that received care for a vaso-occlusive crisis (VOC; N = 95) found: (1) high levels of patient reported depression (29%) and anxiety (34%), and (2) that unemployment was significantly associated with increased frequency of acute care encounters and hospital admissions per person. Research suggests that one promising category of non-pharmacological interventions for managing both physical and affective components of pain are Mindfulness-based Interventions (MBIs; Thompson et al., 2010; Cox et al., 2013). The primary goal of this dissertation was thus to develop and test the feasibility, acceptability, and efficacy of a telephonic MBI for pain catastrophizing in persons with SCD and chronic pain.
Methods: First, a telephonic MBI was developed through an informal process that involved iterative feedback from patients, clinical experts in SCD and pain management, social workers, psychologists, and mindfulness clinicians. Through this process, relevant topics and skills were selected to adapt in each MBI session. Second, a pilot randomized controlled trial was conducted to test the feasibility, acceptability, and efficacy of the telephonic MBI for pain catastrophizing in persons with SCD and chronic pain. Acceptability and feasibility were determined by assessment of recruitment, attrition, dropout, and refusal rates (including refusal reasons), along with semi-structured interviews with nine randomly selected patients at the end of study. Participants completed assessments at baseline, Week 1, 3, and 6 to assess efficacy of the intervention on decreasing pain catastrophizing and other pain-related outcomes.
Results: A telephonic MBI is feasible and acceptable for persons with SCD and chronic pain. Seventy-eight patients with SCD and chronic pain were approached, and 76% (N = 60) were enrolled and randomized. The MBI attendance rate, approximately 57% of participants completing at least four mindfulness sessions, was deemed acceptable, and participants that received the telephonic MBI described it as acceptable, easy to access, and consume in post-intervention interviews. The amount of missing data was undesirable (MBI condition, 40%; control condition, 25%), but fell within the range of expected missing outcome data for a RCT with multiple follow-up assessments. Efficacy of the MBI on pain catastrophizing could not be determined due to small sample size and degree of missing data, but trajectory analyses conducted for the MBI condition only trended in the right direction and pain catastrophizing approached statistically significance.
Conclusion: Overall results showed that at telephonic group-based MBI is acceptable and feasible for persons with SCD and chronic pain. Though the study was not able to determine treatment efficacy nor powered to detect a statistically significant difference between conditions, participants (1) described the intervention as acceptable, and (2) the observed effect sizes for the MBI condition demonstrated large effects of the MBI on pain catastrophizing, mental health, and physical health. Replication of this MBI study with a larger sample size, active control group, and additional assessments at the end of each week (e.g., Week 1 through Week 6) is needed to determine treatment efficacy. Many lessons were learned that will guide the development of future studies including which MBI strategies were most helpful, methods to encourage continued participation, and how to improve data capture.
Resumo:
Previously developed models for predicting absolute risk of invasive epithelial ovarian cancer have included a limited number of risk factors and have had low discriminatory power (area under the receiver operating characteristic curve (AUC) < 0.60). Because of this, we developed and internally validated a relative risk prediction model that incorporates 17 established epidemiologic risk factors and 17 genome-wide significant single nucleotide polymorphisms (SNPs) using data from 11 case-control studies in the United States (5,793 cases; 9,512 controls) from the Ovarian Cancer Association Consortium (data accrued from 1992 to 2010). We developed a hierarchical logistic regression model for predicting case-control status that included imputation of missing data. We randomly divided the data into an 80% training sample and used the remaining 20% for model evaluation. The AUC for the full model was 0.664. A reduced model without SNPs performed similarly (AUC = 0.649). Both models performed better than a baseline model that included age and study site only (AUC = 0.563). The best predictive power was obtained in the full model among women younger than 50 years of age (AUC = 0.714); however, the addition of SNPs increased the AUC the most for women older than 50 years of age (AUC = 0.638 vs. 0.616). Adapting this improved model to estimate absolute risk and evaluating it in prospective data sets is warranted.
Resumo:
Over the last two decades social vulnerability has emerged as a major area of study, with increasing attention to the study of vulnerable populations. Generally, the elderly are among the most vulnerable members of any society, and widespread population aging has led to greater focus on elderly vulnerability. However, the absence of a valid and practical measure constrains the ability of policy-makers to address this issue in a comprehensive way. This study developed a composite indicator, The Elderly Social Vulnerability Index (ESVI), and used it to undertake a comparative analysis of the availability of support for elderly Jamaicans based on their access to human, material and social resources. The results of the ESVI indicated that while the elderly are more vulnerable overall, certain segments of the population appear to be at greater risk. Females had consistently lower scores than males, and the oldest-old had the highest scores of all groups of older persons. Vulnerability scores also varied according to place of residence, with more rural parishes having higher scores than their urban counterparts. These findings support the political economy framework which locates disadvantage in old age within political and ideological structures. The findings also point to the pervasiveness and persistence of gender inequality as argued by feminist theories of aging. Based on the results of the study it is clear that there is a need for policies that target specific population segments, in addition to universal policies that could make the experience of old age less challenging for the majority of older persons. Overall, the ESVI has displayed usefulness as a tool for theoretical analysis and demonstrated its potential as a policy instrument to assist decision-makers in determining where to target their efforts as they seek to address the issue of social vulnerability in old age. Data for this study came from the 2001 population and housing census of Jamaica, with multiple imputation for missing data. The index was derived from the linear aggregation of three equally weighted domains, comprised of eleven unweighted indicators which were normalized using z-scores. Indicators were selected based on theoretical relevance and data availability.
Resumo:
Daily records of nine meteorological variables covering the interval 1961-2013 were used in order to create a state-of-the-art homogenized climatic dataset over Romania at a spatial resolution of 0.1°. All meteorological stations with full data records, as well as stations with up to 30 % missing data, were used for the following variables: air pressure (150 stations); minimum, maximum, and average air temperature (150 stations); soil temperature (127 stations); precipitation (188 stations); sunshine hours (135 stations); cloud cover (104 stations); relative humidity (150 stations). For each parameter, the data series were first homogenized with the software MASH (Multiple Analysis of Series for Homogenization); then, the data series were gridded by means of the software MISH (Meteorological Interpolation based on Surface Homogenized Data).
Resumo:
BACKGROUND: Moderate-to-vigorous physical activity (MVPA) is an important determinant of children’s physical health, and is commonly measured using accelerometers. A major limitation of accelerometers is non-wear time, which is the time the participant did not wear their device. Given that non-wear time is traditionally discarded from the dataset prior to estimating MVPA, final estimates of MVPA may be biased. Therefore, alternate approaches should be explored. OBJECTIVES: The objectives of this thesis were to 1) develop and describe an imputation approach that uses the socio-demographic, time, health, and behavioural data from participants to replace non-wear time accelerometer data, 2) determine the extent to which imputation of non-wear time data influences estimates of MVPA, and 3) determine if imputation of non-wear time data influences the associations between MVPA, body mass index (BMI), and systolic blood pressure (SBP). METHODS: Seven days of accelerometer data were collected using Actical accelerometers from 332 children aged 10-13. Three methods for handling missing accelerometer data were compared: 1) the “non-imputed” method wherein non-wear time was deleted from the dataset, 2) imputation dataset I, wherein the imputation of MVPA during non-wear time was based upon socio-demographic factors of the participant (e.g., age), health information (e.g., BMI), and time characteristics of the non-wear period (e.g., season), and 3) imputation dataset II wherein the imputation of MVPA was based upon the same variables as imputation dataset I, plus organized sport information. Associations between MVPA and health outcomes in each method were assessed using linear regression. RESULTS: Non-wear time accounted for 7.5% of epochs during waking hours. The average minutes/day of MVPA was 56.8 (95% CI: 54.2, 59.5) in the non-imputed dataset, 58.4 (95% CI: 55.8, 61.0) in imputed dataset I, and 59.0 (95% CI: 56.3, 61.5) in imputed dataset II. Estimates between datasets were not significantly different. The strength of the relationship between MVPA with BMI and SBP were comparable between all three datasets. CONCLUSION: These findings suggest that studies that achieve high accelerometer compliance with unsystematic patterns of missing data can use the traditional approach of deleting non-wear time from the dataset to obtain MVPA measures without substantial bias.