949 resultados para monotone missing data
Resumo:
Customer satisfaction and retention are key issues for organizations in today’s competitive market place. As such, much research and revenue has been invested in developing accurate ways of assessing consumer satisfaction at both the macro (national) and micro (organizational) level, facilitating comparisons in performance both within and between industries. Since the instigation of the national customer satisfaction indices (CSI), partial least squares (PLS) has been used to estimate the CSI models in preference to structural equation models (SEM) because they do not rely on strict assumptions about the data. However, this choice was based upon some misconceptions about the use of SEM’s and does not take into consideration more recent advances in SEM, including estimation methods that are robust to non-normality and missing data. In this paper, both SEM and PLS approaches were compared by evaluating perceptions of the Isle of Man Post Office Products and Customer service using a CSI format. The new robust SEM procedures were found to be advantageous over PLS. Product quality was found to be the only driver of customer satisfaction, while image and satisfaction were the only predictors of loyalty, thus arguing for the specificity of postal services
Resumo:
The R-package “compositions”is a tool for advanced compositional analysis. Its basic functionality has seen some conceptual improvement, containing now some facilities to work with and represent ilr bases built from balances, and an elaborated subsys- tem for dealing with several kinds of irregular data: (rounded or structural) zeroes, incomplete observations and outliers. The general approach to these irregularities is based on subcompositions: for an irregular datum, one can distinguish a “regular” sub- composition (where all parts are actually observed and the datum behaves typically) and a “problematic” subcomposition (with those unobserved, zero or rounded parts, or else where the datum shows an erratic or atypical behaviour). Systematic classification schemes are proposed for both outliers and missing values (including zeros) focusing on the nature of irregularities in the datum subcomposition(s). To compute statistics with values missing at random and structural zeros, a projection approach is implemented: a given datum contributes to the estimation of the desired parameters only on the subcompositon where it was observed. For data sets with values below the detection limit, two different approaches are provided: the well-known imputation technique, and also the projection approach. To compute statistics in the presence of outliers, robust statistics are adapted to the characteristics of compositional data, based on the minimum covariance determinant approach. The outlier classification is based on four different models of outlier occur- rence and Monte-Carlo-based tests for their characterization. Furthermore the package provides special plots helping to understand the nature of outliers in the dataset. Keywords: coda-dendrogram, lost values, MAR, missing data, MCD estimator, robustness, rounded zeros
Resumo:
El objetivo de esta tesis es predecir el rendimiento de los estudiantes de doctorado en la Universidad de Girona según características personales (background), actitudinales y de redes sociales de los estudiantes. La población estudiada son estudiantes de tercer y cuarto curso de doctorado y sus directores de tesis doctoral. Para obtener los datos se ha diseño un cuestionario web especificando sus ventajas y teniendo en cuenta algunos problemas tradicionales de no cobertura o no respuesta. El cuestionario web se hizo debido a la complejidad que comportan de las preguntas de red social. El cuestionario electrónico permite, mediante una serie de instrucciones, reducir el tiempo para responder y hacerlo menos cargado. Este cuestionario web, además es auto administrado, lo cual nos permite, según la literatura, unas respuestas mas honestas que cuestionario con encuestador. Se analiza la calidad de las preguntas de red social en cuestionario web para datos egocéntricos. Para eso se calcula la fiabilidad y la validez de este tipo de preguntas, por primera vez a través del modelo Multirasgo Multimétodo (Multitrait Multimethod). Al ser datos egocéntricos, se pueden considerar jerárquicos, y por primera vez se una un modelo Multirasgo Multimétodo Multinivel (multilevel Multitrait Multimethod). Las la fiabilidad y validez se pueden obtener a nivel individual (within group component) o a nivel de grupo (between group component) y se usan para llevar a cabo un meta-análisis con otras universidades europeas para analizar ciertas características de diseño del cuestionario. Estas características analizan si para preguntas de red social hechas en cuestionarios web son más fiables y validas hechas "by questions" o "by alters", si son presentes todas las etiquetas de frecuencia para los ítems o solo la del inicio y final, o si es mejor que el diseño del cuestionario esté en con color o blanco y negro. También se analiza la calidad de la red social en conjunto, en este caso específico son los grupos de investigación de la universidad. Se tratan los problemas de los datos ausentes en las redes completas. Se propone una nueva alternativa a la solución típica de la red egocéntrica o los respondientes proxies. Esta nueva alternativa la hemos nombrado "Nosduocentered Network" (red Nosduocentrada), se basa en dos actores centrales en una red. Estimando modelos de regresión, esta "Nosduocentered network" tiene mas poder predictivo para el rendimiento de los estudiantes de doctorado que la red egocéntrica. Además se corrigen las correlaciones de las variables actitudinales por atenuación debido al pequeño tamaño muestral. Finalmente, se hacen regresiones de los tres tipos de variables (background, actitudinales y de red social) y luego se combinan para analizar cual para predice mejor el rendimiento (según publicaciones académicas) de los estudiantes de doctorado. Los resultados nos llevan a predecir el rendimiento académico de los estudiantes de doctorado depende de variables personales (background) i actitudinales. Asimismo, se comparan los resultados obtenidos con otros estudios publicados.
Resumo:
An improved algorithm for the generation of gridded window brightness temperatures is presented. The primary data source is the International Satellite Cloud Climatology Project, level B3 data, covering the period from July 1983 to the present. The algorithm rakes window brightness, temperatures from multiple satellites, both geostationary and polar orbiting, which have already been navigated and normalized radiometrically to the National Oceanic and Atmospheric Administration's Advanced Very High Resolution Radiometer, and generates 3-hourly global images on a 0.5 degrees by 0.5 degrees latitude-longitude grid. The gridding uses a hierarchical scheme based on spherical kernel estimators. As part of the gridding procedure, the geostationary data are corrected for limb effects using a simple empirical correction to the radiances, from which the corrected temperatures are computed. This is in addition to the application of satellite zenith angle weighting to downweight limb pixels in preference to nearer-nadir pixels. The polar orbiter data are windowed on the target time with temporal weighting to account for the noncontemporaneous nature of the data. Large regions of missing data are interpolated from adjacent processed images using a form of motion compensated interpolation based on the estimation of motion vectors using an hierarchical block matching scheme. Examples are shown of the various stages in the process. Also shown are examples of the usefulness of this type of data in GCM validation.
Resumo:
Relationships between the four families placed in the angiosperm order Fabales (Leguminosae, Polygalaceae, Quillajaceae, Surianaceae) were hitherto poorly resolved. We combine published molecular data for the chloroplast regions matK and rbcL with 66 morphological characters surveyed for 73 ingroup and two outgroup species, and use Parsimony and Bayesian approaches to explore matrices with different missing data. All combined analyses using Parsimony recovered the topology Polygalaceae (Leguminosae (Quillajaceae + Surianaceae)). Bayesian analyses with matched morphological and molecular sampling recover the same topology, but analyses based on other data recover a different Bayesian topology: ((Polygalaceae + Leguminosae) (Quillajaceae + Surianaceae)). We explore the evolution of floral characters in the context of the more consistent topology: Polygalaceae (Leguminosae (Quillajaceae + Surianaceae)). This reveals synapomorphies for (Leguminosae (Quillajaceae + Surianaceae)) as the presence of free filaments and marginal/ventral placentation, for (Quillajaceae + Surianaceae) as pentamery and apocarpy, and for Leguminosae the presence of an abaxial median sepal and unicarpellate gynoecium. An octamerous androecium is synapomorphic for Polygalaceae. The development of papilionate flowers, and the evolutionary context in which these phenotypes appeared in Leguminosae and Polygalaceae, shows that the morphologies are convergent rather than synapomorphic within Fabales.
Resumo:
Data augmentation is a powerful technique for estimating models with latent or missing data, but applications in agricultural economics have thus far been few. This paper showcases the technique in an application to data on milk market participation in the Ethiopian highlands. There, a key impediment to economic development is an apparently low rate of market participation. Consequently, economic interest centers on the “locations” of nonparticipants in relation to the market and their “reservation values” across covariates. These quantities are of policy interest because they provide measures of the additional inputs necessary in order for nonparticipants to enter the market. One quantity of primary interest is the minimum amount of surplus milk (the “minimum efficient scale of operations”) that the household must acquire before market participation becomes feasible. We estimate this quantity through routine application of data augmentation and Gibbs sampling applied to a random-censored Tobit regression. Incorporating random censoring affects markedly the marketable-surplus requirements of the household, but only slightly the covariates requirements estimates and, generally, leads to more plausible policy estimates than the estimates obtained from the zero-censored formulation
Resumo:
Considerable effort is presently being devoted to producing high-resolution sea surface temperature (SST) analyses with a goal of spatial grid resolutions as low as 1 km. Because grid resolution is not the same as feature resolution, a method is needed to objectively determine the resolution capability and accuracy of SST analysis products. Ocean model SST fields are used in this study as simulated “true” SST data and subsampled based on actual infrared and microwave satellite data coverage. The subsampled data are used to simulate sampling errors due to missing data. Two different SST analyses are considered and run using both the full and the subsampled model SST fields, with and without additional noise. The results are compared as a function of spatial scales of variability using wavenumber auto- and cross-spectral analysis. The spectral variance at high wavenumbers (smallest wavelengths) is shown to be attenuated relative to the true SST because of smoothing that is inherent to both analysis procedures. Comparisons of the two analyses (both having grid sizes of roughly ) show important differences. One analysis tends to reproduce small-scale features more accurately when the high-resolution data coverage is good but produces more spurious small-scale noise when the high-resolution data coverage is poor. Analysis procedures can thus generate small-scale features with and without data, but the small-scale features in an SST analysis may be just noise when high-resolution data are sparse. Users must therefore be skeptical of high-resolution SST products, especially in regions where high-resolution (~5 km) infrared satellite data are limited because of cloud cover.
Resumo:
Background: Dietary intervention studies suggest that flavan-3-ol intake can improve vascular function and reduce the risk of cardiovascular diseases (CVD). However, results from prospective studies failed to show a consistent beneficial effect. Objective: To investigate associations between flavan-3-ol intake and CVD risk in the Norfolk arm of the European Prospective Investigation into Cancer and Nutrition (EPIC-Norfolk). Design: Data was available from 24,885 (11,252 men; 13,633 women) participants, recruited between 1993 and 1997 into the EPIC-Norfolk study. Flavan-3-ol intake was assessed using 7-day food diaries and the FLAVIOLA Flavanol Food Composition database. Missing data for plasma cholesterol and vitamin C were imputed using multiple imputation. Associations between flavan-3-ol intake and blood pressure at baseline were determined using linear regression models. Associations with CVD risk were estimated using Cox regression analyses. Results: Median intake of total flavan-3-ols was 1034 mg/d (range: 0 – 8531 mg/d) for men and 970 mg/d (0 – 6695 mg/d) for women, median intake of flavan-3-ol monomers was 233 mg/d (0 – 3248 mg/d) for men and 217 (0 – 2712 mg/d) for women. There were no consistent associations between flavan-3-ol monomer intake and baseline systolic and diastolic blood pressure (BP). After 286,147 person-years of follow up, there were 8463 cardio-vascular events and 1987 CVD related deaths; no consistent association between flavan-3-ol intake and CVD risk (HR 0.93, 95% CI:0.87; 1.00; Q1 vs Q5) or mortality was observed (HR 0.93, 95% CI: 0.84; 1.04). Conclusions: Flavan-3-ol intake in EPIC-Norfolk is not sufficient to achieve a statistically significant reduction in CVD risk.
Resumo:
Background Cognitive–behavioural therapy (CBT) for childhood anxiety disorders is associated with modest outcomes in the context of parental anxiety disorder. Objectives This study evaluated whether or not the outcome of CBT for children with anxiety disorders in the context of maternal anxiety disorders is improved by the addition of (i) treatment of maternal anxiety disorders, or (ii) treatment focused on maternal responses. The incremental cost-effectiveness of the additional treatments was also evaluated. Design Participants were randomised to receive (i) child cognitive–behavioural therapy (CCBT); (ii) CCBT with CBT to target maternal anxiety disorders [CCBT + maternal cognitive–behavioural therapy (MCBT)]; or (iii) CCBT with an intervention to target mother–child interactions (MCIs) (CCBT + MCI). Setting A NHS university clinic in Berkshire, UK. Participants Two hundred and eleven children with a primary anxiety disorder, whose mothers also had an anxiety disorder. Interventions All families received eight sessions of individual CCBT. Mothers in the CCBT + MCBT arm also received eight sessions of CBT targeting their own anxiety disorders. Mothers in the MCI arm received 10 sessions targeting maternal parenting cognitions and behaviours. Non-specific interventions were delivered to balance groups for therapist contact. Main outcome measures Primary clinical outcomes were the child’s primary anxiety disorder status and degree of improvement at the end of treatment. Follow-up assessments were conducted at 6 and 12 months. Outcomes in the economic analyses were identified and measured using estimated quality-adjusted life-years (QALYs). QALYS were combined with treatment, health and social care costs and presented within an incremental cost–utility analysis framework with associated uncertainty. Results MCBT was associated with significant short-term improvement in maternal anxiety; however, after children had received CCBT, group differences were no longer apparent. CCBT + MCI was associated with a reduction in maternal overinvolvement and more confident expectations of the child. However, neither CCBT + MCBT nor CCBT + MCI conferred a significant post-treatment benefit over CCBT in terms of child anxiety disorder diagnoses [adjusted risk ratio (RR) 1.18, 95% confidence interval (CI) 0.87 to 1.62, p = 0.29; adjusted RR CCBT + MCI vs. control: adjusted RR 1.22, 95% CI 0.90 to 1.67, p = 0.20, respectively] or global improvement ratings (adjusted RR 1.25, 95% CI 1.00 to 1.59, p = 0.05; adjusted RR 1.20, 95% CI 0.95 to 1.53, p = 0.13). CCBT + MCI outperformed CCBT on some secondary outcome measures. Furthermore, primary economic analyses suggested that, at commonly accepted thresholds of cost-effectiveness, the probability that CCBT + MCI will be cost-effective in comparison with CCBT (plus non-specific interventions) is about 75%. Conclusions Good outcomes were achieved for children and their mothers across treatment conditions. There was no evidence of a benefit to child outcome of supplementing CCBT with either intervention focusing on maternal anxiety disorder or maternal cognitions and behaviours. However, supplementing CCBT with treatment that targeted maternal cognitions and behaviours represented a cost-effective use of resources, although the high percentage of missing data on some economic variables is a shortcoming. Future work should consider whether or not effects of the adjunct interventions are enhanced in particular contexts. The economic findings highlight the utility of considering the use of a broad range of services when evaluating interventions with this client group. Trial registration Current Controlled Trials ISRCTN19762288. Funding This trial was funded by the Medical Research Council (MRC) and Berkshire Healthcare Foundation Trust and managed by the National Institute for Health Research (NIHR) on behalf of the MRC–NIHR partnership (09/800/17) and will be published in full in Health Technology Assessment; Vol. 19, No. 38.
Resumo:
1. Comparative analyses are used to address the key question of what makes a species more prone to extinction by exploring the links between vulnerability and intrinsic species’ traits and/or extrinsic factors. This approach requires comprehensive species data but information is rarely available for all species of interest. As a result comparative analyses often rely on subsets of relatively few species that are assumed to be representative samples of the overall studied group. 2. Our study challenges this assumption and quantifies the taxonomic, spatial, and data type biases associated with the quantity of data available for 5415 mammalian species using the freely available life-history database PanTHERIA. 3. Moreover, we explore how existing biases influence results of comparative analyses of extinction risk by using subsets of data that attempt to correct for detected biases. In particular, we focus on links between four species’ traits commonly linked to vulnerability (distribution range area, adult body mass, population density and gestation length) and conduct univariate and multivariate analyses to understand how biases affect model predictions. 4. Our results show important biases in data availability with c.22% of mammals completely lacking data. Missing data, which appear to be not missing at random, occur frequently in all traits (14–99% of cases missing). Data availability is explained by intrinsic traits, with larger mammals occupying bigger range areas being the best studied. Importantly, we find that existing biases affect the results of comparative analyses by overestimating the risk of extinction and changing which traits are identified as important predictors. 5. Our results raise concerns over our ability to draw general conclusions regarding what makes a species more prone to extinction. Missing data represent a prevalent problem in comparative analyses, and unfortunately, because data are not missing at random, conventional approaches to fill data gaps, are not valid or present important challenges. These results show the importance of making appropriate inferences from comparative analyses by focusing on the subset of species for which data are available. Ultimately, addressing the data bias problem requires greater investment in data collection and dissemination, as well as the development of methodological approaches to effectively correct existing biases.
Resumo:
Background. There is emerging evidence that context is important for successful transfer of research knowledge into health care practice. The Alberta Context Tool (ACT) is a Canadian developed research-based instrument that assesses 10 modifiable concepts of organizational context considered important for health care professionals’ use of evidence. Swedish and Canadian health care have similarities in terms of organisational and professional aspects, suggesting that the ACT could be used for measuring context in Sweden. This paper reports on the translation of the ACT to Swedish and a testing of preliminary aspects of its validity, acceptability and reliability in Swedish elder care. Methods. The ACT was translated into Swedish and back-translated into English before being pilot tested in ten elder care facilities for response processes validity, acceptability and reliability (Cronbach’s alpha). Subsequently, further modification was performed. Results. In the pilot test, the nurses found the questions easy to respond to (52%) and relevant (65%), yet the questions’ clarity were mainly considered ‘neither clear nor unclear’ (52%). Missing data varied between 0 (0%) and 19 (12%) per item, the most common being 1 missing case per item (15 items). Internal consistency (Cronbach’s Alpha > .70) was reached for 5 out of 8 contextual concepts. Translation and back translation identified 21 linguistic- and semantic related issues and 3 context related deviations, resolved by developers and translators. Conclusion. Modifying an instrument is a detailed process, requiring time and consideration of the linguistic and semantic aspects of the instrument, and understanding of the context where the instrument was developed and where it is to be applied. A team, including the instrument’s developers, translators, and researchers is necessary to ensure a valid translation. This study suggests preliminary validity, reliability and acceptability evidence for the ACT when used with nurses in Swedish elder care.
Resumo:
We study semiparametric two-step estimators which have the same structure as parametric doubly robust estimators in their second step. The key difference is that we do not impose any parametric restriction on the nuisance functions that are estimated in a first stage, but retain a fully nonparametric model instead. We call these estimators semiparametric doubly robust estimators (SDREs), and show that they possess superior theoretical and practical properties compared to generic semiparametric two-step estimators. In particular, our estimators have substantially smaller first-order bias, allow for a wider range of nonparametric first-stage estimates, rate-optimal choices of smoothing parameters and data-driven estimates thereof, and their stochastic behavior can be well-approximated by classical first-order asymptotics. SDREs exist for a wide range of parameters of interest, particularly in semiparametric missing data and causal inference models. We illustrate our method with a simulation exercise.
Resumo:
The relationship between sanitation policies (access and quality) and health in Brazilian municipalities was estimated from 2003 to 2010 using a panel data model with corrections for missing data. The results suggest a limited effect of sanitation policy on health. Compared with results from the literature, we found that the worsening quality of water appears to be associated with increased rates of mortality and hospitalization for children up to one month of age. Improvements in sewage sanitation have reduced the mortality and morbidity rates in children aged one to four. Improved access to piped water is associated with decreased hospitalization related to dysentery and acute respiratory infections (ARI) and does not have an effect on child mortality. Finally, epidemiological transition is only supported by weak evidence, including a more intense effect of reduced access to sanitation in municipalities with the worst mortality and morbidity indicators. In most models, this theory has been rejected
Resumo:
In this work we study the survival cure rate model proposed by Yakovlev (1993) that are considered in a competing risk setting. Covariates are introduced for modeling the cure rate and we allow some covariates to have missing values. We consider only the cases by which the missing covariates are categorical and implement the EM algorithm via the method of weights for maximum likelihood estimation. We present a Monte Carlo simulation experiment to compare the properties of the estimators based on this method with those estimators under the complete case scenario. We also evaluate, in this experiment, the impact in the parameter estimates when we increase the proportion of immune and censored individuals among the not immune one. We demonstrate the proposed methodology with a real data set involving the time until the graduation for the undergraduate course of Statistics of the Universidade Federal do Rio Grande do Norte
Resumo:
Few users of statistical packages are capable of analyzing unbalanced factorials properly, because introductory textbooks do not discuss this topic in detail. The present article is directed to agriculture researchers, with the purpose of clarifying the differences among several widely used programs. It shows how to test useful hypotheses about population means in models having qualitative and quantitative factors. The paper emphasizes the pitfalls of blindly applying packages without prior knowledge of the hypotheses being tested.