959 resultados para not missing at random


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Forkhead box transcription factor FoxP3 is pivotal to the development and function of regulatory T cells (Tregs), which make a major contribution to peripheral tolerance. FoxP3 is believed to perform a regulatory role in all the vertebrate species in which it has been detected. The prevailing view is that FoxP3 is absent in birds and that avian Tregs rely on alternative developmental and suppressive pathways. Prompted by the automated annotation of foxp3 in the ground tit (Parus humilis) genome, we have questioned this assumption. Our analysis of all available avian genomes has revealed that the foxp3 locus is missing, incomplete or of poor quality in the relevant genomic assemblies for nearly all avian species. Nevertheless, in two species, the peregrine falcon (Falco peregrinus) and the saker falcon (F. cherrug), there is compelling evidence for the existence of exons showing synteny with foxp3 in the ground tit. A broader phylogenomic analysis has shown that FoxP3 sequences from these three species are similar to crocodilian sequences, the closest living relatives of birds. In both birds and crocodilians, we have also identified a highly proline-enriched region at the N terminus of FoxP3, a region previously identified only in mammals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Introduction: HIV testing is a cornerstone of efforts to combat the HIV epidemic, and testing conducted as part of surveillance provides invaluable data on the spread of infection and the effectiveness of campaigns to reduce the transmission of HIV. However, participation in HIV testing can be low, and if respondents systematically select not to be tested because they know or suspect they are HIV positive (and fear disclosure), standard approaches to deal with missing data will fail to remove selection bias. We implemented Heckman-type selection models, which can be used to adjust for missing data that are not missing at random, and established the extent of selection bias in a population-based HIV survey in an HIV hyperendemic community in rural South Africa.

Methods: We used data from a population-based HIV survey carried out in 2009 in rural KwaZulu-Natal, South Africa. In this survey, 5565 women (35%) and 2567 men (27%) provided blood for an HIV test. We accounted for missing data using interviewer identity as a selection variable which predicted consent to HIV testing but was unlikely to be independently associated with HIV status. Our approach involved using this selection variable to examine the HIV status of residents who would ordinarily refuse to test, except that they were allocated a persuasive interviewer. Our copula model allows for flexibility when modelling the dependence structure between HIV survey participation and HIV status.

Results: For women, our selection model generated an HIV prevalence estimate of 33% (95% CI 27–40) for all people eligible to consent to HIV testing in the survey. This estimate is higher than the estimate of 24% generated when only information from respondents who participated in testing is used in the analysis, and the estimate of 27% when imputation analysis is used to predict missing data on HIV status. For men, we found an HIV prevalence of 25% (95% CI 15–35) using the selection model, compared to 16% among those who participated in testing, and 18% estimated with imputation. We provide new confidence intervals that correct for the fact that the relationship between testing and HIV status is unknown and requires estimation.

Conclusions: We confirm the feasibility and value of adopting selection models to account for missing data in population-based HIV surveys and surveillance systems. Elements of survey design, such as interviewer identity, present the opportunity to adopt this approach in routine applications. Where non-participation is high, true confidence intervals are much wider than those generated by standard approaches to dealing with missing data suggest.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

1. Comparative analyses are used to address the key question of what makes a species more prone to extinction by exploring the links between vulnerability and intrinsic species’ traits and/or extrinsic factors. This approach requires comprehensive species data but information is rarely available for all species of interest. As a result comparative analyses often rely on subsets of relatively few species that are assumed to be representative samples of the overall studied group. 2. Our study challenges this assumption and quantifies the taxonomic, spatial, and data type biases associated with the quantity of data available for 5415 mammalian species using the freely available life-history database PanTHERIA. 3. Moreover, we explore how existing biases influence results of comparative analyses of extinction risk by using subsets of data that attempt to correct for detected biases. In particular, we focus on links between four species’ traits commonly linked to vulnerability (distribution range area, adult body mass, population density and gestation length) and conduct univariate and multivariate analyses to understand how biases affect model predictions. 4. Our results show important biases in data availability with c.22% of mammals completely lacking data. Missing data, which appear to be not missing at random, occur frequently in all traits (14–99% of cases missing). Data availability is explained by intrinsic traits, with larger mammals occupying bigger range areas being the best studied. Importantly, we find that existing biases affect the results of comparative analyses by overestimating the risk of extinction and changing which traits are identified as important predictors. 5. Our results raise concerns over our ability to draw general conclusions regarding what makes a species more prone to extinction. Missing data represent a prevalent problem in comparative analyses, and unfortunately, because data are not missing at random, conventional approaches to fill data gaps, are not valid or present important challenges. These results show the importance of making appropriate inferences from comparative analyses by focusing on the subset of species for which data are available. Ultimately, addressing the data bias problem requires greater investment in data collection and dissemination, as well as the development of methodological approaches to effectively correct existing biases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ethnic violence appears to be the major source of violence in the world. Ethnic hostilities are potentially all-pervasive because most countries in the world are multi-ethnic. Public health's focus on violence documents its increasing role in this issue.^ The present study is based on a secondary analysis of a dataset of responses by 272 individuals from four ethnic groups (Anglo, African, Mexican, and Vietnamese Americans) who answered questions regarding variables related to ethnic violence from a general questionnaire which was distributed to ethnically diverse purposive, nonprobability, self-selected groups of individuals in Houston, Texas, in 1993.^ One goal was psychometric: learning about issues in analysis of datasets with modest numbers, comparison of two approaches to dealing with missing observations not missing at random (conducting analysis on two datasets), transformation analysis of continuous variables for logistic regression, and logistic regression diagnostics.^ Regarding the psychometric goal, it was concluded that measurement model analysis was not possible with a relatively small dataset with nonnormal variables, such as Likert-scaled variables; therefore, exploratory factor analysis was used. The two approaches to dealing with missing values resulted in comparable findings. Transformation analysis suggested that the continuous variables were in the correct scale, and diagnostics that the model fit was adequate.^ The substantive portion of the analysis included the testing of four hypotheses. Hypothesis One proposed that attitudes/efficacy regarding alternative approaches to resolving grievances from the general questionnaire represented underlying factors: nonpunitive social norms and strategies for addressing grievances--using the political system, organizing protests, using the system to punish offenders, and personal mediation. Evidence was found to support all but one factor, nonpunitive social norms.^ Hypothesis Two proposed that the factor variables and the other independent variables--jail, grievance, male, young, and membership in a particular ethnic group--were associated with (non)violence. Jail, grievance, and not using the political system to address grievances were associated with a greater likelihood of intergroup violence.^ No evidence was found to support Hypotheses Three and Four, which proposed that grievance and ethnic group membership would interact with other variables (i.e., age, gender, etc.) to produce variant levels of subgroup (non)violence.^ The generalizability of the results of this study are constrained by the purposive self-selected nature of the sample and small sample size (n = 272).^ Suggestions for future research include incorporating other possible variables or factors predictive of intergroup violence in models of the kind tested here, and the development and evaluation of interventions that promote electoral and nonelectoral political participation as means of reducing interethnic conflict. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVES: To compare three different methods of falls reporting and examine the characteristics of the data missing from the hospital incident reporting system. DESIGN: Fourteen-month prospective observational study nested within a randomized controlled trial. SETTING: Rehabilitation, stroke, medical, surgical, and orthopedic wards in Perth and Brisbane, Australia. PARTICIPANTS: Fallers (n5153) who were part of a larger trial (1,206 participants, mean age 75.1 � 11.0). MEASUREMENTS: Three falls events reporting measures: participants’ self-report of fall events, fall events reported in participants’ case notes, and falls events reported through the hospital reporting systems. RESULTS: The three reporting systems identified 245 falls events in total. Participants’ case notes captured 226 (92.2%) falls events, hospital incident reporting systems captured 185 (75.5%) falls events, and participant selfreport captured 147 (60.2%) falls events. Falls events were significantly less likely to be recorded in hospital reporting systems when a participant sustained a subsequent fall, (P5.01) or when the fall occurred in the morning shift (P5.01) or afternoon shift (P5.01). CONCLUSION: Falls data missing from hospital incident report systems are not missing completely at random and therefore will introduce bias in some analyses if the factor investigated is related to whether the data ismissing.Multimodal approaches to collecting falls data are preferable to relying on a single source alone.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We review some issues related to the implications of different missing data mechanisms on statistical inference for contingency tables and consider simulation studies to compare the results obtained under such models to those where the units with missing data are disregarded. We confirm that although, in general, analyses under the correct missing at random and missing completely at random models are more efficient even for small sample sizes, there are exceptions where they may not improve the results obtained by ignoring the partially classified data. We show that under the missing not at random (MNAR) model, estimates on the boundary of the parameter space as well as lack of identifiability of the parameters of saturated models may be associated with undesirable asymptotic properties of maximum likelihood estimators and likelihood ratio tests; even in standard cases the bias of the estimators may be low only for very large samples. We also show that the probability of a boundary solution obtained under the correct MNAR model may be large even for large samples and that, consequently, we may not always conclude that a MNAR model is misspecified because the estimate is on the boundary of the parameter space.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: In this secondary data analysis, three statistical methodologies were implemented to handle cases with missing data in a motivational interviewing and feedback study. The aim was to evaluate the impact that these methodologies have on the data analysis. ^ Methods: We first evaluated whether the assumption of missing completely at random held for this study. We then proceeded to conduct a secondary data analysis using a mixed linear model to handle missing data with three methodologies (a) complete case analysis, (b) multiple imputation with explicit model containing outcome variables, time, and the interaction of time and treatment, and (c) multiple imputation with explicit model containing outcome variables, time, the interaction of time and treatment, and additional covariates (e.g., age, gender, smoke, years in school, marital status, housing, race/ethnicity, and if participants play on athletic team). Several comparisons were conducted including the following ones: 1) the motivation interviewing with feedback group (MIF) vs. the assessment only group (AO), the motivation interviewing group (MIO) vs. AO, and the intervention of the feedback only group (FBO) vs. AO, 2) MIF vs. FBO, and 3) MIF vs. MIO.^ Results: We first evaluated the patterns of missingness in this study, which indicated that about 13% of participants showed monotone missing patterns, and about 3.5% showed non-monotone missing patterns. Then we evaluated the assumption of missing completely at random by Little's missing completely at random (MCAR) test, in which the Chi-Square test statistic was 167.8 with 125 degrees of freedom, and its associated p-value was p=0.006, which indicated that the data could not be assumed to be missing completely at random. After that, we compared if the three different strategies reached the same results. For the comparison between MIF and AO as well as the comparison between MIF and FBO, only the multiple imputation with additional covariates by uncongenial and congenial models reached different results. For the comparison between MIF and MIO, all the methodologies for handling missing values obtained different results. ^ Discussions: The study indicated that, first, missingness was crucial in this study. Second, to understand the assumptions of the model was important since we could not identify if the data were missing at random or missing not at random. Therefore, future researches should focus on exploring more sensitivity analyses under missing not at random assumption.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Maximizing data quality may be especially difficult in trauma-related clinical research. Strategies are needed to improve data quality and assess the impact of data quality on clinical predictive models. This study had two objectives. The first was to compare missing data between two multi-center trauma transfusion studies: a retrospective study (RS) using medical chart data with minimal data quality review and the PRospective Observational Multi-center Major Trauma Transfusion (PROMMTT) study with standardized quality assurance. The second objective was to assess the impact of missing data on clinical prediction algorithms by evaluating blood transfusion prediction models using PROMMTT data. RS (2005-06) and PROMMTT (2009-10) investigated trauma patients receiving ≥ 1 unit of red blood cells (RBC) from ten Level I trauma centers. Missing data were compared for 33 variables collected in both studies using mixed effects logistic regression (including random intercepts for study site). Massive transfusion (MT) patients received ≥ 10 RBC units within 24h of admission. Correct classification percentages for three MT prediction models were evaluated using complete case analysis and multiple imputation based on the multivariate normal distribution. A sensitivity analysis for missing data was conducted to estimate the upper and lower bounds of correct classification using assumptions about missing data under best and worst case scenarios. Most variables (17/33=52%) had <1% missing data in RS and PROMMTT. Of the remaining variables, 50% demonstrated less missingness in PROMMTT, 25% had less missingness in RS, and 25% were similar between studies. Missing percentages for MT prediction variables in PROMMTT ranged from 2.2% (heart rate) to 45% (respiratory rate). For variables missing >1%, study site was associated with missingness (all p≤0.021). Survival time predicted missingness for 50% of RS and 60% of PROMMTT variables. MT models complete case proportions ranged from 41% to 88%. Complete case analysis and multiple imputation demonstrated similar correct classification results. Sensitivity analysis upper-lower bound ranges for the three MT models were 59-63%, 36-46%, and 46-58%. Prospective collection of ten-fold more variables with data quality assurance reduced overall missing data. Study site and patient survival were associated with missingness, suggesting that data were not missing completely at random, and complete case analysis may lead to biased results. Evaluating clinical prediction model accuracy may be misleading in the presence of missing data, especially with many predictor variables. The proposed sensitivity analysis estimating correct classification under upper (best case scenario)/lower (worst case scenario) bounds may be more informative than multiple imputation, which provided results similar to complete case analysis.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the main concerns is the nature of the missing values. Let’s consider extremes for simplicity. If missing at random we have not to care about. But if missing shows structures that covariate with substantive variables we have to make decisions. There are, in fact, several options to take. We are speaking about one country, one mode. But if you go cross-cultural (or more precisely, cross-state nations) and mixed modes many questions raise. For example, the simple one. What are we comparing? Reports and books usually go straight into variables distributions and coefficient comparisons. This is possible because the annalist presume "tabula rasa" effect from data collections procedures. But this is not, frequently, the real situation. This paper will expose the mixed missing mode imprint in international surveys. This will help to evaluate how deal with this problem. Also, to consider the real meaning of observed cross-national differences.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Historically a significant gap between male and female wages has existed in the Australian labour market. Indeed this wage differential was institutionalised in the 1912 arbitration decision which determined that the basic female wage would be set at between 54 and 66 per cent of the male wage. More recently however, the 1969 and 1972 Equal Pay Cases determined that male/female wage relativities should be based upon the premise of equal pay for work of equal value. It is important to note that the mere observation that average wages differ between males and females is not sine qua non evidence of sex discrimination. Economists restrict the definition of wage discrimination to cases where two distinct groups receive different average remuneration for reasons unrelated to differences in productivity characteristics. This paper extends previous studies of wage discrimination in Australia (Chapman and Mulvey, 1986; Haig, 1982) by correcting the estimated male/female wage differential for the existence of non-random sampling. Previous Australian estimates of male/female human capital basedwage specifications together with estimates of the corresponding wage differential all suffer from a failure to address this issue. If the sample of females observed to be working does not represent a random sample then the estimates of the male/female wage differential will be both biased and inconsistent.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

• In December 1986 funds were approved to double the intensity of random breath testing (RBT) and provide publicity support for police efforts. These changes were considered necessary to make RBT effective. • RBT methods were changed in the metropolitan area to enable block testing (pulling over a block of traffic rather than one or two cars), deployment of police to cut off escape routes, and testing by traffic patrols in all police subdivisions. Additional operators were trained for country RBT. • A publicity campaign was developed, aimed mainly at male drivers aged 18-50. The campaign consisted of the “cardsharp” television commercials, radio commercials, newspaper articles, posters and pamphlets. • Increased testing and the publicity campaigns were launched on 10 April 1987. • Police tests increased by 92.5% in May – December 1987, compared with the same period in the previous four years. • The detection rate for drinking drivers picked up by police who were cutting off escape routes was comparatively high, indicating that drivers were attempting to avoid RBT, and that this police method was effective at detecting these drivers. • A telephone survey indicated that drivers were aware of the messages of the publicity campaign. • The telephone survey also indicated that the target group had been exposed to high levels of RBT, as planned, and that fear of apprehension was the major factor deterring them from drink driving. • A roadside survey of driver blood alcohol concentrations (BACs) by the University of Adelaide’s Road Accident Research Unit (RARU) showed that, between 10p.m. and 3a.m., the proportion of drivers in Adelaide with a BAC greater than or equal to 0/08 decreased by 42%. • Drivers under 21 were identified as a possible problem area. • Fatalities in the twelve month period commencing May 1987 decreased by 18% in comparison with the previous twelve month period, and by 13% in comparison with the average of the previous two twelve month periods (commencing May 1985 and May 1986). There are indications that this trend is continuing. • It is concluded that the increase in RBT, plus publicity, was successful in achieving its aims of reductions in drink driving and accidents.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Random breath testing (RBT) was introduced in South Australia in 1981 with the intention of reducing the incidence of accidents involving alcohol. In April 1985, a Select Committee of the Upper House which had been established to “review the operation of random breath testing in this State and any other associated matters and report accordingly” presented its report. After consideration of this report, the Government introduced extensive amendments to those sections of the Motor Vehicles Act (MVA) and Road Traffic Act (RTA) which deal with RBT and drink driving penalties. The amended section 47da of the RTA requires that: “(5) The Minister shall cause a report to be prepared within three months after the end of each calendar year on the operation and effectiveness of this section and related sections during that calendar year. (6) The Minister shall, within 12 sitting days after receipt of a report under subsection (5), cause copies of the report to be laid before each House of Parliament.” This is the first such report. Whilst it deals with RBT over a full year, the changed procedures and improved flexibility allowed by the revision to the RTA were only introduced late in 1985 and then only to the extent that the existing resources would allow.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Dados faltantes são um problema comum em estudos epidemiológicos e, dependendo da forma como ocorrem, as estimativas dos parâmetros de interesse podem estar enviesadas. A literatura aponta algumas técnicas para se lidar com a questão, e, a imputação múltipla vem recebendo destaque nos últimos anos. Esta dissertação apresenta os resultados da utilização da imputação múltipla de dados no contexto do Estudo Pró-Saúde, um estudo longitudinal entre funcionários técnico-administrativos de uma universidade no Rio de Janeiro. No primeiro estudo, após simulação da ocorrência de dados faltantes, imputou-se a variável cor/raça das participantes, e aplicou-se um modelo de análise de sobrevivência previamente estabelecido, tendo como desfecho a história auto-relatada de miomas uterinos. Houve replicação do procedimento (100 vezes) para se determinar a distribuição dos coeficientes e erros-padrão das estimativas da variável de interesse. Apesar da natureza transversal dos dados aqui utilizados (informações da linha de base do Estudo Pró-Saúde, coletadas em 1999 e 2001), buscou-se resgatar a história do seguimento das participantes por meio de seus relatos, criando uma situação na qual a utilização do modelo de riscos proporcionais de Cox era possível. Nos cenários avaliados, a imputação demonstrou resultados satisfatórios, inclusive quando da avaliação de performance realizada. A técnica demonstrou um bom desempenho quando o mecanismo de ocorrência dos dados faltantes era do tipo MAR (Missing At Random) e o percentual de não-resposta era de 10%. Ao se imputar os dados e combinar as estimativas obtidas nos 10 bancos (m=10) gerados, o viés das estimativas era de 0,0011 para a categoria preta e 0,0015 para pardas, corroborando a eficiência da imputação neste cenário. Demais configurações também apresentaram resultados semelhantes. No segundo artigo, desenvolve-se um tutorial para aplicação da imputação múltipla em estudos epidemiológicos, que deverá facilitar a utilização da técnica por pesquisadores brasileiros ainda não familiarizados com o procedimento. São apresentados os passos básicos e decisões necessárias para se imputar um banco de dados, e um dos cenários utilizados no primeiro estudo é apresentado como exemplo de aplicação da técnica. Todas as análises foram conduzidas no programa estatístico R, versão 2.15 e os scripts utilizados são apresentados ao final do texto.