791 resultados para Missing values


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Surveys can collect important data that inform policy decisions and drive social science research. Large government surveys collect information from the U.S. population on a wide range of topics, including demographics, education, employment, and lifestyle. Analysis of survey data presents unique challenges. In particular, one needs to account for missing data, for complex sampling designs, and for measurement error. Conceptually, a survey organization could spend lots of resources getting high-quality responses from a simple random sample, resulting in survey data that are easy to analyze. However, this scenario often is not realistic. To address these practical issues, survey organizations can leverage the information available from other sources of data. For example, in longitudinal studies that suffer from attrition, they can use the information from refreshment samples to correct for potential attrition bias. They can use information from known marginal distributions or survey design to improve inferences. They can use information from gold standard sources to correct for measurement error.

This thesis presents novel approaches to combining information from multiple sources that address the three problems described above.

The first method addresses nonignorable unit nonresponse and attrition in a panel survey with a refreshment sample. Panel surveys typically suffer from attrition, which can lead to biased inference when basing analysis only on cases that complete all waves of the panel. Unfortunately, the panel data alone cannot inform the extent of the bias due to attrition, so analysts must make strong and untestable assumptions about the missing data mechanism. Many panel studies also include refreshment samples, which are data collected from a random sample of new

individuals during some later wave of the panel. Refreshment samples offer information that can be utilized to correct for biases induced by nonignorable attrition while reducing reliance on strong assumptions about the attrition process. To date, these bias correction methods have not dealt with two key practical issues in panel studies: unit nonresponse in the initial wave of the panel and in the

refreshment sample itself. As we illustrate, nonignorable unit nonresponse

can significantly compromise the analyst's ability to use the refreshment samples for attrition bias correction. Thus, it is crucial for analysts to assess how sensitive their inferences---corrected for panel attrition---are to different assumptions about the nature of the unit nonresponse. We present an approach that facilitates such sensitivity analyses, both for suspected nonignorable unit nonresponse

in the initial wave and in the refreshment sample. We illustrate the approach using simulation studies and an analysis of data from the 2007-2008 Associated Press/Yahoo News election panel study.

The second method incorporates informative prior beliefs about

marginal probabilities into Bayesian latent class models for categorical data.

The basic idea is to append synthetic observations to the original data such that

(i) the empirical distributions of the desired margins match those of the prior beliefs, and (ii) the values of the remaining variables are left missing. The degree of prior uncertainty is controlled by the number of augmented records. Posterior inferences can be obtained via typical MCMC algorithms for latent class models, tailored to deal efficiently with the missing values in the concatenated data.

We illustrate the approach using a variety of simulations based on data from the American Community Survey, including an example of how augmented records can be used to fit latent class models to data from stratified samples.

The third method leverages the information from a gold standard survey to model reporting error. Survey data are subject to reporting error when respondents misunderstand the question or accidentally select the wrong response. Sometimes survey respondents knowingly select the wrong response, for example, by reporting a higher level of education than they actually have attained. We present an approach that allows an analyst to model reporting error by incorporating information from a gold standard survey. The analyst can specify various reporting error models and assess how sensitive their conclusions are to different assumptions about the reporting error process. We illustrate the approach using simulations based on data from the 1993 National Survey of College Graduates. We use the method to impute error-corrected educational attainments in the 2010 American Community Survey using the 2010 National Survey of College Graduates as the gold standard survey.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Clustering algorithms, pattern mining techniques and associated quality metrics emerged as reliable methods for modeling learners’ performance, comprehension and interaction in given educational scenarios. The specificity of available data such as missing values, extreme values or outliers, creates a challenge to extract significant user models from an educational perspective. In this paper we introduce a pattern detection mechanism with-in our data analytics tool based on k-means clustering and on SSE, silhouette, Dunn index and Xi-Beni index quality metrics. Experiments performed on a dataset obtained from our online e-learning platform show that the extracted interaction patterns were representative in classifying learners. Furthermore, the performed monitoring activities created a strong basis for generating automatic feedback to learners in terms of their course participation, while relying on their previous performance. In addition, our analysis introduces automatic triggers that highlight learners who will potentially fail the course, enabling tutors to take timely actions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

O presente estudo procura testar as propriedades psicométricas de um questionário que avalia (a) perceção do aluno sobre o feedback do professor; a identificação escolar do aluno; as trajetórias escolares (factos e expectativas) e; a perceção do aluno sobre o seu envolvimento escolar. O questionário foi aplicado a 1089 alunos dos 6º, 7º, 9º e 10º anos de escolaridade (M=13.4, DP=1.7), sendo que 52% são do sexo feminino. A amostra é composta por alunos essencialmente de nacionalidade portuguesa (95.9%). A partir dos resultados da análise factorial e seguindo o racional teórico, chegou-se a uma estrutura composta por oito dimensões principais. O QFITE apresenta bons índices de consistência interna, com sete das oito principais dimensões a obterem valores entre .77 e .89. Assim, as análises psicométricas realizadas revelam valores satisfatórios, concluindo-se que o QFITE é um instrumento útil e adequado para avaliar a identificação escolar dos alunos, o envolvimento comportamental escolar, e as perceções dos alunos sobre o feedback do professor.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Snapper (Pagrus auratus) is widely distributed throughout subtropical and temperate southern oceans and forms a significant recreational and commercial fishery in Queensland, Australia. Using data from government reports, media sources, popular publications and a government fisheries survey carried out in 1910, we compiled information on individual snapper fishing trips that took place prior to the commencement of fisherywide organized data collection, from 1871 to 1939. In addition to extracting all available quantitative data, we translated qualitative information into bounded estimates and used multiple imputation to handle missing values, forming 287 records for which catch rate (snapper fisher−1 h−1) could be derived. Uncertainty was handled through a parametric maximum likelihood framework (a transformed trivariate Gaussian), which facilitated statistical comparisons between data sources. No statistically significant differences in catch rates were found among media sources and the government fisheries survey. Catch rates remained stable throughout the time series, averaging 3.75 snapper fisher−1 h−1 (95% confidence interval, 3.42–4.09) as the fishery expanded into new grounds. In comparison, a contemporary (1993–2002) south-east Queensland charter fishery produced an average catch rate of 0.4 snapper fisher−1 h−1 (95% confidence interval, 0.31–0.58). These data illustrate the productivity of a fishery during its earliest years of development and represent the earliest catch rate data globally for this species. By adopting a formalized approach to address issues common to many historical records – missing data, a lack of quantitative information and reporting bias – our analysis demonstrates the potential for historical narratives to contribute to contemporary fisheries management.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Objectives: To investigate the association between effort-reward imbalance (ERI) at work and sedentary lifestyle. Methods: Cross-sectional data from the ongoing Finnish Public Sector Study related to 30 433 women and 7718 men aged 17-64 were used (n = 35 918 after exclusion of participants with missing values in covariates). From the responses to a questionnaire, an aggregated mean score for ERI in a work unit was assigned to each participant. The outcome was sedentary lifestyle defined as <2.00 metabolic equivalent task (MET) hours/day. Logistic regression with generalized estimating equations was used as an analysis method to include both individual and work unit level predictors in the models. Adjustments were made for age, marital status, occupational status, job contract, smoking, and heavy drinking. Results: Twenty five percent of women and 27% of men had a sedentary lifestyle. High individual level ERI was associated with a higher likelihood of sedentary lifestyle both among women (odds ratio (OR) = 1.08, 95% CI 1.01 to 1.16) and men (OR = 1.17, 95% CI 1.02 to 1.33). These associations were not explained by relevant confounders and they were also independent of work unit level job strain measured as a ratio of job demands and control. Conclusions: A mismatch between high occupational effort spent and low reward received in turn seems to be associated with an elevated risk of sedentary lifestyle, although this association is relatively weak.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

National Highway Traffic Safety Administration, Washington, D.C.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In an ever changing world the adults of the future will be faced with many challenges. To cope with these challenges it seems apparent that values education will need to become paramount within a child.s education. A considerable number of research studies have indicated that values education is a critical component within education (Lovat & Toomey, 2007b). Building on this research Lovat (2006) claimed that values education was the missing link in quality teaching The concept of quality teaching had risen to the fore within educational research literature in the late 20th century with the claim that it is the teacher who makes the difference in schooling (Hattie, 2004). Thus, if teachers make such a difference to student learning, achievement and well-being, then it must hold true that pre-service teacher education programmes are vital in ensuring the development of quality teachers for our schools. The gap that this current research programme addressed was to link the fields of values education, quality teaching and pre-service teacher education. This research programme aimed to determine the impact of a values-based pedagogy on the development of quality teaching dimensions within pre-service teacher education. The values-based pedagogy that was investigated in this research programme was Philosophy in the Classroom. The research programme adopted a nested case study design based on the constructivist-interpretative paradigm in examining a unit within a pre-service teacher education programme at a Queensland university. The methodology utilised was qualitative where the main source of data was via interviews. In total, 43 pre-service teachers participated in three studies in order to determine if their involvement in a unit where the focus was on introducing pre-service teachers to an explicit values-based pedagogy impacted on their knowledge, skills and confidence in terms of quality teaching dimensions. The research programme was divided into three separate studies in order to address the two research questions: 1. In what ways do pre-service teachers perceive they are being prepared to become quality teachers? 2. Is there a connection between an explicit values-based pedagogy in pre-service teacher education and the development of pre-service teachers. understanding of quality teaching? Study One provided insight into 21 pre-service teachers. understandings of quality teaching. These 21 participants had not engaged in an explicit values-based pedagogy. Study Two involved the interviewing of 22 pre-service teachers at two separate points in time . prior to exposure to a unit that employed a values-explicit pedagogy and post this subject.s lecture content delivery. Study Three reported on and analysed individual case studies of five pre-service teachers who had participated in Study Two Time 1 and Time 2, as well as a third time following their field experience where they had practice in teaching the values explicit pedagogy. The results of the research demonstrate that an explicit values-based pedagogy introduced into a teacher education programme has a positive impact on the development of pre-service teachers. understanding of quality teaching skills and knowledge. The teaching and practice of a values-based pedagogy positively impacted on pre-service teachers with increases of knowledge, skills and confidence demonstrated on the quality teaching dimensions of intellectual quality, a supportive classroom environment, recognition of difference, connectedness and values. These findings were reinforced through the comparison of pre-service teachers who had participated in the explicit values-based pedagogical approach, with a sample of pre-service teachers who had not engaged in this same values-based pedagogical approach. A solid values-based pedagogy and practice can and does enhance pre-service teachers. understanding of quality teaching. These findings surrounding the use of a values-based pedagogy in pre-service teacher education to enhance quality teaching knowledge and skills has contributed theoretically to the field of educational research, as well having practical implications for teacher education institutions and teacher educators.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objectives Demonstrate the application of decision trees – classification and regression trees (CARTs), and their cousins, boosted regression trees (BRTs) – to understand structure in missing data. Setting Data taken from employees at three different industry sites in Australia. Participants 7915 observations were included. Materials and Methods The approach was evaluated using an occupational health dataset comprising results of questionnaires, medical tests, and environmental monitoring. Statistical methods included standard statistical tests and the ‘rpart’ and ‘gbm’ packages for CART and BRT analyses, respectively, from the statistical software ‘R’. A simulation study was conducted to explore the capability of decision tree models in describing data with missingness artificially introduced. Results CART and BRT models were effective in highlighting a missingness structure in the data, related to the Type of data (medical or environmental), the site in which it was collected, the number of visits and the presence of extreme values. The simulation study revealed that CART models were able to identify variables and values responsible for inducing missingness. There was greater variation in variable importance for unstructured compared to structured missingness. Discussion Both CART and BRT models were effective in describing structural missingness in data. CART models may be preferred over BRT models for exploratory analysis of missing data, and selecting variables important for predicting missingness. BRT models can show how values of other variables influence missingness, which may prove useful for researchers. Conclusion Researchers are encouraged to use CART and BRT models to explore and understand missing data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Attrition in longitudinal studies can lead to biased results. The study is motivated by the unexpected observation that alcohol consumption decreased despite increased availability, which may be due to sample attrition of heavy drinkers. Several imputation methods have been proposed, but rarely compared in longitudinal studies of alcohol consumption. The imputation of consumption level measurements is computationally particularly challenging due to alcohol consumption being a semi-continuous variable (dichotomous drinking status and continuous volume among drinkers), and the non-normality of data in the continuous part. Data come from a longitudinal study in Denmark with four waves (2003-2006) and 1771 individuals at baseline. Five techniques for missing data are compared: Last value carried forward (LVCF) was used as a single, and Hotdeck, Heckman modelling, multivariate imputation by chained equations (MICE), and a Bayesian approach as multiple imputation methods. Predictive mean matching was used to account for non-normality, where instead of imputing regression estimates, "real" observed values from similar cases are imputed. Methods were also compared by means of a simulated dataset. The simulation showed that the Bayesian approach yielded the most unbiased estimates for imputation. The finding of no increase in consumption levels despite a higher availability remained unaltered. Copyright (C) 2011 John Wiley & Sons, Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nearly all chemistry–climate models (CCMs) have a systematic bias of a delayed springtime breakdown of the Southern Hemisphere (SH) stratospheric polar vortex, implying insufficient stratospheric wave drag. In this study the Canadian Middle Atmosphere Model (CMAM) and the CMAM Data Assimilation System (CMAM-DAS) are used to investigate the cause of this bias. Zonal wind analysis increments from CMAMDAS reveal systematic negative values in the stratosphere near 608S in winter and early spring. These are interpreted as indicating a bias in the model physics, namely, missing gravity wave drag (GWD). The negative analysis increments remain at a nearly constant height during winter and descend as the vortex weakens, much like orographic GWD. This region is also where current orographic GWD parameterizations have a gap in wave drag, which is suggested to be unrealistic because of missing effects in those parameterizations. These findings motivate a pair of free-runningCMAMsimulations to assess the impact of extra orographicGWDat 608S. The control simulation exhibits the cold-pole bias and delayed vortex breakdown seen in the CCMs. In the simulation with extra GWD, the cold-pole bias is significantly reduced and the vortex breaks down earlier. Changes in resolved wave drag in the stratosphere also occur in response to the extra GWD, which reduce stratospheric SH polar-cap temperature biases in late spring and early summer. Reducing the dynamical biases, however, results in degraded Antarctic column ozone. This suggests that CCMs that obtain realistic column ozone in the presence of an overly strong and persistent vortex may be doing so through compensating errors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When missing data occur in studies designed to compare the accuracy of diagnostic tests, a common, though naive, practice is to base the comparison of sensitivity, specificity, as well as of positive and negative predictive values on some subset of the data that fits into methods implemented in standard statistical packages. Such methods are usually valid only under the strong missing completely at random (MCAR) assumption and may generate biased and less precise estimates. We review some models that use the dependence structure of the completely observed cases to incorporate the information of the partially categorized observations into the analysis and show how they may be fitted via a two-stage hybrid process involving maximum likelihood in the first stage and weighted least squares in the second. We indicate how computational subroutines written in R may be used to fit the proposed models and illustrate the different analysis strategies with observational data collected to compare the accuracy of three distinct non-invasive diagnostic methods for endometriosis. The results indicate that even when the MCAR assumption is plausible, the naive partial analyses should be avoided.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This Letter describes the search for an enhanced production rate of events with a charged lepton and a neutrino in high-energy pp collisions at the LHC. The analysis uses data collected with the CMS detector, with an integrated luminosity of 5.0 fb-1 at √s=7 TeV, and a further 3.7 fb -1 at √s=8 TeV. No evidence is found for an excess. The results are interpreted in terms of limits on a heavy charged gauge boson (W ′) in the sequential standard model, a split universal extra dimension model, and contact interactions in the helicity-nonconserving model. For the last, values of the binding energy below 10.5 (8.8) TeV in the electron (muon) channel are excluded at a 95% confidence level. Interpreting the ℓν final state in terms of a heavy W′ with standard model couplings, masses below 2.90 TeV are excluded. © 2013 CERN.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A search for diphoton events with large missing transverse energy is presented. The data were collected with the ATLAS detector in proton-proton collisions at √s=7 TeV at the CERN Large Hadron Collider and correspond to an integrated luminosity of 3.1 pb⁻¹. No excess of such events is observed above the standard model background prediction. In the context of a specific model with one universal extra dimension with compactification radius R and gravity-induced decays, values of 1/R<729 GeV are excluded at 95% C. L., providing the most sensitive limit on this model to date.