909 resultados para Differential item functioning


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Resumen tomado de la publicaci??n

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The present research represents a coherent approach to understanding the root causes of ethnic group differences in ability test performance. Two studies were conducted, each of which was designed to address a key knowledge gap in the ethnic bias literature. In Study 1, both the LR Method of Differential Item Functioning (DIF) detection and Mixture Latent Variable Modelling were used to investigate the degree to which Differential Test Functioning (DTF) could explain ethnic group test performance differences in a large, previously unpublished dataset. Though mean test score differences were observed between a number of ethnic groups, neither technique was able to identify ethnic DTF. This calls into question the practical application of DTF to understanding these group differences. Study 2 investigated whether a number of non-cognitive factors might explain ethnic group test performance differences on a variety of ability tests. Two factors – test familiarity and trait optimism – were able to explain a large proportion of ethnic group test score differences. Furthermore, test familiarity was found to mediate the relationship between socio-economic factors – particularly participant educational level and familial social status – and test performance, suggesting that test familiarity develops over time through the mechanism of exposure to ability testing in other contexts. These findings represent a substantial contribution to the field’s understanding of two key issues surrounding ethnic test performance differences. The author calls for a new line of research into these performance facilitating and debilitating factors, before recommendations are offered for practitioners to ensure fairer deployment of ability testing in high-stakes selection processes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE To evaluate the level of HIV/AIDS knowledge among men who have sex with men in Brazil using the latent trait model estimated by Item Response Theory. METHODS Multicenter, cross-sectional study, carried out in ten Brazilian cities between 2008 and 2009. Adult men who have sex with men were recruited (n = 3,746) through Respondent Driven Sampling. HIV/AIDS knowledge was ascertained through ten statements by face-to-face interview and latent scores were obtained through two-parameter logistic modeling (difficulty and discrimination) using Item Response Theory. Differential item functioning was used to examine each item characteristic curve by age and schooling. RESULTS Overall, the HIV/AIDS knowledge scores using Item Response Theory did not exceed 6.0 (scale 0-10), with mean and median values of 5.0 (SD = 0.9) and 5.3, respectively, with 40.7% of the sample with knowledge levels below the average. Some beliefs still exist in this population regarding the transmission of the virus by insect bites, by using public restrooms, and by sharing utensils during meals. With regard to the difficulty and discrimination parameters, eight items were located below the mean of the scale and were considered very easy, and four items presented very low discrimination parameter (< 0.34). The absence of difficult items contributed to the inaccuracy of the measurement of knowledge among those with median level and above. CONCLUSIONS Item Response Theory analysis, which focuses on the individual properties of each item, allows measures to be obtained that do not vary or depend on the questionnaire, which provides better ascertainment and accuracy of knowledge scores. Valid and reliable scales are essential for monitoring HIV/AIDS knowledge among the men who have sex with men population over time and in different geographic regions, and this psychometric model brings this advantage.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: To develop and validate an item bank to measure mobility in older people in primary care and to analyse differential item functioning (DIF) and differential bundle functioning (DBF) by sex. Methods: A pool of 48 mobility items was administered by interview to 593 older people attending primary health care practices. The pool contained four domains based on the International Classification of Functioning: changing and maintaining body position, carrying, lifting and pushing, walking and going up and down stairs. Results: The Late Life Mobility item bank consisted of 35 items, and measured with a reliability of 0.90 or more across the full spectrum of mobility, except at the higher end of better functioning. No evidence was found of non-uniform DIF but uniform DIF was observed, mainly for items in the changing and maintaining body position and carrying, lifting and pushing domains. The walking domain did not display DBF, but the other three domains did, principally the carrying, lifting and pushing items. Conclusions: During the design and validation of an item bank to measure mobility in older people, we found that strength (carrying, lifting and pushing) items formed a secondary dimension that produced DBF. More research is needed to determine how best to include strength items in a mobility measure, or whether it would be more appropriate to design separate measures for each construct.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE: To evaluate the Brazilian version of WHOQOL-OLD Module and to test potential changes to the instrument to increase its psychometric adequacy. METHODS: A total of 424 older adults living in a city in Southern Brazil completed the WHOQOL-OLD instrument, in 2005. Rasch analysis was used to explore the psychometric performance of the scale, as implemented by the RUMM2020 software. Item-trait interaction, threshold disorders, presence of differential item functioning and item fit, were analyzed. RESULTS: Two ("death and dying" and "sensory abilities") out of six domains showed inadequate item-trait interactions. Rescoring the response scale and deleting the most misperforming items led to scale improvement. The evaluation of domains and items individually showed that the "intimacy" domain does perform well in contrast to the findings using the classical approach. In addition, the "sensory abilities" domain does not derive an interval measure in its current format. CONCLUSIONS: Unidimensionality and local independence were seen in all domains. Changes in the response scale and deletion of problematic items improved the scale's performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Resumen tomado de la publicaci??n

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVES: The aim of this study is to assess the structural and cross-cultural validity of the KIDSCREEN-27 questionnaire. METHODS: The 27-item version of the KIDSCREEN instrument was derived from a longer 52-item version and was administered to young people aged 8-18 years in 13 European countries in a cross-sectional survey. Structural and cross-cultural validity were tested using multitrait multi-item analysis, exploratory and confirmatory factor analysis, and Rasch analyses. Zumbo's logistic regression method was applied to assess differential item functioning (DIF) across countries. Reliability was assessed using Cronbach's alpha. RESULTS: Responses were obtained from n = 22,827 respondents (response rate 68.9%). For the combined sample from all countries, exploratory factor analysis with procrustean rotations revealed a five-factor structure which explained 56.9% of the variance. Confirmatory factor analysis indicated an acceptable model fit (RMSEA = 0.068, CFI = 0.960). The unidimensionality of all dimensions was confirmed (INFIT: 0.81-1.15). Differential item functioning (DIF) results across the 13 countries showed that 5 items presented uniform DIF whereas 10 displayed non-uniform DIF. Reliability was acceptable (Cronbach's alpha = 0.78-0.84 for individual dimensions). CONCLUSIONS: There was substantial evidence for the cross-cultural equivalence of the KIDSCREEN-27 across the countries studied and the factor structure was highly replicable in individual countries. Further research is needed to correct scores based on DIF results. The KIDSCREEN-27 is a new short and promising tool for use in clinical and epidemiological studies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Over the past several decades, the prevalence of obesity has dramatically increased. Cause for concern has increased because overweight and obesity are major contributors to morbidity and mortality. Intervention research aimed at reducing the prevalence of obesity has identified the family, specifically the parent, as a key component of the home environment. However, findings from dietary behavior change interventions have been disheartening because few studies have reported meaningful change, suggesting methodological and/or measurement issues within the intervention process. A lack of appropriate mediators and cross-cultural equivalence may partially explain the reason for little change.^ The study aims were to (1) evaluate the psychometric properties and assess the cross cultural equivalence of the Food Insecurity Scale (paper 1) and the modified Parent Feeding Practices Questionnaire (paper 2) and to assess the overall relationships among food insecurity, parent mediators, and parent behaviors towards children's dietary behavior (paper 3) through structural equation modeling and tests of invariance. The study aims were accomplished through conducting secondary analyses using baseline data from English- and Spanish-speaking Hispanic women who participated in the Healthy Families: Step by Step (BHF) study.^ Results indicated that although the FIS and the mPFPQ exhibited sound psychometric properties, the instruments exhibited a lack of invariance across language spoken groups. The lack of invariance was more pronounced in the FIS. Results also supported the theoretical framework identifying parent's perceived barriers and self-efficacy as mediators of parent's behaviors toward improving children's health eating. Results did not suggest that the relationships were moderated by food insecurity.^ In conclusion, the identification of differential item functioning in food insecurity and parent feeding practices may be beneficial in enhancing tailored interventions through the incorporation of cultural differences into the change mechanisms. However, future research needs to be conducted to determine if the lack of invariance demonstrates the existence of item bias or if it is a reflection of true difference among the language spoken groups. Additionally, obesity intervention studies targeting parent/family barriers and parent self-efficacy to provide/encourage healthy diets may result in an increase in parent behaviors which promote healthy eating behaviors among children. Future research should also examine a more complete causal pathway to determine whether parental changes in the mediators ultimately lead to an increase in healthy dietary behavior among children.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There are very few studies in Spain that treat underachievement rigorously, and those that do are typically related to gifted students. The present study examined the proportion of underachieving students using the Rasch measurement model. A sample of 643 first-year high school students (mean age = 12.09; SD = 0.47) from 8 schools in the province of Alicante (Spain) completed the Battery of Differential and General Skills (Badyg), and these students' General Points Average (GPAs) were recovered by teachers. Dichotomous and Partial credit Rasch models were performed. After adjusting the measurement instruments, the individual underachievement index provided a total sample of 181 underachieving students, or 28.14% of the total sample across the ability levels. This study confirms that the Rasch measurement model can accurately estimate the construct validity of both the intelligence test and the academic grades for the calculation of underachieving students. Furthermore, the present study constitutes a pioneer framework for the estimation of the prevalence of underachievement in Spain.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background The HCL-32 is a widely-used screening questionnaire for hypomania. We aimed to use a Rasch analysis approach to (i) evaluate the measurement properties, principally unidimensionality, of the HCL-32, and (ii) generate a score table to allow researchers to convert raw HCL-32 scores into an interval-level measurement which will be more appropriate for statistical analyses. Methods Subjects were part of the Bipolar Disorder Research Network (BDRN) study with DSM-IV bipolar disorder (n=389). Multidimensionality was assessed using the Rasch fit statistics and principle components analysis of the residuals (PCA). Item invariance (differential item functioning, DIF) was tested for gender, bipolar diagnosis and current mental state. Item estimates and reliabilities were calculated. Results Three items (29, 30, 32) had unacceptable fit to the Rasch unidimensional model. Item 14 displayed significant DIF for gender and items 8 and 17 for current mental state. Item estimates confirmed that not all items measure hypomania equally. Limitations This sample was recruited as part of a large ongoing genetic epidemiology study of bipolar disorder and may not be fully representative of the broader clinical population of individuals with bipolar disorder. Conclusion The HCL-32 is unidimensional in practice, but measurements may be further strengthened by the removal of four items. Re-scored linear measurements may be more appropriate for clinical research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Relational reasoning, or the ability to identify meaningful patterns within any stream of information, is a fundamental cognitive ability associated with academic success across a variety of domains of learning and levels of schooling. However, the measurement of this construct has been historically problematic. For example, while the construct is typically described as multidimensional—including the identification of multiple types of higher-order patterns—it is most often measured in terms of a single type of pattern: analogy. For that reason, the Test of Relational Reasoning (TORR) was conceived and developed to include three other types of patterns that appear to be meaningful in the educational context: anomaly, antinomy, and antithesis. Moreover, as a way to focus on fluid relational reasoning ability, the TORR was developed to include, except for the directions, entirely visuo-spatial stimuli, which were designed to be as novel as possible for the participant. By focusing on fluid intellectual processing, the TORR was also developed to be fairly administered to undergraduate students—regardless of the particular gender, language, and ethnic groups they belong to. However, although some psychometric investigations of the TORR have been conducted, its actual fairness across those demographic groups has yet to be empirically demonstrated. Therefore, a systematic investigation of differential-item-functioning (DIF) across demographic groups on TORR items was conducted. A large (N = 1,379) sample, representative of the University of Maryland on key demographic variables, was collected, and the resulting data was analyzed using a multi-group, multidimensional item-response theory model comparison procedure. Using this procedure, no significant DIF was found on any of the TORR items across any of the demographic groups of interest. This null finding is interpreted as evidence of the cultural-fairness of the TORR, and potential test-development choices that may have contributed to that cultural-fairness are discussed. For example, the choice to make the TORR an untimed measure, to use novel stimuli, and to avoid stereotype threat in test administration, may have contributed to its cultural-fairness. Future steps for psychometric research on the TORR, and substantive research utilizing the TORR, are also presented and discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The shift from decentralized to centralized A-level examinations (Abitur) was implemented in the German school system as a measure of Educational Governance in the last decade. This reform was mainly introduced with the intention of providing higher comparability of school examinations and student achievement as well as increasing fairness in school examinations. It is not known yet if these ambitious aims and functions of the new centralized examination format have been achieved and if fairer assessment can be guaranteed in terms of providing all students with the same opportunities to pass the examinations by allocating fair tests to different student subpopulations e.g., students of different background or gender. The research presented in this article deals with these questions and focuses on gender differences. It investigates gender-specific fairness of the test items in centralized Abitur examinations as high school exit examinations in Germany. The data are drawn from Abitur examinations in English (as a foreign language). Differential item functioning (DIF) analysis reveals that at least some parts of the examinations indicate gender inequality. (DIPF/Orig.)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Model misspecification affects the classical test statistics used to assess the fit of the Item Response Theory (IRT) models. Robust tests have been derived under model misspecification, as the Generalized Lagrange Multiplier and Hausman tests, but their use has not been largely explored in the IRT framework. In the first part of the thesis, we introduce the Generalized Lagrange Multiplier test to detect differential item response functioning in IRT models for binary data under model misspecification. By means of a simulation study and a real data analysis, we compare its performance with the classical Lagrange Multiplier test, computed using the Hessian and the cross-product matrix, and the Generalized Jackknife Score test. The power of these tests is computed empirically and asymptotically. The misspecifications considered are local dependence among items and non-normal distribution of the latent variable. The results highlight that, under mild model misspecification, all tests have good performance while, under strong model misspecification, the performance of the tests deteriorates. None of the tests considered show an overall superior performance than the others. In the second part of the thesis, we extend the Generalized Hausman test to detect non-normality of the latent variable distribution. To build the test, we consider a seminonparametric-IRT model, that assumes a more flexible latent variable distribution. By means of a simulation study and two real applications, we compare the performance of the Generalized Hausman test with the M2 limited information goodness-of-fit test and the Likelihood-Ratio test. Additionally, the information criteria are computed. The Generalized Hausman test has a better performance than the Likelihood-Ratio test in terms of Type I error rates and the M2 test in terms of power. The performance of the Generalized Hausman test and the information criteria deteriorates when the sample size is small and with a few items.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In many applications (like social or sensor networks) the in- formation generated can be represented as a continuous stream of RDF items, where each item describes an application event (social network post, sensor measurement, etc). In this paper we focus on compressing RDF streams. In particular, we propose an approach for lossless RDF stream compression, named RDSZ (RDF Differential Stream compressor based on Zlib). This approach takes advantage of the structural similarities among items in a stream by combining a differential item encoding mechanism with the general purpose stream compressor Zlib. Empirical evaluation using several RDF stream datasets shows that this combi- nation produces gains in compression ratios with respect to using Zlib alone.