978 resultados para differential item functioning
Resumo:
Resumen tomado de la publicaci??n
Resumo:
Resumen tomado de la publicaci??n
Resumo:
The present research represents a coherent approach to understanding the root causes of ethnic group differences in ability test performance. Two studies were conducted, each of which was designed to address a key knowledge gap in the ethnic bias literature. In Study 1, both the LR Method of Differential Item Functioning (DIF) detection and Mixture Latent Variable Modelling were used to investigate the degree to which Differential Test Functioning (DTF) could explain ethnic group test performance differences in a large, previously unpublished dataset. Though mean test score differences were observed between a number of ethnic groups, neither technique was able to identify ethnic DTF. This calls into question the practical application of DTF to understanding these group differences. Study 2 investigated whether a number of non-cognitive factors might explain ethnic group test performance differences on a variety of ability tests. Two factors – test familiarity and trait optimism – were able to explain a large proportion of ethnic group test score differences. Furthermore, test familiarity was found to mediate the relationship between socio-economic factors – particularly participant educational level and familial social status – and test performance, suggesting that test familiarity develops over time through the mechanism of exposure to ability testing in other contexts. These findings represent a substantial contribution to the field’s understanding of two key issues surrounding ethnic test performance differences. The author calls for a new line of research into these performance facilitating and debilitating factors, before recommendations are offered for practitioners to ensure fairer deployment of ability testing in high-stakes selection processes.
Resumo:
Introduction The Skin Self-Examination Attitude Scale (SSEAS) is a brief measure that allows for the assessment of attitudes in relation to skin self-examination. This study evaluated the psychometric properties of the SSEAS using Item Response Theory (IRT) methods in a large sample of men ≥ 50 years in Queensland, Australia. Methods A sample of 831 men (420 intervention and 411 control) completed a telephone assessment at the 13-month follow-up of a randomized-controlled trial of a video-based intervention to improve skin self-examination (SSE) behaviour. Descriptive statistics (mean, standard deviation, item–total correlations, and Cronbach’s alpha) were compiled and difficulty parameters were computed with Winsteps using the polytomous Rasch Rating Scale Model (RRSM). An item person (Wright) map of the SSEAS was examined for content coverage and item targeting. Results The SSEAS have good psychometric properties including good internal consistency (Cronbach’s alpha = 0.80), fit with the model and no evidence for differential item functioning (DIF) due to experimental trial grouping was detected. Conclusions The present study confirms the SSEA scale as a brief, useful and reliable tool for assessing attitudes towards skin self-examination in a population of men 50 years or older in Queensland, Australia. The 8-item scale shows unidimensionality, allowing levels of SSE attitude, and the item difficulties, to be ranked on a single continuous scale. In terms of clinical practice, it is very important to assess skin cancer self-examination attitude to identify people who may need a more extensive intervention to allow early detection of skin cancer.
Resumo:
Background: To develop and validate an item bank to measure mobility in older people in primary care and to analyse differential item functioning (DIF) and differential bundle functioning (DBF) by sex. Methods: A pool of 48 mobility items was administered by interview to 593 older people attending primary health care practices. The pool contained four domains based on the International Classification of Functioning: changing and maintaining body position, carrying, lifting and pushing, walking and going up and down stairs. Results: The Late Life Mobility item bank consisted of 35 items, and measured with a reliability of 0.90 or more across the full spectrum of mobility, except at the higher end of better functioning. No evidence was found of non-uniform DIF but uniform DIF was observed, mainly for items in the changing and maintaining body position and carrying, lifting and pushing domains. The walking domain did not display DBF, but the other three domains did, principally the carrying, lifting and pushing items. Conclusions: During the design and validation of an item bank to measure mobility in older people, we found that strength (carrying, lifting and pushing) items formed a secondary dimension that produced DBF. More research is needed to determine how best to include strength items in a mobility measure, or whether it would be more appropriate to design separate measures for each construct.
Resumo:
This paper proposes a framework to analyse performance on multiple choice questions with the focus on linguistic factors. Item Response Theory (IRT) is deployed to estimate ability and question difficulty levels. A logistic regression model is used to detect Differential Item Functioning questions. Probit models testify relationships between performance and linguistic factors controlling the effects of question construction and students’ background. Empirical results have important implications. The lexical density of stems affects performance. The use of non-Economics specialised vocabulary has differing impacts on the performance of students with different language backgrounds. The IRT-based ability and difficulty help explain performance variations.
Resumo:
In non-western society,researches on social development and personality change focused on economic development and social modernization. The present study is aimed at exploring the relationship between the social transformation and personality changes of Chinese people by using so-called indigenous personality measurement of CPAI (Chinese Personality Assessment Inventory). Meanwhile, the influence of CPAI measurement itself and measurement theory were also taken into consideration. In study 1, two sets of CPAI data collected in a 10 year interval were analyzed. At the same time, the CPAI-2 data was analyzed in terms of modernization level of various cities from which the data were collected. However, this study didn’t consider the importance of “equivalence” of the measurement, CPAI. In study 2, we detected DIF (Differential item functioning) across the different period groups to confirm if CPAI was equal to people in different period. In this process, both CTT and IRT method were used. The outcome reminded us that there were some DIF items. In study 3, to make sure that the personality measurement is fair to people in different period, we only saved those items whose DIF effect size lower than 0.01, and used IRT method to estimate test-taker’s personality. Then, cohort analysis was used to explore the pattern of personality change of Chinese people. In study 4, we factor-analyzed the DIF items to find the relation between social transformation and the latent personality variable which were composed of DIF items. From these 4 studies, we could got the following conclusions: (1) The CPAI 22 traits could be divided into two categories, with the changing of age, period and cohort, type I traits didn’t change, they were Logical vs Affective Orientation, Enterprise, Responsibility, Inferiority vs Self-Acceptance, Optimism vs Pessimism, Face, Family, Defensive, Graciousness vs Meanness; While with the changing of age, period and cohort, type II traits changed, they were Leadership, Self vs. Social Orientation, Veraciousness vs Slickness, Traditionalism vs Modernity, Harmony, Renqing, Meticulousness, Extraversion vs Introversion, Emotionality, Practical Mindedness, Internal vs External Locus of Control, Thrift vs Extravagance, Discipline. Meanwhile DIF items measured 5 psychologycial characteristics which changed greatly with the changing of age, period and cohort, they were Life attitude of Cynicism-realism, Psychological maladjustment, Coping style of Waiyuanneifang, Self-efficacy, the value of Individualism. (2) In sum, Chinese people in 1992 were more traditional than those in 2001, and with the 10-year of rapid development, according to the market economy’s needs, Chinese people became more individualism. (3) The DIF method of CTT and IRT were comparable. But, in generally, IRT method was more accurate and valid in detecting DIF as were as estimating personality. (4) The DIF outcomes showed that CPAI had good item validity. Meanwhile, it’s possible to develop a subscale by using CPAI items to assess some psychological characteristics. In this current study, according to their stability and variability, we could divided personality traits and psychological characteristics into 3 categories, and the outcome supported the hypothesis of “Six Factor Model”, these foundings were of some theoretic meanings. Meanwhile, as the relation between social development and personality change being explored, it certain help Chinese people cope with the rapid changing society. In this study, we also found that it’s possible to develop a subscale by using CPAI items to assess obverse personality traits and it had some practical use. Furthermore, the use of different measurement theory and cohort analysis embodied some innovation in methodology.
Resumo:
Resumen tomado de la publicaci??n
Testing the structural and cross-cultural validity of the KIDSCREEN-27 quality of life questionnaire
Resumo:
OBJECTIVES: The aim of this study is to assess the structural and cross-cultural validity of the KIDSCREEN-27 questionnaire. METHODS: The 27-item version of the KIDSCREEN instrument was derived from a longer 52-item version and was administered to young people aged 8-18 years in 13 European countries in a cross-sectional survey. Structural and cross-cultural validity were tested using multitrait multi-item analysis, exploratory and confirmatory factor analysis, and Rasch analyses. Zumbo's logistic regression method was applied to assess differential item functioning (DIF) across countries. Reliability was assessed using Cronbach's alpha. RESULTS: Responses were obtained from n = 22,827 respondents (response rate 68.9%). For the combined sample from all countries, exploratory factor analysis with procrustean rotations revealed a five-factor structure which explained 56.9% of the variance. Confirmatory factor analysis indicated an acceptable model fit (RMSEA = 0.068, CFI = 0.960). The unidimensionality of all dimensions was confirmed (INFIT: 0.81-1.15). Differential item functioning (DIF) results across the 13 countries showed that 5 items presented uniform DIF whereas 10 displayed non-uniform DIF. Reliability was acceptable (Cronbach's alpha = 0.78-0.84 for individual dimensions). CONCLUSIONS: There was substantial evidence for the cross-cultural equivalence of the KIDSCREEN-27 across the countries studied and the factor structure was highly replicable in individual countries. Further research is needed to correct scores based on DIF results. The KIDSCREEN-27 is a new short and promising tool for use in clinical and epidemiological studies.
Resumo:
Over the past several decades, the prevalence of obesity has dramatically increased. Cause for concern has increased because overweight and obesity are major contributors to morbidity and mortality. Intervention research aimed at reducing the prevalence of obesity has identified the family, specifically the parent, as a key component of the home environment. However, findings from dietary behavior change interventions have been disheartening because few studies have reported meaningful change, suggesting methodological and/or measurement issues within the intervention process. A lack of appropriate mediators and cross-cultural equivalence may partially explain the reason for little change.^ The study aims were to (1) evaluate the psychometric properties and assess the cross cultural equivalence of the Food Insecurity Scale (paper 1) and the modified Parent Feeding Practices Questionnaire (paper 2) and to assess the overall relationships among food insecurity, parent mediators, and parent behaviors towards children's dietary behavior (paper 3) through structural equation modeling and tests of invariance. The study aims were accomplished through conducting secondary analyses using baseline data from English- and Spanish-speaking Hispanic women who participated in the Healthy Families: Step by Step (BHF) study.^ Results indicated that although the FIS and the mPFPQ exhibited sound psychometric properties, the instruments exhibited a lack of invariance across language spoken groups. The lack of invariance was more pronounced in the FIS. Results also supported the theoretical framework identifying parent's perceived barriers and self-efficacy as mediators of parent's behaviors toward improving children's health eating. Results did not suggest that the relationships were moderated by food insecurity.^ In conclusion, the identification of differential item functioning in food insecurity and parent feeding practices may be beneficial in enhancing tailored interventions through the incorporation of cultural differences into the change mechanisms. However, future research needs to be conducted to determine if the lack of invariance demonstrates the existence of item bias or if it is a reflection of true difference among the language spoken groups. Additionally, obesity intervention studies targeting parent/family barriers and parent self-efficacy to provide/encourage healthy diets may result in an increase in parent behaviors which promote healthy eating behaviors among children. Future research should also examine a more complete causal pathway to determine whether parental changes in the mediators ultimately lead to an increase in healthy dietary behavior among children.^
Resumo:
There are very few studies in Spain that treat underachievement rigorously, and those that do are typically related to gifted students. The present study examined the proportion of underachieving students using the Rasch measurement model. A sample of 643 first-year high school students (mean age = 12.09; SD = 0.47) from 8 schools in the province of Alicante (Spain) completed the Battery of Differential and General Skills (Badyg), and these students' General Points Average (GPAs) were recovered by teachers. Dichotomous and Partial credit Rasch models were performed. After adjusting the measurement instruments, the individual underachievement index provided a total sample of 181 underachieving students, or 28.14% of the total sample across the ability levels. This study confirms that the Rasch measurement model can accurately estimate the construct validity of both the intelligence test and the academic grades for the calculation of underachieving students. Furthermore, the present study constitutes a pioneer framework for the estimation of the prevalence of underachievement in Spain.
Resumo:
Background The HCL-32 is a widely-used screening questionnaire for hypomania. We aimed to use a Rasch analysis approach to (i) evaluate the measurement properties, principally unidimensionality, of the HCL-32, and (ii) generate a score table to allow researchers to convert raw HCL-32 scores into an interval-level measurement which will be more appropriate for statistical analyses. Methods Subjects were part of the Bipolar Disorder Research Network (BDRN) study with DSM-IV bipolar disorder (n=389). Multidimensionality was assessed using the Rasch fit statistics and principle components analysis of the residuals (PCA). Item invariance (differential item functioning, DIF) was tested for gender, bipolar diagnosis and current mental state. Item estimates and reliabilities were calculated. Results Three items (29, 30, 32) had unacceptable fit to the Rasch unidimensional model. Item 14 displayed significant DIF for gender and items 8 and 17 for current mental state. Item estimates confirmed that not all items measure hypomania equally. Limitations This sample was recruited as part of a large ongoing genetic epidemiology study of bipolar disorder and may not be fully representative of the broader clinical population of individuals with bipolar disorder. Conclusion The HCL-32 is unidimensional in practice, but measurements may be further strengthened by the removal of four items. Re-scored linear measurements may be more appropriate for clinical research.
Resumo:
Relational reasoning, or the ability to identify meaningful patterns within any stream of information, is a fundamental cognitive ability associated with academic success across a variety of domains of learning and levels of schooling. However, the measurement of this construct has been historically problematic. For example, while the construct is typically described as multidimensional—including the identification of multiple types of higher-order patterns—it is most often measured in terms of a single type of pattern: analogy. For that reason, the Test of Relational Reasoning (TORR) was conceived and developed to include three other types of patterns that appear to be meaningful in the educational context: anomaly, antinomy, and antithesis. Moreover, as a way to focus on fluid relational reasoning ability, the TORR was developed to include, except for the directions, entirely visuo-spatial stimuli, which were designed to be as novel as possible for the participant. By focusing on fluid intellectual processing, the TORR was also developed to be fairly administered to undergraduate students—regardless of the particular gender, language, and ethnic groups they belong to. However, although some psychometric investigations of the TORR have been conducted, its actual fairness across those demographic groups has yet to be empirically demonstrated. Therefore, a systematic investigation of differential-item-functioning (DIF) across demographic groups on TORR items was conducted. A large (N = 1,379) sample, representative of the University of Maryland on key demographic variables, was collected, and the resulting data was analyzed using a multi-group, multidimensional item-response theory model comparison procedure. Using this procedure, no significant DIF was found on any of the TORR items across any of the demographic groups of interest. This null finding is interpreted as evidence of the cultural-fairness of the TORR, and potential test-development choices that may have contributed to that cultural-fairness are discussed. For example, the choice to make the TORR an untimed measure, to use novel stimuli, and to avoid stereotype threat in test administration, may have contributed to its cultural-fairness. Future steps for psychometric research on the TORR, and substantive research utilizing the TORR, are also presented and discussed.