2 resultados para variable sample size
em DigitalCommons@University of Nebraska - Lincoln
Resumo:
The 3PL model is a flexible and widely used tool in assessment. However, it suffers from limitations due to its need for large sample sizes. This study introduces and evaluates the efficacy of a new sample size augmentation technique called Duplicate, Erase, and Replace (DupER) Augmentation through a simulation study. Data are augmented using several variations of DupER Augmentation (based on different imputation methodologies, deletion rates, and duplication rates), analyzed in BILOG-MG 3, and results are compared to those obtained from analyzing the raw data. Additional manipulated variables include test length and sample size. Estimates are compared using seven different evaluative criteria. Results are mixed and inconclusive. DupER augmented data tend to result in larger root mean squared errors (RMSEs) and lower correlations between estimates and parameters for both item and ability parameters. However, some DupER variations produce estimates that are much less biased than those obtained from the raw data alone. For one DupER variation, it was found that DupER produced better results for low-ability simulees and worse results for those with high abilities. Findings, limitations, and recommendations for future studies are discussed. Specific recommendations for future studies include the application of Duper Augmentation (1) to empirical data, (2) with additional IRT models, and (3) the analysis of the efficacy of the procedure for different item and ability parameter distributions.
Resumo:
Evaluations of measurement invariance provide essential construct validity evidence. However, the quality of such evidence is partly dependent upon the validity of the resulting statistical conclusions. The presence of Type I or Type II errors can render measurement invariance conclusions meaningless. The purpose of this study was to determine the effects of categorization and censoring on the behavior of the chi-square/likelihood ratio test statistic and two alternative fit indices (CFI and RMSEA) under the context of evaluating measurement invariance. Monte Carlo simulation was used to examine Type I error and power rates for the (a) overall test statistic/fit indices, and (b) change in test statistic/fit indices. Data were generated according to a multiple-group single-factor CFA model across 40 conditions that varied by sample size, strength of item factor loadings, and categorization thresholds. Seven different combinations of model estimators (ML, Yuan-Bentler scaled ML, and WLSMV) and specified measurement scales (continuous, censored, and categorical) were used to analyze each of the simulation conditions. As hypothesized, non-normality increased Type I error rates for the continuous scale of measurement and did not affect error rates for the categorical scale of measurement. Maximum likelihood estimation combined with a categorical scale of measurement resulted in more correct statistical conclusions than the other analysis combinations. For the continuous and censored scales of measurement, the Yuan-Bentler scaled ML resulted in more correct conclusions than normal-theory ML. The censored measurement scale did not offer any advantages over the continuous measurement scale. Comparing across fit statistics and indices, the chi-square-based test statistics were preferred over the alternative fit indices, and ΔRMSEA was preferred over ΔCFI. Results from this study should be used to inform the modeling decisions of applied researchers. However, no single analysis combination can be recommended for all situations. Therefore, it is essential that researchers consider the context and purpose of their analyses.