2 resultados para matrix population models

em Digital Commons @ DU | University of Denver Research


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purposes of this study were (1) to validate of the item-attribute matrix using two levels of attributes (Level 1 attributes and Level 2 sub-attributes), and (2) through retrofitting the diagnostic models to the mathematics test of the Trends in International Mathematics and Science Study (TIMSS), to evaluate the construct validity of TIMSS mathematics assessment by comparing the results of two assessment booklets. Item data were extracted from Booklets 2 and 3 for the 8th grade in TIMSS 2007, which included a total of 49 mathematics items and every student's response to every item. The study developed three categories of attributes at two levels: content, cognitive process (TIMSS or new), and comprehensive cognitive process (or IT) based on the TIMSS assessment framework, cognitive procedures, and item type. At level one, there were 4 content attributes (number, algebra, geometry, and data and chance), 3 TIMSS process attributes (knowing, applying, and reasoning), and 4 new process attributes (identifying, computing, judging, and reasoning). At level two, the level 1 attributes were further divided into 32 sub-attributes. There was only one level of IT attributes (multiple steps/responses, complexity, and constructed-response). Twelve Q-matrices (4 originally specified, 4 random, and 4 revised) were investigated with eleven Q-matrix models (QM1 ~ QM11) using multiple regression and the least squares distance method (LSDM). Comprehensive analyses indicated that the proposed Q-matrices explained most of the variance in item difficulty (i.e., 64% to 81%). The cognitive process attributes contributed to the item difficulties more than the content attributes, and the IT attributes contributed much more than both the content and process attributes. The new retrofitted process attributes explained the items better than the TIMSS process attributes. Results generated from the level 1 attributes and the level 2 attributes were consistent. Most attributes could be used to recover students' performance, but some attributes' probabilities showed unreasonable patterns. The analysis approaches could not demonstrate if the same construct validity was supported across booklets. The proposed attributes and Q-matrices explained the items of Booklet 2 better than the items of Booklet 3. The specified Q-matrices explained the items better than the random Q-matrices.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation introduces an approach to generate tests to test fail-safe behavior for web applications. We apply the approach to a commercial web application. We build models for both behavioral and mitigation requirements. We create mitigation tests from an existing functional black box test suite by determining failure type and points of failure in the test suite and weaving required mitigation based on weaving rules to generate a test suite that tests proper mitigation of failures. A genetic algorithm (GA) is used to determine points of failure and type of failure that needs to be tested. Mitigation test paths are woven into the behavioral test at the point of failure based on failure specific weaving rules. A simulator was developed to evaluate choice of parameters for the genetic algorithm. We showed how to tune the fitness function and performed tuning experiments for GA to determine what values to use for exploration weight and prospecting weight. We found that higher defect densities make prospecting and mining more successful, while lower mitigation defect densities need more exploration. We compare efficiency and effectiveness of the approach. First, the GA approach is compared to random selection. The results show that the GA performance was better than random selection and that the approach was robust when the search space increased. Second, we compare the GA against four coverage criteria. The results of comparison show that test requirements generated by a genetic algorithm (GA) are more efficient than three of the four coverage criteria for large search spaces. They are equally effective. For small search spaces, the genetic algorithm is less effective than three of the four coverage criteria. The fourth coverage criteria is too weak and unable to find all defects in almost all cases. We also present a large case study of a mortgage system at one of our industrial partners and show how we formalize the approach. We evaluate the use of a GA to create test requirements. The evaluation includes choice of initial population, multiplicity of runs and a discussion of the cost of evaluating fitness. Finally, we build a selective regression testing approach based on types of changes (add, delete, or modify) that could occur in the behavioral model, the fault model, the mitigation models, the weaving rules, and the state-event matrix. We provide a systematic method by showing the formalization steps for each type of change to the various models.