929 resultados para Test theory
Resumo:
The purpose of this paper was to evaluate the psychometric properties of a stage-specific selfefficacy scale for physical activity with classical test theory (CTT), confirmatory factor analysis (CFA) and item response modeling (IRM). Women who enrolled in the Women On The Move study completed a 20-item stage-specific self-efficacy scale developed for this study [n = 226, 51.1% African-American and 48.9% Hispanic women, mean age = 49.2 (67.0) years, mean body mass index = 29.7 (66.4)]. Three analyses were conducted: (i) a CTT item analysis, (ii) a CFA to validate the factor structure and (iii) an IRM analysis. The CTT item analysis and the CFA results showed that the scale had high internal consistency (ranging from 0.76 to 0.93) and a strong factor structure. Results also showed that the scale could be improved by modifying or eliminating some of the existing items without significantly altering the content of the scale. The IRM results also showed that the scale had few items that targeted high self-efficacy and the stage-specific assumption underlying the scale was rejected. In addition, the IRM analyses found that the five-point response format functioned more like a four-point response format. Overall, employing multiple methods to assess the psychometric properties of the stage-specific self-efficacy scale demonstrated the complimentary nature of these methods and it highlighted the strengths and weaknesses of this scale.
Resumo:
The subject of the presented thesis is the accurate measurement of time dilation, aiming at a quantitative test of special relativity. By means of laser spectroscopy, the relativistic Doppler shifts of a clock transition in the metastable triplet spectrum of ^7Li^+ are simultaneously measured with and against the direction of motion of the ions. By employing saturation or optical double resonance spectroscopy, the Doppler broadening as caused by the ions' velocity distribution is eliminated. From these shifts both time dilation as well as the ion velocity can be extracted with high accuracy allowing for a test of the predictions of special relativity. A diode laser and a frequency-doubled titanium sapphire laser were set up for antiparallel and parallel excitation of the ions, respectively. To achieve a robust control of the laser frequencies required for the beam times, a redundant system of frequency standards consisting of a rubidium spectrometer, an iodine spectrometer, and a frequency comb was developed. At the experimental section of the ESR, an automated laser beam guiding system for exact control of polarisation, beam profile, and overlap with the ion beam, as well as a fluorescence detection system were built up. During the first experiments, the production, acceleration and lifetime of the metastable ions at the GSI heavy ion facility were investigated for the first time. The characterisation of the ion beam allowed for the first time to measure its velocity directly via the Doppler effect, which resulted in a new improved calibration of the electron cooler. In the following step the first sub-Doppler spectroscopy signals from an ion beam at 33.8 %c could be recorded. The unprecedented accuracy in such experiments allowed to derive a new upper bound for possible higher-order deviations from special relativity. Moreover future measurements with the experimental setup developed in this thesis have the potential to improve the sensitivity to low-order deviations by at least one order of magnitude compared to previous experiments; and will thus lead to a further contribution to the test of the standard model.
Resumo:
Die Invarianz physikalischer Gesetze unter Lorentztransformationen ist eines der fundamentalen Postulate der modernen Physik und alle Theorien der grundlegenden Wechselwirkungen sind in kovarianter Form formuliert. Obwohl die Spezielle Relativitätstheorie (SRT) in einer Vielzahl von Experimenten mit hoher Genauigkeit überprüft und bestätigt wurde, sind aufgrund der weitreichenden Bedeutung dieses Postulats weitere verbesserte Tests von grundsätzlichem Interesse. Darüber hinaus weisen moderne Ansätze zur Vereinheitlichung der Gravitation mit den anderen Wechselwirkungen auf eine mögliche Verletzung der Lorentzinvarianz hin. In diesem Zusammenhang spielen Ives-Stilwell Experimente zum Test der Zeitdilatation in der SRT eine bedeutende Rolle. Dabei wird die hochauflösende Laserspektroskopie eingesetzt, um die Gültigkeit der relativistischen Dopplerformel – und damit des Zeitdilatationsfaktors γ – an relativistischen Teilchenstrahlen zu untersuchen. Im Rahmen dieser Arbeit wurde ein Ives-Stilwell Experiment an 7Li+-Ionen, die bei einer Geschwindigkeit von 34 % der Lichtgeschwindigkeit im Experimentierspeicherring (ESR) des GSI Helmholtzzentrums für Schwerionenforschung gespeichert waren, durchgeführt. Unter Verwendung des 1s2s3S1→ 1s2p3P2-Übergangs wurde sowohl Λ-Spektroskopie als auch Sättigungsspektroskopie betrieben. Durch die computergestützte Analyse des Fluoreszenznachweises und unter Verwendung optimierter Kantenfilter für den Nachweis konnte das Signal zu Rauschverhältnis entscheidend verbessert und unter Einsatz eines zusätzlichen Pumplasers erstmals ein Sättigungssignal beobachtet werden. Die Frequenzstabilität der beiden verwendeten Lasersysteme wurde mit Hilfe eines Frequenzkamms spezifiziert, um eine möglichst hohe Genauigkeit zu erreichen. Die aus den Strahlzeiten gewonnen Daten wurden im Rahmen der Robertson-Mansouri-Sexl-Testtheorie (RMS) und der Standard Model Extension (SME) interpretiert und entsprechende Obergrenzen für die relevanten Testparameter der jeweiligen Theorie bestimmt. Die Obergrenze für den Testparameter α der RMS-Theorie konnte gegenüber den früheren Messungen bei 6,4 % der Lichtgeschwindigkeit am Testspeicherring (TSR) des Max-Planck-Instituts für Kernphysik in Heidelberg um einen Faktor 4 verbessert werden.
Resumo:
The position effect describes the influence of just-completed items in a psychological scale on subsequent items. This effect has been repeatedly reported for psychometric reasoning scales and is assumed to reflect implicit learning during testing. One way to identify the position effect is fixed-links modeling. With this approach, two latent variables are derived from the test items. Factor loadings of one latent variable are fixed to 1 for all items to represent ability-related variance. Factor loadings on the second latent variable increase from the first to the last item describing the position effect. Previous studies using fixed-links modeling on the position effect investigated reasoning scales constructed in accordance with classical test theory (e.g., Raven’s Progressive Matrices) but, to the best of our knowledge, no Rasch-scaled tests. These tests, however, meet stronger requirements on item homogeneity. In the present study, therefore, we will analyze data from 239 participants who have completed the Rasch-scaled Viennese Matrices Test (VMT). Applying a fixed-links modeling approach, we will test whether a position effect can be depicted as a latent variable and separated from a latent variable representing basic reasoning ability. The results have implications for the assumption of homogeneity in Rasch-homogeneous tests.
Resumo:
The Work Limitations Questionnaire (WLQ) is used to determine the amount of work loss and productivity which stem from certain health conditions, including rheumatoid arthritis and cancer. The questionnaire is currently scored using methodology from Classical Test Theory. Item Response Theory, on the other hand, is a theory based on analyzing item responses. This study wanted to determine the validity of using Item Response Theory (IRT), to analyze data from the WLQ. Item responses from 572 employed adults with dysthymia, major depressive disorder (MDD), double depressive disorder (both dysthymia and MDD), rheumatoid arthritis and healthy individuals were used to determine the validity of IRT (Adler et al., 2006).^ PARSCALE, which is IRT software from Scientific Software International, Inc., was used to calculate estimates of the work limitations based on item responses from the WLQ. These estimates, also known as ability estimates, were then correlated with the raw score estimates calculated from the sum of all the items responses. Concurrent validity, which claims a measurement is valid if the correlation between the new measurement and the valid measurement is greater or equal to .90, was used to determine the validity of IRT methodology for the WLQ. Ability estimates from IRT were found to be somewhat highly correlated with the raw scores from the WLQ (above .80). However, the only subscale which had a high enough correlation for IRT to be considered valid was the time management subscale (r = .90). All other subscales, mental/interpersonal, physical, and output, did not produce valid IRT ability estimates.^ An explanation for these lower than expected correlations can be explained by the outliers found in the sample. Also, acquiescent responding (AR) bias, which is caused by the tendency for people to respond the same way to every question on a questionnaire, and the multidimensionality of the questionnaire (the WLQ is composed of four dimensions and thus four different latent variables) probably had a major impact on the IRT estimates. Furthermore, it is possible that the mental/interpersonal dimension violated the monotonocity assumption of IRT causing PARSCALE to fail to run for these estimates. The monotonicity assumption needs to be checked for the mental/interpersonal dimension. Furthermore, the use of multidimensional IRT methods would most likely remove the AR bias and increase the validity of using IRT to analyze data from the WLQ.^
Resumo:
Report published in the Proceedings of the National Conference on "Education and Research in the Information Society", Plovdiv, May, 2014
Resumo:
Ambiguity resolution plays a crucial role in real time kinematic GNSS positioning which gives centimetre precision positioning results if all the ambiguities in each epoch are correctly fixed to integers. However, the incorrectly fixed ambiguities can result in large positioning offset up to several meters without notice. Hence, ambiguity validation is essential to control the ambiguity resolution quality. Currently, the most popular ambiguity validation is ratio test. The criterion of ratio test is often empirically determined. Empirically determined criterion can be dangerous, because a fixed criterion cannot fit all scenarios and does not directly control the ambiguity resolution risk. In practice, depending on the underlying model strength, the ratio test criterion can be too conservative for some model and becomes too risky for others. A more rational test method is to determine the criterion according to the underlying model and user requirement. Miss-detected incorrect integers will lead to a hazardous result, which should be strictly controlled. In ambiguity resolution miss-detected rate is often known as failure rate. In this paper, a fixed failure rate ratio test method is presented and applied in analysis of GPS and Compass positioning scenarios. A fixed failure rate approach is derived from the integer aperture estimation theory, which is theoretically rigorous. The criteria table for ratio test is computed based on extensive data simulations in the approach. The real-time users can determine the ratio test criterion by looking up the criteria table. This method has been applied in medium distance GPS ambiguity resolution but multi-constellation and high dimensional scenarios haven't been discussed so far. In this paper, a general ambiguity validation model is derived based on hypothesis test theory, and fixed failure rate approach is introduced, especially the relationship between ratio test threshold and failure rate is examined. In the last, Factors that influence fixed failure rate approach ratio test threshold is discussed according to extensive data simulation. The result shows that fixed failure rate approach is a more reasonable ambiguity validation method with proper stochastic model.
Resumo:
本文从整体上论述了汽车变速箱性能检测系统的测试原理和设计方案,并从硬件和软件两方面详细阐述了汽车变速箱性能检测系统的组成。现场总线的采用,使本系统的结构更加简单,实施更加方便。实践证明,该方法是可行的,可靠的。
Resumo:
During the past 11 years, with the rapid development of the Internet, more and more psychologists began to realize and take advantage of it, which led to a growing number of psychological test administrated on the internet for data collection. But there were some controversy about the reliability and representatively of this new method. To examine the applicability of the Online Survey and how different types of scales used on the internet, we first reversed the measurement instrument, then from three different levels to investigate the equivalence of online survey and paper-and-pencil assessment, namely, sample level, scale level and item level. Both Classical Test Theory and Item Response Theory were used to analyze the invariance of different types of scale applicability on the internet. The main conclusions of this study could be drawn as follows: 1. In the sample-based study, self-select sample of the online survey was compared to the random sampled sample of paper-and-pencil assessment. The results showed there were no gender difference between them (p>0.05), but the online survey sample was characterized with high qualifications, high-income and younger features (88% of the sample with post-secondary education or above, and 71% aged 20 -29 years). There were significant differences on the scores of all scales between online survey and paper-and-pencil assessment (p<0.01). With demographic controlled, there was no significant difference on the variable of Neurotic between different surveys (p>0.05). 2. With in-group design, it was proved equivalence of the scale of BI (Attitude toward Brand Importance), BT (Attitude toword Brand Switcher), Extraversion, and Conscientiousbess in the compared study in the reliability, construct validity and average scores. 3. On the item level, the results based on the Item Response Theory analysis showed that 2PLM is appropriate for personality and attitude scales. With regard to personality scale, there emerged some items with DIF in the dimensions of Openness to the experience subscale and Agreeable subscale. However, there were no significant differences about the test function. 4. Exploring the psychometrics properties of answer formats of five-, six-, seven-, ten-points, it was showed that different measurement validity between online survey and paper-and-pencil test. It was also described the lower reliability and validity of six-point scale. In conclusion, the results support the application of personality scale online, but for attitude scale, we need to choose prudently.
Resumo:
Competency Assessment Method (CAM) is an important technique of Human Resource Management and Development in theory and practice, especially in personnel selection and training. Based on literatures of related fields, the thesis explored the feasibility of CAM in China. The main results found in this study are as follows: 1. Competencies scored in Behavioral Event Interviews (BEI) are not influenced by length of protocol, by performance in the preceding year. Average level and maximal level of complexity correlate significantly with length of protocol. Total competency frequency of outstanding executives is not significantly different from that of typical executives. These results support McCleland's view. But there is significant correlation between length of protocol and competency frequencies, which which is not agreed by McCleland. We found that competency scores using coding standard of average level and maximal level of complexity show more reliability than that using coding standard of competency frequencies. But this isn't confirmed by McCleland. 2. Inter-rater reliability was studied. The results indicated: total Category Agreement (CA) is 55.45%, over 70 percent of 20 competencies of the inter-rater reliability coefficients based on the classical test theory are significantly correlated. G coefficient based on the generalization theory is 0.85697. 3. Study of criterion sample shows that manager's competencies of China's communication enterprise are as follows: Impact and Influence, Organization Commitment, Information Seeking, Achievement Orientation, Team Leadership, Interpersonal Understanding, Initiative, Market Awareness, Self-confidence, Developing Others. This result in similar to the generic competency model of managers presented in Spencer's book. 4. CAM showed more advantages than the method of experts panel judgement.
Resumo:
IIt is well recognised that medical students and junior doctors find fluid prescription a challenging topic. This study was designed to gain a greater understanding of the experiences that medical students face related to learning about fluid prescribing. Methods: A qualitative approach, using focus groups, was employed in this research. Final-year medical students in academic year 2011-12 at Queen's University Belfast were invited to participate during their 'Assistantship' placement in March 2012. Discussions in focus groups, consisting of between six and eight students, were recorded and transcribed verbatim. The research team, consisting of three separate investigators, conducted thematic analysis independently. A final consensus regarding emerging themes was reached by discussion within the whole research team. Medical students and junior doctors find fluid prescription a challenging topic Results: Five prominent themes emerged: 'Teaching experience: a disruptive variation'; 'Curricular disconnections'; 'The driving test: Theory-practice transformation'; 'Role modelling: which standard to aspire to?'; and finally 'Reconciling the perceived risk'. Discussion: This re search provided insights into medical students' opinions of the teaching practices and learning experiences related to fluid prescribing. The learning of prescribing skills is complex andcontextual. In the development of such skills, medical students are often exposed to conflicting educational experiences that challenge the novicelearner in making judgements on best prescribing practice. This study adds to the body of evidence that fluid prescription is a difficult topic, and has generated a number of multifaceted and strategic recommendations to potentially improve fluid prescription teaching.
Resumo:
Dans ce texte, nous analysons les développements récents de l’économétrie à la lumière de la théorie des tests statistiques. Nous revoyons d’abord quelques principes fondamentaux de philosophie des sciences et de théorie statistique, en mettant l’accent sur la parcimonie et la falsifiabilité comme critères d’évaluation des modèles, sur le rôle de la théorie des tests comme formalisation du principe de falsification de modèles probabilistes, ainsi que sur la justification logique des notions de base de la théorie des tests (tel le niveau d’un test). Nous montrons ensuite que certaines des méthodes statistiques et économétriques les plus utilisées sont fondamentalement inappropriées pour les problèmes et modèles considérés, tandis que de nombreuses hypothèses, pour lesquelles des procédures de test sont communément proposées, ne sont en fait pas du tout testables. De telles situations conduisent à des problèmes statistiques mal posés. Nous analysons quelques cas particuliers de tels problèmes : (1) la construction d’intervalles de confiance dans le cadre de modèles structurels qui posent des problèmes d’identification; (2) la construction de tests pour des hypothèses non paramétriques, incluant la construction de procédures robustes à l’hétéroscédasticité, à la non-normalité ou à la spécification dynamique. Nous indiquons que ces difficultés proviennent souvent de l’ambition d’affaiblir les conditions de régularité nécessaires à toute analyse statistique ainsi que d’une utilisation inappropriée de résultats de théorie distributionnelle asymptotique. Enfin, nous soulignons l’importance de formuler des hypothèses et modèles testables, et de proposer des techniques économétriques dont les propriétés sont démontrables dans les échantillons finis.
Resumo:
Le monde actuel, marqué par une augmentation incessante des exigences professionnelles, requiert des enseignants une adaptation constante aux changements sociaux, culturels et économiques. Si, pour les enseignants expérimentés, l’accommodation à ces transformations est accompagnée de plusieurs défis, pour les nouveaux enseignants qui ne maîtrisent pas complètement tous les aspects de la profession, l’intégration au milieu scolaire peut être extrêmement difficile ou même insupportable, au point où certains quittent le métier. Néanmoins, à force de persévérance, un certain nombre des nouveaux enseignants franchissent les obstacles imposés par la profession. Dans leur cas, la satisfaction et l’engagement professionnel peuvent être des caractéristiques importantes qui les incitent à continuer à exercer leurs activités d’enseignement. Dans ce contexte, l’étude vise l’analyse des éléments liés à la construction de l’identité professionnelle des enseignants lors de leur insertion dans le métier, à partir des perceptions des nouveaux enseignants et de celles des gestionnaires des écoles primaires et secondaires. L’harmonie entre la perception de ces deux groupes d’acteurs scolaires peut constituer un important facteur du rendement des professionnels dans leur métier et de l’efficacité des institutions d’enseignement. Ainsi, du côté des nouveaux enseignants, l’étude examine les variables qui peuvent être liées à leur engagement professionnel et de celui des gestionnaires, elle vise à analyser les éléments qui peuvent être liés à leur satisfaction sur le travail effectué par les nouveaux enseignants. La présente étude, de type quantitatif, est constituée des analyses secondaires des données issues des enquêtes pancanadiennes auprès des directions et des enseignants d’écoles primaires et secondaires du Canada, menées en 2005 et 2006 par une équipe de professeurs de différentes universités canadiennes. Les analyses statistiques sont basées sur deux modèles théoriques : (1) l’engagement professionnel des nouveaux enseignants et (2) la satisfaction des gestionnaires sur le travail effectué par les nouveaux enseignants. Ces modèles sont examinés en suivant la théorie classique des tests (TCT) et celle des réponses aux items (TRI) afin de profiter des avantages de chacune des méthodes. Du côté de la TCT, des analyses de cheminement et des modélisations aux équations structurelles ont été effectuées pour examiner les modèles théoriques. Du côté de la TRI, des modélisations de Rasch ont été utilisées pour examiner les propriétés psychométriques des échelles utilisées par la recherche afin de vérifier si les données sont bien ajustées aux modèles et si les items se regroupent de façon logique pour expliquer les traits latents à l’étude. Les résultats mettent en évidence le rapport humain qui définit la profession enseignante. Autrement dit, pour les nouveaux enseignants, les émotions en classe, conséquence du processus d’interaction avec leurs élèves, sont le facteur majeur lié à l’engagement professionnel. Dans le même sens, la relation des nouveaux enseignants avec les divers membres de la communauté scolaire (parents des élèves, gestionnaires, personnel de l’école et autres enseignants) est un facteur-clé de la satisfaction des gestionnaires quant au travail des nouveaux enseignants. Les analyses indiquent également l’importance de la satisfaction au travail dans le modèle des nouveaux enseignants. Cette variable est un important déterminant de l’engagement professionnel et peut être associée à tous les autres éléments du modèle des nouveaux enseignants. Finalement, les résultats indiquent le besoin de construction des variables latentes avec un plus grand nombre d’items pour mieux positionner les personnes dans l’échelle de mesure. Ce résultat est plutôt important pour le modèle des gestionnaires qui indique de mauvais ajustements items-personnes.
Resumo:
L’utilisation des mesures subjectives en épidémiologie s’est intensifiée récemment, notamment avec la volonté de plus en plus affirmée d’intégrer la perception qu’ont les sujets de leur santé dans l’étude des maladies et l’évaluation des interventions. La psychométrie regroupe les méthodes statistiques utilisées pour la construction des questionnaires et l’analyse des données qui en sont issues. Ce travail de thèse avait pour but d’explorer différents problèmes méthodologiques soulevés par l’utilisation des techniques psychométriques en épidémiologie. Trois études empiriques sont présentées et concernent 1/ la phase de validation de l’instrument : l’objectif était de développer, à l’aide de données simulées, un outil de calcul de la taille d’échantillon pour la validation d’échelle en psychiatrie ; 2/ les propriétés mathématiques de la mesure obtenue : l’objectif était de comparer les performances de la différence minimale cliniquement pertinente d’un questionnaire calculée sur des données de cohorte, soit dans le cadre de la théorie classique des tests (CTT), soit dans celui de la théorie de réponse à l’item (IRT) ; 3/ son utilisation dans un schéma longitudinal : l’objectif était de comparer, à l’aide de données simulées, les performances d’une méthode statistique d’analyse de l’évolution longitudinale d’un phénomène subjectif mesuré à l’aide de la CTT ou de l’IRT, en particulier lorsque certains items disponibles pour la mesure différaient à chaque temps. Enfin, l’utilisation de graphes orientés acycliques a permis de discuter, à l’aide des résultats de ces trois études, la notion de biais d’information lors de l’utilisation des mesures subjectives en épidémiologie.