Background Many medical exams use 5 options for multiple choice questions (MCQs), although the literature suggests that 3 options are optimal. Previous studies on this topic have often been based on non-medical examinations, so we sought to analyse rarely selected, 'non-functional' distractors (NF-D) in high stakes medical examinations, and their detection by item authors as well as psychometric changes resulting from a reduction in the number of options. Methods Based on Swiss Federal MCQ examinations from 2005-2007, the frequency of NF-D (selected by <1% or <5% of the candidates) was calculated. Distractors that were chosen the least or second least were identified and candidates who chose them were allocated to the remaining options using two extreme assumptions about their hypothetical behaviour: In case rarely selected distractors were eliminated, candidates could randomly choose another option - or purposively choose the correct answer, from which they had originally been distracted. In a second step, 37 experts were asked to mark the least plausible options. The consequences of a reduction from 4 to 3 or 2 distractors - based on item statistics or on the experts' ratings - with respect to difficulty, discrimination and reliability were modelled. Results About 70% of the 5-option-items had at least 1 NF-D selected by <1% of the candidates (97% for NF-Ds selected by <5%). Only a reduction to 2 distractors and assuming that candidates would switch to the correct answer in the absence of a 'non-functional' distractor led to relevant differences in reliability and difficulty (and to a lesser degree discrimination). The experts' ratings resulted in slightly greater changes compared to the statistical approach. Conclusions Based on item statistics and/or an expert panel's recommendation, the choice of a varying number of 3-4 (or partly 2) plausible distractors could be performed without marked deteriorations in psychometric characteristics.


Sexual selection theory largely rests on the assumption that populations contain individual variation in mating preferences and that individuals are consistent in their preferences. However, there are few empirical studies of within-population variation and even fewer have examined individual male mating preferences. Here, we studied a color polymorphic population of the Lake Victoria cichlid fish Neochromis omnicaeruleus, a species in which color morphs are associated with different sex-determining factors. Wild-caught males were tested in three-way choice trials with multiple combinations of different females belonging to the three color morphs. Compositional log-ratio techniques were applied to analyze individual male mating preferences. Large individual variation in consistency, strength, and direction of male mating preferences for female color morphs was found and hierarchical clustering of the compositional data revealed the presence of four distinct preference groups corresponding to the three color morphs in addition to a no-preference class. Consistency of individual male mating preferences was higher in males with strongest preferences. We discuss the implications of these findings for our understanding of the mechanisms underlying polymorphism in mating preferences.


Background: It is yet unclear if there are differences between using electronic key feature problems (KFPs) or electronic case-based multiple choice questions (cbMCQ) for the assessment of clinical decision making. Summary of Work: Fifth year medical students were exposed to clerkships which ended with a summative exam. Assessment of knowledge per exam was done by 6-9 KFPs, 9-20 cbMCQ and 9-28 MC questions. Each KFP consisted of a case vignette and three key features (KF) using “long menu” as question format. We sought students’ perceptions of the KFPs and cbMCQs in focus groups (n of students=39). Furthermore statistical data of 11 exams (n of students=377) concerning the KFPs and (cb)MCQs were compared. Summary of Results: The analysis of the focus groups resulted in four themes reflecting students’ perceptions of KFPs and their comparison with (cb)MCQ: KFPs were perceived as (i) more realistic, (ii) more difficult, (iii) more motivating for the intense study of clinical reasoning than (cb)MCQ and (iv) showed an overall good acceptance when some preconditions are taken into account. The statistical analysis revealed that there was no difference in difficulty; however KFP showed a higher discrimination and reliability (G-coefficient) even when corrected for testing times. Correlation of the different exam parts was intermediate. Conclusions: Students perceived the KFPs as more motivating for the study of clinical reasoning. Statistically KFPs showed a higher discrimination and higher reliability than cbMCQs. Take-home messages: Including KFPs with long menu questions into summative clerkship exams seems to offer positive educational effects.


Fragestellung/Einleitung: Es ist unklar inwiefern Unterschiede bestehen im Einsatz von Key Feature Problemen (KFP) mit Long Menu Fragen und fallbasierten Typ A Fragen (FTA) für die Überprüfung des klinischen Denkens (Clinical Reasoning) in der klinischen Ausbildung von Medizinstudierenden. Methoden: Medizinstudierende des fünften Studienjahres nahmen an ihrer klinischen Pädiatrie-Rotation teil, die mit einer summativen Prüfung endete. Die Überprüfung des Wissen wurde pro Prüfung elektronisch mit 6-9 KFP [1], [3], 9-20 FTA und 9-28 nichtfallbasierten Multiple Choice Fragen (NFTA) durchgeführt. Jedes KFP bestand aus einer Fallvignette und drei Key Features und nutzen ein sog. Long Menu [4] als Antwortformat. Wir untersuchten die Perzeption der KFP und FTA in Focus Gruppen [2] (n of students=39). Weiterhin wurden die statistischen Kennwerte der KFP und FTA von 11 Prüfungen (n of students=377) verglichen. Ergebnisse: Die Analyse der Fokusgruppen resultierte in vier Themen, die die Perzeption der KFP und deren Vergleich mit FTA darstellten: KFP wurden als 1. realistischer, 2. schwerer, und 3. motivierender für das intensive Selbststudium des klinischen Denkens als FTA aufgenommen und zeigten 4. insgesamt eine gute Akzeptanz sofern gewisse Voraussetzungen berücksichtigt werden. Die statistische Auswertung zeigte keinen Unterschied im Schwierigkeitsgrad; jedoch zeigten die KFP eine höhere Diskrimination und Reliabilität (G-coefficient) selbst wenn für die Prüfungszeit korrigiert wurde. Die Korrelation der verschiedenen Prüfungsteile war mittel. Diskussion/Schlussfolgerung: Die Studierenden erfuhren die KFP als motivierenden für das Selbststudium des klinischen Denkens. Statistisch zeigten die KFP eine grössere Diskrimination und höhere Relibilität als die FTA. Der Einbezug von KFP mit Long Menu in Prüfungen des klinischen Studienabschnitts erscheint vielversprechend und einen „educational effect“ zu haben.


Fragestellung/Einleitung: Prüfungen sind essentieller Bestandteil in der ärztlichen Ausbildung. Sie liefern wertvolle Informationen über den Entwicklungsprozess der Studierenden und wirken lernbegleitend und lernmodulierend [1], [2]. Bei schriftlichen Prüfungen dominieren derzeit Multiple Choice Fragen, die in verschiedenen Typen verwendet werden. Zumeist werden Typ-A Fragen genutzt, bei denen genau eine Antwort richtig ist. Multiple True-False (MTF) Fragen hingegen lassen mehrere richtige Antworten zu: es muss für jede Antwortmöglichkeit entschieden werden, ob diese richtig oder falsch ist. Durch die Mehrfachantwort scheinen MTF Fragen bestimmte klinische Sachverhalte besser widerspiegeln zu können. Auch bezüglich Reliabilität und dem Informationsgewinn pro Testzeit scheinen MTF Fragen den Typ-A Fragen überlegen zu sein [3]. Dennoch werden MTF Fragen bislang selten genutzt und es gibt wenig Literatur zu diesem Fragenformat. In dieser Studie soll untersucht werden, inwiefern die Verwendung von MTF Fragen die Nutzbarkeit (Utility) nach van der Vleuten (Reliabilität, Validität, Kostenaufwand, Effekt auf den Lernprozess und Akzeptanz der Teilnehmer) [4] schriftlicher Prüfungen erhöhen kann. Um die Testreliabilität zu steigern, sowie den Kostenaufwand für Prüfungen zu senken, möchten wir das optimale Bewertungssystem (Scoring) für MTF Fragen ermitteln. Methoden: Wir analysieren die Daten summativer Prüfungen der Medizinischen Fakultät der Universität Bern. Unsere Daten beinhalten Prüfungen vom ersten bis zum sechsten Studienjahr, sowie eine Facharztprüfung. Alle Prüfungen umfassen sowohl MTF als auch Typ-A Fragen. Für diese Prüfungen vergleichen wir die Viertel-, Halb- und Ganzpunktbewertung für MTF Fragen. Bei der Viertelpunktbewertung bekommen Kandidaten für jede richtige Teilantwort ¼ Punkt. Bei der Halbpunktbewertung wird ½ Punkt vergeben, wenn mehr als die Hälfte der Antwortmöglichkeiten richtig ist, einen ganzen Punkt erhalten die Kandidaten wenn alle Antworten richtig beantwortet wurden. Bei der Ganzpunktbewertung erhalten Kandidaten lediglich einen Punkt wenn die komplette Frage richtig beantwortet wurde. Diese unterschiedlichen Bewertungsschemata werden hinsichtlich Fragencharakteristika wie Trennschärfe und Schwierigkeit sowie hinsichtlich Testcharakteristika wie der Reliabilität einander gegenübergestellt. Die Ergebnisse werden ausserdem mit denen für Typ A Fragen verglichen. Ergebnisse: Vorläufige Ergebnisse deuten darauf hin, dass eine Halbpunktbewertung optimal zu sein scheint. Eine Halbpunktbewertung führt zu mittleren Item-Schwierigkeiten und daraus resultierend zu hohen Trennschärfen. Dies trägt zu einer hohen Testreliabilität bei. Diskussion/Schlussfolgerung: MTF Fragen scheinen in Verbindung mit einem optimalen Bewertungssystem, zu höheren Testreliabilitäten im Vergleich zu Typ A Fragen zu führen. In Abhängigkeit des zu prüfenden Inhalts könnten MTF Fragen einen wertvolle Ergänzung zu Typ-A Fragen darstellen. Durch die geeignete Kombination von MTF und Typ A Fragen könnte die Nutzbarkeit (Utility) schriftlicher Prüfungen verbessert werden.


The maintenance of colour polymorphisms within populations has been a long-standing interest in evolutionary ecology. African cichlid fish contain some of the most striking known cases of this phenomenon. Intrasexual selection can be negative frequency dependent when males bias aggression towards phenotypically similar rivals, stabilizing male colour polymorphisms. We propose that where females are territorial and competitive, aggression biases in females may also promote coexistence of female morphs. We studied a polymorphic population of the cichlid fish Neochromis omnicaeruleus from Lake Victoria, in which three distinct female colour morphs coexist: one plain brown and two blotched morphs. Using simulated intruder choice tests in the laboratory, we show that wild-caught females of each morph bias aggression towards females of their own morph, suggesting that females of all three morphs may have an advantage when their morph is locally the least abundant. This mechanism may contribute to the establishment and stabilization of colour polymorphisms. Next, by crossing the morphs, we generated sisters belonging to different colour morphs. We find no sign of aggression bias in these sisters, making pleiotropy unlikely to explain the association between colour and aggression bias in wild fish, which is maintained in the face of gene flow. We conclude that female-female aggression may be one important force for stabilizing colour polymorphism in cichlid fish.


Both inter- and intrasexual selection have been implicated in the origin and maintenance of species-rich taxa with diverse sexual traits. Simultaneous disruptive selection by female mate choice and male-male competition can, in theory, lead to speciation without geographical isolation if both act on the same male trait. Female mate choice can generate discontinuities in gene flow, while male-male competition can generate negative frequency-dependent selection stabilizing the male trait polymorphism. Speciation may be facilitated when mating preference and/or aggression bias are physically linked to the trait they operate on. We tested for genetic associations among female mating preference, male aggression bias and male coloration in the Lake Victoria cichlid Pundamilia. We crossed females from a phenotypically variable population with males from both extreme ends of the phenotype distribution in the same population (blue or red). Male offspring of a red sire were significantly redder than males of a blue sire, indicating that intra-population variation in male coloration is heritable. We tested mating preferences of female offspring and aggression biases of male offspring using binary choice tests. There was no evidence for associations at the family level between female mating preferences and coloration of sires, but dam identity had a significant effect on female mate preference. Sons of the red sire directed significantly more aggression to red than blue males, whereas sons of the blue sire did not show any bias. There was a positive correlation among individuals between male aggression bias and body coloration, possibly due to pleiotropy or physical linkage, which could facilitate the maintenance of color polymorphism.


A new Swiss federal licencing examination for human medicine (FLE) was developed and released in 2011. This paper describes the process from concept design to the first results obtained on implementation of the new examination. The development process was based on the Federal Act on University Medical Professions and involved all national stakeholders in this venture. During this process questions relating to the assessment aims, the assessment formats, the assessment dimensions, the examination content and necessary trade-offs were clarified. The aims were to create a feasible, fair, valid and psychometrically sound examination in accordance with international standards, thereby indicating the expected knowledge and skills level at the end of undergraduate medical education. Finally, a centrally managed and locally administered examination comprising a written multiple-choice element and a practical “clinical skills” test in the objective structured clinical examination (OSCE) format was developed. The first two administrations of the new FLE show that the examination concept could be implemented as intended. The anticipated psychometric indices were achieved and the results support the validity of the examination. Possible changes to the format or content in the future are discussed.


OBJECTIVE To survey retention procedures used in orthodontic practices in Switzerland. MATERIAL AND METHODS A questionnaire previously developed by Renkema et al. (2009) was sent to 223 Swiss orthodontists. The questionnaire comprised six parts, mainly containing multiple-choice questions. Information as to background education of the individual orthodontist, retention in general, frequency of different types of removable or bonded retainers that were used, retention pro- tocol, and the type and size of the wire used for bonded retainers was assessed. RESULTS The overall response rate was 65 percent. Most orthodontists placed a bonded retainer in the upper and lower arch, except when the upper arch was expanded during treatment or when extractions were performed in the upper arch, in which case they placed a combination of fixed and removable retainers. Opinions varied with regard to how many hours the removable retainers should be worn and the duration of the retention phase. As far as bonded retainers were concerned, 87 percent of the orthodontists preferred life-long retention. Ninety-three percent of the orthodontists considered that the development of a guide- line on retention procedures would be useful. CONCLUSIONS The choice of retention procedures is mostly based on orthodontists personal preference. A further research into the long-term effectiveness of individual retention protocols is needed.


Fragestellung/Einleitung: Die Eidgenössische Prüfung Humanmedizin (EP) wurde zwischenzeitlich dreimal erfolgreich durchgeführt. Daten zu Stärken, Schwächen und dem Weiterentwicklungsbedarf lagen bisher nur spärlich vor. Deshalb sollten diese anhand einer qualitativen Studie unter den involvierten Experten und bildungspolitischen Entscheidungsträgern erhoben werden. Methoden: Vier Fokus-Gruppen mit insgesamt 25 Teilnehmern wurden entsprechend internationaler Standards durchgeführt, um die Einschätzungen involvierter Experten und bildungspolitischer Entscheidungsträger bzgl. den erfahrenen Stärken, Einflüssen und dem Weiterentwicklungsbedarf der EP zu erhalten. Die Fokusgruppendiskussionen wurden wörtlich transkribiert und anhand von Inhaltsanalyse ausgewertet. Ergebnisse: Erfahrene Stärken waren v.a. die Kombination der beiden Prüfungs-Teile „Multiple Choice“ (MC) und „Clinical Skills“ (CS), die formatspezifischen Stärken der MC- und CS-Prüfung und die kollaborative Herangehensweise. Erfahrene Einflüsse der EP waren v.a. auf das studentische Lernverhalten, die Prüfer, den Lehrkörper, die Reform der Curricula, die Zusammenarbeit der Fakultäten und die erfahrene Wichtigkeit des Schweizer Lernzielkatalogs (SCLO). Bedarf zur Weiterentwicklung wurde v.a. in Folgendem gesehen: Dass Modifikationen nur angegangen werden, wenn diese wohlüberlegt und evidenzbasiert sind, in einer verbesserten Authentizität der CS-Prüfung, in weiteren Examensformaten, in einer verbesserten Kommunikationsstrategie, in der weiteren Revision des SCLO, in der Anerkennung der Limitationen eines „Single Shot Examens“ und im Aufbau einer Incentives-Struktur für die Kliniker, die aktiv die EP mitgestalten. Diskussion/Schlussfolgerung: Insgesamt wird die EP als geeignet für ihre Aufgaben angesehen. Diese Prüfung hat Einflüsse auf die Medizinstudierendenausbildung in der Schweiz auch über die direkten summativen Prüfungsaspekte hinaus. Es wurde ein Bedarf zur Weiterentwicklung gesehen, jedoch sollten die Veränderungen wohl begründet sein.


PURPOSE Dyslexia is the most common developmental reading disorder that affects language skills. Latent strabismus (heterophoria) has been suspected to be causally involved. Even though phoria correction in dyslexic children is commonly applied, the evidence in support of a benefit is poor. In order to provide experimental evidence on this issue, we simulated phoria in healthy readers by modifying the vergence tone required to maintain binocular alignment. METHODS Vergence tone was altered with prisms that were placed in front of one eye in 16 healthy subjects to induce exophoria, esophoria, or vertical phoria. Subjects were to read one paragraph for each condition, from which reading speed was determined. Text comprehension was tested with a forced multiple choice test. Eye movements were recorded during reading and subsequently analyzed for saccadic amplitudes, saccades per 10 letters, percentage of regressive (backward) saccades, average fixation duration, first fixation duration on a word, and gaze duration. RESULTS Acute change of horizontal and vertical vergence tone does neither significantly affect reading performance nor reading associated eye movements. CONCLUSION Prisms in healthy subjects fail to induce a significant change of reading performance. This finding is not compatible with a role of phoria in dyslexia. Our results contrast the proposal for correcting small angle heterophorias in dyslexic children.


The selection of oviposition sites by syrphids and other aphidophagous insects is influenced by the presence of con- and heterospecific competitors. Chemical cues play a role in this selection process, some of them being volatile semiochemicals. Yet, little is known about the identity and specificity of chemical signals that are involved in the searching behavior of these predators. In this study, we used olfactometer bioassays to explore the olfactory responses of gravid females and larvae of the syrphid Sphaerophoria rueppellii, focussing on volatiles from conspecific immature stages, as well as odors from immature stages of the competing coccinellid Adalia bipunctata. In addition, a multiple-choice oviposition experiment was conducted to study if females respond differently when they can also sense their competitors through visual or tactile cues. Results showed that volatiles from plants and aphids did not affect the behavior of second-instars, whereas adult females strongly preferred odors from aphid colonies without competitors. Odors from conspecific immature stages had a repellent effect on S. rueppellii adult females, whereas their choices were not affected by volatiles coming from immature heterospecific A. bipunctata. The results imply that the syrphid uses odors to avoid sites that are already occupied by conspecifics. They did not avoid the odor of the heterospecific competitor, although in close vicinity they were found to avoid laying eggs on leaves that had traces of the coccinellid. Apparently adult syrphids do not rely greatly on volatile semiochemicals to detect the coccinellid, but rather use other stimuli at close range (e. g., visual or non-volatile compounds) to avoid this competitor.


PURPOSE Austrian out-of-hospital emergency physicians (OOHEP) undergo mandatory biannual emergency physician refresher courses to maintain their licence. The purpose of this study was to compare different reported emergency skills and knowledge, recommended by the European Resuscitation Council (ERC) guidelines, between OOHEP who work regularly at an out-of-hospital emergency service and those who do not currently work as OOHEP but are licenced. METHODS We obtained data from 854 participants from 19 refresher courses. Demographics, questions about their practice and multiple-choice questions about ALS-knowledge were answered and analysed. We particularly explored the application of therapeutic hypothermia, intraosseous access, pocket guide use and knowledge about the participants' defibrillator in use. A multivariate logistic regression analysed differences between both groups of OOHEP. Age, gender, years of clinical experience, ERC-ALS provider course attendance and the self-reported number of resuscitations were control variables. RESULTS Licenced OOHEP who are currently employed in emergency service are significantly more likely to initiate intraosseous access (OR = 4.013, p < 0.01), they initiate mild-therapeutic hypothermia after successful resuscitation (OR = 2.550, p < 0.01) more often, and knowledge about the used defibrillator was higher (OR = 2.292, p < 0.01). No difference was found for the use of pocket guides.OOHEP who have attended an ERC-ALS provider course since 2005 have initiated more mild therapeutic hypothermia after successful resuscitation (OR = 1.670, p <0.05) as well as participants who resuscitated within the last year (OR = 2.324, p < 0.01), while older OOHEP initiated mild therapeutic hypothermia less often, measured per year of age (OR = 0.913, p <0.01). CONCLUSION Licenced and employed OOHEP implement ERC guidelines better into clinical practice, but more training on life-saving rescue techniques needs to be done to improve knowledge and to raise these rates of application.