Assessment of multiple-choice questions in medicine. Validity evidence of an instrument
Abstract
Introduction: The appropriate preparation of test ítems of an examination constitutes validity evidence in itself. Despite there being a general consensus about item-writing guidelines, several studies report a high incidence of violations of these standards. An instrument is proposed in order to assess the quality of multiple-choice item-writing, describing the validity evidence gathering process.
Methods: The validity evidence was gathered on an instrument designed to assess multiple choice ítems features, according to the sources proposed by the Standards for Educational and Psychological Testing, and particularly those related to content, response process, and internal structure. Kappa index (following Fleiss’ model) and point-biserial correlation coefficient were used to measure concordance in the criteria assessed by the instrument. An exploratory factorial analysis was performed to identify the instrument dimensions, and Cronbach’s alpha was calculated as an internal consistency statistic.
Results: Concordance between multiple judges was greater than 0.8 (almost perfect agreement) for 12 out of 21 criteria, and 0.19 for Bloom’s taxonomy level. Factorial analysis defined 4 dimensions with Kaiser-Meyer-Olkin (KMO) test =0.666 (p<.01), explained variance of 49.979%, and a Cronbach’s alpha of 0.627.
Conclusion: This instrument can be used to assess multiple choice ítems, since it counts with validity evidence related to content, response process and internal structure, and psychometric values appropriated for instrumentation.