Item-Score Reliability as a Selection Tool in Test Construction

E.A.O. Zijlmans; J. Tijmstra; L.A. van der Ark; K. Sijtsma

doi:https://doi.org/10.3389/fpsyg.2018.02298

Item-Score Reliability as a Selection Tool in Test Construction

Authors	E.A.O. Zijlmans J. Tijmstra L.A. van der Ark K. Sijtsma
Publication date	01-2019
Journal	Frontiers in Psychology
Article number	2298
Volume \| Issue number	9
Number of pages	12
Organisations	Faculty of Social and Behavioural Sciences (FMG) - Research Institute of Child Development and Education (RICDE)
Abstract	This study investigates the usefulness of item-score reliability as a criterion for item selection in test construction. Methods MS, λ₆, and CA were investigated as item-assessment methods in item selection and compared to the corrected item-total correlation, which was used as a benchmark. An ideal ordering to add items to the test (bottom-up procedure) or omit items from the test (top-down procedure) was defined based on the population test-score reliability. The orderings the four item-assessment methods produced in samples were compared to the ideal ordering, and the degree of resemblance was expressed by means of Kendall's τ. To investigate the concordance of the orderings across 1,000 replicated samples, Kendall's W was computed for each item-assessment method. The results showed that for both the bottom-up and the top-down procedures, item-assessment method CA and the corrected item-total correlation most closely resembled the ideal ordering. Generally, all item assessment methods resembled the ideal ordering better, and concordance of the orderings was greater, for larger sample sizes, and greater variance of the item discrimination parameters.
Document type	Article
Language	English
Published at	https://doi.org/10.3389/fpsyg.2018.02298 (Final published version)
Downloads	fpsyg-09-02298 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Item-Score Reliability as a Selection Tool in Test Construction