Rasch & quality control: Controlling data, forgetting quality?by Gerry Lassche (Miyagi Gakuin Women's University) |
|
". . . while Rasch analysis may be useful for controlling data, it does not have anything to say about the quality of testing practice, or validity, and cannot be a stand-in proxy for such validation procedures." |
[ p. 75 ]
Shows great flexibility in reformulating ideas in differing linguistic forms to convey finer shades of meaning precisely, to give emphasis, to differentiate and to eliminate ambiguity.Ah, the devil is in the details! What does "great" mean? How many forms are required for "differing"? How precise is "precisely"? How does one define relative "emphasis", differentiation", and "ambiguity"? In order for this test to have construct validity, I believe the meaning in these various terms needs to be unpacked, so that stakeholders can indeed be in agreement about what they refer to.
[ p. 76 ]
That the test obtains samples of consistent individual peformance while minimizing irrelevant variation is a measure of reliability (Hegelheimer and Chapelle, 2000). This is done through the use of reliable instrumentation, which is essential for claiming valid test use (Lynch, 1996, 44). Assuming that an interactionalist paradigm is essential for interpreting such variation, factors such as item / task characteristics (ie input such as a text to be manipulated in some way), the instructional rubric, and characteristics of the test setting need to be specified in order to ensure reliability.When test-takers are assessed differently, ergo unfairly, because some raters are more severe than other raters, this is clearly connected to faulty instrumentation: in that case, scoring that is inconsistently applied; that is, more or less severe. In the case where items do not differentiate between test-takers in a way that adequately reflects differing degrees of performance, that is also a case of faulty reliability: the items do not refer back in a consistent way to the original construct.
". . . all elements which make up the three stages of test development, must be progressively examined in order to determine if test use is validated." |
[ p. 77 ]
". . . when testers talk about reliability, the assumption should be that validation has taken place first." |
Presenter | Test type | Test content | Rasch analysis |
Takaaki Kumazawa | Course Achievement exit test | 20 MC Vocabulary items 20 MC Reading Comp items |
Dependable |
Trevor Bond | Placement proficiency test | MC test (item #'s na) | Not dependable |
Ed Schaeffer | Rater difficulty in thesis exit evaluation | Theses | Dependable |
[ p. 78 ]
Presenter 1 - Course achievement analysis[ p. 79 ]
[ p. 80 ]
And so these men of IndostanIn this poem, the wise blind men have never seen an elephant, and do not know what one looks like. By using their own peculiar perspective (in this case, what was tactically proximate), they defined their own narrow view of "elephant-ness".
Disputed loud and long,
Each in his own opinion
Exceeding stiff and strong,
Though each was partly in the right,
And all were in the wrong!
[ p. 81 ]
[ p. 82 ]
NOTE: A response to some of the criticisms mentioned in this paper is online at http://www.jalt.org/pansig/2008/HTML/SchaKum.htm. |
[ p. 83 ]