Teacher development and assessment literacy
by Tim Newfields (Toyo University)
|
After defining the concept of assessment literacy and possible operationalizations of this concept for three different populations, the rationale for developing an assessment literacy scale is explained. Using a modified Angoff procedure, the suitability of 100 possible assessment literacy items for three target populations was evaluated by a small panel of experts. Sample items are described and 70 items concerning assessment related issues that may be appropriate for high school foreign language teachers are outlined. This paper concludes by considering possible uses and limitations of the Assessment Literacy for High School Foreign Language Teachers Inventory and a call for further research on assessment literacy. Keywords: assessment standards, evaluation skills, test competence, statistical literacy, test development
|
[ p. 48 ]
This paper examines the notion of assessment literacy and some of its possible components. After mentioning why assessment literacy is important for teachers, let's briefly conceptualize this term, then attempt to operationalize it, and finally examine some screening items that might actually begin to express what this notion represents for three different groups. Those who are hoping to find a single, cogent definition of "assessment literacy" that works for all groups will be disappointed because I believe the construct represents a wide matrix of skills which vary significantly from population to population. What might be called "assessment literacy" from the viewpoint of a university student, a high school teacher, and a professional test developer probably involve vastly different skills.[ p. 49 ]
"Instead of conceptualizing assessment literacy solely as a set of given skills, perhaps we should also focus on the conditions needed to foster such skills." |
[ p. 50 ]
When I ask university students in Japan about assessment literacy in their native language, very few are able to articulate anything. Is this because they lack the meta-language needed to describe the concept? Or perhaps is this because the term satei nouryoku (perhaps the best translation of "assessment literacy") is not a pervasive word in Japanese? Such questions are fascinating, but beyond the scope of this paper.
|
"Often, the biggest challenge in promoting assessment literacy seems to be convincing end-users that the topic is actually worth learning: when many people encounter the arcane jargon and complex statistical formulas sometimes used in assessment, a frequent response is numbness." |
[ p. 51 ]
|
|
[ p. 52 ]
Figure 1. Procedure adopted in this assessment literacy research
[ p. 53 ]
Part I: Terminology | |||
question # | response format(s) | sample task(s) | sample topic(s) |
Q1 - Q15 | matching | match testing terms with appropriate symbols | sample variance, null hypothesis, mean |
Q16 - Q29 | multiple choice | select the correct term for a concept described | exam types, variable types, error types |
Q30 - Q35 | open response | explain or contrast various statistical terms | explain the central limit theorem |
Part II: Procedures | |||
Q36- Q40 | short completion | specify the M and SD for 5 types of test scores | quartile/percentile/stanine/T/z-score |
Q41- Q45 | short completion | calculate basic statistics interpret basic statistics |
calculate M, SD for a test pin point strong correlation(s) |
Q46- Q50 | short completion | calculate advanced statistics decide an appropriate statistic |
determine effect size for two groups decide which type of ANOVA to use |
Q51 - Q55 | short completion | calculate five correlation statistics | determine the Pearson correlation index |
Q56 - Q59 | mostly open response | interpret pretest/posttest results | decide what classroom "progress" occurred |
Part III: Test Interpretation | |||
Q51 - Q74 | mostly open response | interpret published research | construct validity, accommodation |
Part IV: Assessment Ethics | |||
Q75 - Q100 | multiple choice | select the most appropriate sentence response for each question |
grading procedures, reporting test scores, handling ethical violations |
[ p. 54 ]
[ p. 55 ]
4 recommendations in favor | 3 recommendations in favor | 2 recommendations in favor | 1 recommendation in favor | No recommendations in favor |
Q3, Q5, Q11, Q13-14, Q20, Q22-24, Q32, Q36, Q41-42, Q56, Q71, Q76-82, Q84-86, Q88-90, Q92, Q94-100 | Q16, Q28, Q32, Q37, Q43, Q72-74, Q75, Q83, Q87, Q91 | Q17, Q39-40, Q45, Q47, Q50 | Q1, Q4, Q6, Q8, Q15, Q21, Q26, Q38, Q44, Q57-59, Q66 | Q2, Q7, Q9-10, Q12, Q18-19, Q25, Q27, Q29-31, Q33-35, Q46, Q48-49, Q51-55, Q60-65, Q67-70, Q93 |
36 items total | 12 items total | 6 items total | 13 items total | 34 items total |
[ p. 56 ]
Based on this procedure, 48 items from Appendix A were adopted into the first version of the Assessment Literacy Test for High School Foreign Language Teachers in Appendix C and 47 were rejected. The remaining six items that had two votes were considered on a case-by-case basis.Part I: Terminology | |||
question # | response format(s) | sample task(s) | sample topic(s) |
Q1 - Q9 | matching | match testing terms with appropriate symbols | sample variance, null hypothesis, mean |
Q10 - Q16 | multiple choice | select the correct term for a concept described | exam types, variable types, cutoff points |
Q17 - Q20 | open response | explain or contrast various statistical terms | distinguishing masters and non-masters |
Part II: Procedures | |||
Q21- Q25 | short completion | calculate basic descriptive statistics interpret basic statistics |
calculate mean & S.D. for a test identify points of significance |
Q26- Q29 | open response | interpret pre-test/posttest gains | assess whether classroom "progress" occurred |
Q30 - Q33 | short completion | calculate three descriptive statistics | describe a box-plot and bell curve |
Q34 - Q36 | open response | think of three ways to increase validity | validity & reliability issues the reliability of a writing test item |
Part III: Test Interpretation | |||
Q37 - Q44 | mostly open response | interpret tests and research | invalid test items, sloppy statistics interpreting error of measurement |
Part IV: Assessment Ethics | |||
Q45 - Q56 | multiple choice | select the most appropriate sentence response for each question |
grading procedures, reporting test scores handling ethical violations |
Q57 - Q70 | mostly open response | identify an ethical problem and/or suggest a solution to a problem |
grading procedures, confidentiality issues, dealing with test anxiety |
" . . . many aspects of assessment are inter-related: ethics often impinge upon interpretation and statistical procedure use." |
[ p. 57 ]
Another point clear from the inventory in Appendix C1 is that many items tend to focus on those aspects of assessment literacy which are easily-measurable. As a result, the Assessment Literacy Test for High School Foreign Language Teachers Inventory has a strong quantitative orientation and perhaps too many questions about statistics. These aspects can be measured in vitro through writing, but perhaps the most important forms of classroom assessment happen in vivo and informally. Moreover, if we look at the tentative operationalization of assessment literacy for teachers suggested in Table 2, it is clear that some areas are under-represented in the test in Appendix C. Specifically, items #6 and #11 are not sufficiently covered. This suggests that the test needs to be augmented in some areas (quite likely), or the operationalization of the concept needs to be worked out more (also likely), or both.[ p. 58 ]
If we take the optimistic view that given the time and resources, most teachers will be motivated to improve their own assessment literacy skills, then several suggestions are in order. Table 7 lists some ways that ordinary teachers can to become more literate about assessment.
|
Acknowledgement: I am grateful to Kristie Sage and Peter Ross for their feedback on this article. The limitations of this paper, however, are my responsibility. |
[ p. 59 ]
[ p. 60 ]
Main Article | Appendix A | Appendix B | Appendix C |