Teacher development and assessment literacy

Appendix A:

Foreign Language Assessment Literacy Test -
Preliminary Item Screening

Part 1:
Part 2:
Part 3:
Test Interpretation
Part 4:
Assessment Ethics

PART II. Procedures

(A) Exercise 1
INSTRUCTIONS: Specify the mean and standard deviation for the following types of norm-referenced tests assuming that the curve has a normal distribution:
36. quartile score mean = standard deviation=
Level A Level B Level C
37. percentile score mean = standard deviation=
Level A Level B Level C
38. stanine score mean = standard deviation=
Level A Level B Level C
39. T score mean = standard deviation=
Level A Level B Level C
40. z score mean = standard deviation=
Level A Level B Level C

(B) Exercise 2
INSTRUCTIONS: Look at the data from the test below, then answer Questions 41-45 using any electronic device or software program that you know how to operate.
Raw score sections of four sections of a norm-referenced language test of general English ability.
(Correct number of items for each section of the test appears below)
Section 1 Section 2 Section 3 Section 4 Total
k (# of items) 10 30 20 30 90
1. Diana 8 28 15 14 65
2. Cindy 7 22 10 15 54
3. Marilyn 4 11 9 8 32
4. Jack 10 26 19 26 81
5. Chris 5 15 10 16 46
6. Faith 7 18 15 22 62
7. Doug 9 10 12 21 52
8. James 3 10 5 11 29
9. Emiko 8 23 16 25 72
10. Eric 6 19 12 18 55
etc. . .

41. What is the mean of the total test?
Level A Level B Level C

42. What is the standard deviation?
Level A Level B Level C

43. Which student(s) is/are more than one standard deviation from the mean?
Level A Level B Level C

44. Do any sections of this test correlate closely in a way that's statistically
significant at a p<.05 level (If so, mention which)
Level A Level B Level C

45. What sort of distribution curve does this test have so far?
Level A Level B Level C

[ p. 64 ]

(C) Exercise 3
INSTRUCTIONS: The table below indicates the hypothetical data for a 50-item test that were given to two different population samples. Look at that data then calculate the statistics mentioned in Questions 46-50:
Population A Population B
sample size: 20 80
mean score: 32 25
standard deviation: 7.5 6
low-high: 14 - 48 12 - 50
alpha reliability estimate: .7 .8

46. ANOVA:
Level A Level B Level C

47. F-ratio:
Level A Level B Level C

48. Chi-square distribution:
Level A Level B Level C

49. effect size:
Level A Level B Level C

50. standard error of measurement:
Level A Level B Level C

(D) Exercise 4
INSTRUCTIONS: Compare the oral interview ratings below by two raters of the same student, then calculate the statistics mentioned in Questions 51-55. Note that all ratings are in terms of 5-point bands, with 5 representing the highest possible rating.
Category Rater A Rater B
Grammar 3.5 3
Fluency 4 4
Pronunciation 4 3.5
Cohesion 4 3.5
Vocabulary 4.5 4
Total 20 18

51. The inter-rater reliability coefficient for A and B is .
Level A Level B Level C

52. The Pearson correlation index for the two raters is .
Level A Level B Level C

53. The index of concordance among the two raters is .
Level A Level B Level C

54. The chi-square test of independence for these two raters is .
Level A Level B Level C

55. The kappa coefficient of the combined rating is .
Level A Level B Level C

(E) Exercise 5
INSTRUCTIONS: Read this hypothetical data comparing a 60-item classroom pretest/posttest, then complete the sentences below. Note that following the pretest, the top one-third students were classified into an "upper group" and the lower one-third were classified into a "bottom group":
Category Pretest Posttest
sample size: 48 42
total mean: 30 33
total range: 7-44 12-52
total standard deviation: 3.6 4.3
upper group mean score: 45 50
upper group standard deviation: 4.0 3.9
bottom group mean: 20 20
bottom group standard deviation: 4.2 5.8

56. How did the upper group perform differently from the bottom group?
Level A Level B Level C

57. What sort of distribution curve would this posttest likely have?
Level A Level B Level C

58. Which type of ANOVA, if any, would be suitable for measuring
the pretest/posttest gains made by this sample group?
Level A Level B Level C

59. What sort claims could validly be made about the "progress" of this class?
Level A Level B Level C


Main Article Appendix A: I   II   III   IV Appendix B Appendix C: I   II   III   IV

2006 Pan SIG-Proceedings: Topic Index Author Index Page Index Title Index Main Index
Complete Pan SIG-Proceedings: Topic Index Author Index Page Index Title Index Main Index

[ p. 65 ]
Last Next