Appendix A, Part I of _Teacher development and assessment literacy

Teacher development and assessment literacy

Appendix A:

Foreign Language Assessment Literacy Test -
Preliminary Item Screening

(Ver. 4 - November 7, 2006)

Part 1:
Terminology

INSTUCTIONS: Below is a list of possible items for three different foreign language assessment literacy tests: one for professional test validators (Level A), one for language teachers with bachelor degrees in education (Level B), and yet another for first year undergraduate education majors (Level C). If you think that an item represents something that professional language test validators should know, mark "Level A". If you believe that an item should be known by foreign language teachers with B.A. degrees, mark "Level B". If you think that an item is something an education major should know before entering college, mark "Level C". If you believe it's not necessary for any of these three populations to know a given item, leave it blank. Please remember that Levels A, B, and C represent - in your view - the minimal competency levels for each of these three populations. If an item is beyond what you believe a member of a group ought to know, then leave it blank.

It's not necessary to answer any of the items below, but you're welcome to do so if you wish. When clicking the boxes for Levels A, B, and C remember that you may click more than one box if it seems appropriate or leave all boxes blank.

When you have completed this document, please email a copy to timothy*at*toyonet*dot*toyo*dot*ac*dot*jp. Thank you for your cooperation.

PART I. Terminology

(A) Matching exercise

INSTRUCTIONS: Match the statistical symbols on the left (1-9) with the corresponding terms on the right (A-P). One item has already been completed as an example. Note that one term does not have a corresponding match.
	(1) df	_13_ (A) chi-square	Level A Level B Level C
	(2) F	___ (B) coefficient of determination	Level A Level B Level C
	(3) H_o	___ (C) degrees of freedom	Level A Level B Level C
	(4) k	___ (D) F-value, variance ratio	Level A Level B Level C
	(5) N	___ (E) null hypothesis	Level A Level B Level C
	(6) n	___ (F) number of cases in a population	Level A Level B Level C
	(7) ρ	___ (G) number of cases in a sample	Level A Level B Level C
	(8) r	___ (H) number of items in a test	Level A Level B Level C
	(9) r²	___ (I) Pearson's correlation coefficient	Level A Level B Level C
	(10) r₂	___ (J) probability of a Type I error	Level A Level B Level C
	(11) ζ, SD, S_x	___ (K) sample mean	Level A Level B Level C
	(12) s²	___ (L) sample variance	Level A Level B Level C
	(13) χ², c²	___ (M) standard deviation	Level A Level B Level C
	(14) , M	___ (N) frequency	Level A Level B Level C
	(15) v	___ (O) x-value	Level A Level B Level C
		___ (P) (1) level of significance, (2) the proportion of responses to an item that are correct	Level A Level B Level C

[ p. 62 ]

(B) Multiple choice questions

INSTRUCTIONS: Select the best response (A-E) for the items below.
Note that some items have more than one "correct" possible response.

16. Gender, occupation, or nationality are considered variables in most language studies.	Level A Level B Level C
17. If a test only seems to measure what it claims to, then it is said to have validity.	Level A Level B Level C
18. A error occurs when a researcher thinks there is no relationship between two variables, but there actually is.	Level A Level B Level C
19. The cutoff point for a criteria-reference test should be when the is equal to or greater than 1.	Level A Level B Level C
20. Exams used to determine a student's progress toward mastery of a content area are known as tests.	Level A Level B Level C
21. How many standard deviations a score is from the mean is revealed by a test's.	Level A Level B Level C

22. The test excerpt below is an example of a test.
German poem

23. To find out how well a particular item in a test correlates with the total test score, a should be ascertained.	Level A Level B Level C
24. Any variable that is not part of a research study, but still has an effect on its results is said to that study.	Level A Level B Level C
25. In a 3-parameter IRT test model, the point on an ability scale at which the probability of a correct response for a given item is .5 is known as the .	Level A Level B Level C
26. To predict how many more items need to be added to a given test to increase its reliability to a desired value, the should be calculated.	Level A Level B Level C
27. If a test is uni-dimensional, then it should automatically show a high degree of .	Level A Level B Level C
28. The tendency of examinee expectations to contaminate test results is known as .	Level A Level B Level C
29. A test administration procedure in which a large set of test items is organized into shorter sub-sets, each of which is randomly assigned to a sub-sample, hence avoiding the need to administer all items to all examinees is known as a sampling.	Level A Level B Level C

30. To compare a the mean of a particular sub-group to the mean of a larger group that is within the same population, a should be performed.	Level A Level B Level C
31. Briefly explain the difference between the standard error of estimate (SEE) and standard error of measurement (SEM) in the space below, mentioning when each of these statistics should be used.	Level A Level B Level C
32. If you want to see how closely "masters" who scored high on a particular CRT test differed from "non-masters" who scored closer the bottom, which technique(s) might you use?	Level A Level B Level C
33. What's the difference between a predictive and concurrent validation study? When should each type of study be used?	Level A Level B Level C
34. How do the Kuder-Richardson Formula 20 and Formula 21 differ? When should each be used?	Level A Level B Level C
35. What does the central limit theorem tell us?	Level A Level B Level C