|jump to content | main menu | tips on using this site | site map|
Analyzing A Test
Most of you will be teaching at an institution that will have scoring services available. Especially if you are giving objective tests you might wonder if you should bother with these services; after all, it won’t take that long for you to do scoring yourself.
However, analyzing a test goes beyond counting the number of items right on student papers. The information that follows, when used, will help you give feedback to both yourself and your students. In addition, you’ll be able to use it in the future to improve your tests.
Item difficulty is simply the proportion of students who answered an item correctly. If j indicates item number, Nc is the number of students getting the item correct, and N is the total number of students taking the test, then the item difficulty for the jth item is
Table 1 shows the student by item matrix for a six-item test. A “1” indicates that the student got that item correct. To get item difficulties, simply count the number of students who got an item correct and divide by the number of students. For example, the difficulty for item 1 is p1 = Nc / N = 9/10 or .90. Item 1 was an easy item because almost all students answered it correctly. On the other hand, item 6 has a difficulty p6 = Nc / N of 2/10 or .20. Item 6 was difficult for the group. In this example, students 2, 3, and 4 make up the upper 30% of the class and students 6, 7, and 8 make up the lower 30%.
Table 2, below, shows the answering pattern of the 10 students taking the six items. The asterisk above the option indicates the correct answer. The number in a cell indicates the number of students who answered a particular option.
In looking at Table 2 you can see that for the first five items, most of the students selected the correct answer. In most classroom achievement tests, as a general rule of thumb you want to get an average difficulty of .7 - .8. Why? The reason here is psychological—not mathematical. Even if a student is “given” a B when he has correctly answered only half the items, he still may come away with a feeling of failure. A good B student should feel that he was able to show his knowledge.
What should you do in the future with a test that is too difficult? Consider adding items, usually to the topic areas that deserve more weight. Try to make these items somewhat easier than those measured on the original test.
Item discrimination is the ability of the item to differentiate those students with more knowledge from those with less. To calculate item discrimination, the total test score is used as a surrogate for this knowledge, the top scoring students are separated from the bottom scoring students, and you then compare their response patterns. Typically the group is divided into thirds and the middle group is excluded. For our purpose on the sample test let’s call
Students 2, 3, and 4 the top group (Nu)
An item discriminates positively if more students in the upper group got an item right than students in the lower group. To calculate the discrimination index subtract the number of students in the lower group that got an item correct from those in the upper group, and divide by the number of students that made up the upper or lower group. The formula for the discrimination index of item j, where pju = Ncu / Nu is the item difficulty for the upper third and pjl = Ncl / Nl is the difficulty for the lower third
dj = pju - pjl
or, if Nu = Nl, that is, the number of students in the bottom one third is the same as that in the upper third,
dj = (Ncu - Ncl ) / Nu.
For example, item 1 was answered correctly by all three students in the upper group and two in the lower (students 1 and 7). The discrimination index is
dj = (Ncu - Ncl ) / Nu. = (3 – 2) / 3 = 1/3 or .33.
The discrimination index for item 6 is (2 – 0)/3 or .67. The discrimination index for item 6 is higher because more good students got this item correct. Another way of viewing this item is that if you got it correct, you are apt to be a good student.
Item 2, however, produces different results. One student in the upper group got it correct; one in the lower got it correct yielding a discrimination index of 0. Why did as many good as poor students get this item wrong? Going back to Table 2, you can see that many students were drawn to distractor “a”. What is in distractor “a” that pulls students? Have you possibly mis-keyed the item? Whatever it is, it’s something you want to look at before you hand back the exam. Perhaps you can see immediately why students chose this and you say to yourself, “I can see why this might be considered a correct answer.” If so, give credit for both options.
Item analyses can give you useful information. To illustrate various points, the example given had just 10 students. However, for small classes (<20) remember that there may be some special characteristics of the students that affect the responses. If you were to give this test again to another class, you might get somewhat different results. However, if you have a large class, your results are apt to be more stable
If you give several tests, errors will eventually cancel out. Overall, remember, if you give a point after a reasoned argument, you are a hero; if you take away points or lower grades, you are a cad.
|© CET, SFSU 2003||
this is the end of the page.