jump to content | main menu | tips on using this site | site map
OCT sitemap
assessment unit home
evaluating tests
Evaluating Tests Home
Mathematical underpinnings button button

Mathematical
Underpinnings


print module; link opens in new window search the O C T site tell a friend about the O C T site; link opens in new window contact the O C T team; link opens in new window  meet the O C T team

 






  Printer-friendly page Printer-friendly page


Some Mathematical Underpinnings

This Page Includes
 Scales
  Correlation Coefficients
  Regression

 

Scales

Scales have been categorized for years into four types defined by their mathematical qualities and the consequent mathematical operations that are allowable. We have already discussed ratio and interval scales in Module I but to repeat briefly: a ratio scale is one that has a true zero point (such that the numeral zero represents the absence of the construct) and equal distances between the points on the scale and an interval scale has the equidistant assumption without the zero assumption. Any mathematical operation is allowable with a ratio scale while only addition and subtraction are allowable with an interval scale. Note, however, that as soon as you subtract two values, the numeral zero takes on meaning. The implication is that you can always discuss distance from a point on an interval scale and the ratios of two points if you have a ratio scale. Another interpretation is that the points or locations on a ratio scale are meaningful while only distances are meaningful on an interval scale.

activity


Why is the difference between two interval scales a ratio scale? Can you say that my knowledge is twice Sam’s if my test score is 40 and his is 20? If I grew from 35 to 40 and Sam grew from 10 to 20, who changed the most? By how much? How would you compare the changes? Who should get the higher grade?

Two other scales are used Return to top of page

One is called an ordinal scale because only order is meaningful. The scale values are 1st, 2nd, 3rd, and so forth. So while we record these data as numbers, they really aren't. They are only relative positions and so are not truly numeric. To be 'first' in a race, i.e. to say 'I won' is completely meaningless outside of some context: the situation, length of the race, and most importantly the other contestants. I can always find 10 people who will not be able to run as fast as I, so I will win.

BUT:
  1. A ranking of five teachers (ordinal scale!) is not a rating of them (interval scale).
  2. A ranking (ordinal scale) of 30 students is not a rating of them.
  3. The best teacher I ever had is not necessarily a good teacher.
  4. A bronze medal in the Olympics is great—all the contestants were good athletes and 2/10 of a second is a meaningless difference. (Gold, silver, bronze is ordinal!)
  5. I am the second best speller in my household. (Even if there are only two of us.)

The last scale is not really a scale but a set of categories. Called a nominal scale, it is a classification listing of mutually exclusive categories under an inclusive title. Thus, the variable 'gender' is a dichotomy, male and female; and the variable "major" may be composed of science, liberal arts, fine arts, other.


Correlation Coefficients Return to top of page

The statistical method that summarizes the link (the relationship or correlation) between two variables is called the correlation coefficient. There are many of these types of correlation coefficients, best learned as you need them, but they were all developed for specific situations and have different interpretations. The correlation coefficients most prominently used in test theory are the

Pearson r equals the straight-line (linear) between two interval or ratio variables, e.g. between two tests
point biserial/biserial equals between a ratio or interval variable and a real/false dichotomy, e.g. between an item and a test score
phi ( greek symbol phi) coefficient equals between two dichotomies, e.g. between two items

The Pearson r is the grand parent of all coefficients and the one from which some are derived and to which others are compared. Chapters could be written on it, but for our purposes, several points should be remembered.

  1. It is sensitive only to the linear relationship between two variables and not other more complex shapes. Thus we are talking about a fixed increase (or decrease) in one variable for a fixed increase in another that is constant throughout the range of the variables.
  2. The actual redundancy between the two variables is calculated by squaring the coefficient. Thus, if rXY = .71, then the redundancy in information between the variables X and Y is (.71)2 = .5 or 50%.
  3. The values run between +1 and –1, that is, -1 ≤ rXY ≤ +1.

coefficients diagram.  See caption below for notes
Figure 3: Degrees of correlation between two variables in terms of variance overlap. Notice that greater overlap results in a greater value for the Pearson r

If one of the variables, say Y, is a dichotomy, then the Pearson r is simplified to a point-biserial r. This index is a measure of overlap in X between the two groups of people defined by the dichotomous nominal variable Y = 1 and Y = 2. If there is no person in group 1 (say the lower group) who is higher on the X variable than anyone in group 2, then the coefficient is 1. If the two groups overlap and are thus indistinguishable from one another with respect to the X variable, then the correlation coefficient is 0. The range is the same as the Pearson but the sign is merely a function of which group is “1” and which is “2”, and so in practice only positive values are considered.

Diagram. Degrees of overlap between groups and their corresponding Pearson r values
Figure 4: Three degrees of overlap between two groups with approximate point biserial correlation coefficients

The phi ( greek symbol phi) coefficient describes the overlap between two dichotomies and the logic derives directly from the study of logic and Venn diagrams. For our purposes, however, a 2 X 2 table can illustrate varying degrees of relationship. Suppose we are talking about two test items and have narrowed the discussion down to whether students have either passed or failed the items. Success on one item can tell us about success on the other with differing degrees of probability. Studying Figure 4 will show you how this works and the kind of information you can get from test data. Here, one hundred students are represented.

test overlap diagram.  click to open a new window with a text description
Figure 5: Three degrees of overlap between two test items with approximate phi coefficients

 

activity

 

  1. For each pair of variables, select the appropriate correlation coefficient:
  • Sex and pass/fail item 3 on the test
  • Number of class absences and class grade
  • Number of years of mother’s education and sex

 

 

Regression Return to top of page

When we use the Pearson correlation and its underlying linear relationship to predict the Y variable from the X variable, we are applying linear regression. The Pearson correlation, then, can also be used to describe how much, or know accurately, we can predict Y from X. The regression equation is

Model 1: Y’ = bX + c

where Y’ is the predicted Y, b = greek letter deltaY/ greek letter deltaX, the change in Y for a unit change in X (the slope of a line), and c is the Y intercept, the place where the line crosses the Y axis at X = 0.

For testing purposes, the regression equation has several implications. First, for the Model 1 stated above, (rxy)2, is the overlap between X and Y or the amount of Y which can be predicted from X. Second, the equation can be expanded to more X (predictor) variables,

Model 2.  Y prime equals b sub one x sub one plus b sub two x sub two and so on until b susb j x sub j plus c.

Our correlation expands to the “multiple regression coefficient” (R) and R2 is the amount of redundancy between Y and the complete set of variables, X1 through XJ.

activity


Suppose that for every hour spent studying per week, students’ GPA’s increase by .15 point. What is the slope of the prediction line?

If you wanted to predict score grades from grades in the two course prerequisites, what would your regression model look like?

 



return to top

 

© CET, SFSU 2003 Introduction | Design | Development | Implementation | Assessment | Site Home
this is the end of the page.