| jump to content | main menu | tips on using this site | site map |
|
|
|
||||||||||||||||||||||||||||||||||||
|
Some Mathematical Underpinnings
ScalesScales have been categorized for years into four types defined by their mathematical qualities and the consequent mathematical operations that are allowable. We have already discussed ratio and interval scales in Module I but to repeat briefly: a ratio scale is one that has a true zero point (such that the numeral zero represents the absence of the construct) and equal distances between the points on the scale and an interval scale has the equidistant assumption without the zero assumption. Any mathematical operation is allowable with a ratio scale while only addition and subtraction are allowable with an interval scale. Note, however, that as soon as you subtract two values, the numeral zero takes on meaning. The implication is that you can always discuss distance from a point on an interval scale and the ratios of two points if you have a ratio scale. Another interpretation is that the points or locations on a ratio scale are meaningful while only distances are meaningful on an interval scale.
Two other scales are used
|
| BUT: |
|
The last scale is not really a scale but a set of categories. Called a nominal scale, it is a classification listing of mutually exclusive categories under an inclusive title. Thus, the variable 'gender' is a dichotomy, male and female; and the variable "major" may be composed of science, liberal arts, fine arts, other.
The statistical method that summarizes the link (the relationship or correlation) between two variables is called the correlation coefficient. There are many of these types of correlation coefficients, best learned as you need them, but they were all developed for specific situations and have different interpretations. The correlation coefficients most prominently used in test theory are the
| Pearson r | the straight-line (linear) between two interval or ratio variables, e.g. between two tests | |
| point biserial/biserial | between a ratio or interval variable and a real/false dichotomy, e.g. between an item and a test score | |
| phi (
|
between two dichotomies, e.g. between two items |
The Pearson r is the grand parent of all coefficients and the one from which some are derived and to which others are compared. Chapters could be written on it, but for our purposes, several points should be remembered.
![]() |
| Figure 3: Degrees of correlation between two variables in terms of variance overlap. Notice that greater overlap results in a greater value for the Pearson r |
If one of the variables, say Y, is a dichotomy, then the Pearson r is simplified to a point-biserial r. This index is a measure of overlap in X between the two groups of people defined by the dichotomous nominal variable Y = 1 and Y = 2. If there is no person in group 1 (say the lower group) who is higher on the X variable than anyone in group 2, then the coefficient is 1. If the two groups overlap and are thus indistinguishable from one another with respect to the X variable, then the correlation coefficient is 0. The range is the same as the Pearson but the sign is merely a function of which group is “1” and which is “2”, and so in practice only positive values are considered.
![]() |
| Figure 4: Three degrees of overlap
between two groups with approximate point biserial correlation coefficients
|
The phi (
)
coefficient describes the overlap between two dichotomies and the logic
derives directly from the study of logic and Venn diagrams.
For our purposes, however, a 2 X 2 table can illustrate varying degrees
of relationship. Suppose we are talking about two test items and have
narrowed the discussion down to whether students have either passed or
failed the items. Success on one item can tell us about success on the
other with differing degrees of probability. Studying Figure 4 will show
you how this works and the kind of information you can get from test
data.
Here, one hundred students are represented.
![]() |
| Figure 5: Three degrees of overlap between two test items with approximate phi coefficients |
|
When we use the Pearson correlation and its underlying linear relationship to predict the Y variable from the X variable, we are applying linear regression. The Pearson correlation, then, can also be used to describe how much, or know accurately, we can predict Y from X. The regression equation is
Model 1: Y’ = bX + c
where Y’ is the predicted Y, b =
Y/
X,
the change in Y for a unit change in X (the slope of a line), and c is
the Y intercept,
the place where the line crosses the Y axis at X = 0.
For testing purposes, the regression equation has several implications. First, for the Model 1 stated above, (rxy)2, is the overlap between X and Y or the amount of Y which can be predicted from X. Second, the equation can be expanded to more X (predictor) variables,
![]()
Our correlation expands to the “multiple regression coefficient” (R) and R2 is the amount of redundancy between Y and the complete set of variables, X1 through XJ.
If you wanted to predict score grades from grades in the two course prerequisites, what would your regression model look like?
|
| © CET, SFSU 2003 |
Introduction |
Design |
Development |
Implementation |
Assessment |
Site Home this is the end of the page. |