Multicollinearity



Youtube Link: http://www.youtube.com/watch?v=Ybzc3AB1E-E

Parts:

See also:


Commentary:

There is a tendency in the literature to use the words collinearity and multicollinearity interchangeably. However, there is probably some benefit to distinguishing the two words. Strictly speaking, collinearity is observed when two variables are correlated at 1.0 or -1.0 with each other (Pedhazur, 1997). In practice, collinearity may be said to be observed when two independent variables included in a multiple regression analysis correlate with each other very highly (say, .95 or higher).

By contrast, multicollinearity may be said to be observed when two or more independent variables as a combination predict a very substantial percentage of another independent variable's variance. Precisely how much of the variance must be shared is a matter of discussion. However, the implications are serious, as the presence of collinearity or multicollinearity is known to affect adversely the estimation of regression statistics. In particular, the presence of multicollinearity may be expected to impact the accuracy of beta weights and/or standard errors.

Although a suggestive indication of the presence of multicollinearity may be achieved by an examination of the correlation matrix, the two most common methods used to diagnose multicollinearity are tolerance and the variance inflation factor. More sophisticated techniques include the use of condition indices in conjunction with the variance-decomposition proportions (Belsey, Kuh, & Welsch, 2004).

See also

● Variance inflation factor

References

● Belsey, D. A., Kuh, E., & Welsch, R. E. (2004). Regression diagnostics: Identifying influential data and sources of collinearity  . New York: Wiley.
  
● Pedhazur, E. J. (1997). Multiple regression in behavioral research. South Melbourne: Wadsworth.

Further Reading

O'Brien, R. M. (2007). A caution regarding rules of thumb for variance inflation factors. Quality & Quantity, 41, 673-690.