A nonlinear relationship between y and x can often
be approximately represented within the general linear model as a polynomial
function of x.
Example:
yi = b0 + b1xi + b2xi2 + eimay be represented as a linear model
yi = b0 + b1xi1 + b2xi2 + eiwith the transformed variables xi1= xi and xi2 = xi2 .
The order of the polynomial function is the highest exponent of x; the model above is a second-order model.
To estimate a polynomial
function x is often first deviated from its mean (or median) to reduce collinearity between
x and higher powers of x. A variable deviated
from its mean is called centered. The transformation is x
= X - X. where x (lower case) represents the centered variable and X
(uppercase) the original (uncentered) variable.
A polynomial function can be used when
The response function E{y} for any polynomial model with one predictor variable can be represented on a 2-dimensional plot of y against x.
A second degree polynomial implies a parabolic relationship. The signs of the coefficients determine the shape of the response function:
as shown in these graphs:
Example: The Kuznets curve postulates an inverted U-shaped relationship between income inequality and economic development (measured as log GDP per capita in this example). This curvilinear relationship is often approximated with a second degree polynomial (aka quadratic function).
Higher degree polynomials produce curves with more inflection points:
When estimating a polynomial function, it is often useful to test for the joint significance of the coefficients of x, x2, and higher powers of x, in addition to testing for the significance of each coefficient separately. In a joint test of significance one tests H0: b1 = b2 = 0 against the alternative that at least one of the coefficient is not zero. Joint significance tests are explained in Module 8.
NOTES
A second-order model with two predictors has the general response function
E{Y} = b0 + b1x1 + b2x2 + b11x12 + b22x22 + b12x1x2where
x1 = X1 - X.1(The x variables are centered.) The indexing of the coefficients reflects the composition of the corresponding term. The response function is a quadratic function of x1 and x2. The product term x1x2 represents the interaction of x1 and x2. The coefficient b12 therefore represents the effect of the interaction of x1 and x2 on Y (more on this below).
x2 = X2 - X.2
The response function E{y} of a polynomial regression model of any order with two predictor variables may be represented in 3-dimensional space with dimensions y, x1, and x2. The response function defines a surface in 3-dimensional space which can alternatively be represented
Example: the model with response function
E{y} = 1,740 - 4x12 - 3x22 - 3x1x2yields the quadratic response surface
The following exhibit (figure b) is another example of a quadratic surface
Polynomial models involving more than 2 predictor variables are possible but the response function can no longer be represented in 3-dimensional space.
A regression model with p-1 predictors is called additive if the response function can be written in the form
E{y} = f1(x1) + f2(x2) + ... + fp-1(xp-1)where f1, f2, ..., fp-1 can be any function.
E{y} = b0 + b1x1 + b2x2 + b3x1x2The meaning of the regression coefficients b1 and b2 is not the same as it is in a model without interaction. In the interaction model, the change in E{y} with a unit increase in x1 when x2 is held constant is
b1 + b3x2and the change in E{y} with a unit increase in x2 when x1 is held constant is
b2 + b3x1Therefore in the interaction model the effect of both x1 and x2 depends on the level of the other variable. (So that the regression model is no longer additive.)
dE{y}/dx1 = b1 + b3x2
dE{y}/dx2 = b2 + b3x1
Example: compare the additive model
(a) E{y} = 10 + 2x1 + 5x2to the interaction models
(b) E{y} = 10 + 2x1 + 5x2 + 0.5x1x2 (reinforcement effect)In the first interaction model (b) the value of y is increased (relative to the additive model) when x1 and x2 both have high values; hence x1 and x2 reinforce each other.
(c) E{y} = 10 + 2x1 + 5x2 - 0.5x1x2 (interference effect)
Interactions can be represented as plots of y against x1 conditional on the value of x2 called conditional effects plots:
Interaction effects can also be represented by drawing the response surface (y as function of x1 and x2) in perspective in 3-dimensional space or using contour plots.
The following exhibits show how to plot a response surface using SYSTAT and 3 representations of the interaction model of NKNW Problem 7.39 p. 323. (As of V.9 STATA does not do 3-dimensional plots.)
Example: (From von Eye and Schuster. 1998. Regression Analysis for Social Science. New York: Academic Press. Pp. 159-162.)
The authors report a regression analysis with variables
REC: Recall performance (dependent variable)
CC1: Cognitive complexity measure
EDUC: Educational background
The estimated model is (t-ratios in parentheses):
| REC | -10.78 | +5.34CC1 | +16.6EDUC | -0.97(CC1xEDUC) | R2=.043 | n=327 |
| (-.41) | (2.58) | (3.02) | (-2.35) |
All coefficients are significant. Interpret the results. Is the interaction effect of the reinforcement or interference type?
The interaction model can be visualized in 3-D space (after determining the range of the independent variables):
(SYSTAT command is: fplot y=-10.78+5.34*x1+16.6*x2-0.97*x1*x2; stick=out xmin=4 xmax=72 ymin=4 ymax=7 surface=xycut)