Module 6 - POLYNOMIAL REGRESSION & INTERACTIONS

1.  POLYNOMIAL REGRESSION WITH ONE PREDICTOR VARIABLE

1.  Formulation of the Model

A nonlinear relationship between y and x can often be approximately represented within the general linear model as a polynomial function of x.
Example:

yi  = b0 + b1xi + b2xi2 + ei
may be represented as a linear model
yi  = b0 + b1xi1 + b2xi2 + ei
with the transformed variables xi1= xi and xi2 = xi2 .

The order of the polynomial function is the highest exponent of x; the model above is a second-order model.

To estimate a polynomial function x is often first deviated from its mean (or median) to reduce collinearity between x and higher powers of x.   A variable deviated from its mean is called centered.  The transformation is x = X - X. where x (lower case) represents the centered variable and X (uppercase) the original (uncentered) variable.
A polynomial function can be used when

2.  Graphic Representation of the Model

The response function E{y} for any polynomial model with one predictor variable can be represented on a 2-dimensional plot of y against x.

A second degree polynomial implies a parabolic relationship.  The signs of the coefficients determine the shape of the response function:

as shown in these graphs:

Example:  The Kuznets curve postulates an inverted U-shaped relationship between income inequality and economic development (measured as log GDP per capita in this example).  This curvilinear relationship is often approximated with a second degree polynomial (aka quadratic function).

Higher degree polynomials produce curves with more inflection points:

When estimating a polynomial function, it is often useful to test for the joint significance of the coefficients of x, x2, and higher powers of x, in addition to testing for the significance of each coefficient separately.  In a joint test of significance one tests H0: b1 = b2 = 0 against the alternative that at least one of the coefficient is not zero.  Joint significance tests are explained in Module 8.

NOTES

2.  POLYNOMIAL REGRESSION WITH MORE THAN ONE PREDICTOR VARIABLE

1. Formulation of the Model

A second-order model with two predictors has the general response function

E{Y} = b0 + b1x1 + b2x2 + b11x12 + b22x22 + b12x1x2
where
x1 = X1 - X.1
x2 = X2 - X.2
(The x variables are centered.)  The indexing of the coefficients reflects the composition of the corresponding term.  The response function is a quadratic function of x1 and x2.  The product term x1x2 represents the interaction of x1 and x2.  The coefficient b12 therefore represents the effect of the interaction of x1 and x2 on Y (more on this below).

2.  Graphic Representation of the Model

The response function E{y} of a polynomial regression model of any order with two predictor variables may be represented in 3-dimensional space with dimensions y, x1, and x2.  The response function defines a surface in 3-dimensional space which can alternatively be represented

Example: the model with response function

E{y} = 1,740 - 4x12 - 3x22 - 3x1x2
yields the quadratic response surface

The following exhibit (figure b) is another example of a quadratic surface

Polynomial models involving more than 2 predictor variables are possible but the response function can no longer be represented in 3-dimensional space.

3.  INTERACTION REGRESSION MODELS

1.  Formulation of the Model

A regression model with p-1 predictors is called additive if the response function can be written in the form

E{y} = f1(x1) + f2(x2) + ... + fp-1(xp-1)
where f1, f2, ..., fp-1 can be any function.
Models that are not additive contain interaction effects.  Interactions are commonly represented as cross-product terms called interaction terms.
The simplest interaction model is a special case (without the square terms) of the second-order polynomial model with two predictor variables with response function
E{y} = b0 + b1x1 + b2x2 + b3x1x2
The meaning of the regression coefficients b1 and b2 is not the same as it is in a model without interaction.  In the interaction model, the change in E{y} with a unit increase in x1 when x2 is held constant is
b1 + b3x2
and the change in E{y} with a unit increase in x2 when x1 is held constant is
b2 + b3x1
Therefore in the interaction model the effect of both x1 and x2 depends on the level of the other variable.  (So that the regression model is no longer additive.)
NOTE:  The effects of x1 and x2 are obtained by differentiating E{y} with respect to x1 and x2, respectively:
dE{y}/dx1 = b1 + b3x2
dE{y}/dx2 = b2 + b3x1

Example: compare the additive model

(a) E{y} = 10 + 2x1 + 5x2
to the interaction models
(b) E{y} = 10 + 2x1 + 5x2 + 0.5x1x2  (reinforcement effect)
(c) E{y} = 10 + 2x1 + 5x2 - 0.5x1x2  (interference effect)
In the first interaction model (b) the value of y is increased (relative to the additive model) when x1 and x2 both have high values; hence x1 and x2 reinforce each other.
In the second interaction model (c) the value of is decreased (relative to the additive model) when x1 and x2 both have high values; hence x1 and x2 interfere with each other.  

2. Graphic Representation of the Model

Interactions can be represented as plots of y against x1 conditional on the value of x2 called conditional effects plots:

Interaction effects can also be represented by drawing the response surface (y as function of x1 and x2) in perspective in 3-dimensional space or using contour plots.

The following exhibits show how to plot a response surface using SYSTAT and 3 representations of the interaction model of NKNW Problem 7.39 p. 323.  (As of V.9 STATA does not do 3-dimensional plots.)

Example: (From von Eye and Schuster. 1998.  Regression Analysis for Social Science.  New York: Academic Press. Pp. 159-162.)

The authors report a regression analysis with variables

REC: Recall performance (dependent variable)
CC1: Cognitive complexity measure
EDUC: Educational background

The estimated model is (t-ratios in parentheses):

REC -10.78 +5.34CC1 +16.6EDUC -0.97(CC1xEDUC) R2=.043 n=327
  (-.41) (2.58) (3.02) (-2.35)    

All coefficients are significant.  Interpret the results.  Is the interaction effect of the reinforcement or interference type?

The interaction model can be visualized in 3-D space (after determining the range of the independent variables):

(SYSTAT command is: fplot y=-10.78+5.34*x1+16.6*x2-0.97*x1*x2; stick=out xmin=4 xmax=72 ymin=4 ymax=7 surface=xycut)



Last modified 27 February 2006