SYSTAT Rectangular file D:\mydocs\ys209\Survey2.syd,

This file is provided with SYSTAT V9.0, from Afifi & Clark (1984).

>names
 
Variables in the SYSTAT Rectangular file are:
 ID           SEX          AGE          MARITAL      EDUCATN      EMPLOY
 INCOME       RELIGION     BLUE         DEPRESS      LONELY       CRY
 SAD          FEARFUL      FAILURE      AS_GOOD      HOPEFUL      HAPPY
 ENJOY        BOTHERED     NO_EAT       EFFORT       BADSLEEP     GETGOING
 MIND         TALKLESS     UNFRNDLY     DISLIKE      TOTAL        CASECONT
 DRINK        HEALTHY      DOCTOR       MEDS         BED_DAYS     ILLNESS
 CHRONIC      MARITAL$     SEX$         AGE$         EDUC$        DRINKS
 LOGINC       MALE         MARRIED      PROT         CATH         FEMALE

TOTAL is a depression score that is the sum of the 20 items BLUE to DISLIKE.
SEX is coded 1=male, 2=female.

>let female=0
>if sex=2 then let female=1
>let l10inc=l10(income+1)

>mglh
>model total=constant+age+female+l10inc+educatn
>estimate
 
Dep Var: TOTAL   N: 256   Multiple R: 0.325690   Squared multiple R: 0.106074
 
Adjusted squared multiple R: 0.091828   Standard error of estimate: 8.490592
 
Effect         Coefficient    Std Error     Std Coef Tolerance     t   P(2 Tail)
 
CONSTANT         21.395008     3.033244     0.000000   .        7.05351  0.00000
AGE              -0.095752     0.033526    -0.172383  0.977636 -2.85607  0.00465
FEMALE            1.982634     1.095914     0.109506  0.972044  1.80912  0.07163
L10INC           -5.847107     1.836840    -0.204768  0.860685 -3.18324  0.00164
EDUCATN          -0.593161     0.439126    -0.087098  0.856596 -1.35078  0.17798
 
                             Analysis of Variance
 
Source             Sum-of-Squares   df  Mean-Square     F-ratio       P
 
Regression           2147.123745     4   536.780936    7.445968    0.000011
Residual             1.80946E+04   251    72.090144
-------------------------------------------------------------------------------
 
*** WARNING ***
Case          220 is an outlier        (Studentized Residual =     3.863695)
Case          256 is an outlier        (Studentized Residual =     4.344835)
 
Durbin-Watson D Statistic     0.706
First Order Autocorrelation   0.612



MARITAL is coded 1 to 5 with 2=married.

>let married=0
>if marital=2 then let married=1

>model total=constant+age+female+l10inc+educatn+married
>estimate
 
Dep Var: TOTAL   N: 256   Multiple R: 0.325734   Squared multiple R: 0.106103
 
Adjusted squared multiple R: 0.088225   Standard error of estimate: 8.507419
 
Effect         Coefficient    Std Error     Std Coef Tolerance     t   P(2 Tail)
 
CONSTANT         21.349262     3.081694     0.000000   .        6.92777  0.00000
AGE              -0.094726     0.035486    -0.170535  0.876091 -2.66940  0.00810
FEMALE            1.982414     1.098089     0.109494  0.972039  1.80533  0.07223
L10INC           -5.779476     1.988751    -0.202399  0.737132 -2.90608  0.00399
EDUCATN          -0.600881     0.448323    -0.088232  0.825071 -1.34029  0.18137
MARRIED          -0.108490     1.208720    -0.006100  0.774089 -0.08976  0.92855
 
                             Analysis of Variance
 
Source             Sum-of-Squares   df  Mean-Square     F-ratio       P
 
Regression           2147.706821     5   429.541364    5.934845    0.000033
Residual             1.80940E+04   250    72.376173
-------------------------------------------------------------------------------
 
*** WARNING ***
Case          220 is an outlier        (Studentized Residual =     3.862356)
Case          256 is an outlier        (Studentized Residual =     4.342031)
 
Durbin-Watson D Statistic     0.706
First Order Autocorrelation   0.613



RELIGION is coded from 1 to 5 with 1=Protestant, 2=Catholic.

>let prot=0
>if religion=1 then let prot=1
>let cath=0
>if religion=2 then let cath=1

>model total=constant+age+female+l10inc+educatn+married+prot+cath
>estimate
 
Dep Var: TOTAL   N: 256   Multiple R: 0.381671   Squared multiple R: 0.145673
 
Adjusted squared multiple R: 0.121559   Standard error of estimate: 8.350458
 
Effect         Coefficient    Std Error     Std Coef Tolerance     t   P(2 Tail)
 
CONSTANT         23.690470     3.113848     0.000000   .        7.60810  0.00000
AGE              -0.081829     0.035070    -0.147317  0.864187 -2.33329  0.02043
FEMALE            2.456177     1.088035     0.135661  0.953889  2.25744  0.02485
L10INC           -5.841078     1.957106    -0.204557  0.733335 -2.98455  0.00312
EDUCATN          -0.750378     0.442644    -0.110184  0.815434 -1.69522  0.09129
MARRIED           0.268498     1.192327     0.015097  0.766437  0.22519  0.82202
PROT             -4.168188     1.236377    -0.234197  0.713842 -3.37129  0.00087
CATH             -3.105215     1.574103    -0.134071  0.745791 -1.97269  0.04964
 
                             Analysis of Variance
 
Source             Sum-of-Squares   df  Mean-Square     F-ratio       P
 
Regression           2948.672486     7   421.238927    6.040987    0.000002
Residual             1.72931E+04   248    69.730151
-------------------------------------------------------------------------------
 
*** WARNING ***
Case          216 is an outlier        (Studentized Residual =     3.787911)
Case          220 is an outlier        (Studentized Residual =     4.049760)
Case          256 is an outlier        (Studentized Residual =     4.645155)
 
Durbin-Watson D Statistic     0.732
First Order Autocorrelation   0.595



DRINK is coded 1=Yes, regularly, 2=No.

>let drinks=drink
>if drink=2 then let drinks=0

>model total=constant+age+female+l10inc+educatn+married+prot+cath+drinks
>estimate
 
Dep Var: TOTAL   N: 256   Multiple R: 0.384897   Squared multiple R: 0.148146
 
Adjusted squared multiple R: 0.120556   Standard error of estimate: 8.355224
 
Effect         Coefficient    Std Error     Std Coef Tolerance     t   P(2 Tail)
 
CONSTANT         24.763945     3.363627     0.000000   .        7.36227  0.00000
AGE              -0.085121     0.035305    -0.153244  0.853706 -2.41104  0.01664
FEMALE            2.388638     1.091574     0.131930  0.948797  2.18825  0.02959
L10INC           -5.759389     1.960598    -0.201696  0.731560 -2.93757  0.00362
EDUCATN          -0.743934     0.442962    -0.109237  0.815193 -1.67945  0.09433
MARRIED           0.252884     1.193150     0.014219  0.766253  0.21195  0.83232
PROT             -4.364748     1.258669    -0.245241  0.689567 -3.46775  0.00062
CATH             -3.119165     1.575087    -0.134674  0.745710 -1.98031  0.04878
DRINKS           -1.145426     1.352584    -0.051825  0.920861 -0.84684  0.39790
 
                             Analysis of Variance
 
Source             Sum-of-Squares   df  Mean-Square     F-ratio       P
 
Regression           2998.736008     8   374.842001    5.369477    0.000003
Residual             1.72430E+04   247    69.809773
-------------------------------------------------------------------------------
 
*** WARNING ***
Case          216 is an outlier        (Studentized Residual =     3.830503)
Case          220 is an outlier        (Studentized Residual =     3.973130)
Case          256 is an outlier        (Studentized Residual =     4.561884)
 
Durbin-Watson D Statistic     0.738
First Order Autocorrelation   0.594



Researchers often present a "trimmed" model, like the following, that excludes variables that were non-significant in previous models.  One problem with this practice is the possibillity that a variable that was non-significant in a previous model would have been significant in the trimmed model.  The correct procedure is to test the joint significance of all the variables that are removed in the trimmed model (i.e., educatn, married, and drinks).  If these variables are jointly non-significant, they can be excluded safely.  The procedure to test for joint significance is shown in Module 8.
 
>model total=constant+age+female+l10inc+prot+cath
>estimate
 
Dep Var: TOTAL   N: 256   Multiple R: 0.367122   Squared multiple R: 0.134778
 
Adjusted squared multiple R: 0.117474   Standard error of estimate: 8.369851
 
Effect         Coefficient    Std Error     Std Coef Tolerance     t   P(2 Tail)
 
CONSTANT         21.764300     2.895827     0.000000   .        7.51575  0.00000
AGE              -0.073341     0.033245    -0.132036  0.966154 -2.20608  0.02829
FEMALE            2.541221     1.089255     0.140358  0.956179  2.33299  0.02044
L10INC           -6.784909     1.706329    -0.237610  0.969216 -3.97632  0.00009
PROT             -3.873875     1.224085    -0.217661  0.731636 -3.16471  0.00174
CATH             -3.014877     1.576540    -0.130171  0.746945 -1.91234  0.05698
 
                             Analysis of Variance
 
Source             Sum-of-Squares   df  Mean-Square     F-ratio       P
 
Regression           2728.148500     5   545.629700    7.788656    0.000001
Residual             1.75136E+04   250    70.054406
-------------------------------------------------------------------------------
 
*** WARNING ***
Case          220 is an outlier        (Studentized Residual =     4.088155)
Case          256 is an outlier        (Studentized Residual =     4.764508)
 
Durbin-Watson D Statistic     0.735
First Order Autocorrelation   0.592


Last modified 25 Feb 2000