From leland@spss.com Wed Mar 17 10:42:21 1999 Date: Wed, 17 Mar 1999 08:11:40 -0600 From: "Wilkinson, Leland" <leland@spss.com> Reply-To: systat-l@spss.com To: "'systat-l@spss.com'" <systat-l@spss.com> Subject: RE: INFLUENCE Plots The manual describes the procedure exactly. The word INFLUENCE has so many meanings that it is of no help here. The computation is very simple. The size of a symbol is computed as follows: Compute the Pearson correlation coefficient for all the cases. Drop the one case Compute the Pearson correlation coefficient again The absolute value of the difference between the two numbers is proportional to the size of the symbol. The sign of the difference determines whether the symbol is filled. This procedure is done for all points in the scatterplot. The computations are fast because the algorithm uses a drop-out/insert method for the calculations. Thus, the size of the symbol reflects a particular kind of influence on the Pearson correlation. Under several assumptions, Cook's D and other regression influence measures are related to this statistic. The main assumptions involve scaling of the variables. The Pearson influence is computed after standardizing both X and Y. It therefore is of less use in a regression context. I originally put this one into SYSTAT because Thissen and Wainer showed its use for preliminary scatterplot diagnostics where correlations were being computed. (Psychological Bulletin, 1981, 90, 179-184). This is particularly helpful in a SPLOM. Also, some of the regression influence measures are amalgams of leverage and error, so the relations are a bit more complicated. For this purpose, I would advise using Cook's D or h or other measures to determine the size of the symbols in a residuals plot. Blank and I discuss this in our book, Desktop Data Analysis. LW > -----Original Message----- > From: Francois Nielsen [SMTP:nielsen@email.unc.edu] > Sent: Tuesday, March 16, 1999 1:38 PM > To: systat-l@spss.com > Subject: INFLUENCE Plots > > I am using INFLUENCE plots, in which the size of the symbol denotes the > influence of an observation on the Pearson correlation of Y and X, in my > linear regression models course. I would like more information than the > STATISTICS manual (v6/v7) provides on two points: > 1. When used in a regular bivariate scatterplot of Y against X, what is > exactly the measure of influence used to size the symbols? Is is related > to the COOK distance and if so how? > 2. When used in the context of a partial regression plot, say of > YPARTIAL(1) plotted against XPARTIAL(1), is the measure of influence > related to the DFBETAS statistic? > I am still using v7.0 so I may have missed any related changes in v8.0. > > FN. > > ______________________________________________________________________ > > Francois Nielsen, Professor 919-962-5064 (office) > Department of Sociology 919-962-1007 (sociology department) > University of North Carolina 919-962-7568 (departmental fax) > Chapel Hill, NC 27599-3210 919-968-0245 (home) > E-mail: francois_nielsen@unc.edu (alias for nielsen@email.unc.edu) > ______________________________________________________________________ >