USING PRLN04 - 22 SEP 1997 Fran‡ois Nielsen Department of Sociology University of North Carolina Chapel Hill, NC 27599-3210 email francois_nielsen@unc.edu 919-962-5064 (office) 919-968-0245 (home) 919-962-7568 (departmental fax) PRLN04 is a DOS program coded by Fran‡ois Nielsen to calculate the Gini coefficient of income inequality from income distribution data giving the number of individuals in income categories, with the top category open, such as those published by the Census Bureau. The program estimates the Gini coefficient by reconstructing the continuous distribution of income underlying the empirical distribution. It does so by estimating the average income of recipients in a category as equal to the category mid-point (for categories below the category including the median observation), and by attempting to fit a Pareto distribution to the data (for the category including the median observation and categories above). In cases where fitting the Pareto distribution is not possible (e.g., when there is an empty category or the estimate of the Pareto slope is out of plausible range), the program takes appropriate action (e.g., uses the category midpoint as the estimate). The details of the algorithm are embodied in the procedure GinParLin in the source code (PRLN04.PAS). PRLN04.EXE is a self-standing compiled version of the program. To use the program you need to prepare two text files: 1. a text file containing the LOWER BOUNDS of the income categories separated by one or more spaces. Example: for calculating Gini coefficients using income distribution data from the 1970 census (in which income categories were 0-999; 1,000-1,999; etc.; and 50,000 and above) you might create a file LB70.PRN consisting of the single line of text: 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 12000 15000 25000 50000 2. a text file containing the distribution data, i.e. the number of income recipients in each income category, separated by one or more spaces, and preceded by TWO integer numbers serving as ID numbers. (The two ID numbers are required but arbitrary: they only serve in identifying the output.) Example: the income distribution data for a particular county (e.g., Alamance County, North Carolina) in 1970 might be entered into a text file NC70TST.PRN which looks like this: 37 001 496 730 824 1221 1499 1704 1712 1888 2149 2122 3808 3713 3084 642 133 Note that the data in NC70TST.PRN should fit on a single line. The two ID numbers are used here to record the State (37 = North Carolina) and County (001 = Alamance) FIPS codes. If you want the Gini coefficient estimates for several units, use a separate line for each distribution. With the lower bounds file and income distribution file in the same directory as PRLN04.EXE (using the LB70.PRN and NC70TST.PRN files as examples) a session with PRLN04 proceeds as follows. o at the DOS prompt, go to the directory where PRLN04.EXE is located and make sure that LB70.PRN and NC70TST.PRN are there. Type prln04 o The program asks Input file name? Type nc70tst.prn o The program asks Lower Bounds file name? Type lb70.prn o The program asks Output to (S)creen, (F)ile, (P)rinter? Type s, f, or p as desired. If you choose f, the progran will prompt you for a file name. o The following output appears on the screen, in the text file that you specified, or on the printer, depending on your choice of output device: 1 37 1 33.41 33.35 The first number in the output (1) is a sequential number provided by the program, the next two numbers (37 1) are the two ID numbers provided by the user in the income distribution data file (NC70TST.PRN in this example). The next two numbers are alternative estimates of the Gini coefficient of income inequality, expressed as percentages. The first estimate is calculated uses intervals of the Lorenz curve corresponding to the original categories of the empirical income distribution. The second estimate is based on a decomposition of the Lorenz curve into 100 intervals. The second estimate is the one used in the analysis of Nielsen and Alderson (1997). Nielsen is currently doing research on the properties of these alternative estimators. REFERENCE Nielsen, Fran‡ois and Arthur S. Alderson. 1997. "The Kuznets Curve and the Great U-Turn: Income Inequality in U.S. Counties, 1970 to 1990." American Sociological Review 62:12-33. FILES PRLNMAN.PRN This file! (TEXT) PRLN04.EXE Executable DOS program. (BINARY) PRLN04.PAS Turbo Pascal V 7.0 source code. (TEXT) LB70.PRN Lower bounds of income categories for 1970 census. (TEXT) LB80.PRN Lower bounds of income categories for 1980 census. (TEXT) LB90.PRN Lower bounds of income categories for 1990 census. (TEXT) NC70TST.PRN Test data from 1970 census (Alamance County, NC) (TEXT) INEQ70.TXT Gini coefficents for all counties, 1970 (TEXT) INEQ70.WQ1 Gini coefficients for all counties, 1970 (LOTUS TYPE 2 - BINARY) INEQ80.TXT Gini coefficents for all counties, 1970 (TEXT) INEQ80.WQ1 Gini coefficients for all counties, 1970 (LOTUS TYPE 2 - BINARY) INEQ90.TXT Gini coefficents for all counties, 1970 (TEXT) INEQ90.WQ1 Gini coefficients for all counties, 1970 (LOTUS TYPE 2 - BINARY)