Notes and Errata
Tabulation and Suppression Rules in the SSDC Data File
Census Tract-Level Data, 1960: San Diego, CA

The DualLabs Census Bureau 1960 State census tract data files use population and housing counts of zero to identify counts that are 1) zero, 2) suppressed or 3) not tabulated. The printed Census volume uses what the Bureau calls a "leader" (an ellipsis: "...") to identify any of these three counts. Neither source adequately distinguishes counts that are actually zero from counts the Census Bureau suppressed or did not tabulate.

The SSDC staff created a revised DualLabs data file for San Diego County that resolves this problem.

The SSDC data file uses a missing value indicator where values have been suppressed or not tabulated and the number zero to indicate a value of zero.

During the revision process, staff discovered additional motives for revising the DualLabs data file:

There are three versions of the SSDC data file. All versions change DualLabs labels for "Puerto Rican or Spanish Surname persons" to "Spanish Surname persons". In all three versions, the number zero is used only to indicate a count of zero, never to indicate a suppressed or not tabulated count. Not tabulated and suppressed values are indicated differently:

  1. Excel spreadsheet: Leaders "..." are used to indicate counts that are suppressed or not tabulated.
  2. ASCII data file: Values of 99999 are used to indicate counts that are suppressed or not tabulated.
  3. SPSS Portable file: System missing values (discrete values of 99999) are used to indicate counts that are suppressed or not tabulated.

Tabulation Rules
The Census Bureau developed and used "tabulation rules" that indicate whether a value for a particular cell in a table will be reported or not reported. Five tabulation rules are documented on page three of the DualLabs codebook.

SSDC staff used the tabulation rules to change zeros in the DualLabs data file to missing value indicators in the SSDC data file, where appropriate.

Tabulation Rules Applied to the SSDC Data File
Tabulation rules are applied to tracts in places with more than or less than 50,000 persons. The only place with more than 50,000 persons in San Diego County is San Diego City. All other places (Chula Vista, El Cajon, La Mesa and National City) have less than 50,000 persons. The following tabulation rules were applied to the SSDC data file:

Tabulation Rule 1 - In tables 7 and 10, counts are only included in tracts of places with 50,000 or more inhabitants. Therefore, in tables 7 and 10, in tracts of places with less than 50,000 persons, SSDC staff replaced zeros with missing value indicators in the SSDC data file.

Tabulation Rule 2 - In table 35, counts are only included for rural farm and non-farm residents in tracts of places with less than 50,000 inhabitants. Therefore, in table 35, in tracts of places with more than 50,000 persons, SSDC staff replaced zeros with missing value indicators in the SSDC data file.

Tabulation Rule 3 - No changes were made to Table 52 in the SSDC data file.

Tabulation Rule 4 - In table 56, counts are included only in tracts of places with 50,000 or more inhabitants. Therefore, in table 56, in tracts of places with less than 50,000 persons, SSDC staff replaced zeros with missing value indicators in the SSDC data file.

Tabulation Rule 5 - In tables 58 and 59, counts are included only in tracts of places with less than 50,000 inhabitants. Therefore, in tables 58 and 59, in tracts of places with more than 50,000 persons, SSDC staff replaced zeros with missing value indicators in the SSDC data file.

100% Suppression Rules
The Census Bureau suppressed counts of less than 5 persons or housing units in 100% complete count tables for reasons of confidentiality.

Although individual cells in a table may be suppressed, in the printed volume the Bureau did not suppress totals for a tract. It is possible, therefore, to use the presence of totals in the printed volume to determine if cell values of zero in the DualLabs file were suppressed or were, in fact, zero.3

Figure 1 illustrates county and census tracts counts available in a typical table in the printed volume.

Printed Volume Table - Characteristics of the Population
Subject San Diego County San Diego
Total Tract A Tract B Tract C Tract D
Total Population 2141 902 1235 4 ...
           
Household Relationship          
Population in Households 1694 807 883 ... ...
   Head of primary family 56 27 29 ... ...
    Primary individual 514 236 278 ... ...
    Wife of head 821 370 451 ... ...
    Related single child under 18 255 155 100 ... ...
    Other relative of head 28 11 17 ... ...
    Non-relative of head 20 8 8 ... ...
Population in Group Quarters 447 95 352 ... ...
    Inmate of institution 148 22 126 ... ...
    Other 299 73 226 ... ...
Figure 1

The first row has total population counts for the county and tracts in the county; the value 2141 ("Total, San Diego County") is a sum of the "Total Population" values for all tracts in the county. The "leader" in Tract D is an indicator of a value of zero.

In Figure 1, the two yellow cells are subtotals ("population in households" and "population in group quarters") in Tract C that have been suppressed because the total population of Tract C is less than 5. Subject counts in the remaining rows in Tract C are also suppressed. The "leaders" in the Tract D column actually indicate zero because the total population count in Tract D is zero.

Table Name and Label DualLabs Table 000 - Household Relationship
Table Cell Label Head of primary family Primary individual Wife of head Related single child under 18 Other relative of head Non-relative of head Inmate of Institution Other in group quarters
Table Cell Name T000001 T000002 T000003 T000004 T000005 T000006 T000007 T000008
Census Tract A 27 236 370 155 11 8 22 73
Census Tract B 29 278 451 100 17 8 126 226
Census Tract C 0 0 0 0 0 0 0 0
Census Tract D 0 0 0 0 0 0 0 0
Figure 2

Figure 2 is an example of data In the DualLabs file. In this example, the zeros in Census Tracts C and D could indicate either an actual value of zero or could indicate that the actual value has been suppressed. By using the printed census volume (as illustrated in figure 1), we can see that, in Tract C the zeros should be replaced with missing value indicators because the total population count for Tract C is 4. In addition, we can see that, in Tract D, zeros would not be replaced because the printed volume reveals that Tract D's total population is zero.

100% Suppression Rules Applied to the SSDC Data File
After applying the tabulation rule changes explained above, SSDC staff determined which zeros in the DualLabs file were actual values of zero and which indicated suppressed values by doing the following:

  1. Computed a total for each census tract in every table.
  2. If the computed total tract count was zero, staff looked up the value for that tract in row 1 of the printed volume.
  3. If the tract count was greater than zero in the printed volume, staff replaced zeros with missing value indicators in the SSDC data file.

SSDC staff followed the following procedure to make this determination:

Suppression Rule 1 - In tables 1, 8, and 11 through 16, if total population is greater than 0 and less than 5 (1-4) cells with value of zero were replaced with missing value indicators in the SSDC data file.

Suppression Rule 2 - In tables 2 through 7, if total housing unit count is greater than 0 and less than 5 (1-4), cells with a value of zero DualLabs data file were replaced with missing value indicators in the SSDC data file.

Suppression Rule 3a - In table 7, if total housing unit count is 5 or more but total count of "owner occupied units reporting value" is greater than 0 but less than 5 (1-4), then cell counts of zero were replaced with missing value indicators in the SSDC data file.

Suppression Rule 3b - In table 10, if total count of housing units is 5 or more but total count of "renter occupied units reporting rent" is less than 5, then cells with a value of zero in the DualLabs file were replaced with missing value indicators in the SSDC data file.

Sample (25%, 20%, and 5%) Suppression Rules
Estimated subject counts in sample tables have a degree of uncertainty associated with the counts. In general, the smaller the sample, the higher the level of sampling error. The DualLabs codebook documents one sample suppression rule:

SSDC staff examined sample table cell counts in the DualLabs data file and determined that this sample suppression rule was not applied to any "Nonwhite" or "Spanish Surname" tables in the DualLabs data file. In fact, there are no suppression rules applied to sample Tables 17 through 63.

Therefore, SSDC staff made no changes to the DualLabs tables 17 through 63 in the SPSS and ASCII SSDC data files. Researchers can determine confidence intervals and levels suitable for their research in sample table counts in the DualLabs data file.

SSDC staff did make changes to the spreadsheet version of the SSDC data file to make it conform to the printed census volume. The Census Bureau did suppress sample counts for "Nonwhite" and "Spanish Surname" persons in the printed Census volume. Population counts for "Nonwhite" and Spanish Surname" persons were suppressed in census tracts with fewer than 400 persons in Tables P-4 and P-5. Housing unit counts for "Nonwhites" were suppressed in census tracts with fewer than 100 units in Table H-3 and housing unit counts for "Spanish Surname" persons were suppressed in census tracts with fewer than 400 units in Table H-4.

Therefore, in the spreadsheet version of the SSDC data file, SSDC staff suppressed the estimated counts for the above "Nonwhite" and "Spanish Surname" sample table cells. Staff does not want to supply counts that have high levels of sampling errors to users who are looking-up counts rather than performing analysis on counts.

Users can subset the SSDC data file for "Nonwhite" or "Spanish Surname" persons on the SSDC extraction Web page in CSV or Excel output formats if they require subsets without suppressed census tract counts for these population groups in spreadsheet format.


Footnotes

1United States. Bureau of the Census. U.S. Censuses Of Population and Housing 1960. Census Tracts. Final Report PHC(1)-135 [San Diego, Calif.] U.S. Government Printing Office, Washington, D.C. 1962.

2"... white persons of Spanish surname were distinguished separately in five Southwestern States (Arizona, California, Colorado, New Mexico and Texas). In all other States, Puerto Rican persons ... were identified" (Source: Page 1, column 2 of the Introduction to the Census Printed Volume).

3The "total" columns (for the SMSA, counties, cities, etc.) include statistics [suppressed counts] for those tracts which are omitted from the tables because they have fewer than the specified number of persons or housing units. These totals, therefore, are not necessarily the sum of the figures for the tracts [which exclude suppressed counts] that are shown in the tables. (Source: Page 1, column 2 of the Introduction to the Census Printed Volume).

4The computed sum of all census tract housing units in the printed Census volume table H-1 is 339,440 because there is an error in the count of housing units in Census Tract 183 (1186). The housing units count in Tract 183 should be 1188. This can be confirmed by summing owner occupied + renter occupied + available vacant + other vacant housing units in Tract 183 (270+715+135+68=1188). Therefore, the total county count of housing units in the printed volume (339,442) is correct.

5It is important to make a distinction between the "Nonwhite" and "Spanish Surname" population groups. The Census Bureau defines "Spanish Surname" persons as a subgroup of "White" persons. "In order to obtain data on Spanish- and Mexican-Americans ... white persons (and white heads of households) of Spanish surname were distinguished separately ...". (Source: Page 3, column 2 of the Introduction to the Census Printed Volume). A detailed discussion of Spanish Surname population counts are available in Hernandez, et al. "Census Data and the Problem of Conceptually Defining the Mexican American Population." SOCIAL SCIENCE QUARTERLY, Vol. 53, no.4. March 1973, pp. 677 ff.