Is the Digital Divide Still Closing? New Evidence Points to Skewed Online Results Absent Non-Internet Households
By Mario Callegaro, Ph.D. and Tom Wells, Ph.D.
In 2002, a Department of Commerce report titled, A nation online: How Americans are expanding their use of the Internet,1 sparked a discussion around the differences between Internet and non-Internet households – the digital divide. Recently, the debate has shifted to the broadband digital divide, which shows similar findings. Now that the number of non-Internet households seems to have stabilized, we need to consider the ongoing impact of Internet penetration in the U.S. with regard to online research. Looking at figure 1, we notice a rapid trend in Internet adoption up to 2001, then a slowdown and consolidation to 64% by Spring 2008. Data from the Pew Internet & American Life Project2 shows the same type of trend in Internet penetration, with the main difference being that Pew measures Internet adoption at a 'person' level, counting users who go online at least occasionally.
From these two sources of data, we can see how non-Internet households are not going to disappear anytime soon. Therefore we want to re-assess the contribution of non-Internet households to the final estimate of survey statistics, and whether we can afford to 'forget about' them.

Impact of Non-Internet Households on Survey Estimates
Because non-Internet households have different characteristics, what is their impact on a final survey estimate? For each estimate, the impact depends on how many and how different non-Internet households are from Internet households. From a non-response point of view, the question is: What happens if we do not talk to non-internet households? To answer this, we present some results from late 2007 through early 2008 in table 1. It is apparent there are substantial differences between the two groups—both in attitudes and behaviors. It is worthwhile mentioning that Knowledge Networks' (KN's) probability-based approach3 enables us to compute confidence intervals to test whether the difference is statistically significant. The upshot: Using an Internet-only population can produce biased results.
Estimate |
Non- |
Internet |
Total |
Stat Diff. |
Receive TV signal with a standard antenna* |
26.7 |
16.3 |
21.2 |
Yes |
Regular cable ownership* |
47.0 |
57.8 |
53.8 |
Yes |
Digital cable ownership* |
51.6 |
40.2 |
44.5 |
Yes |
Recycled your newspaper or other papers in the past 12 months* |
49.1 |
66.7 |
59.6 |
Yes |
Recycled your glass in the past 12 months* |
38.2 |
56.4 |
49.1 |
Yes |
Taken steps to reduce your use of energy in the past 12 months* |
55.7 |
64.5 |
60.9 |
Yes |
It is a citizen’s duty to keep informed about politics even if it is time-consuming** |
56.8 |
68.1 |
63.5 |
Yes |
It is a citizen’s duty to report a crime even if it might put him or her in some jeopardy** |
60.8 |
71.1 |
66.9 |
Yes |
Someone like me can’t really influence government decisions** |
37.5 |
31.7 |
34.1 |
Yes |
Do you feel that things in this country... have gotten off on the wrong track* |
72.2 |
72.4 |
72.3 |
No |
Note: For the measures above, one person per household was randomly selected for the analysis. In the last column we report if the difference is statistically significant in all pair-wise comparisons (Internet vs. non- Internet; Internet vs. total; non-Internet vs. total) at a .05 p level.
*: ”Yes”/ "no" answer options.
**: Top two box chosen: strongly agree + agree.
The importance of non-Internet households
Leaving out non-Internet households can lead to serious over- or under-estimations. But online researchers need as clear a picture as possible of the entire U.S. population for:
Unfortunately, the largest Internet-only sample never will include this important portion of U.S. households. This factor, coupled with a multitude of additional complications—respondent self-selection; wear-out; and the ever-dwindling 'reservoir' of survey-takers, to mention just a few—further negates the possibility of getting complete information. And again, it seems that non-Internet households will be around for some time.
What kinds of respondents comprise non-Internet households?
As shown in figure 2, income is the strongest predictor of being a non-Internet household, as shown via our segmentation procedure.4 Income and education level data share similar patterns; non-Internet households are heavily low education and low income. Missing this sub-group can produce a distorted picture of any target audience.

In the following table, we report additional characteristics of Internet and non-Internet households to provide a more complete portrait of each. A picture of non-Internet respondents emerges—most are unmarried, living in non-urban areas, and members of a minority. Our data closely follow other estimates of non-Internet status. 5
Ethnicity |
Non-Internet |
Internet |
White, Non-Hispanic |
30.2 |
69.8 |
Black, Non-Hispanic |
60.0 |
40.0 |
Other, Non-Hispanic |
26.2 |
73.8 |
Hispanic |
49.1 |
50.9 |
2+ races, Non-Hispanic |
39.8 |
60.2 |
Marital Status |
|
|
Married |
24.9 |
75.1 |
Widowed |
56.3 |
43.7 |
Divorced |
49.6 |
50.4 |
Separated |
59.5 |
40.5 |
Never married |
45.3 |
54.7 |
Living with partner |
45.4 |
54.6 |
Metropolitan Statistical Area |
|
|
Non-Urban |
43.2 |
56.8 |
Urban |
35.8 |
64.2 |
Note: For ethnicity and marital status, one person per household was randomly selected for the analysis. The percentage of Hispanics refers to respondents who can speak English proficiently in order to go through our recruitment call. Starting in July '08, we are including Spanish language in the recruitment call.
Can Weighting Correct the Data?
In a previous study, authors use an RDD sample to examine attitudes on the economic outlook between Internet and non-Internet households. They show that by using model-based weighting or a more general calibration to population total, one can reduce and almost eliminate bias for these variables.6 A very recent paper shows similar results for health type variables, where the weighting, when applied, reduces but does not eliminate the coverage bias due to non-Internet households.7 However, more recent data compare results from an opt-in panel with those of a probability-based consumer panel. Even with sophisticated geo-demographic weighting, differences between Internet and non-Internet households may not be eliminated .8 When we conduct a preliminary analysis of the dataset in this article, using multinomial logistic regression, we see that for some variables, differences between Internet and non-Internet households still exist, even after controlling for the relevant demographic variables. This evidence provides initial proof that weighting cannot solve the problem of eliminating non-Internet households.
Conclusion
The impact of non-Internet households on survey estimates is impossible to predict in advance, and sometimes the differences are substantial. Leaving out non-Internet households can seriously lead to over- or under-estimations of the survey estimates and most of the time it is not possible to have an external validation data to corroborate the validity of the estimates.
As marketers and academicians shift core surveys to the Internet, for accuracy's sake, it is critical to consider representation of the non-Internet population. This segment possesses a unique collective voice that is inextinguishable and central to the reliability of decisions based on online research. Those who ignore this sub-group will skew their survey results, and it is inconclusive as to whether weighting can help.
Mario Callegaro is Knowledge Networks' Survey Research Scientist. He has published nationally and internationally in the areas of telephone and cell phone surveys; polling and exit polls; longitudinal surveys; event history calendar; interviewer effect; web surveys and survey quality. He holds a B.A. in Sociology from the University of Trento, Italy, and a M.S. and a Ph.D. in Survey Research and Methodology from the University of Nebraska, Lincoln.
Tom Wells is Director of KN's Profile Group. He is responsible for designing, updating and administering our key profile surveys and overseeing our profile survey datasets. He holds a B.A. in Sociology from the University of California, Berkeley and an M.S. and Ph.D. in Sociology from the University of Wisconsin, Madison.