[ Fall/Winter 2008 ]
River Samples: A Good Catch for Researchers?
By Charles DiSogra, Ph.D., Chief Statistician
In recent publications and conference presentations, "river sampling" has been positioned as the best option for reaching a random, less-surveyed online audience. Yet is it? What is meant by randomness or newness in this context, and what is the real value of each?
River sampling – as practiced most prominently by DMS Research – recruits using banner ads, pop-up ads and similar instant "capture" promotions. Individuals who volunteer to participate are screened for their reported demographic characteristics and then "randomly assigned" to the appropriate survey. Hence the metaphor of being captured from the flowing river of online persons. Yet, online "audience" is perhaps the more accurate descriptor of this sample type, because an audience is a group of people who have chosen to be somewhere. In sampling methodology, this "self-selection" is a massive problem for representative samples; whether or not respondents are new is a relatively minor issue by comparison.
Are river samples representative?
Representative samples are generally built on random selection from the eligible population. Random assignment ex post facto without initial random selection only guarantees random allocation of a non-representative sample. As an example, let's say you conveniently have any 200 white males ages 18-24. If you randomly assign them to four groups, there is no assurance that any group has received an allocation representative of anything other than the assortment you started with. How you got those 200 males, and knowing what they represent, trumps any subsequent random assignment as an issue of sampling and representation.
Fundamentally, river sampling is a web-based convenience sample of opt-in participants that uses a quota sampling approach to build a study sample. The selection method is not probability based, and participants cannot be described as representing any larger defined population. As an opt-in process, there is an inherent bias in river sampling, and that bias is both unknown and not measurable. Therefore, adding this to an opt-in panel sample does not address the projectability of results, nor does it ensure that there is fresh sample, as the respondent could be in a panel and in a river.
Proponents of river sampling, however, prefer to focus attention away from issues of selection and representation and onto the fact that their captured samples are more "fresh" survey takers than the repeatedly surveyed members of most online panels. More to the point, there is an invented concept that high-tenured panelists can become burned-out; the contribution of these panelists is somehow implied to be flat and devalued.
Now, let's be clear at this point, the online opt-in or volunteer panels suffer from serious flaws, such as a high proportion of members belonging to numerous other panels, an infiltration of professional respondents, and – their biggest flaw – non-representative membership. With the exception of high-quality, probability-based panels such as KnowledgePanel®, the vast majority of online panels are not representative of any population other than those who join them. They each have different recruitment methods and different reward strategies. They are as much samples of convenience as are river samples.
Is River Sample® fresher than other online sample?
Returning to the topic of river respondents being more "fresh" and somehow better, DMS, Inc. has recently introduced results from a study it completed as an argument in support of river sampling. The purpose of their study was to evaluate the differences between river respondents and ongoing (tenured) panelists – plus, as a gold standard, compare both of these to a random-digit dial (RDD) telephone survey sample. Although the results are neither surprising nor comforting, the problems with the DMS study have more to do with what is not there than what is there.
The only river sample used is a DMS River Sample®, whose methods are proprietary and thus not described. The online comparative samples are an aggregate of a DMS panel sample, again with undisclosed proprietary methods, and two other unnamed online panel samples with no disclosure of their methods for recruitment. There is also no report on how large any of these online panels are in total membership, or even whether their corporate scope is national or global, although the sample is presumed to be national. We are left to assume that their methods, from recruitment to survey assignment and reward scheme, are similar. This may be a false assumption.
An RDD telephone survey provides the third comparison sample. RDD surveys are a venerable methodology; however, their quality can range enormously. Fortunately, the American Association for Public Opinion Research (AAPOR) publishes both standards and metrics (like response rates) for evaluating the quality of RDD surveys. The DMS study presents no metrics or methods on the RDD survey, making it impossible to understand how good a survey it is and whether the results are worthy of comparison. As for being representative, there is no discussion of how or if these RDD data were weighted, other than that quota cells were established for "gender, age, income and ethnicity."
Although the study states that "at least 400 responses were collected in each sample cell," it is impossible to understand how the 2,412 survey responses were distributed, and thus precision estimates for proportional comparisons at the question-respondent level are not known, regardless of their stated 95% confidence level. Also, the problem of statistical testing in a multiple-comparisons situation (where alpha levels require adjustment toward being more stringent) was not discussed and assumed not applied.
Conclusive evidence?
With these methodological unknowns, it is difficult to lend some degree of credibility to these findings, including the benchmark comparisons. However, the descriptive results on the sample population are somewhat suggestive regarding who showed up on survey day and how they compare to each other, regardless of how they got there or what they may represent. It is also necessary to ignore the mode effect that is certainly taking place with the telephone interview versus the online survey instrument.
Avoiding any conclusions, it can be gleaned from these data that the river sample participants are more likely to be: students, instant messengers, "into" the latest technology, takers of online surveys (although not necessarily as panel members, but maybe due to the fact that they are spending time on the Internet), and slightly more risk-takers. Since "online surveys" was not defined, it could also include taking brief customer surveys or service satisfaction surveys – or bogus surveys that are really masking marketing traps for sales. In any case, it will be as unlikely to generalize results from this type of sample as it would be for the opt-in panel samples.
As a final point, studies done with members of the probability-based KnowledgePanel® by Knowledge Networks, and published in Marketing Research and also presented at the 2007 AAPOR meeting, have shown either no or only trivial differences in survey results between new panel members (recent "catches," so to speak) and those with longer tenure and survey-taking experience.
In the end, while we may be able to assert that river samples are "new," and "random," they are not representative of anything. And the value of respondent "newness" on its own for research is questionable, if non-existent.
Dr. Charles DiSogra is Knowledge Networks' Chief Statistician and heads the statistics unit at the Menlo Park office.
He brings over 20 years experience in survey research, sample design, data analysis, and administration. Dr. DiSogra has a masters degree in public health and a doctorate in nutritional epidemiology with an emphasis in biostatistics and policy analysis from the University of California at Berkeley. He can be reached at cdisogra@knowledgenetworks.com.
Fresh Photo: © Stephen Snyder - Dreamstime.com







