It is reasonable to wonder how the experts are coming up with the numbers of cases of COVID and calculating its lethality. Some might be concerned about or even skeptical because the numbers often change and they vary across countries, age groups, ethnicities and economic classes. This essay provides a basic overview of a core method of making inferences from samples to entire populations—this is what philosophers call the inductive generalization.
An inductive generalization is an inductive argument. In philosophy, an argument consists of premises and one conclusion. The premises are the reasons or evidence being offered to support the conclusion, which is the claim being argued for. Philosophers generally divide arguments into inductive and deductive. In philosophy a deductive argument is such that the premises provide (or are supposed to provide) complete support for the conclusion. An inductive argument is an argument such that the premises provide (or are supposed to provide) some degree of support (but less than complete support) for the conclusion. If the premises of an inductive argument support the conclusion adequately (or better) it is a strong argument. It is such that if the premises are true, the conclusion is likely to be true. If a strong inductive argument has all true premises, it is sometimes referred to as being cogent.
One feature of inductive logic is that a strong inductive argument can have a false conclusion even when all the premises are true. This is because of what is known as the inductive leap: the conclusion always goes beyond the premises. This can also be put in terms of drawing a conclusion from what has been observed to what has not been observed. Our good dead friend David Hume argued back in the 1700s that this meant we could never be sure about inductive reasoning and later philosophers called this the problem of induction. In practical terms, this means that even if we engage in perfect inductive reasoning using premises that are certain, our conclusion can still turn out to be wrong. But induction is often the only option—so we use it because we must. So, when the initial numbers about COVID-19 turn out to be wrong, this is exactly what we should expect.
What, then, is an inductive generalization? Roughly put, it is an argument in which a conclusion about an entire population is based on evidence from a sample of observed members of that population. The formal version looks like this:
Premise 1: P% of observed Xs are Ys.
Conclusion: P% of all Xs are Ys.
The observed Xs would be the sample and all the Xs would be the target population. As an example, if someone wanted to know the mortality rate for males over sixty, the target population would be all males over sixty.
While the argument is simple, sorting out when a generalization is a good one can be quite challenging. Without getting into the complicated statistics and methods for doing rigorous generalizations, I will go over the basic method of assessment—so you can make some sense when experts talk about such matters in the context of COVID-19 or anything. There will be various factors whose presence or absence in the sample can affect the presence or absence of the property the argument is concerned with, so a representative sample will have those factors in proportion to the target population. For example, if we wanted to determine the infection rate for all people, then we would need to try to ensure that our sample included all factors that impact the infection rate and we would also need to ensure that our sample mirrored our target population in terms of age, ethnicity, base health, and so on for all relevant features of the population. Sorting out what factors are relevant can be challenging—in the case of COVID-19 experts are engaged in taking samples and doing causal analysis to try to create even better samples to make stronger generalizations. To the degree that the sample mirrors the target population properly, it would be representative.
A sample is biased relative to a factor to the extent that the factor is not present in the sample in the same proportion as in the population. This sort of sample bias is a major problem when trying to generalize about COVID-19. One example of this is trying to draw a conclusion about the lethality of COVID-19. While the math to do this is easy (a simple calculation of the percentage of the infected who die from it) getting the numbers right is hard because we need to know how people have been infected and how many of them die from it.
Experts have been trying to determine the number of people who have been infected—this is done by testing and modeling—which are also inductive reasoning. In the United States, most of the testing is being done on people who are showing symptoms, and this will create a biased sample—to get an unbiased sample, everyone needs to be tested. Until testing is broad enough, inferences about the lethality of the virus will be weakened by this biasing factor. There is also the practical matter of the accuracy of the tests and the determination of the cause of death.
To use a concrete, but made up example, if 5% of those who tested positive for COVID-19 ended up dying, the generalization from that sample to the whole population would only be as strong as the representativeness of the sample—and if only sick people are being tested, the sample will not be representative and the conclusion about the lethality of the virus will be wrong.
There is also the challenge of sorting out the effect of the virus on different populations. While there will be an overall infection rate and lethality rate for the whole population, there will also be different infection rates and lethality rates for different groups within the human population. As an example, the elderly appear to have a higher lethality rate than the population as a whole.
In addition to representativeness, sample size is important—the larger, the better. This brings us to two more concepts: Margin of error and confidence level. A margin of error is a range of percentage points within which the conclusion of inductive generalization falls; this number is usually presented in terms of being plus or minus. The margin of error depends on sample size and the confidence level of the argument. The confidence level is typically presented as a number and represents the percentage of arguments like the one in question that have a true conclusion.
When generalizing about large (10,000+) populations, a sample will need to have 1,000+ individuals to be representative (assuming the sample is taken properly). This table, from Moore & Parker’s Critical Thinking text, shows the connection between sample size and error margin (confidence level of 95%:
Sample Size | Error Margin (%) | Corresponding Range (percentage points) |
10 | +/- 30 | 60 |
25 | +/- 22 | 44 |
50 | +/- 14 | 28 |
100 | +/- 10 | 20 |
250 | +/- 06 | 12 |
500 | +/- 04 | 8 |
1,000 | +/- 03 | 6 |
1,500 | +/- 02 | 4 |
The practical takeaway here is that sample size is particularly important: a small sample will have a large margin of error that can make it useless. For example, suppose that a group of 50 COVID-19 patients received hydroxychloroquine tablets and 10 of them recovered fully. Laying aside all causal reasoning (which would be a huge mistake) the best we could say is that 20% of patients treated with hydroxychloroquine +/-14% will recover fully. It is very important to say here that this is just a simple generalization and that a controlled experiment or study would be needed to properly assess a causal claim—something I will discus in the future.
There are various fallacies (mistakes in reasoning) that can occur with a generalization. I will discuss those in the next essay. Stay safe and I will see you in the future.