
In my last essay I looked briefly at how to pick between experts. While people often reply on experts when making arguments, they also rely on studies (and experiments). Since most people do not do their own research, the studies mentioned are typically those conducted by others. While using study results in an argument is quite reasonable, making a good argument based on study results requires being able to pick between studies rationally.
Not surprisingly, people tend to pick based on fallacious reasoning. One common approach is to pick a study based on the fact that it agrees with what you already believe. This is rather obviously not good reasoning: to infer that something is true simply because I believe it gets things backwards. It should be first established that a claim is probably true, then it is reasonable to believe it.
Another common approach is to accept a study as correct because the results match what you really want to be true. For example, a liberal might accept a study that claims liberals are smarter and more generous than conservatives. This sort of “reasoning” is the classic fallacy of wishful thinking. Obviously enough, wishing that something is true (or false) does not prove that the claim is true (or false).
In some cases, people try to create their own “studies” by appealing to their own anecdotal data about some matter. For example, a person might claim that poor people are lazy based on his experience with some poor people. While anecdotes can be interesting, to take an anecdote as evidence is to fall victim to the classic fallacy of anecdotal evidence.
While fully assessing a study requires expertise in the relevant field, non-experts can still make rational evaluations of studies, provided that they have the relevant information about the study. The following provides a concise guide to studies—and experiments.
In normal use, people often jam together studies and experiments. While this is fine for informal purposes, this distinction is actually important. A properly done controlled cause-to-effect experiment is the gold standard of research, although it is not always a viable option.
The objective of the experiment is to determine the effect of a cause and this is done by the following general method. First, a random sample is selected from the population. Second, the sample is split into two groups: the experimental group and the control group. The two groups need to be as alike as possible—the more alike the two groups, the better the experiment.
The experimental group is then exposed to the causal agent while the control group is not. Ideally, that should be the only difference between the groups. The experiment then runs its course and the results are examined to determine if there is a statistically significant difference between the two. If there is such a difference, then it is reasonable to infer that the causal factor brought about the difference.
Assuming that the experiment was conducted properly, whether or not the results are statistically significant depends on the size of the sample and the difference between the control group and experimental group. The key idea is that experiments with smaller samples are less able to reliably capture effects. As such, when considering whether an experiment actually shows there is a causal connection it is important to know the size of the sample used. After all, the difference between the experimental and control groups might be rather large, but might not be significant. For example, imagine that an experiment is conducted involving 10 people. 5 people get a diet drug (experimental group) while 5 do not (control group). Suppose that those in the experimental group lose 30% more weight than those in the control group. While this might seem impressive, it is actually not statistically significant: the sample is so small, the difference could be due entirely to chance. The following table shows some information about statistical significance.
Sample Size (Control group + Experimental Group) |
Approximate Figure That The Difference Must Exceed To Be Statistically Significant (in percentage points) |
10 | 40 |
100 | 13 |
500 | 6 |
1,000 | 4 |
1,500 | 3 |
While the experiment is the gold standard, there are cases in which it would be impractical, impossible or unethical to conduct an experiment. For example, exposing people to radiation to test its effect would be immoral. In such cases studies are used rather than experiments.
One type of study is the Nonexperimental Cause-to-Effect Study. Like the experiment, it is intended to determine the effect of a suspected cause. The main difference between the experiment and this sort of study is that those conducting the study do not expose the experimental group to the suspected cause. Rather, those selected for the experimental group were exposed to the suspected cause by their own actions or by circumstances. For example, a study of this sort might include people who were exposed to radiation by an accident. A control group is then matched to the experimental group and, as with the experiment, the more alike the groups are, the better the study.
After the study has run its course, the results are compared to see if these is a statistically significant difference between the two groups. As with the experiment, merely having a large difference between the groups need not be statistically significant.
Since the study relies on using an experimental group that was exposed to the suspected cause by the actions of those in the group or by circumstances, the study is weaker (less reliable) than the experiment. After all, in the study the researchers have to take what they can find rather than conducting a proper experiment.
In some cases, what is known is the effect and what is not known is the cause. For example, we might know that there is a new illness, but not know what is causing it. In these cases, a Nonexperimental Effect-to-Cause Study can be used to sort things out.
Since this is a study rather than an experiment, those in the experimental group were not exposed to the suspected cause by those conducting the study. In fact, the cause it not known, so those in the experimental group are those showing the effect.
Since this is an effect-to-cause study, the effect is known, but the cause must be determined. This is done by running the study and determining if these is a statistically significant suspected causal factor. If such a factor is found, then that can be tentatively taken as a causal factor—one that will probably require additional study. As with the other study and experiment, the statistical significance of the results depends on the size of the study—which is why a study of adequate size is important.
Of the three methods, this is the weakest (least reliable). One reason for this is that those showing the effect might be different in important ways from the rest of the population. For example, a study that links cancer of the mouth to chewing tobacco would face the problem that those who chew tobacco are often ex-smokers. As such, the smoking might be the actual cause. To sort this out would involve a study involving chewers who are not ex-smokers.
It is also worth referring back to my essay on experts—when assessing a study, it is also important to consider the quality of the experts conducting the study. If those conducting the study are biased, lack expertise, and so on, then the study would be less credible. If those conducting it are proper experts, then that increases the credibility of the study.
As a final point, there is also a reasonable concern about psychological effects. If an experiment or study involves people, what people think can influence the results. For example, if an experiment is conducted and one group knows it is getting pain medicine, the people might be influenced to think they are feeling less pain. To counter this, the common approach is a blind study/experiment in which the participants do not know which group they are in, often by the use of placebos. For example, an experiment with pain medicine would include “sugar pills” for those in the control group.
Those conducting the experiment can also be subject to psychological influences—especially if they have a stake in the outcome. As such, there are studies/experiments in which those conducting the research do not know which group is which until the end. In some cases, neither the researchers nor those in the study/experiment know which group is which—this is a double blind experiment/study.
Overall, here are some key questions to ask when picking a study:
Was the study/experiment properly conducted?
Was the sample size large enough?
Were the results statistically significant?
Were those conducting the study/experiment experts?