Back in 2016 my husky, Isis, and I had slowed down since we teamed up in 2004 because pulling so many years will slow down man and dog. While Isis faced a crisis, most likely due to the wear of time on her spine, the steroids she was prescribed helped address the pain and inflammation and for a while she was tail up and bright eyed once more.
In my previous essay I looked at using causal reasoning on a small sale by applying the methods of difference and agreement. In this essay I will look at thinking critically about experiments and studies.
The gold standard in science is the controlled cause to effect experiment. The objective of an experiment is to determine the effect of a cause. As such, the question is “I wonder what this does?” While conducting such an experiment can be complicated and difficult, the basic idea is simple.
The first step is to have a question about a causal agent. For example, it might be wondered what effect steroids have on arthritis in elderly dogs. The second step is to determine the target population, which might already be taken care of in the first step, in the example, elderly dogs would be the target population. The third step is to pull a random sample from the target population. This sample needs to be representative, which means it needs to be like the target population. For example, a sample from the population of elderly dogs would ideally include all breeds of dogs, male dogs, female dogs, and so on for all relevant qualities of dogs. If a sample is not properly taken it can be biased. The problem with a biased sample is that the inference will be weak because the sample might not be adequately like the general population. The sample also needs to be large enough. A sample that is too small will also fail to adequately support the inference drawn from the experiment.
The fourth step involves splitting the sample into the control group and the experimental group. These groups need to be as similar as possible (and can be made of the same individuals). The reason they need to be alike is because in the fifth step the experimenters introduce the cause (such as steroids) to the experimental group and the experiment is run to see what difference this makes between the two groups. The final step is getting the results and determining if the difference is statistically significant. This occurs when the difference between the two groups can be confidently attributed to the presence of the cause (as opposed to chance or other factors). While calculating this can be complicated, when assessing an experiment (such as a clinical trial) it is easy enough to compare the number of individuals in the sample to the difference between the experimental and control groups. This handy table from Critical Thinking makes this easy and also shows the importance of having a large enough sample.
| Number in Experimental Group
(with similarly sized control group) |
Approximate Figure That the difference Must Exceed To Be Statistically Significant (in percentage points) |
| 10 | 40 |
| 25 | 27 |
| 50 |
19 |
| 100 | 13 |
| 250 | 8 |
| 500 | 6 |
| 1,000 | 4 |
| 1,500 | 3 |
Many “clinical trials” mentioned in articles and blog posts have very small samples sizes and this can make their results all but meaningless. This table also shows why anecdotal evidence is fallacious: a sample size of one is useless when it comes to an experiment.
The above table also assumes that the experiment is run correctly: the sample was representative, the control group was adequately matched to the experimental group, the experimenters were not biased, and so on for all the relevant factors. As such, when considering the results of an experiment it is important to consider those factors as well. If, for example, you are reading an article about an herbal supplement for arthritic dogs and it mentions a clinical trial, you would want to check on the sample size, the difference between the two groups and determine whether the experiment was also properly conducted. Without this information, you would need to rely entirely on the credibility of the source. If the source is credible and claims that the experiment was conducted properly, then it would be reasonable to trust the results. If the source’s credibility is in question, then trust should be withheld. Assessing credibility is a matter of determining expertise and the goal is to avoid being a victim of a fallacious appeal to authority. Here is a short checklist for determining whether a person (or source) is an expert or not:
- The person has sufficient expertise in the subject matter in question.
- The claim being made by the person is within her area(s) of expertise.
- There is an adequate degree of agreement among the other experts in the subject in question.
- The person in question is not significantly biased.
- The area of expertise is a legitimate area or discipline.
- The authority in question must be identified.
While the experiment is the gold standard, there are times when it cannot be used. In some cases, this is a matter of ethics: exposing people or animals to something potentially dangerous might be deemed morally unacceptable. In other cases, it is a matter of practicality or necessity. In such cases, studies are used.
One type of study is the non-experimental cause to effect study. This is identical to the cause to effect experiment with one critical difference: the experimental group is not exposed to the cause by those running the study. For example, a study might be conducted of dogs who recovered from Lyme disease to see what long term effects it has on them. It would be cruel to give dogs Lyme disease to study its effects, although researchers often try to justify such cruelty in the name of progress.
The study, as would be expected, runs in the same basic way as the experiment and if there is a statistically significant difference between the two groups (and it has been adequately conducted) then it is reasonable to make the relevant inference about the effect of the cause in question.
While useful, the study is weaker than the experiment. This is because those conducting the study must take what they get as the experimental group is already exposed to the cause and this can create problems in properly sorting out the effect of the cause in question. As such, while a properly run experiment can still get erroneous results, a properly run study is even more likely to have issues.
A second type of study is the effect to cause study. It differs from the cause to effect experiment and study in that the effect is known but the cause is not. Hence, the goal is to infer an unknown cause from the known effect. It also differs from the experiment in that those conducting the study obviously do not introduce the cause.
This study is conducted by comparing the experimental group and the control group (which are ideally, as similar as possible) to sort out a likely cause by considering the differences between them. As would be expected, this method is less reliable than the others since those doing the study are trying to backtrack from an effect to a cause. If considerable time has passed since the suspected cause, this can make the matter even more difficult to sort out. The conducting the study also must work with the experimental group they happen to get and this can introduce complications into the study, making a strong inference problematic.
An example of this would be a study of elderly dogs who suffer from paw knuckling (the paw flips over so the dog is walking on the top of the paw) to determine the cause of this effect. As one might suspect, finding the cause would be challenging as there would be a multitude of potential causes in the history of the dogs ranging from injury to disease. It is also likely that there are many causes in play here, and this would require sorting out the different causes for this same effect. Because of such factors, the effect to cause study is the weakest of the three and supports the lowest level of confidence in its results even when conducted properly. This explains why it can be so difficult for researchers to determine the causes of many problems that, for example, elderly dogs suffer from.
In the case of Isis, the steroids that she was taking were well-studied, so it was quite reasonable for me to believe that they were a causal factor in her remarkable but all too brief recovery. I do not, however, know for sure what caused her knuckling as there are so many potential causes for that effect. However, the important thing is that she was able to walk normally about 90% of the time and her tail was back in the air, showing that she was a happy husky.
