Avoiding confirmation bias in genetic medicine

Confirmation bias - image by Beatrice the Biologist

Confirmation bias. Image by Beatrice the Biologist. Used with permission.

Confirmation bias is the tendency to search for, interpret and favour information that confirms your pre-existing beliefs or hypotheses. It is a universal tendency, an integral part of human nature. We like to make connections, we like to find explanations. These core human desires have led to some of our greatest discoveries. But they can also lead us astray. In genetic medicine confirmation bias causes error and harm. We must diligently guard against it.


I want to find a genetic variant to explain this disease

Confirmation bias often occurs during investigations into possible links between genetic variants and medical problems. We are keen to find such links for the patient in front of us, and for the advancement of knowledge. So we are primed to favour information that supports links. To guard against this it is vital we fully comply with the strict evidence requirements for proving causal links between genes and disease.

This is particularly important in genetic medicine because everyone has hundreds of genetic variants with the potential to impact human functions. This makes it fairly easy to come up with a plausible reason why one of them has caused a given disease in a given person, if you want to.

This is similar to the confirmation bias that has led to so many people believing that vaccinations cause autism. We vaccinate most children, so most autistic children will have been vaccinated. This doesn’t show that vaccinations cause autism. All vaccinated children will want to stay up late, but no one thinks vaccinations cause that! Moreover, many huge studies have unequivocally shown that vaccinations don’t cause autism.

Why do people still believe vaccinations cause autism? Because it was their existing belief and no other explanation for autism has displaced it. In this context, the fact that all kids with autism have been vaccinated seems compelling supporting evidence.


I want to find a phenotype this genetic variant explains

“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”

Sherlock Holmes
Trying to find a phenotype that a genetic variant has caused is a more recent area where confirmation bias is causing errors. As we described in last week’s post, we increasingly have genetic information in people with no, or mild medical problems. Sometimes this absence of a phenotype doesn’t fit our favoured hypothesis that the gene causes disease. When this happens, it can be tempting to wonder if the data rather than the theory should be different.

For example, sometimes a variant is found in a child and is thought to have caused their medical condition. But the variant is also in their father, who was reported to be entirely healthy. All too often a repeat medical examination will suddenly uncover ‘mild features’ of the syndrome in the father, which is used as evidence that the variant causes the condition. Of course sometimes this will be the truth, the features were not noticed first time around. But, if the ‘mild features’ are only detectable if you are trying very hard to find them, if an independent reviewer, without knowledge of the desired theory would not find them, confirmation bias is probably influencing deductions.

So how do we prevent confirmation bias in genetic medicine?


Recognise the problem

The first and most important thing we should do is to be more open and less defensive about confirmation bias. This is not easy for scientists and doctors. Confirmation bias happens when we are not fully objective about the available data, and being objective is the lifeblood of a scientist. Accepting that we could ever be less than objective will be very uncomfortable. Perhaps focusing on the universality of confirmation bias as a human trait and providing the objective evidence showing how common it is will be helpful starting points.


Reset the baseline

There is a pervasive misconception at the heart of genetic medicine today that leads people to overestimate the chance that a genetic variant has caused a phenotype. It is the mother of all confirmation biases in our field. It has arisen because we have not adapted to the change in how we use genetic analyses today, as I described in a previous post. Previously the chance of finding a causative genetic variant was high because we only did genetic analyses if the chance of finding a genetic cause was high! Now we use genetic testing much more liberally and so the chance of finding a causative genetic variant, in the clinic or in research, is much, much lower.


Don’t move the goalposts

Genetic research is often trying to answer a very simple central question. For example ‘do variants in DNA repair genes cause disease-X?’ The research team will conduct well-designed, objective experiments to investigate this question. They may find no evidence of an association. But, searching long and hard in the data, they find a small subset of patients enriched for a small subset of variants.

We generate large amounts of data in genetic studies these days. This means you can always find a subset of patients that look like they are linked to a subset of genetic variants, if you want to. But you have moved the goal posts. This result is not the answer to the original question. It was not investigated objectively. It was found because the team believed DNA repair genes cause disease-X and so (unconsciously) favoured any data supporting that theory.

Of course science needs to be able to follow-up on new leads from research. But if a new theory emerges during a study, then an objective, properly designed investigation of the new theory needs to be done. You can’t keep moving the goalposts until you score a goal.


Data before deduction

Wherever possible we should obtain the data we are going to use to investigate a question before we start trying to find answers. This is particularly important for phenotype data. Genetic data by its nature is fixed and objective. You can’t decide a genetic variant is present if it is not present. Phenotypes are more nebulous. How floppy does a child have to be to have ‘hypotonia’? How big does a tongue have to be to be ‘macroglossia’? There is no objective definition of intellectual disability, or hemihypertrophy, or hundreds of other phenotypes. This makes phenotypes much more vulnerable to confirmation bias. Ideally phenotype data should be obtained before a study or clinical genetic test. But if that is not possible, try to get someone who doesn’t know the proposed answer to do the phenotyping.

Sherlock Holmes summed it up very well “It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”


Image by Beatrice the Biologist. Used with permission.