Lack of in vivo validation may cause inaccurate diagnosis for infertility patients
Many infertility cases have genetic causes, but pinpointing the culprit mutations remains difficult because fertility and reproduction are controlled by many genes. Additionally, these genes carry many harmless but suspicious mutations in different people, making it hard to spot the truly damaging ones.
In a recent study published in PNAS, a team led by Dr. John Schimenti, professor of genetics in the Department of Biomedical Sciences, tested the accuracy of existing methods used to predict the genetic variation that cause infertility.
Getting an accurate interpretation of genetic variation is crucial for giving patients the right diagnosis and recommendations. “Interpreting the functional impacts of genetic variation is challenging but profoundly important for clinical management and genetic counseling,” says Schimenti.
When scientists want to identify the genetic mutations responsible for a trait, they use a combination of computational tools and molecular techniques. Typically, complex algorithms analyze the DNA sequence of a patient, and classify the patient’s genetic variation based on its likelihood to cause disease.
Most of the variation in our DNA is either classified as benign or as variants of unknown significance (VUS). “A mutation that causes infertility will exist within a background of multiple VUS in candidate genes,” says Schimenti. “It is difficult to conclusively implicate any single variant as being responsible for infertility.” Scientists use both the terms mutation and genetic variant to describe a change in the DNA sequence.
For many traits, like rare diseases and cancers, a panel of experts in specific disease areas then examines the computational predictions. The experts search if other evidence, for example published laboratory experiments, confirm the predictions. This verification process increases the reliability of the clinical database of genetic variants. Unfortunately, there is no such panel for infertility, which requires support and approval from the NIH to be established. For reproductive traits, most conclusions are solely based on algorithms’ predictions.
Schimenti and his team wanted to assess if computational methods alone provided accurate predictions for infertility-related mutations. They set up an experiment where they examined the fertility of mice engineered to carry human genetic variants in genes essential for male reproduction. They chose to focus on 11 genetic variants that algorithms predicted would disrupt the function of these key fertility genes. Three of these 11 mutations were also observed in men clinically diagnosed with fertility issues.
Out of the 11 mutations predicted to be harmful by algorithms, Schimenti and colleagues observed that 10 had no effect on mice fertility. Only one genetic variant found in a male infertility patient had greatly reduced sperm production in mice. This means that the 10 other mutations were predicted to have an effect but were clinically benign in the mouse models.
Schimenti can think of several reasons why in vivo observations did not match the computational predictions. One of them is that that algorithms are trained on datasets that are inaccurate, and if the models are learning on partially wrong data, their predictions are partially incorrect. “Some studies have demonstrated that nearly half of the rare mutations that were algorithmically predicted to have a negative impact on health did not have the predicted effect,” says Schimenti.
Another reason could be that the computational predictions are not wrong, but that biological systems are resilient against mutations. “Living systems have robustness or redundancies that can mask minor biochemical or structural defects of proteins,” says Schimenti. Some of these mutations may affect the function of a gene as predicted, but this alone may not be enough to compromise fertility of an organism. Sometimes, genetic variants in a gene only affect a trait when combined with specific variations in other genes.
Schimenti also acknowledges that his experiments tested human mutations in mouse models. “It is possible that mice may be more tolerant to the protein alterations than humans,” says Schimenti. “It is also possible that the consequences only manifest themselves over longer human lifespans.”
Nevertheless, Schimenti’s study proves that relying on computational or in vitro experiments alone is insufficient for use as a diagnostic in the clinical setting. These methods used in isolation wrongly labels harmless mutations as bad ones, and they fail to identify the genetic factors responsible for infertility in actual patients.
“Computational prediction is only one piece of the evidence, and if we don’t look at the other pieces, we are bound to make mistakes in our interpretation of genetic variants,” says Schimenti. With his study, he calls for improving existing computational approaches and increasing experimental validation, which are fundamental to provide accurate genetic diagnosis and give patients meaningful answers.
Co-authors in the study include postdoctoral fellow Dr. Xinbao Ding in Schimenti’s lab, the laboratory of Dr. Haiyuan Yu, professor in the Department of Computational Biology and the Weill Institute for Cell and Molecular Biology in the College of Agriculture and Life Sciences, as well as researchers from the University of Pittsburgh, Oregon Health & Sciences University, and the Polish Academy of Sciences.
A modified version of this story appears in the Cornell Chronicle.
Written by Elodie Smith