Frequently Asked Questions
Is GeVIR just a modified gnomAD constraint score (e.g. pLI, observed/expected)?
No, GeVIR is based on a different hypothesis and is complementary to gnomAD constraint metrics. GnomAD constraint scores estimate expected number of variants from specific group (e.g. LoF or missense) and compares it with observed number of variants in a gene. These metrics do not consider how variants are distribution within a gene. Oppositely, GeVIR is based on the idea that long conservative regions without any functional variants (e.g. LoF, missense, INDELs) might indicate important genes even if other parts of these genes are enriched with variants. Thus gnomAD constraint metrics prioritise genes with low overall number of variants, whereas GeVIR prioritise genes with long variantion intolerant regions. For more information please read our paper.
I can't find gene X, why there is no rank for it?
We calculated ranks only for genes with "valid" canonical transcript (starts with Met, ends with stop codon, CDS is divisible by 3) according to Ensembl build 37 annotation. Therefore gene X might had a different name/ID in Ensembl build 37 or it did not have "valid" canonical transcript.
When should I use GeVIR and when should I use gnomAD constraint scores?
GeVIR can be used to prioritise candidate genes intolerant to missense variants, especially if they are short. LOEUF is recommended to prioritise candidate genes intolerant to LoF variants, especially if they are long, although, GeVIR might be more useful with short genes, for which LOEUF cannot be confidently estimated. VIRLoF can be used when a single metric is required as it shows the best performance of all the variant-based gene constraint metrics assessed.
Is GeVIR based on Constraint Coding Regions (CCRs) study data/method?
No, but it is based on the same idea. GeVIR, same as Constrained Coding Regions (CCRs) map
, analyse regions without protein coding variants in gnomAD database. However, while CCRs study shows that novel variants in such regions are more likely to be deleterious (i.e. CCRs complement variant prioritisation methods such as SIFT
), GeVIR shows that these regions represent deviation from equal variant distribution, expected under absence of natural selection (i.e. GeVIR is an alternative to gene missense variation intolerance metrics, such as Missense Z-score).
Moreover, we used more strict coverage filters than CCR study and adjusted variant intolerant region weights by evolutionary conservation (measured by mean GERP++ score
) to reduce the number of false positive regions. Finally, GeVIR gene score is calculated based on ALL gene variant intolerant regions and not only the longest one (a method described in CCRs study). This approached allowed to more precisely prioritise dominant genes and create continious metric which can also be used to prioritise recessive genes.
My disease candidate gene is variant intolerant accordint to GeVIR (i.e. low gene percentile), but my candidate disease causing variant is not located in a variant intolerant region (i.e. high region percenile). Does GeVIR still supports my hypothesis?
It adds evidence that your candidate gene might be intolerant to variation and consequently be associated with some disease, but not that your particular variant is disease causing. However, it also does not disprove your hypothesis. We found that missense disease causing variants were ~3 times more often observed in very long variant intolerant regions (>20 amino acids), comparing with short regions (1-5 amino acids). However, only ~10% of pathogenic missense variants were located inside such regions and more than half of the studied pathogenic missense variants were located in very short regions (1-5 amino acids). Moreover, we did not observe this trend among LoF pathogenic variants, which in most cases are expected to be deleterious irregardles of their location inside protein. Thus while candidate missense variant location inside a long variant intolerant region might be used as a supporting evidence of its pathogenicity, location outside these regions should not be used as an evidence that variant is benign. For example, there are multiple long variant intolerant regions in TARDBP
gene, but all known disease causing variants are located outside them, which leads to a speculative hypothesis that deleterious changes inside such regions might result in more severe or even lethal phenotypes.
How to contact GeVIR?
Please contact us by email: firstname.lastname@example.org
Why are there antlers on GeVIR logo?
"Gevir" means "Antlers" in Norwegian :)