Oral Presentation 27th Annual Lorne Proteomics Symposium 2022

Proteome-wide estimation of the change in protein stability due to missense mutation (#25)

Dan Andrews 1 , Aaron Chuah 1 , Sean Li 1 , Matthew Field 2
  1. John Curtin School of Medical Research, Australian National University, Acton, ACT, Australia
  2. Australian Institute of Tropical Health and Medicine, James Cook University, Cairns, Queensland, Australia

It is widely accepted that decreased protein stability due to missense genetic variation is strongly associated with inherited genetic disease.  However, the interpretative power of instability effects of mutation on proteins has been limited by the need for structural data covering the observed variation.  Consequently, analysis of the influence of genetic variation on protein stability is not routinely performed as part of clinical genomic workflows to detect pathogenic variation. The availability of AlphaFold predicted structures for whole genomes of proteins, including that of humans, now allows genome-wide assessment of the stability effects of genetic variation.  Importantly, this now allows routine assessment of personal genetic variation for missense variants that potentially have clinical interpretive value.  Here, we present both a web-tool and a standalone software package that allows prioritisation of genetic variation to identify just those variants that lead to strong destabilisation effects in essential proteins.  Our methodology innately down-weights variation is non-essential and redundant proteins, as these genes inherently harbour more variation deleterious to protein function.  Our predictive metrics are trained on the presumed non-disease-associated variation present in the GnomAD dataset.  For each missense variant from an incident clinical genome, a test statistic is calculated that assesses the magnitude of the destabilisation effect with respect to all other non-disease associated variation identified in this protein in the GnomAD dataset.  Missense variants in a personal genome are prioritised such that variation that produces unusually large stability effects in the encoded protein, given the population variation observed in that protein, are easily identified. The results of this methodology are assessed and contrasted between the HapMap and two clinical cohorts.