Poster Presentation 27th Annual Lorne Proteomics Symposium 2022

Detection of novel protein coding regions in Campylobacter jejuni using N-TAILS  (#109)

Joel Cain 1 2 , Melanie White 2 3 , Stuart Cordwell 1 2 4
  1. School of Life and Environmental Sciences, University of Sydney, Sydney, NSW, Australia
  2. Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
  3. School of Medical Sciences, The University of Sydney, Sydney, NSW, Australia
  4. Sydney Mass Spectrometry, University of Sydney, Sydney, NSW, Australia

In proteomics experiments, protein identification via automated database searching relies on the definition of protein coding regions though genome annotation. While this approach is sufficient to capture the vast proportion of proteins in any given proteome, it restricts detection to exclude regions not conforming to the classical definitions of a gene (e.g. very small predicted proteins). Providing physical evidence for the expression of novel protein coding regions is a key step on the path to comprehensively mapping any given biological system and is of particular importance for initiatives such as the Human Proteome Project.

N-terminal amine isotopic labelling of substrates (N-TAILS) is a technique that allows the enrichment of N-terminal peptides from complex mixtures. While commonly applied to the functional characterisation of proteases and their recognition sites, this method has broader applicability towards the identification of novel protein coding regions and has the added benefit of mitigating dynamic range issues which would hamper detection of small proteins that generate few tryptic peptides, if any, during shotgun proteomics.

Here, we interrogated N-TAILS datasets against a 6-frame translation of the Campylobacter jejuni NCTC11168 genome to allow for unbiased detection of products of all potential coding regions. C. jejuni NCTC11168 was chosen due to recent identification of a previously unrecognised protein coding region in this strain, despite its extensive genome annotation. Mapping of 9628 N-terminal peptides detected against existing gene annotations identified a subset, including true N-termini from previously classified pseudogenes (e.g. Cj0742 and Cj1064), which did not agree with these definitions and suggest their classification in genome repositories be revisited. N-termini from 26 previously unidentified open reading frames were also detected, and we provide additional evidence to support their status as protein coding elements of C. jejuni.