Poster Presentation 27th Annual Lorne Proteomics Symposium 2022

Proteome Discoverer 3.0 software with the CHIMERYS intelligent search algorithm: an AI-driven leap forward in peptide identification (#124)

Martin Frejno 1 , Daniel Zolg 1 , Tobias Schmidt 1 , Siegfried Gessulat 1 , Michael Graber 1 , Florian Seefried 1 , Magnus Rathke-Kuhnert 1 , Samia Ben Fredj 1 , Kai Fritzemeier 2 , Frank Berg 2 , Waqas Nasir 2 , David Horn 3 , Bernard Delanghe 2 , Christoph Henrich 2 , Daniel Hermanson 3 , Bernhard Küster 4 , Mathias Wilhelm 4
  1. MSAID GmbH, Garching b Munchen, Germany
  2. Thermo Fisher Scientific GmbH, Bremen, Germany
  3. Thermo Fisher Scientific, San Jose, California, United States
  4. Technical University Munich, Freising, Germany

Matching peptide sequences to tandem mass spectra is integral to bottom-up proteomics, where chimeric spectra are estimated to constitute >50% of data-dependent acquisition data. Some search engines allow multi-pass searches or account for multiple possible precursors, however, such approaches result in an over or underutilization of measured fragment ions. This introduces errors or leaves valuable information unused, resulting in far fewer peptide identifications than could be obtained from the data. Here, we describe Proteome Discoverer 3.0 software with CHIMERYS, an intelligent search algorithm, that rethinks the analysis of tandem mass spectra. This innovative approach routinely doubles the number of peptide identifications and reaches identification rates of >80% for typical proteomic data sets. CHIMERYS uses accurate predictions of peptide fragment ion intensities and retention times provided by the deep learning framework INFERYS. CHIMERYS aims to explain as much measured intensity with as few candidate peptides as possible, resulting in the deconvolution of chimeric spectra. We validated this approach in multiple ways including entrapment searches and an in-silico chimeric spectra system. Here, analyzing a HeLa cell lysate digest with CHIMERYS identified 114k PSMs, 61k unique peptides and 7,300 unique protein groups at 1% FDR. This is a 3.5-, 2- and 1.5-fold increase when compared to Sequest HT, respectively, resulting in more identified peptides per protein and more identified proteins. CHIMERYS is compatible with all Orbitrap mass spectrometers but provides more additional identifications from the increased sensitivity of recent instruments. CHIMERYS provides exceptional performance with short chromatographic gradients and high protein loads, enabling higher throughput and a deeper mining of data. CHIMERYS is the first search algorithm that embraces chimeric spectra in a highly scalable, cloud-native, AI-powered implementation.