Mass spectrometry-based quantitative proteomics research relies heavily on the reliability of the downstream analysis for detection of diagnostic markers, alteration of expression patterns in response to different signals, and understanding pathogenicity mechanisms[1].
However, despite considerable developments of MS technologies, the data from label-free based proteomics analysis are still susceptible to systematic biases. The source of this bias is usually unknown and caused by variation of non-biological sources which include, but are not limited to, differences in sample preparation and handling, instrument calibration or changes in temperature. Unquestionably, this bias cannot be overcome only by adjusting the experimental settings [2,3].
Unwanted variation is one of major challenges in proteomics data analysis. The selection of a proper normalization method is crucial to achieve reliable results with low false discovery rate [4]. We have compared several normalisation methods on various Data Independent Acquisition (DIA) label-free proteomics plasma datasets; immunoprecipitations and global proteomics (cell lines). The normalization methods were evaluated in terms of their ability to reduce variation between technical replicates, their effect on differential expression analysis and their effect on the estimation of logarithmic fold changes. All the data analyses were done using the R-statistical programming language version 4.1.1. We found that Unwanted Variation III Complete (RUVIIIC) reduced the most variation between technical replicates leading to increased detection of differentially expression proteins.
References