Modeling the Determinants of Protein and Phosphoprotein Abundance in Cancers
Protein abundance and post-translational modification are important players in cancers. In this talk, we report the first place entry to the 2017 NCI-CPTAC DREAM Proteogenomics Challenge for predicting protein abundance (based on mRNA levels), and predicting phosphorylation levels (based on protein levels). Our model consists of four parts, motivated by the principles behind protein and phosphoprotein level regulation: 1) global model capturing the correlation between mRNA and protein abundances, and protein abundance and phosphorylation level of a single gene; 2) gene-specific model capturing the inter-dependencies across different genes; 3) cross-tissue model capturing the information from shared regulatory networks and pathways across cell types; 4) cross-phosphorylation site model capturing the co-regulation of different phosphorylation sites of the same gene. The third and the fourth models are not yet reported in lilterature, and set the new state-of-the-field for predicting protein and phospho-protein abundances.