Preparation_of_Brief_Papers_for_IEEE_TRANSACTIONS_and_JOURNALS__February_2017_.pdf (3.92 MB)
Download file

BERT_PLPS: A BERT-based Model for Predicting Lysine Phosphoglycerylation Sites

Download (3.92 MB)
posted on 2023-04-11, 16:26 authored by Songning LaiSongning Lai, Pengwei Wang, Lan Ye, Zhi Liu

As one of the most important post-translational modification processes, lysine phosphoglycerylation modifications affect many important biosynthetic processes in the human body. However, traditional experimental methods for the recognization of lysine phosphoglycerylation sites are not only expensive but also time-consuming. Computational techniques may provide an economical and efficient way to predict lysine phosphoglycerylation sites. Therefore, it is extremely necessary and meaningful to study and establish prediction models with high accuracy. In the present study, we propose a BERT-based model, BERT PLPS, which could predict accurately lysine phosphoglycerylation sites. This model extracts amino acid sequence features with three algorithms: CKSAAP, AAC, and BE. Sample equalization is performed using the ADASYN and KNN algorithms. The data are dimensionalized by the ISOMap algorithm, and the features are encoded into feature sequences by an encoder as the input to a BERT-based prediction model. To learn better the intrinsic biological language of lysine, we replaced the original static mask with a dynamic random mask. Compared to other machine learning or deep learning-based models, BERT PLPS exhibits up to 99.53% accuracy and outperforms the most advanced model (PLP FS) with an increase of approximately 0.35% on ACC and approximately 0.93% on MCC.


Email Address of Submitting Author

Submitting Author's Institution

The School of Information Science and Engineering, Shandong University, Qingdao 266237, China.

Submitting Author's Country

  • China