A_Comprehensive_Exploration_of_Pre_training_Language_Models.pdf (135.75 kB)
Download fileA Comprehensive Exploration of Pre-training Language Models
Recently, the development of pre-trained language models
has brought natural language processing (NLP) tasks to the new state-of-the-art. In this paper we explore the efficiency of various pre-trained
language models. We pre-train a list of transformer-based models with
the same amount of text and the same training steps. The experimental results shows that the most improvement upon the origin BERT is
adding the RNN-layer to capture more contextual information for short text
understanding.