TechRxiv
A_Comprehensive_Comparison_of_Pre_training_Language_Models.pdf (136.13 kB)
Download file

A Comprehensive Comparison of Pre-training Language Models

Download (136.13 kB)
preprint
posted on 2023-07-30, 12:36 authored by Tong GuoTong Guo

Recently, the development of pre-trained language models has brought natural language processing (NLP) tasks to the new state-of-the-art. In this paper we explore the efficiency of various pre-trained language models. We pre-train a list of transformer-based models with the same amount of text and the same training steps. The experimental results shows that the most improvement upon the origin BERT is adding the RNN-layer to capture more contextual information for short text understanding. But the conclusion is: There are no remarkable improvement for short text understanding for similar BERT structures. Data-centric method[12] can achieve better performance.

History

Email Address of Submitting Author

779222056@qq.com

Submitting Author's Institution

_

Submitting Author's Country

  • China

Usage metrics

    Licence

    Exports