loading page

A Survey of Backpropagation-free Training For LLMS
  • +2
  • Hanzi Mei,
  • Dongqi Cai,
  • Yaozong Wu,
  • Shangguang Wang,
  • Mengwei Xu
Hanzi Mei
Beijing University of Posts and Telecommunications (BUPT)
Dongqi Cai
Beijing University of Posts and Telecommunications (BUPT)

Corresponding Author:[email protected]

Author Profile
Yaozong Wu
Beijing University of Posts and Telecommunications (BUPT)
Shangguang Wang
Beijing University of Posts and Telecommunications (BUPT)
Mengwei Xu
Beijing University of Posts and Telecommunications (BUPT)

Abstract

Large language models (LLMs) have achieved remarkable performance in various downstream tasks. However, training LLMs is computationally expensive and requires a large amount of memory. To address this issue, backpropagation-free (BP-free) training has been proposed as a promising approach to reduce the computational and memory costs of training LLMs. In this survey, we provide a comprehensive overview of BP-free training for LLMs. We first outline three mainstream BP-free training methods. Subsequently, we introduce their optimizations for LLMs. The goal of this survey is to provide a comprehensive understanding of BP-free training for LLMs and to inspire future research in this area.
19 Mar 2024Submitted to TechRxiv
29 Mar 2024Published in TechRxiv