loading page

Large Language Models Training, Challenges, Applications, and Development
  • Kazem Taghandiki,
  • Mohammad Mohammadi
Kazem Taghandiki
Faculty Member, Department of Computer Engineering, Technical and Vocational University (TVU)
Author Profile
Mohammad Mohammadi
Faculty Member, Department of Computer Engineering, Technical and Vocational University (TVU)

Abstract

This paper delves into the realm of large language models (LLMs), which are potent tools within the domain of artificial intelligence, facilitating comprehensive comprehension and utilization of human language by computers. LLMs, exemplified by GPT-3, exhibit remarkable proficiency across various linguistic tasks, ranging from writing to language translation and customer service. However, despite their efficacy, these models are not without drawbacks; they often manifest biases, possess opaque workings, and consume substantial energy resources. Tracing the evolution of LLMs from rudimentary models to sophisticated constructs capable of intricate language processing, this paper explores their learning mechanisms, diverse applications, and associated concerns, including the imperative of fairness and privacy preservation. Looking towards the future, it advocates for research endeavors aimed at enhancing the environmental sustainability, comprehensibility, and impartiality of LLMs. The overarching objective is to harness these models effectively and ethically, cognizant of their transformative potential across technological and societal landscapes. In addition, it should be noted that the author extensively reviewed and studied over 40 articles within a span of 2 months to comprehensively acquaint themselves with the subject matter, ensuring the most effective presentation of large language models (LLMs) to the interested audience.
25 May 2024Submitted to TechRxiv
03 Jun 2024Published in TechRxiv