loading page

A Comparative Analysis of Large Language Models to Evaluate Robustness and Reliability in Adversarial Conditions
  • Takeshi Goto,
  • Kensuke Ono,
  • Akira Morita
Takeshi Goto

Corresponding Author:[email protected]

Author Profile
Kensuke Ono
Author Profile
Akira Morita
Author Profile


This study went on a comprehensive evaluation of four prominent Large Language Models (LLMs)-Google Gemini, Mistral 8x7B, ChatGPT-4, and Microsoft Phi-1.5-to assess their robustness and reliability under a variety of adversarial conditions. Utilizing the Microsoft PromptBench dataset, the research investigates each model's performance against syntactic manipulations, semantic alterations, and contextually misleading cues. The findings reveal notable differences in model resilience, highlighting the distinct strengths and weaknesses of each LLM in responding to adversarial challenges. Comparative analysis underscores the necessity for multifaceted evaluation approaches to enhance model resilience, suggesting future research directions involving the augmentation of training datasets with adversarial examples and the exploration of advanced natural language understanding algorithms. This study contributes to the ongoing discourse in LLM research by providing insights into model vulnerabilities and advocating for comprehensive strategies to bolster LLM robustness against the evolving landscape of adversarial threats.
21 Mar 2024Submitted to TechRxiv
29 Mar 2024Published in TechRxiv