Stegomalware: A Systematic Survey of Malware Hiding and Detection in Images, Machine Learning Models and Research Challenges

Raj chaganti; vinayakumar R; Mamoun Alazab; Tuan Pham

doi:10.36227/techrxiv.16755457.v1

loading page

Stegomalware: A Systematic Survey of Malware Hiding and Detection in Images, Machine Learning Models and Research Challenges

Raj chaganti ,
vinayakumar R ,
Mamoun Alazab ,
Tuan Pham

Abstract

Malware distribution to the victim network is commonly performed through file attachments in phishing email or downloading illegitimate files from the internet, when the victim interacts with the source of infection. To detect and prevent the malware distribution in the victim machine, the existing end device security applications may leverage sophisticated techniques such as signature-based or anomaly-based, machine learning techniques. The well-known file formats Portable Executable (PE) for Windows and Executable and Linkable Format (ELF) for Linux based operating system are used for malware analysis and the malware detection capabilities of these files has been well advanced for real time detection. But the malware payload hiding in multimedia like cover images using steganography detection has been a challenge for enterprises, as these are rarely seen and usually act as a stager in sophisticated attacks. In this article, to our knowledge, we are the first to try to address the knowledge gap between the current progress in image steganography and steganalysis academic research focusing on data hiding and the review of the stegomalware (malware payload hiding in images) targeting enterprises with cyberattacks current status. We present the stegomalware history, generation tools, file format specification description. Based on our findings, we perform the detail review of the image steganography techniques including the recent Generative Adversarial Networks (GAN) based models and the image steganalysis methods including the Deep Learning opportunities and challenges in stegomalware generation and detection are presented based on our findings.