Cloudstrike: Chaos Engineering for Security and Resiliency in Cloud Infrastructure

loading page

Abstract

Most cyber-attacks and data breaches in cloud

infrastructure are due to human errors and misconfiguration

vulnerabilities. Cloud customer-centric tools are lacking, and existing

security models do not efficiently tackle these security challenges.

Novel security mechanisms are imperative, therefore, we

propose Risk-driven Fault Injection (RDFI) techniques to tackle

these challenges. RDFI applies the principles of chaos engineering

to cloud security and leverages feedback loops to execute, monitor,

analyze and plan security fault injection campaigns, based on

a knowledge-base. The knowledge-base consists of fault models

designed from cloud security best practices and observations

derived during iterative fault injection campaigns. Furthermore,

the observations indicate security weaknesses and verify the

correctness of security attributes (integrity, confidentiality and

availability) and security controls. Ultimately this knowledge is

critical in guiding security hardening efforts and risk analysis.

We have designed and implemented the RDFI strategies including

various chaos algorithms as a software tool: CloudStrike. Furthermore,

CloudStrike has been evaluated against infrastructure

deployed on two major public cloud systems: Amazon Web Service

and Google Cloud Platform. The time performance linearly

increases, proportional to increasing attack rates. Similarly, CPU

and memory consumption rates are acceptable. Also, the analysis

of vulnerabilities detected via security fault injection has been

used to harden the security of cloud resources to demonstrate the

value of CloudStrike. Therefore, we opine that our approaches

are suitable for overcoming contemporary cloud security issues

2020Published in IEEE Access volume 8 on pages 123044-123060. 10.1109/ACCESS.2020.3007338