Abstract
Contribution
We provide initial algorithms and a framework leading to assisted root
cause analysis through a modular architecture including collection,
identification, analysis, and presentation steps. Our proposed framework
creates pre-structured data from vast heterogeneous datasets
automatically, enriches the data with additional information from the CI
system, and adds fine-grained default and user-defined labels that
support the root cause analysis of failures.
Background
Projects spanning hundreds of thousands of lines of code and several
thousand daily continuous integration workflows cannot rely on manual
prelabeling and qualitative interviews to generate meaningful
improvements to broken CI job runs.
Evaluation
We evaluated our approach by measuring manual root cause analysis times
over several CI jobs. The data we used is publicly available via the
Kubernetes and
OpenShift
projects, allowing every researcher to continue and reproduce our work.