loading page

Dimensional Outlier Detection
  • Jiawei Yang,
  • Sylwan Rahardja,
  • Susanto Rahardja
Jiawei Yang

Corresponding Author:[email protected]

Author Profile
Sylwan Rahardja
Susanto Rahardja

Abstract

Few outlier detectors have considered all industrial standards in time complexity, space complexity, accuracy, interpretability, and scalability, which is very challenging in big data applications. To address these challenges, we proposed a framework for dimensional outlier detection (FDOD) to detect outliers that are dimensionally separable. Based on FDOD, two detectors were discovered. Both outperformed all 22 baseline state-of-the-art detectors including 10 detectors published in recent three years with 18 real-world datasets. Compared to existing standards, one proposed detector had around 4% improvement evaluated by the area under the receiver operating characteristic (ROC AUC) and required only 42% computational time and 13% memory usage and another proposed detector was even better. The techniques suggested are suitable for deployment in mobile settings that require the utilization of lightweight models. This study also poses a question to the outlier detection domain that whether technically complex solutions are really needed for outlier detection in big data applications since normally complex solutions require more computational resources. Besides, we advocate for opening larger outlier detection datasets from the industry to support the development of outlier detection in big data. The implementation of the proposed methods can be found on www.OutlierNet.com for reproducibility.
19 Mar 2024Submitted to TechRxiv
29 Mar 2024Published in TechRxiv