loading page

Data-Driven Insights: Boosting Algorithms to Uncover Electricity Theft Patterns in AMI
  • +1
  • Inam Ullah Khan,
  • Arshid Ali,
  • C James Taylor,
  • Xiandong Ma
Inam Ullah Khan
Technology Hub, Computer Science Department, Edge Hill University

Corresponding Author:[email protected]

Author Profile
Arshid Ali
Department of Electrical Engineering and Computer Science, South Dakota State University
C James Taylor
School of Engineering, Lancaster University
Xiandong Ma
School of Engineering, Lancaster University


This study introduces a sophisticated supervised machine learning method for electric theft detection utilizing a customized Histogram Gradient Boosting (HGB) algorithm. Comprehensive preprocessing, including imputation, normalization, outlier management, and resampling, ensures the timeseries data is accurately prepared for analysis. The SMOTE-ENN algorithm corrects class imbalances, preparing the data for the feature optimization stage where crucial features are selected and extracted. The HGB algorithm, enhanced through Bayesian optimization, is central to the training process, resulting in a model that precisely classifies electricity consumption patterns as genuine or fraudulent. The robustness of the model is assessed against other recognized boosting methods, such as Adaptive Boosting (ADB), Gradient Boosting Decision Tree (GBDT), and LightGBM, alongside various ensemble and traditional machine learning models. Utilizing key performance metrics like accuracy, F1 score, and AUC for validation, the proposed model yields very promising results, with a 93% accuracy, 95% F1 score, and 98% AUC, outperforming the comparison group under similar dataset and hyperparameter conditions. This underscores the model's potential as a highly accurate tool for combating electricity theft within an advanced metering infrastructure (AMI).
13 Mar 2024Submitted to TechRxiv
19 Mar 2024Published in TechRxiv