EconPapers    
Economics at your fingertips  
 

Enhancing Electricity Theft Detection through K-Nearest Neighbors and Logistic Regression Algorithms with Synthetic Minority Oversampling Technique: A Case Study on State Electricity Company (PLN) Customer Data

Yan Maraden (), Gunawan Wibisono, I Gde Dharma Nugraha, Budi Sudiarto, Fauzan Hanif Jufri, Kazutaka and Anton Satria Prabuwono
Additional contact information
Yan Maraden: Departement of Electrical Engineering, Universitas Indonesia, Depok 16424, Indonesia
Gunawan Wibisono: Departement of Electrical Engineering, Universitas Indonesia, Depok 16424, Indonesia
I Gde Dharma Nugraha: Departement of Electrical Engineering, Universitas Indonesia, Depok 16424, Indonesia
Budi Sudiarto: Departement of Electrical Engineering, Universitas Indonesia, Depok 16424, Indonesia
Fauzan Hanif Jufri: Departement of Electrical Engineering, Universitas Indonesia, Depok 16424, Indonesia
Kazutaka: Departement of Electrical Engineering, Universitas Indonesia, Depok 16424, Indonesia
Anton Satria Prabuwono: Faculty of Computing and Information Technology in Rabigh, King Abdulaziz University, Rabigh 21911, Saudi Arabia

Energies, 2023, vol. 16, issue 14, 1-24

Abstract: Electricity theft has caused massive losses and damage to electricity utilities. The damage affects the electricity supply’s quality and increases the generation load. The losses happen not only for the electricity utilities but also affect the legitimate users who have to pay excessive electricity bills. That is why the method to detect electricity theft is indispensable. Recently, machine learning algorithms have been used to develop a model for detecting electricity theft. However, most algorithms have problems due to imbalanced data, overfitting issues, and lack of data. Therefore, this paper proposes a solution that implements the oversampling technique to address the problems and increase the developed model’s accuracy. It is used to perform oversampling on the imbalanced dataset. Our proposed method consists of a pre-processing step to remove empty values and extract several parameters. After that, the oversampling technique is performed on the result of the pre-processing step. The logistic regression model combined with the oversampling techniques shows the best performance results on the developed model of electricity theft detection based on the state electricity company customers. The experiment shows that the proposed method, logistic regression combined with the synthetic minority oversampling technique, shows superior performance in terms of the accuracy of the training data and data testing, precision, recall, and F1-scores of 98.97%, 98.7%, 95%, 99%, and 97%, respectively. Moreover, the experiment also shows that the proposed solution outperforms existing methods.

Keywords: machine learning; k-nearest neighbors; logistic regression; anomalies detection; electricity theft (search for similar items in EconPapers)
JEL-codes: Q Q0 Q4 Q40 Q41 Q42 Q43 Q47 Q48 Q49 (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1996-1073/16/14/5405/pdf (application/pdf)
https://www.mdpi.com/1996-1073/16/14/5405/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jeners:v:16:y:2023:i:14:p:5405-:d:1195106

Access Statistics for this article

Energies is currently edited by Ms. Agatha Cao

More articles in Energies from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jeners:v:16:y:2023:i:14:p:5405-:d:1195106