EconPapers    
Economics at your fingertips  
 

Probabilistic Forecasting Based Joint Detection and Imputation of Clustered Bad Data in Residential Electricity Loads

Soyeong Park, Seungwook Yoon, Byungtak Lee, Seokkap Ko and Euiseok Hwang
Additional contact information
Soyeong Park: School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005, Korea
Seungwook Yoon: School of Mechatronics, GIST, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005, Korea
Byungtak Lee: Honam Research Center, Electronics and Telecommunications Research Institute, Gwangju 61012, Korea
Seokkap Ko: Honam Research Center, Electronics and Telecommunications Research Institute, Gwangju 61012, Korea
Euiseok Hwang: School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005, Korea

Energies, 2020, vol. 14, issue 1, 1-13

Abstract: Residential electricity load data can include numerous types of bad data, even clustered bad data, as they that are typically captured by simple measurement instruments. For example, in the case of a time-series of Not-a-Number (NaN) errors, the values before or next to a NaN may appear as the sum of actual values during the times of the NaN series. To utilize load data that includes such erroneous data for prediction or data mining analysis, customized detection and imputation should be conducted. This study proposes a new joint detection and imputation method for handling clustered bad data in residential electricity loads. Examples of these data are known invalid data points, such as consecutive NaN or zero values followed by or being ahead of an outlier. The proposed joint detection and imputation scheme first investigates the neighbors of the invalid data points, using probabilistic forecasting techniques. These techniques are implemented by the next valid neighbors to determine whether there is an anomaly or not. Then, adaptive imputations are applied on the basis of the detection, the candidate point should be imputed simultaneously or not. To assess the potential of the newly proposed scheme to characterize the clustered bad data, we analyzed the electricity loads of 354 households. Moreover, joint detection and imputations are conducted to test with the randomly injected synthesized clustered bad data (containing NaNs of various lengths) that is followed by the summation of the actual NaN values. The proposed scheme succeeded in detecting clustered bad data with an accuracy of 95.5% and a false alarm rate of 3.6% for all households in the dataset. Outlier detection-assisted imputation schemes are evaluated for NaNs with optional outliers. Results demonstrate that these schemes improve the overall accuracy significantly compared to schemes without outlier detection.

Keywords: bad data detection; probabilistic forecasting; residential electricity load (search for similar items in EconPapers)
JEL-codes: Q Q0 Q4 Q40 Q41 Q42 Q43 Q47 Q48 Q49 (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/1996-1073/14/1/165/pdf (application/pdf)
https://www.mdpi.com/1996-1073/14/1/165/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jeners:v:14:y:2020:i:1:p:165-:d:472594

Access Statistics for this article

Energies is currently edited by Ms. Agatha Cao

More articles in Energies from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jeners:v:14:y:2020:i:1:p:165-:d:472594