IMPLEMENTATION OF K NEAREST NEIGHBORS WITH CROSS VALIDATION AND EUCLIDEAN DISTANCE FOR ELECTRICITY MISUSE PREDICTION AT PLN UP3 DEMAK
DOI:
https://doi.org/10.30659/Abstract
Electricity misuse is a critical issue that severely impacts both operational efficiency and revenue within utility companies, particularly in developing regions. This study presents the implementation of the K Nearest Neighbors (K NN) algorithm with cross validation and Euclidean distance metrics to predict electricity misuse in the PLN UP3 Demak area. The analysis focuses on the P2 and P3 customer segments, known for their diverse consumption patterns and higher risk of fraudulent activities. Given the inherent class imbalance in the dataset here instances of misuse are significantly outnumbered by legitimate consumption he Synthetic Minority Over sampling Technique (SMOTE) was applied to improve the model’s ability to detect minority class instances.
Our findings reveal that applying SMOTE resulted in a substantial increase in the model’s accuracy, precision, and recall, demonstrating its effectiveness in balancing the dataset. Specifically, the application of K NN with SMOTE showed improved detection of irregular consumption patterns indicative of electricity misuse, which were less discernible in the original imbalanced dataset. The comparative analysis between models trained with and without SMOTE underscored the importance of addressing class imbalance to achieve reliable predictive performance. Furthermore, the study identified distinct behavioral patterns in P2 and P3 customers, which are critical for early detection of potential misuse. These findings were supported by the cross validation results, which confirmed the model's robustness and its capability to generalize well to unseen data.
Overall, this research provides valuable insights for utility companies, highlighting the importance of implementing advanced machine learning techniques like K NN with SMOTE to enhance fraud detection capabilities. The study also emphasizes the necessity of continuous monitoring and analysis of customer consumption patterns to proactively identify and mitigate potential misuse.
References
[1] Sumarlin, “Implementation of the K Nearest Neighbor Algorithm as a Decision Support Tool for Scholarship Recipients Classification,” Journal of Business Information Systems, vol. 5, no. 1, pp. 52–62, Apr. 2015, doi: 10.21456/VOL5ISS1PP52 62.
[2] Danny, M., Muhidin, A., & Jamal, A., “Application of the K Nearest Neighbor Machine Learning Algorithm to Predict Sales of Best Selling Products,” Brilliance: Research of Artificial Intelligence, vol. 4, no. 1, pp. 255–264, Jun. 2024, doi: 10.47709/brilliance.v4i1.4063.
[3] Pradipta, G. A., Wardoyo, R., Musdholifah, A., Sanjaya, I. N. H., & Ismail, M., “SMOTE for Handling Imbalanced Data Problem: A Review,” 2021 6th International Conference on Informatics and Computing, ICIC 2021, 2021, doi: 10.1109/ICIC54025.2021.9632912.
[4] Iqbal, M. S., Limon, M. F. A., Kabir, M. M., Rabby, M. K. M., Soeb, M. J. A., & Jubayer, M. F., “A Hybrid Optimization Algorithm for Improving Load Frequency Control in Interconnected Power Systems,” Expert Systems with Applications, vol. 249, Sep. 2024, doi: 10.1016/J.ESWA.2024.123702.
[5] Soori, M., Arezoo, B., & Dastres, R., “Optimization of Energy Consumption in Industrial Robots: A Review,” Cognitive Robotics, vol. 3, pp. 142–157, Jan. 2023, doi: 10.1016/J.COGR.2023.05.003.
[6] Zhang, L., & Jánošík, D., “Enhanced Short Term Load Forecasting with Hybrid Machine Learning Models: CatBoost and XGBoost Approaches,” Expert Systems with Applications, vol. 241, p. 122686, May 2024, doi: 10.1016/J.ESWA.2023.122686.
[7] Kim, Y. S., Kim, M. K., Fu, N., Liu, J., Wang, J., & Srebric, J., “Investigating the Impact of Data Normalization Methods on Predicting Electricity Consumption in a Building Using Different Artificial Neural Network Models,” Sustainable Cities and Society, p. 105570, Jun. 2024, doi: 10.1016/J.SCS.2024.105570.
[8] Kataray, T., et al., “Integration of Smart Grid with Renewable Energy Sources: Opportunities and Challenges – A Comprehensive Review,” Sustainable Energy Technologies and Assessments, vol. 58, p. 103363, Aug. 2023, doi: 10.1016/J.SETA.2023.103363.
[9] Enoch, N. A., George, P. G., & Aning, J., “Predicting the Remaining Lifetime of Distribution Transformers Using Machine Learning,” American Journal of Engineering and Applied Sciences, vol. 13, no. 4, pp. 627–638, Apr. 2020, doi: 10.3844/AJEASSP.2020.627.638.
[10] Granados Lieberman, D., Romero Troncoso, R. J., Osornio Rios, R. A., Garcia Perez, A., & Cabal Yepez, E., “Techniques and Methodologies for Power Quality Analysis and Disturbances Classification in Power Systems: A Review,” IET Generation, Transmission & Distribution, vol. 5, no. 4, pp. 519–529, Apr. 2011, doi: 10.1049/IET GTD.2010.046
[11] Nasution, D. A., Khotimah, H. H., & Chamidah, N., “Comparison of Data Normalization for Wine Classification Using K NN Algorithm,” CESS (Journal of Computer Engineering, System and Science), vol. 4, no. 1, pp. 78–82, Jan. 2019, doi: 10.24114/CESS.V4I1.11458.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 retno supiyanti (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
License. Articles in PULSE—JEIB are published open access under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. This permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as appropriate credit is given to the original authors and the source, a link to the license is provided, and changes are indicated.
Copyright. Authors retain copyright and grant the journal the right of first publication. The publisher may archive and index the published version and its metadata in trusted services (e.g., Crossref, PKP PN/LOCKSS/CLOCKSS).
Fees. APC: no fee (no submission or processing charges).