Flight delay prediction: Evaluating machine learning algorithms for enhanced accuracy
Sarah Ahmed A AlBassam and
Dhafir N AlShahrani
PLOS ONE, 2025, vol. 20, issue 12, 1-17
Abstract:
Flight delays pose substantial operational and economic challenges for airlines, directly affecting scheduling efficiency, resource allocation, and passenger satisfaction. Accurate prediction of arrival delays is therefore critical for optimizing airline operations and enhancing customer experience. This study systematically evaluates the predictive performance of six machine learning classifiers—Decision Tree, Random Forest, Support Vector Classifier (SVC), Logistic Regression, K-Nearest Neighbors (KNN), and Naive Bayes—on a comprehensive flight dataset, with particular attention to the challenges posed by class imbalance. To mitigate skewed class distributions, resampling techniques including Random Oversampling, Synthetic Minority Oversampling Technique (SMOTE), and Adaptive Synthetic Sampling (ADASYN) were applied to the training data. Model performance was rigorously assessed using stratified 10-fold cross-validation and further validated on a hold-out test set, employing multiple evaluation metrics: Accuracy, F1-score, Matthews Correlation Coefficient (MCC), and ROC-AUC. The results demonstrate that Random Forest combined with Random Oversampling and Decision Tree combined with SMOTE both achieved the highest predictive performance (accuracy 0.90, F1-score 0.90, MCC 0.73, ROC-AUC 0.87. Notably, simpler models such as Naive Bayes exhibited competitive results under balanced conditions, underscoring the continued relevance of probabilistic classifiers in certain operational contexts. These findings highlight the critical role of resampling strategies and rigorous cross-validation in developing reliable, high-performing predictive models for imbalanced flight delay datasets, offering actionable insights for both airline operations and data-driven decision-making.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0335141 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 35141&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0335141
DOI: 10.1371/journal.pone.0335141
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().