Detection of COVID‐19 Using Protein Sequence Data via Machine Learning Classification Approach
Siti Aminah,
Gianinna Ardaneswari,
Mufarrido Husnah,
Ghani Deori and
Handi Bagus Prasetyo
Journal of Applied Mathematics, 2023, vol. 2023, issue 1
Abstract:
The emergence of severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) in late 2019 resulted in the COVID‐19 pandemic, necessitating rapid and accurate detection of pathogens through protein sequence data. This study is aimed at developing an efficient classification model for coronavirus protein sequences using machine learning algorithms and feature selection techniques to aid in the early detection and prediction of novel viruses. We utilized a dataset comprising 2000 protein sequences, including 1000 SARS‐CoV‐2 sequences and 1000 non‐SARS‐CoV‐2 sequences. Feature extraction provided 27 essential features representing the primary structural data, achieved through the Discere package. To optimize performance, we employed machine learning classification algorithms such as K‐nearest neighbor (KNN), XGBoost, and Naïve Bayes, along with feature selection techniques like genetic algorithm (GA), LASSO, and support vector machine recursive feature elimination (SVM‐RFE). The SVM‐RFE+KNN model exhibited exceptional performance, achieving a classification accuracy of 99.30%, specificity of 99.52%, and sensitivity of 99.55%. These results demonstrate the model’s efficacy in accurately classifying coronavirus protein sequences. Our research successfully developed a robust classification model capable of early detection and prediction of protein sequences in SARS‐CoV‐2 and other coronaviruses. This advancement holds great promise in facilitating the development of targeted treatments and preventive strategies for combating future viral outbreaks.
Date: 2023
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1155/2023/9991095
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wly:jnljam:v:2023:y:2023:i:1:n:9991095
Access Statistics for this article
More articles in Journal of Applied Mathematics from John Wiley & Sons
Bibliographic data for series maintained by Wiley Content Delivery ().