COVID-19 Genome Sequence Analysis for New Variant Prediction and Generation
Amin Ullah,
Khalid Mahmood Malik,
Abdul Khader Jilani Saudagar (),
Muhammad Badruddin Khan,
Mozaherul Hoque Abul Hasanat,
Abdullah AlTameem,
Mohammed AlKhathami and
Muhammad Sajjad ()
Additional contact information
Amin Ullah: CORIS Institute, Oregon State University, Corvallis, OR 97331, USA
Khalid Mahmood Malik: Department of Computer Science and Engineering, Oakland University, Rochester, MI 48309, USA
Abdul Khader Jilani Saudagar: Information Systems Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11564, Saudi Arabia
Muhammad Badruddin Khan: Information Systems Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11564, Saudi Arabia
Mozaherul Hoque Abul Hasanat: Information Systems Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11564, Saudi Arabia
Abdullah AlTameem: Information Systems Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11564, Saudi Arabia
Mohammed AlKhathami: Information Systems Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11564, Saudi Arabia
Muhammad Sajjad: Color and Visual Computing Lab, Department of Computer Science, Norwegian University of Science and Technology (NTNU), 2815 Gjøvik, Norway
Mathematics, 2022, vol. 10, issue 22, 1-16
Abstract:
The new COVID-19 variants of concern are causing more infections and spreading much faster than their predecessors. Recent cases show that even vaccinated people are highly affected by these new variants. The proactive nucleotide sequence prediction of possible new variants of COVID-19 and developing better healthcare plans to address their spread require a unified framework for variant classification and early prediction. This paper attempts to answer the following research questions: can a convolutional neural network with self-attention by extracting discriminative features from nucleotide sequences be used to classify COVID-19 variants? Second, is it possible to employ uncertainty calculation in the predicted probability distribution to predict new variants? Finally, can synthetic approaches such as variational autoencoder-decoder networks be employed to generate a synthetic new variant from random noise? Experimental results show that the generated sequence is significantly similar to the original coronavirus and its variants, proving that our neural network can learn the mutation patterns from the old variants. Moreover, to our knowledge, we are the first to collect data for all COVID-19 variants for computational analysis. The proposed framework is extensively evaluated for classification, new variant prediction, and new variant generation tasks and achieves better performance for all tasks. Our code, data, and trained models are available on GitHub (https://github.com/Aminullah6264/COVID19, accessed on 16 September 2022).
Keywords: artificial intelligence; deep learning; RNA analysis; SARS-CoV-2; self-attention; uncertainty analysis; variational autoencoders (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.mdpi.com/2227-7390/10/22/4267/pdf (application/pdf)
https://www.mdpi.com/2227-7390/10/22/4267/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:10:y:2022:i:22:p:4267-:d:973219
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().