Multiple partition Markov model for B.1.1.7, B.1.351, B.1.617.2, and P.1 variants of SARS-CoV 2 virus

García, Jesús Enrique; González-López, Verónica Andrea; Tasca, Gustavo Henrique

Multiple partition Markov model for B.1.1.7, B.1.351, B.1.617.2, and P.1 variants of SARS-CoV 2 virus

Jesús Enrique García (), Verónica Andrea González-López () and Gustavo Henrique Tasca
Additional contact information
Jesús Enrique García: University of Campinas
Verónica Andrea González-López: University of Campinas

Computational Statistics, 2025, vol. 40, issue 6, No 13, 3153-3189

Abstract: Abstract With tools originating from Markov processes, we investigate the similarities and differences between genomic sequences in FASTA format coming from four variants of the SARS-CoV 2 virus, B.1.1.7 (UK), B.1.351 (South Africa), B.1.617.2 (India), and P.1 (Brazil). We treat the virus’ sequences as samples of finite memory Markov processes acting in $$A=\{a,c,g,t\}.$$ A = { a , c , g , t } . We model each sequence, revealing some heterogeneity between sequences belonging to the same variant. We identified the five most representative sequences for each variant using a robust notion of classification, see Fernández et al. (Math Methods Appl Sci 43(13):7537–7549. https://doi.org/10.1002/mma.5705 ). Using a notion derived from a metric between processes, see García et al. (Appl Stoch Models Bus Ind 34(6):868–878. https://doi.org/10.1002/asmb.2346 ), we identify four groups, each group representing a variant. It is also detected, by this metric, global proximity between the variants B.1.351 and B.1.1.7. With the selected sequences, we assemble a multiple partition model, see Cordeiro et al. (Math Methods Appl Sci 43(13):7677–7691. https://doi.org/10.1002/mma.6079 ), revealing in which states of the state space the variants differ, concerning the mechanisms for choosing the next element in A. Through this model, we identify that the variants differ in their transition probabilities in eleven states out of a total of 256 states. For these eleven states, we reveal how the transition probabilities change from variant (group of variants) to variant (group of variants). In other words, we indicate precisely the stochastic reasons for the discrepancies.

Keywords: Metric between Markov processes; Bayesian information criterion; Model selection; Model comparison (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s00180-022-01291-8 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:compst:v:40:y:2025:i:6:d:10.1007_s00180-022-01291-8

Ordering information: This journal article can be ordered from
http://www.springer.com/statistics/journal/180/PS2

DOI: 10.1007/s00180-022-01291-8

Access Statistics for this article

Computational Statistics is currently edited by Wataru Sakamoto, Ricardo Cao and Jürgen Symanzik

More articles in Computational Statistics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().