Entropy Based Biological Sequence Study
Bimal Kumar Sarkar
A chapter in Entropy and Exergy in Renewable Energy from IntechOpen
Abstract:
SARS-CoV-2 virus strains are taken into consideration for the analysis of digitized sequences of information by means of the notions of entropy. The occurrence of a particular pattern in the corona viral sequence is paid a special attention. The incidence of genetic word is represented in a density means. The incidence frequency of the q-gram genetic word is determined with the help of finite impulse response (FIR) filter along the sequence. It is in turn, used for the determination of the probability distribution of the genetic word incidence as the input for the calculation of entropy in the sequence. The sequence entropy is further used for principal component analysis (PCA) to determine the similarity/dissimilarity between the viral sequences. We have considered seven human corona virus sequences. Entropy based similarity study for SARS-CoV-2 strains is presented in this work.
Keywords: sequences; genetic information; FIR filter; entropy; PCA; corona virus (search for similar items in EconPapers)
JEL-codes: Q20 Q40 (search for similar items in EconPapers)
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.intechopen.com/chapters/75997 (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ito:pchaps:222875
DOI: 10.5772/intechopen.96615
Access Statistics for this chapter
More chapters in Chapters from IntechOpen
Bibliographic data for series maintained by Slobodan Momcilovic ().