Phylogenetic Trees Construction with Compressed DNA Sequences Using GENBIT COMPRESS Tool

RajaRajeswari, P.; Raju, S. Viswanadha

Phylogenetic Trees Construction with Compressed DNA Sequences Using GENBIT COMPRESS Tool

P. RajaRajeswari () and S. Viswanadha Raju
Additional contact information
P. RajaRajeswari: K L University
S. Viswanadha Raju: JNTUH

Annals of Data Science, 2017, vol. 4, issue 1, No 6, 105-121

Abstract: Abstract The data contained in the DNA atom for even basic unicellular life forms is huge and requires proficient capacity. Proficient capacity implies, expulsion of all excess from the information being put away. The Proposed Compression calculation “GENBIT Compress” is solely intended to dispense with all repetition from the DNA groupings of extensive genomes. We characterize a pressure separation, taking into account an ordinary compressor to show it is a permissible separation. Just as of late have researchers started to value the way that pressure proportions imply a lot of essential measurable data. In applying the methodology, we have utilized another DNA succession compressor “GENBIT Compress”. The NCD is universal in that it is not restricted to a specific application area, and works across application area boundaries. A theoretical precursor, the normalized information distance, is provably optimal in the sense that it minimises every computable normalized metric that satisfies a certain density requirement. However, the optimality comes at the price of using the non-computable notion of Kolmogorov complexity. We propose precise notions of similarity metric, normal compressor, and show that the NCD based on a normal compressor is a similarity metric that approximates optimality The normalized compression distance, an efficiently computable, and thus practically applicable form of the normalized information distance is used to calculate Distance Matrix The normalized compression distance, an effectively processable, and along these lines for all intents and purposes relevant type of the standardized data separation is utilized to figure Distance Matrix. In this paper this new separation framework is proposed to recreate Phylogenetic tree. Phylogeny are the fundamental device for speaking to the relationship among organic elements. Phylogenetic remaking techniques endeavor to locate the developmental history of given arrangement of species. This history is generally depicted by an edge weighted tree, where edges relate to various branches of advancement, and the heaviness of an edge compares to the measure of developmental change on that specific branch. We developed a phylogenetic tree with BChE DNA arrangements of warm blooded creatures giving new proposed separation grid by GENBIT compressor to NJ (Neighbor-Joining calculation) tree. The results in the present research confirm the existence of low compression ratios for natural DNA sequences with high repetitive DNA bases(A, C, G, T), the more repetitive bases, the less is their compression ratios. The ultimate goal is, of course, to learn the “genome organization” principles, and explain this organization using our knowledge about evolution.

Keywords: Normalized compression distance; Kolmogorov complexity; GENBIT compress; Phylogeny; Bioinformatics; Distance matrix; Phylogenetic tree; Neighbor-joining algorithm (search for similar items in EconPapers)
Date: 2017
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s40745-016-0098-4 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:aodasc:v:4:y:2017:i:1:d:10.1007_s40745-016-0098-4

Ordering information: This journal article can be ordered from
https://www.springer ... gement/journal/40745

DOI: 10.1007/s40745-016-0098-4

Access Statistics for this article

Annals of Data Science is currently edited by Yong Shi

More articles in Annals of Data Science from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().