Power Laws in the Size Distribution of Gene Families in Complete Genomes: Biological Interpretations
Martijn A. Huynen and
Erik van Nimwegen
Working Papers from Santa Fe Institute
Abstract:
We compare the frequency distribution of gene family sizes in the complete genomes of five Bacteria ({\it Escherichia coli, Haemophilus influenzae, Mycoplasma genitalium, Mycoplasma pneumoniae,} and {\it Synechocystis sp. PCC6803}), one Archaeon ({\it Methanococcus janaschii}), one eukaryote ({\it Saccharomyces cerevisiae}), the Vaccinia virus and the bacteriophage T4. The sizes of the gene families versus their frequencies show power-law distributions that tend to become flatter (have a larger exponent) as the number of genes in the genome increases. Power-law distributions generally occur as the limit distribution of a multiplicative stochastic process with a boundary constraint. The exponent of the power-law distribution depends on the average and the variance of the logarithm of the multiplication factor. We discuss various models that can account for a multiplicative process determining the size of gene families in the genome. In particular we argue that the size distribution of the gene families in complete genomes indicates that the genes within a family do not behave independently, and that the dynamics of gene family sizes does not operate at the level of single genes.
Keywords: Gene family; comparative genome analysis; genome evolution; power-law distribution (search for similar items in EconPapers)
Date: 1997-03
References: View complete reference list from CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wop:safiwp:97-03-025
Access Statistics for this paper
More papers in Working Papers from Santa Fe Institute Contact information at EDIRC.
Bibliographic data for series maintained by Thomas Krichel ().