The multiple sequence sets: problem and heuristic algorithms
Kang Ning () and
Hon Wai Leong ()
Additional contact information
Kang Ning: University of Michigan
Hon Wai Leong: National University of Singapore
Journal of Combinatorial Optimization, 2011, vol. 22, issue 4, No 21, 778-796
Abstract:
Abstract “Sequence set” is a mathematical model used in many applications such as biological sequences analysis and text processing. However, “single” sequence set model is not appropriate for the rapidly increasing problem size. For example, very large genome sequences should be separated and processed chunk by chunk. For these applications, the underlying mathematical model is “Multiple Sequence Sets” (MSS). To process multiple sequence sets, sequences are distributed to different sets and then sequences on each set are processed in parallel. Deriving effective algorithm for MSS processing is challenging. In this paper, we have first defined the cost functions for the problem of Process of Multiple Sequence Sets (PMSS). The PMSS problem is then formulated as to minimize the total cost of process. Based on the analysis of the features of multiple sequence sets, we have proposed the Distribution and Deposition (DDA) algorithm and DDA* algorithm for PMSS problem. In DDA algorithm, the sequences are first distributed to multiple sets according to their alphabet contents; then sequences in each set are processed by deposition algorithm. The DDA* algorithm differs from the DDA algorithm in that the DDA* algorithm distributes sequences by clustering based on a set of sequence features. Experiments showed that the results of DDA and DDA* are always smaller than other algorithms, and DDA* outperformed DDA in most instances. The DDA and DDA* algorithms were also efficient both in time and space.
Keywords: Multiple sequence sets; Distribution and deposition; Shortest common supersequence; Sequence features; Performance ratio (search for similar items in EconPapers)
Date: 2011
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s10878-010-9329-3 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:jcomop:v:22:y:2011:i:4:d:10.1007_s10878-010-9329-3
Ordering information: This journal article can be ordered from
https://www.springer.com/journal/10878
DOI: 10.1007/s10878-010-9329-3
Access Statistics for this article
Journal of Combinatorial Optimization is currently edited by Thai, My T.
More articles in Journal of Combinatorial Optimization from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().