Licencja
Minimizing genomic duplication episodes
Abstrakt (EN)
Background The genomic duplication study is fundamental to understand the process of evolution. In evolutionary molecular biology, many approaches focus on discovering the occurrences of gene duplications and multiple gene duplication episodes and their locations in the Tree of Life. To reconstruct such episodes, one can cluster single gene duplications inferred by reconciling a set of gene trees with a species tree. Results We propose an efficient quadratic time algorithm to solve the problem of genomic duplication clustering, in which input gene trees are rooted, episode locations are restricted to preserve the minimal number of single gene duplications, clustering rules are described by minimum episodes method, and the goal is based on the recently introduced new approach to minimize the maximal number of duplication episodes on a single path, called here the score. Based on our theoretical results, we show new algorithmic relationships between the score and the minimum episodes score, defined as the minimal number of duplication episodes. Conclusions Our evaluation analysis on three empirical datasets demonstrates, that under the model in which the minimal number of duplications is preserved, the duplication clusterings with minimal score support the clusterings with the minimal total number of duplication episodes. Availability: The software is available at https://bitbucket.org/pgor17/rmp