Licencja
Cophenetic Distances: A Near-Linear Time Algorithmic Framework
Abstrakt (EN)
Tree metrics that compare pairs of trees are an elementary tool for analyzing phylogenetic trees. The cophenetic distance is a classic vector-based tree metric introduced by Cardona et al. that originates from the pioneering work of Sokal and Rohlf more than 50 years ago. However, when faced with phylogenetic analyses where sets of large-scale trees are compared, the quadratic runtime of the current best-known (naïve) algorithm to compute the cophenetic distance becomes prohibitive. Here we describe an algorithmic framework that computes the cophenetic distance under the ??1 -norm in ??(??log2??) time, where n is the size of the compared pair of trees. Based on the work from Sokal and Rohlf, we introduce a natural class of cophenetic distances and show that our algorithmic framework can compute each member of this class in ??(??log2??) time. In addition, we present a modification of this framework for computing these distances under the ??2 -norm in ??(??log??) time. Finally, we demonstrate the scalability of our algorithm.