'TBRDist' is an R package that calculates rearrangement distances between phylogenetic trees using C++ algorithms by Chris Whidden:
- Unrooted trees (
TBRDist(),USPRDist(),ReplugDist()): TBR, SPR, and Replug distances based on the algorithms of Whidden and Matsen (2017). The C++ implementation has diverged from the upstream uspr library with optimizations including integer-encoded tree identity (replacing Newick string round-tripping), precomputed SPR lookup tables for small trees, and reduced heap allocation in the A* search and branch-and-bound routines, yielding roughly 2.5× faster computation. - Rooted trees (
RSPRDist()): Rooted SPR distances via rspr, implementing the exact fixed-parameter algorithm (with cluster decomposition) of Whidden, Beiko, and Zeh (2013). Supports exact distances, a linear-time 3-approximation, and maximum agreement forest output.
The uSPR distance is a natural distance metric with respect to phylogenetic tree search, as common tree search and sampling software mainly use SPR operations (or NNI operations, a subset of SPR operations). The uSPR distance is also a lower bound on the number of lateral gene transfer events required to explain the difference between a reference/species tree and a gene tree.
Note that the uSPR distance is NP-hard to compute (as are the TBR and replug
distances) and the running time of most of the algorithms used in this software
scales exponentially with the distance computed.
The unrooted functions are aimed at trees with up to 50 leaves and uSPR
distances up to 14. The rooted RSPRDist() can handle much larger trees
(200+ leaves, distances > 50).
Install and load the library from CRAN as follows:
install.packages('TBRDist')
library('TBRDist')
You can install the development version thus:
if(!require(devtools)) install.packages("devtools")
devtools::install_github('ms609/TBRDist')Advanced users wishing to access the source code should note that the project
contains submodules.
To download the contents of these subdirectories, clone the project using
git clone --recursive https://github.com/ms609/TBRDist,
or if you've already cloned the project, run
git submodule update --init --recursive.
If you use unrooted distances (TBRDist, USPRDist, ReplugDist) in your
research, please cite:
Whidden C, Matsen FA IV (2019). Calculating the Unrooted Subtree-Prune-and-Regraft Distance. IEEE/ACM Transactions on Computational Biology and Bioinformatics 16(3):898–911. doi:10.1109/TCBB.2018.2802911
If you use rooted SPR distances (RSPRDist), please cite:
Whidden C, Beiko RG, Zeh N (2013). Fixed-Parameter Algorithms for Maximum Agreement Forests. SIAM Journal on Computing 42(4):1431–1466. doi:10.1137/110845045
A citation to the R package itself would also be appreciated:
Smith MR (2019). TBRDist: Rearrangement Distances Between Phylogenetic Trees. doi:10.5281/zenodo.3548333
Some text in this README file originates from the 'uspr' readme.
Please note that the 'TBRDist' project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.