References

GTDB-Tk is described in:

Chaumeil PA, et al. 2022. GTDB-Tk2: memory friendly classification with the genome taxonomy database. Bioinformatics, btac672.

Chaumeil PA, et al. 2019. GTDB-Tk: A toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics, btz848.

The Genome Taxonomy Database (GTDB) is described in:

Parks DH, et al. 2020. A complete domain-to-species taxonomy for Bacteria and Archaea. Nature Biotechnology, https://doi.org/10.1038/s41587-020-0501-8.

Parks DH, et al. 2018. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nature Biotechnology, http://dx.doi.org/10.1038/nbt.4229.

We strongly encourage you to cite the following 3rd party dependencies:

Software

Reference

Prodigal

Hyatt D, et al. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics, 11:119. doi: 10.1186/1471-2105-11-119.

HMMER

Eddy SR. 2011. Accelerated profile HMM searches. PLOS Comp. Biol., 7:e1002195.

pplacer

Matsen FA, et al. 2010. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics, 11:538.

skani

Shaw J. and Yu Y.W. 2023. Fast and robust metagenomic sequence comparison through sparse chaining with skani. Nature Methods, 20, pages1661–1665 (2023).

FastTree

Price MN, et al. 2010. FastTree 2 - Approximately Maximum-Likelihood Trees for Large Alignments. PLoS One, 5, e9490.

Mash

Ondov BD, et al. 2016. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol 17, 132. doi: doi: 10.1186/s13059-016-0997-x.

DendroPy

Sukumaran J. and Mark T. Holder. 2010. DendroPy: A Python library for phylogenetic computing. Bioinformatics 26: 1569-1571.

NumPy

Harris, C.R., Millman, K.J., van der Walt, S.J. et al. Array programming with NumPy. Nature 585, 357–362 (2020). DOI: 0.1038/s41586-020-2649-2

tqdm

DOI: 10.5281/zenodo.595120