Bioconda

Step 1: Install conda (if not already done)

We strongly recommend using Mamba (much faster!) over miniconda/conda, but all will work.

Step 2: Create the GTDB-Tk environment

It is strongly recommended to create a new conda environment for each version of GTDB-Tk released.

Warning

You must always specify the version of GTDB-Tk, as conda may try to install a very old version (v1.0.2).

GTDB-Tk requires third-party packages from the conda-forge and bioconda channels, make sure to specify those channels in that order!

# NOTE: replace 2.1.1 with the version you wish to install

# using conda
conda create -n gtdbtk-2.1.1 -c conda-forge -c bioconda gtdbtk=2.1.1

# using mamba (alternative)
mamba create -n gtdbtk-2.1.1 -c conda-forge -c bioconda gtdbtk=2.1.1

Step 3: Download and alias the GTDB-Tk reference data

GTDB-Tk requires an environment variable named GTDBTK_DATA_PATH to be set to the directory containing the unarchived GTDB-Tk reference data.

Automatically

The conda package is bundled with a script download-db.sh (source) that will automatically download, and extract the GTDB-Tk reference data. The script will be on the system path so simply run:

download-db.sh

Manually

You can automatically alias GTDBTK_DATA_PATH whenever the environment is activated by setting environment-specific variables, e.g.:

# Activate the GTDB-Tk conda environment
conda activate gtdbtk-2.1.1

# Set the environment variable to the directory containing the GTDB-Tk reference data
conda env config vars set GTDBTK_DATA_PATH="/path/to/unarchived/gtdbtk/data";