Compute the ANI of input genomes to all GTDB-Tk representative genomes.


usage: gtdbtk ani_rep (--genome_dir GENOME_DIR | --batchfile BATCHFILE)
                      --out_dir OUT_DIR [--no_mash] [--mash_k MASH_K]
                      [--mash_s MASH_S] [--mash_d MASH_D] [--mash_v MASH_V]
                      [--mash_db MASH_DB] [--min_af MIN_AF] [-x EXTENSION]
                      [--prefix PREFIX] [--cpus CPUS] [--tmpdir TMPDIR]
                      [--debug] [-h]

mutually exclusive required arguments


directory containing genome files in FASTA format


path to file describing genomes - tab separated in 2 or 3 columns (FASTA file, genome ID, translation table [optional])

required named arguments


directory to output files

optional Mash arguments


skip pre-filtering of genomes using Mash


k-mer size [1-32]

Default: 16


maximum number of non-redundant hashes

Default: 5000


maximum distance to keep [0-1]

Default: 0.15


maximum p-value to keep [0-1]

Default: 1.0


path to save/read (if exists) the Mash reference sketch database (.msh)

optional FastANI arguments


minimum alignment fraction to assign genome to a species cluster

Default: 0.5

Named Arguments

-x, --extension

extension of files to process, gz = gzipped

Default: “fna”


prefix for all output files

Default: “gtdbtk”


number of CPUs to use

Default: 1


specify alternative directory for temporary files

Default: “/tmp”


create intermediate files for debugging purposes



gtdbtk ani_rep --genome_dir genomes/ --out_dir ani_rep/ --cpus 70


[2020-04-13 10:51:58] INFO: GTDB-Tk v1.1.0
[2020-04-13 10:51:58] INFO: gtdbtk ani_rep --genome_dir genomes/ --out_dir ani_rep/ --cpus 70
[2020-04-13 10:51:58] INFO: Using GTDB-Tk reference data version r89: /release89
[2020-04-13 10:51:59] INFO: Using Mash version 2.2.2
[2020-04-13 10:51:59] INFO: Creating Mash sketch file: ani_rep/intermediate_results/mash/gtdbtk.user_query_sketch.msh
==> Sketching 3 of 3 (100.0%) genomes
[2020-04-13 10:51:59] INFO: Creating Mash sketch file: ani_rep/intermediate_results/mash/gtdbtk.gtdb_ref_sketch.msh
==> Sketching 24706 of 24706 (100.0%) genomes
[2020-04-13 10:53:13] INFO: Calculating Mash distances.
[2020-04-13 10:53:14] INFO: Calculating ANI with FastANI.
==> Processing 874 of 874 (100.0%) comparisons.
[2020-04-13 10:53:23] INFO: Summary of results saved to: ani_rep/gtdbtk.ani_summary.tsv
[2020-04-13 10:53:23] INFO: Closest representative hits saved to: ani_rep/gtdbtk.ani_closest.tsv
[2020-04-13 10:53:23] INFO: Done.