✉️ [email protected] | X (formerly Twitter) | LinkedIn | 🌐 Website

GAMBIT Overview

GAMBIT (Genomic Approximation Method for Bacterial Identification and Tracking) determines the taxon of the query genome assembly using a k-mer-based approach to match the assembly sequence to the closest complete genome in a database.

<aside> 🧬 GAMBIT genomic distance metric correlates with sequence identity! GAMBIT uses an efficient genomic distance metric along with a curated database to identify genome assemblies in seconds. You can read more about how the distance metric is calculated in the Technical Details section!

</aside>

If the distance between the query genome assembly and the closest genome in the database is within a built-in species threshold, GAMBIT will assign the query genome to that species. Species thresholds are determined through a combination of automated and manual curation processes based on the diversity within the taxon.

<aside> 🧬 GAMBIT includes a manually curated, high-quality database! GAMBIT databases consist of two files:

  1. A signatures file containing the GAMBIT signatures (compressed representations) of all genomes represented in the database
  2. A metadata file relating the represented genomes to their genome accessions, taxonomic identifications, and species thresholds </aside>

Date of Last Update:

August 2nd, 2024

GAMBIT Version:

v1.0.1

Latest Database Version:

GAMBIT Prokaryotic GTDB Database v2.0.0

GAMBIT Fungal Database v0.2.0

Table Of Contents


GAMBIT on Terra.bio

Theiagen’s Public Health Bioinformatics (PHB) is a suite of workflows for characterization, epidemiology and sharing of pathogen genomes. Workflows are available for viruses, bacteria, and fungi.

Importing and using a GAMBIT via the PHB workflows

The GAMBIT_Query_PHB workflow performs taxon assignment of a genome assembly using the GAMBIT. It can be imported directly to Terra.bio via Dockstore.

Two inputs are required for the GAMBIT_Query_PHB workflow: a genome assembly and a sample name associated with the genome assembly. The default GAMBIT database used for taxonomic identification is the Prokaryotic GAMBIT Database GTDB v2.0.0, but alternate GAMBIT databases can be provided

<aside> <img src="https://prod-files-secure.s3.us-west-2.amazonaws.com/be290196-9090-4f3c-a9ab-fe730ad213e0/a7a39960-0058-472b-bafa-5109dd1bd393/Picture3.png" alt="https://prod-files-secure.s3.us-west-2.amazonaws.com/be290196-9090-4f3c-a9ab-fe730ad213e0/a7a39960-0058-472b-bafa-5109dd1bd393/Picture3.png" width="40px" /> More information on GAMBIT_Query_PHB is available on the following page:

GAMBIT_Query

</aside>

Additionally, GAMBIT is also part of the TheiaProk and TheiaEuk collection of workflows, the first dedicated to the analysis of prokaryotic data and the second data from mycotics. The TheiaProk or TheiaEuk most appropriate for your type of input data can be imported from the Dockstore links on the right.

In both, GAMBIT is responsible for performing the taxonomic identification of the assembled sequences, which can trigger taxa-specific submodules for further genomic characterization. For TheiaProk, the default database is the Prokaryotic GAMBIT Database GTDB v2.0.0 and for TheiaEuk, the default database is the Fungal GAMBIT Database v0.2.0.

<aside> <img src="https://prod-files-secure.s3.us-west-2.amazonaws.com/be290196-9090-4f3c-a9ab-fe730ad213e0/a7a39960-0058-472b-bafa-5109dd1bd393/Picture3.png" alt="https://prod-files-secure.s3.us-west-2.amazonaws.com/be290196-9090-4f3c-a9ab-fe730ad213e0/a7a39960-0058-472b-bafa-5109dd1bd393/Picture3.png" width="40px" /> More information on TheiaProk and TheiaEuk is available on the following pages:

TheiaEuk Workflow Series

</aside>

<aside> <img src="https://prod-files-secure.s3.us-west-2.amazonaws.com/be290196-9090-4f3c-a9ab-fe730ad213e0/a7a39960-0058-472b-bafa-5109dd1bd393/Picture3.png" alt="https://prod-files-secure.s3.us-west-2.amazonaws.com/be290196-9090-4f3c-a9ab-fe730ad213e0/a7a39960-0058-472b-bafa-5109dd1bd393/Picture3.png" width="40px" /> Detailed documentation for each PHB release, including helpful workflow input and output explanations, can be found on the Public Health Resources page!

Theiagen Public Health Resources

</aside>

<aside> <img src="https://prod-files-secure.s3.us-west-2.amazonaws.com/be290196-9090-4f3c-a9ab-fe730ad213e0/d6d09689-a26a-4184-9ee7-fa181383aa99/download.png" alt="https://prod-files-secure.s3.us-west-2.amazonaws.com/be290196-9090-4f3c-a9ab-fe730ad213e0/d6d09689-a26a-4184-9ee7-fa181383aa99/download.png" width="40px" /> Use on Terra.bio:


Gambit on your local machine

This guide assumes you have prior knowledge of how to install software locally in a Unix command-line environment. The necessary databases will have to be downloaded independently to be used with GAMBIT. They are available in the GAMBIT Databases section of this document and should be placed in a directory of your choice. The directory should not contain any other files with the same extensions.

Installation

Installation and from Bioconda

The recommended way to install the tool is through the Conda package manager from the Bioconda channel. You can simply run the following command to download GAMBIT’s latest version:

conda install -c bioconda gambit

Installation with Docker

The latest version of GAMBIT software is available as a Docker container in Theiagen’s Google Artifact Registry (GAR). If Docker is installed in your system you can simply run the following command to download the container:

docker pull us-docker.pkg.dev/general-theiagen/staphb/gambit:1.0.0

You can access the container with the following command (note: with the -v $PWD:/data your current directory is being mapped to the data/ folder inside the container):

docker run -v $PWD:/data -it us-docker.pkg.dev/general-theiagen/staphb/gambit:1.0.0 bash

Installation from source

This guide assumes that you have Git, Python and Pip installed in your system. Navigate to https://github.com/jlumpe/gambit and clone the repository:

git clone <https://github.com/jlumpe/gambit.git>