TheiaProk_Illumina_PEv1.1 (1).png

PHBG workflows

TheiaProk workflows

MashTree_FASTA

kSNP3

Core_Gene_SNP

Kraken workflows

Page Contents

Overview

The TheiaProk workflows are for the assembly, quality assessment, and characterization of bacterial genomes.

There are currently two TheiaProk workflows: one for Illumina paired-end sequencing (TheiaProk_Illumina_PE), and another for Illumina single-end sequencing (TheiaProk_Illumina_SE). Besides the data input types, there are minimal differences between these two workflows.

All input reads are processed through “core tasks” in each workflow. These undertake read trimming and assembly, quality assessment, species identification, and some genome characterization. For some taxa identified, “taxa-specific sub-workflows” will be automatically activated, undertaking additional taxa-specific characterization steps. When setting up each workflow, users may choose to use “optional tasks” as additions or alternatives to tasks run in the workflow by default.

Inputs


TheiaProk_Illumina_PE

TheiaProk_Illumina_SE

Core Tasks


versioning: Version Capture for TheiaProk

screen: Total Raw Read Quantification and Genome Size Estimation

read_QC_trim: Read Quality Trimming, Adapter Removal, Quantification, and Identification

CG-Pipeline: Assessment of Read Quality, and Estimation of Genome Coverage

shovill: De novo Assembly

QUAST: Assembly Quality Assessment

BUSCO: Assembly Quality Assessment

MUMmer_ANI: Average Nucleotide Identity (optional)

GAMBIT: Taxon Assignment

AMRFinderPlus: AMR Genotyping (default)

ResFinder: AMR Genotyping (alternative)

TS_MLST: MLST Profiling

Prokka: Assembly Annotation (default)

Bakta: Assembly Annotation (alternative)

PlasmidFinder: Plasmid Identification

QC_check: Check QC Metrics Against User-Defined Thresholds (optional)

Taxa-specific sub-workflows


The TheiaProk workflow automatically activates taxa-specific sub-workflows after identifying relevant taxa using GAMBIT. Such sub-workflows are available for the following taxa:

Escherichia spp

Shigella spp

Salmonella spp

Listeria monocytogenes

Legionella pneumophila

Klebsiella spp

Mycobacterium tuberculosis

Acinetobacter baumannii

Pseudomonas aeruginosa

Streptococcus pneumoniae

Outputs


TheiaProk_Illumina_PE Outputs

✉️ [email protected] | X (formerly Twitter) | LinkedIn | 🌐 Website