TheiaCoV Genomic Characterization
The TheiaProk workflows are for the assembly, quality assessment, and characterization of bacterial genomes. There are currently four TheiaProk workflows designed to accommodate different kinds of input data:
All input reads are processed through “core tasks” in the TheiaProk Illumina and ONT workflows. These undertake read trimming and assembly appropriate to the input data type. TheiaProk workflows subsequently launch default genome characterization modules for quality assessment, species identification, antimicrobial resistance gene detection, sequence typing, and more. For some taxa identified, “taxa-specific sub-workflows” will be automatically activated, undertaking additional taxa-specific characterization steps. When setting up each workflow, users may choose to use “optional tasks” as additions or alternatives to tasks run in the workflow by default.
versioning
: Version Capture for TheiaProkscreen
: Total Raw Read Quantification and Genome Size Estimationread_QC_trim
: Read Quality Trimming, Adapter Removal, Quantification, and IdentificationCG-Pipeline
: Assessment of Read Quality and Estimation of Genome Coverageshovill
: De novo Assemblyversioning
: Version Capture for TheiaProkscreen
: Total Raw Read Quantification and Genome Size Estimationread_QC_trim_ont
: Read Quality Trimming, Quantification, and Identificationdragonflye
: De novo AssemblyThe following tasks are performed for all TheiaProk workflows.
QUAST
: Assembly Quality AssessmentBUSCO
: Assembly Quality AssessmentMUMmer_ANI
: Average Nucleotide Identity (optional)GAMBIT
: Taxon AssignmentKmerFinder
: Taxon Assignment (optional)AMRFinderPlus
: AMR Genotyping (default)ResFinder
: AMR Genotyping (alternative)TS_MLST
: MLST ProfilingProkka
: Assembly Annotation (default)Bakta
: Assembly Annotation (alternative)PlasmidFinder
: Plasmid IdentificationQC_check
: Check QC Metrics Against User-Defined Thresholds (optional)Taxon Tables
: Copy outputs to new data tables based on taxonomic assignment (optional)The TheiaProk workflows automatically activate taxa-specific sub-workflows after the identification of relevant taxa using GAMBIT
. Alternatively, the user can provide the expected taxa in the expected_taxon
workflow input to override the taxonomic assignment made by GAMBIT. Modules are launched for all TheiaProk workflows unless otherwise indicated.