The TheiaProk workflows are for the assembly, quality assessment, and characterization of bacterial genomes. There are currently four TheiaProk workflows designed to accommodate different kinds of input data:

Illumina paired-end sequencing (TheiaProk_Illumina_PE)
Illumina single-end sequencing (TheiaProk_Illumina_SE)
ONT sequencing (TheiaProk_ONT)
Genome assemblies (TheiaProk_FASTA)

All input reads are processed through “core tasks” in the TheiaProk Illumina and ONT workflows. These undertake read trimming and assembly appropriate to the input data type. TheiaProk workflows subsequently launch default genome characterization modules for quality assessment, species identification, antimicrobial resistance gene detection, sequence typing, and more. For some taxa identified, “taxa-specific sub-workflows” will be automatically activated, undertaking additional taxa-specific characterization steps. When setting up each workflow, users may choose to use “optional tasks” as additions or alternatives to tasks run in the workflow by default.

Inputs

TheiaProk_Illumina_PE

TheiaProk_Illumina_SE

TheiaProk_ONT

TheiaProk_FASTA

Core Tasks for TheiaProk_Illumina_PE and TheiaProk_Illumina_SE

`versioning`: Version Capture for TheiaProk

`screen`: Total Raw Read Quantification and Genome Size Estimation

`read_QC_trim`: Read Quality Trimming, Adapter Removal, Quantification, and Identification

`CG-Pipeline`: Assessment of Read Quality and Estimation of Genome Coverage

`shovill`: De novo Assembly

Core Tasks for TheiaProk_ONT

`versioning`: Version Capture for TheiaProk

`screen`: Total Raw Read Quantification and Genome Size Estimation

`read_QC_trim_ont`: Read Quality Trimming, Quantification, and Identification

`dragonflye`: De novo Assembly

Outputs

TheiaProk_ONT_PHB Outputs

Page Contents

Workflows available

Overview

Inputs

TheiaProk_Illumina_PE

TheiaProk_Illumina_SE

TheiaProk_ONT

TheiaProk_FASTA

Core Tasks for TheiaProk_Illumina_PE and TheiaProk_Illumina_SE

versioning: Version Capture for TheiaProk

screen: Total Raw Read Quantification and Genome Size Estimation

read_QC_trim: Read Quality Trimming, Adapter Removal, Quantification, and Identification

CG-Pipeline: Assessment of Read Quality and Estimation of Genome Coverage

shovill: De novo Assembly

Core Tasks for TheiaProk_ONT

versioning: Version Capture for TheiaProk

screen: Total Raw Read Quantification and Genome Size Estimation

read_QC_trim_ont: Read Quality Trimming, Quantification, and Identification

dragonflye: De novo Assembly

Outputs

`versioning`: Version Capture for TheiaProk

`screen`: Total Raw Read Quantification and Genome Size Estimation

`read_QC_trim`: Read Quality Trimming, Adapter Removal, Quantification, and Identification

`CG-Pipeline`: Assessment of Read Quality and Estimation of Genome Coverage

`shovill`: De novo Assembly

`versioning`: Version Capture for TheiaProk

`screen`: Total Raw Read Quantification and Genome Size Estimation

`read_QC_trim_ont`: Read Quality Trimming, Quantification, and Identification

`dragonflye`: De novo Assembly