Page Contents
TheiaCoV Genomic Characterization
The TheiaEuk_PE workflow is for the assembly, quality assessment, and characterization of fungal genomes. It is designed to accept Illumina paired-end sequencing data as the primary input.
All input reads are processed through “core tasks” in each workflow. The core tasks include raw-read quality assessment, read cleaning (quality trimming and adapter removal), de novo assembly, assembly quality assessment, and species taxon identification. For some taxa identified, “taxa-specific sub-workflows” will be automatically activated, undertaking additional taxa-specific characterization steps, including clade-typing and/or antifungal resistance detection.
TheiaEuk_PE has a number of required and optional inputs. The page linked below shows all input variables as in the Terra workflow input form and additional descriptions of these variables, default values used, and links to the relevant sections of this documentation page.
<aside> ℹ️ Input read data
The TheiaEuk_PE workflow takes in Illumina paired-end read data. Read file names should end with .fastq
or .fq
, with the optional addition of .gz
. When possible, Theiagen recommends zipping files with gzip prior to Terra upload to minimize data upload time.
By default, the workflow anticipates 2 x 150bp reads (i.e. the input reads were generated using a 300-cycle sequencing kit). Modifications to the optional parameter for trim_minlen
may be required to accommodate shorter read data, such as the 2 x 75bp reads generated using a 150-cycle sequencing kit.
</aside>
versioning
: Version capture for TheiaEukscreen
: Total Raw Read Quantification and Genome Size Estimationrasusa
: Read subsamplingread_QC_trim
: Read Quality Trimming, Adapter Removal, and Quantificationshovill
: De novo AssemblyQUAST
: Assembly Quality AssessmentCG-Pipeline
: Assessment of Read Quality, and Estimation of Genome CoverageGAMBIT
: Taxon AssignmentTS_MLST
: MLST ProfilingQC_check
: Check QC Metrics Against User-Defined Thresholds (optional)The TheiaEuk workflow automatically activates taxa-specific sub-workflows after identification of relevant taxa using GAMBIT
. Many of these taxa-specific workflows do not require any additional workflow inputs from the user.
✉️ [email protected] | X (formerly Twitter) | LinkedIn | 🌐 Website