TheiaCoV Genomic Characterization
Sharing of sample read and assembly data through internationally accessible databases allows insights to be drawn about how the virus is spreading and mutating across the globe. This allows international researchers and public health scientists to use these data and help all of us make stronger public health decisions.
The Mercury workflow series was developed to allow users to efficiently and accurately prepare submission files for GISAID, SRA, and Genbank submissions as well as BioSample registration. These workflows were developed to ingest read, assembly, and metadata files associated with SARS-CoV-2 amplicon reads from clinical samples and format these data for submission per the Public Health Alliance for Genomic Epidemiology (PH4GE)’s SARS-CoV-2 Contextual Data Specifications.
A series of introductory training videos provide a conceptual overview of the methods and walkthrough tutorials on how to use these Mercury workflows through Terra are available on the Theiagen Genomics YouTube page:
The Mercury Workflows have been combined into a single workflow called Mercury_Prep_N_Batch. This workflow performs both functions of Mercury_PE/SE_Prep and Mercury_Batch in a single workflow. These legacy workflows can be found in the PHVG repository.
This workflow processes read data, assembly files, and contextual metadata to prepare submissions for a group of samples.
<aside>
⚠️ By default, this workflow uses the read1/read2_dehosted
columns for your SRA read files and the assembly_fasta
column for GenBank and GISAID. The workflow will not work if these columns are not present in your data table. If you want to use other files, the following two options should help:
using_clearlabs_data
, when set to true will change:
read1_dehosted
→ clearlabs_fastq_gz
;assembly_fasta
→ clearlabs_fasta
; andassembly_mean_coverage
→ clearlabs_assembly_coverage
using_reads_dehosted
, when set to true will only change read1_dehosted
→ reads_dehosted
.If both using_clearlabs_data
and using_reads_dehosted
are set to true, reads_dehosted
will be used instead of clearlabs_fastq_gz
but all other using_clearlabs_data
changes will still occur.
</aside>
<aside>
⚠️ A new column library_layout
is required for workflow success. Please use the updated metadata formatters below to ensure all required metadata is provided.
</aside>
<aside> 📢 SARS-CoV-2 metadata formatter:
Mercury_Prep_N_Batch_SC2_Metadata_Formatter_2023_05_22.xlsx
Monkeypox metadata formatter:
Mercury_Prep_N_Batch_MPXV_Metadata_Formatter_2022_12_23.xlsx
</aside>
✉️ [email protected] | X (formerly Twitter) | LinkedIn | 🌐 Website