Overview

Sharing of sample read and assembly data through internationally accessible databases allows insights to be drawn about how the virus is spreading and mutating across the globe. This allows international researchers and public health scientists to use these data and help all of us make stronger public health decisions.

The Mercury workflow series was developed to allow users to efficiently and accurately prepare submission files for GISAID, SRA, and Genbank submissions as well as BioSample registration. These workflows were developed to ingest read, assembly, and metadata files associated with SARS-CoV-2 amplicon reads from clinical samples and format these data for submission per the Public Health Alliance for Genomic Epidemiology (PH4GE)’s SARS-CoV-2 Contextual Data Specifications.

A series of introductory training videos provide a conceptual overview of the methods and walkthrough tutorials on how to use these Mercury workflows through Terra are available on the Theiagen Genomics YouTube page:

https://youtu.be/h8YASVckOrw

Mercury_Prep_N_Batch

The Mercury Workflows have been combined into a single workflow called Mercury_Prep_N_Batch. This workflow performs both functions of Mercury_PE/SE_Prep and Mercury_Batch in a single workflow. These legacy workflows can be found in the PHVG repository.

This workflow processes read data, assembly files, and contextual metadata to prepare submissions for a group of samples.

<aside> ⚠️ By default, this workflow uses the read1/read2_dehosted columns for your SRA read files and the assembly_fasta column for GenBank and GISAID. The workflow will not work if these columns are not present in your data table. If you want to use other files, the following two options should help:

using_clearlabs_data, when set to true will change:
- read1_dehosted → clearlabs_fastq_gz;
- assembly_fasta → clearlabs_fasta; and
- assembly_mean_coverage → clearlabs_assembly_coverage
using_reads_dehosted, when set to true will only change read1_dehosted → reads_dehosted.

If both using_clearlabs_data and using_reads_dehosted are set to true, reads_dehosted will be used instead of clearlabs_fastq_gz but all other using_clearlabs_data changes will still occur.

</aside>

<aside> ⚠️ A new column library_layout is required for workflow success. Please use the updated metadata formatters below to ensure all required metadata is provided.

</aside>

<aside> 📢 SARS-CoV-2 metadata formatter:

Mercury_Prep_N_Batch_SC2_Metadata_Formatter_2023_05_22.xlsx

Monkeypox metadata formatter:

Mercury_Prep_N_Batch_MPXV_Metadata_Formatter_2022_12_23.xlsx

</aside>

User Inputs

Outputs

✉️ [email protected] | X (formerly Twitter) | LinkedIn | 🌐 Website