Page Contents

The SRA_Fetch workflow downloads sequence data from NCBI’s Sequence Read Archive (SRA). It requires an SRA run accession to populate the associated read files to a Terra data table.

Inputs

The only input for the SRA_Fetch workflow is an SRA run accession, which begin with “SRR”, or an ENA run accession, which begin with “ERR”. Please see the NCBI Metadata and Submission Overview for assistance with identifying accessions: https://www.ncbi.nlm.nih.gov/sra/docs/submitmeta/. Briefly, NCBI-accessioned objects have the following naming scheme:

STUDY SRP#
SAMPLE SRS#
EXPERIMENT SRX#
RUN SRR#

Only RUN level accession numbers result in workflow success.

Tasks/Actions

Read files associated with the SRA run accession provided as input are copied to your workspace’s associated Google bucket. Hyperlinks to those files are shown in the “read1” and “read2” columns of the Terra data table.

Outputs

This workflow produces output columns for the read data. For paired-end data, these are read1 and read2 columns (for single-end data, only the read1 column).

References

This workflow relies on https://github.com/rpetit3/fastq-dl, a very handy bioinformatics tool by Robert A. Petit III

👋 [email protected] | ✉️ [email protected] | X (formerly Twitter) | LinkedIn | 🌐 Website