Page Contents
Overview
RASUSA functions to randomly downsample the number of raw reads to a user-defined threshold.
π Use Cases:
- to reduce computing resources when samples end up with drastically more data than needed to perform analyses
- to perform limit of detection (LOD) studies to identify appropriate minimum coverage thresholds required to perform downstream analyses
π§ Desired size may be specified by inputting any one of the following:
- coverage (e.g. 20X)
- number of bases (e.g. β5mβ for 5 megabases)
- number of reads (e.g. 100000 total reads)
- fraction of reads (e.g. 0.5 samples half the reads)
<aside>
π‘ NOTE
- If using RASUSA_PHB workflow version v2.0.0 or higher, the call-caching feature of Terra has been DISABLED to ensure that the workflow is run from the beginning and data is downloaded fresh. Call-caching will not be enabled, even if the user checks the box β
in the Terra workflow interface.
</aside>
Inputs
Input all String values (other than when selecting dropdown option Strings) in quotations, e.g. β5mβ
Required Inputs
Optional Inputs
Outputs
π‘ Donβt Forget! Remember to use the subsampled reads in downstream analyses with this.read1_subsampled
and this.read2_subsampled
inputs.
β
Verify reads were successfully subsampled before downstream analyses by comparing read file size/s to the original read file size/s
View file sizes by clicking on the read file listed in the Terra data table and looking at the file size
Terra Outputs
References
βοΈ [email protected] | X (formerly Twitter) | LinkedIn | π Website