Page Contents
The Lyve_SET WDL workflow runs the https://github.com/lskatz/lyve-SET pipeline developed by Lee Katz et al. for phylogenetic analysis of bacterial genomes using high quality single nucleotide polymorphisms (hqSNPs). The Lyve_SET workflow identifies SNPs amongst a set of samples by mapping sequencing reads to a reference genome, identifying high quality SNPs, and inferring phylogeny using RAxML.
The Lyve_SET WDL workflow is run using read data from a set of samples. The workflow will produce a pairwise SNP matrix for the sample set and a maximum likelihood phylogenetic tree. Details regarding the default implementation of Lyve_SET and optional modifications are listed below.
read_cleaner
input variable.mask_cliffs
and mask_phages
variables to “true”.smalt
and varscan
). Additional options for each are available using the mapper
and snpcaller
input variables.min_alt_frac
and min_coverage
input variables.nomsa
= true, nomatrix
= true, or notrees
= true, respectively.For full descriptions of Lyve-SET pipeline outputs, we recommend consulting the Lyve-SET documentation: https://github.com/lskatz/lyve-SET/blob/master/docs/OUTPUT.md
The following output files are populated to the Terra data table. However, please note that certain files may not appear in the data table following a run for two main reasons:
notrees
= true, no tree file will appearIn addition to these outputs, all of the files produced by the Lyve-SET pipeline are available in the task-level outputs, including intermediate files and individual bam and vcf files for each sample. These files can be accessed viewing the execution directory for the run.
Lyve-SET Katz LS, Griswold T, Williams-Newkirk AJ, Wagner D, Petkau A, et al. (2017) A Comparative Analysis of the Lyve-SET Phylogenomics Pipeline for Genomic Epidemiology of Foodborne Pathogens. Frontiers in Microbiology 8.
✉️ [email protected] | X (formerly Twitter) | LinkedIn | 🌐 Website