PHBG workflows
Page Contents
The kSNP3 workflow is for phylogenetic analysis of bacterial genomes using single nucleotide polymorphisms (SNPs). The kSNP3 workflow identifies SNPs amongst a set of genome assemblies, then calculates a number of phylogenetic trees based on those SNPs:
_pan._core.This workflow also features an optional module, summarize_data that creates a presence/absence matrix for the analyzed samples from a list of indicated columns (such as AMR genes, plasmid types etc.). If the phandango_coloring variable is set to true, this will be formatted for visualization in Phandango, else it can be viewed in Excel.
The ksnp3 workflow is run on the set of assembly files to produce both pan-genome and core-genome phylogenies. This also results in alignment files which - are used by snp-dists to produce a pairwise SNP distance matrix for both the pan-genome and core-genomes.
The optional summarize_data task performs the following only if all of the data_summary_* and sample_names optional variables are filled out:
"amrfinderplus_virulence_genes,amrfinderplus_stress_genes", etc. that can be found within the origin Terra data table.amrfinder_amr_genes column for a sample contains these values: "aph(3')-IIIa,tet(O),blaOXA-193", the summarize_data task will check each sample in the set to see if they also have those AMR genes detected.By default, this task appends a Phandango coloring tag to color all items from the same column the same; this can be turned off by setting the optional phandango_coloring variable to false.
✉️ [email protected] | X (formerly Twitter) | LinkedIn | 🌐 Website