Page Contents

Overview

TheiaValidate performs basic comparisons between user-designated columns in two separate tables. We anticipate this workflow being run to determine if any differences exist between version releases or two workflows, such as TheiaProk_ONT vs TheiaProk_Illumina_PE. A summary PDF report is produced in addition to a Excel spreadsheet that lists the values for any columns that do not have matching content for a sample.

<aside> ⚠️ The two tables being compared must have both identical sample names and an equal number of samples. If not, validation will not work or (in the case of unequal number of samples) not be attempted.

</aside>

In order to enable this workflow to function for different workflow series, we require users to provide a list of columns they want to compare between the two tables. Feel free to use the information below that Theiagen uses to compare versions of the three main workflow series as a starting point for your own validations:

Validation Starting Points

If additional validation metrics are desired, the user has the ability to provide a validation_criteria_tsv file that specifies what type of comparison should be performed. There are several options for additional validation checks:

Inputs

Please note that all string inputs must be enclosed in quotes; for example, “column1,column2” or “workspace1”

Terra Inputs

The optional validation_criteria_tsv file takes the following format (tab-delimited; a header line is required):

column_name	criteria
columnB SET
columnC	IGNORE
columnD	0.01
columnE	EXACT

Please see the overview section for a description of all available criteria options (EXACT, IGNORE, SET, <PERCENT_DIFF>).

Outputs

Terra Outputs

Example Data and Outputs

If the above inputs are provided, then the following output files will be generated:

example.xlsx

example.pdf

✉️ [email protected] | X (formerly Twitter) | LinkedIn | 🌐 Website