✉️ [email protected] | X (formerly Twitter) | LinkedIn | 🌐 Website
<aside> <img src="https://prod-files-secure.s3.us-west-2.amazonaws.com/be290196-9090-4f3c-a9ab-fe730ad213e0/39a844d4-3053-4f0c-a8c3-dc265e8f9325/Picture3.png" alt="https://prod-files-secure.s3.us-west-2.amazonaws.com/be290196-9090-4f3c-a9ab-fe730ad213e0/39a844d4-3053-4f0c-a8c3-dc265e8f9325/Picture3.png" width="40px" /> Running workflows on the command line requires the direct use of the WDL (Workflow Development Language). As the name suggests, this is the workflow management language that is used to write and execute workflows. Frank has put together a great video describing📺 WDL Task and Workflow Files and you can find full instructions below on running these WDL workflows.
</aside>
You will need to have access to the WDL workflow file (.wdl) and any associated input files (such as reference genomes, input data files, etc.). To do this, complete the following steps:
If you don't already have Git installed on your system, you will need to install it. Here's how you can install Git on some common operating systems:
Open your terminal.
Create a directory where you want to store the cloned repository and navigate to it.
mkdir /path/to/your/desired/new/directory
cd /path/to/your/desired/new/directory
Clone the ‣ repository from GitHub using the following command:
git clone <https://github.com/theiagen/public_health_bioinformatics.git>
After running the command, Git will download all the repository files and set up a local copy in the directory you specified.
Change your working directory to the newly cloned repository:
cd public_health_bioinformatics
You're now inside the cloned repository's directory. Here, you should find all the files and directories from the GitHub repository.
You can verify that the repository has been cloned successfully by listing the contents of the current directory using the ls
(on Linux/macOS) or dir
(on Windows) command:
ls
This should display the files and directories within the ‣ repository.
Congratulations! You've successfully cloned the ‣ repository from GitHub to your local command line environment. You're now ready to proceed with running the bioinformatics analysis workflows using WDL as described in subsequent steps.
Docker and miniwdl will be required for command line execution. We will check if these are installed on your system and if not, install them now.
Open your terminal.
Navigate to the directory where your workflow and input files are located using the cd
command:
cd /path/to/your/workflow/directory
Check if Docker is installed:
docker --version
If Docker is not installed, follow the official installation guide for your operating system: **https://docs.docker.com/get-docker/**
Check if miniwdl
is installed:
miniwdl --version
If miniwdl
is not installed, you can install it using pip:
pip install miniwdl
In a WDL (Workflow Description Language) workflow, an input JSON file is used to provide attributes (values/files etc) for input variables into the workflow. The names of the input variables must match the names of inputs specified in the workflow file. The workflow files can be found within the git repository that you cloned. Each input variable can have a specific type of attribute, such as String, File, Int, Boolean, Array, etc. Here's a detailed outline of how to specify different types of input variables in an input JSON file:
Run the workflow using miniwdl
with the following command, replacing your_workflow.wdl
with the actual filename of your WDL workflow and input.json
with the filename of your input JSON file.
miniwdl run your_workflow.wdl --input input.json
You can monitor the progress of the workflow by checking the console output for updates and log messages. This can help you identify any potential issues or errors during execution.
Once the workflow completes successfully, you will find the output files and results in the designated output directory as defined in your WDL workflow.
Conclusion
Reviewing the outputs of your bioinformatics workflow is a critical step to ensure the quality of your analysis. Logs, stderr, stdout, and generated output files provide valuable insights into the execution process and results. By carefully reviewing these outputs and addressing any issues, you can enhance the reliability and accuracy of your bioinformatics analysis.
Congratulations! You have successfully executed a bioinformatics analysis workflow using WDL on the command line. This tutorial covered the basic steps to run a WDL workflow using the miniwdl
command line tool.
Remember that the specific steps and commands might vary depending on the details of your workflow, software versions, and environment. Be sure to consult the documentation for miniwdl
, WDL, and any other tools you're using for more advanced usage and troubleshooting.
Happy analyzing!