Workflows
What is a Workflow?Filters
GermlineStructuralV-nf is a pipeline for identifying structural variant events in human Illumina short read whole genome sequence data. GermlineStructuralV-nf identifies structural variant and copy number events from BAM files using Manta, Smoove, and TIDDIT. Variants are then merged using SURVIVOR, ...
Type: Nextflow
Creators: Georgina Samaha, Marina Kennerson, Tracy Chew, Sarah Beecroft
Submitter: Georgina Samaha
Fastq-to-BAM @ NCI-Gadi is a genome alignment workflow that takes raw FASTQ files, aligns them to a reference genome and outputs analysis ready BAM files. This workflow is designed for the National Computational Infrastructure's (NCI) Gadi supercompter, leveraging multiple nodes on NCI Gadi to run all stages of the workflow in parallel, either massively parallel using the scatter-gather approach or parallel by sample. It consists of a number of stages and follows the BROAD Institute's best practice ...
Type: Shell Script
Creators: Cali Willet, Tracy Chew, Georgina Samaha, Rosemarie Sadsad, Andrey Bliznyuk, Ben Menadue, Rika Kobayashi, Matthew Downton, Yue Sun
Submitter: Georgina Samaha
GermlineShortV_biovalidation
Type: Shell Script
Creators: Georgina Samaha, Tracy Chew, Cali Willet, Nandan Deshpande
Submitter: Georgina Samaha
Germline-ShortV @ NCI-Gadi is an implementation of the BROAD Institute's best practice workflow for germline short variant discovery. This implementation is optimised for the National Compute Infrastucture's Gadi HPC, utilising scatter-gather parallelism to enable use of multiple nodes with high CPU or memory efficiency. This workflow requires sample BAM files, which can be generated using the Fastq-to-bam @ NCI-Gadi pipeline. Germline-ShortV can be applied ...
Type: Shell Script
Creators: Rosemarie Sadsad, Georgina Samaha, Tracy Chew, Cali Willet
Submitter: Tracy Chew
Somatic-ShortV @ NCI-Gadi is a variant calling pipeline that calls somatic short variants (SNPs and indels) from tumour and matched normal BAM files following GATK's Best Practice Workflow. This workflow is designed for the National Computational Infrastructure's (NCI) Gadi supercompter, leveraging multiple nodes on NCI Gadi to run all stages of the workflow in parallel. ...
The Flashlite-Supernova pipeline runs Supernova to generate phased whole-genome de novo assemblies from a Chromium prepared library on University of Queensland's HPC, Flashlite.
Infrastructure_deployment_metadata: FlashLite (QRISCloud)
Bootstrapping-for-BQSR @ NCI-Gadi is a pipeline for bootstrapping a variant resource to enable GATK base quality score recalibration (BQSR) for non-model organisms that lack a publicly available variant resource. This implementation is optimised for the National Compute Infrastucture's Gadi HPC. Multiple rounds of bootstrapping can be performed. Users can use Fastq-to-bam @ NCI-Gadi and Germline-ShortV @ NCI-Gadi to ...
SLURM HPC Cromwell implementation of GATK4 germline variant calling pipeline
See the GATK website for more information on this toolset
Assumptions
- Using hg38 human reference genome build
- Running using HPC/SLURM scheduling. This repo was specifically tested on Pawsey Zeus machine, primarily running in the
/scratch
partition. - Starting from short-read Illumina paired-end fastq files as input
Dependencies
The following versions have been ...
Local Cromwell implementation of GATK4 germline variant calling pipeline
See the GATK website for more information on this toolset
Assumptions
- Using hg38 human reference genome build
- Running 'locally' i.e. not using HPC/SLURM scheduling, or containers. This repo was specifically tested on Pawsey Nimbus 16 CPU, 64GB RAM virtual machine, primarily running in the
/data
volume storage partition. - Starting from short-read Illumina paired-end fastq ...