Short read quality control, trimming and contamination filter

Workflow Type: Common Workflow Language
Stable

Workflow for short paired end reads quality control, trimming and filtering.
Multiple paired datasets will be merged into single paired dataset.
Summary:

  • Sequali QC on raw data files
  • fastp for read quality trimming
  • BBduk for phiX and rRNA filtering (optional)
  • Filter human reads using Hostile (optional)
  • Custom read filtering using Hostile (optional)
  • Sequali QC on filtered (merged) data

Other UNLOCK workflows on WorkflowHub: https://workflowhub.eu/projects/16/workflows?view=default

All tool CWL files and other workflows can be found at:
https://gitlab.com/m-unlock/cwl

How to setup and use an UNLOCK workflow:
https://docs.m-unlock.nl/docs/workflows/setup.html

Click and drag the diagram to pan, double click or use the controls to zoom.

Inputs

ID Name Description Type
identifier Identifier Identifier for this dataset used in this workflow.
  • string
threads Number of threads Number of threads to use for computational processes. (default 2)
  • int
memory Maximum memory in MB Maximum memory usage in MegaBytes. (default 8000)
  • int
forward_reads Forward reads Forward sequence fastq file(s) locally
  • File[]
reverse_reads Reverse reads Reverse sequence fastq file(s) locally
  • File[]
do_not_output_filtered_reads Don't output reads. Do not output filtered reads. (default false)
  • boolean
skip_qc_unfiltered Skip QC unfiltered Skip FastQC analyses of raw input reads (default false)
  • boolean
skip_qc_filtered Skip QC filtered Skip FastQC analyses of filtered input reads (default false)
  • boolean
filter_rrna filter rRNA Optionally remove rRNA sequences from the reads (default false)
  • boolean
deduplicate Deduplicate reads Remove exact duplicate reads with fastp. (default false)
  • boolean
humandb Filter human reads Bowtie2 index folder. Provide the folder in which the in index files are located.
  • Directory?
reference_filter_db Filter reference file(s) Custom reference database for filtering with Hostile. Provide the folder in which the bowtie2 index files are located. (default false)
  • Directory?
keep_reference_mapped_reads Keep mapped reads Discard unmapped and keep reads mapped to the given reference. Default false (discard mapped)
  • boolean
destination Output Destination Optional output destination only used for cwl-prov reporting.
  • string?
source Input URLs used for this run A provenance element to capture the original source of the input data
  • string[]?

Steps

ID Name Description
workflow_merge_pe_reads Merge paired reads Merge multiple forward and reverse fastq reads to single file objects
sequali_illumina_before Sequali before Quality assessment and report of reads before filtering
fastp fastp Read quality filtering and (barcode) trimming.
rrna_filter rRNA filter (bbduk) Filters rRNA sequences from reads using bbduk
human_filter Human filter Filter human reads from the dataset using Hostile
reference_filter Custom reference filter Filter reads using custom references with Hostile
phix_filter PhiX filter (bbduk) Filters illumina spike-in PhiX sequences from reads using bbduk
sequali_illumina_after Sequali after Quality assessment and report of reads after filtering
reports_files_to_folder Reports to folder Preparation of QC output files to a specific output folder
out_fwd_reads Output fwd reads Step needed to output filtered reads because there is an option to not to.
out_rev_reads Output rev reads Step needed to output filtered reads because there is an option to not to.

Outputs

ID Name Description Type
reports_folder Filtering reports folder Folder containing all reports of filtering and quality control
  • Directory
QC_forward_reads Filtered forward read Filtered forward read
  • File?
QC_reverse_reads Filtered reverse read Filtered reverse read
  • File?

Version History

Version 2 (latest) Created 6th Nov 2025 at 13:41 by Bart Nijsse

Major changes: Replaced FastQC with Sequali. Reference and contamination filtered in now done using Hostile instead of BBduk (required pre-build indexes)


Open master f0dcdb1

Version 1 (earliest) Created 21st Apr 2022 at 14:00 by Bart Nijsse

Initial commit


Frozen Version-1 5c2e0e5
help Creators and Submitter
Discussion Channel
Activity

Views: 5053   Downloads: 785

Created: 21st Apr 2022 at 14:00

Last updated: 6th Nov 2025 at 16:37

help Attributions

None

Total size: 11.7 KB
Powered by
(v.1.17.1)
Copyright © 2008 - 2025 The University of Manchester and HITS gGmbH