Workflow for short paired end reads quality control, trimming and filtering.
Multiple paired datasets will be merged into single paired dataset.
Summary:
- Sequali QC on raw data files
- fastp for read quality trimming
- BBduk for phiX and rRNA filtering (optional)
- Filter human reads using Hostile (optional)
- Custom read filtering using Hostile (optional)
- Sequali QC on filtered (merged) data
Other UNLOCK workflows on WorkflowHub: https://workflowhub.eu/projects/16/workflows?view=default
All tool CWL files and other workflows can be found at:
https://gitlab.com/m-unlock/cwl
How to setup and use an UNLOCK workflow:
https://docs.m-unlock.nl/docs/workflows/setup.html
Click and drag the diagram to pan, double click or use the controls to zoom.
Inputs
| ID | Name | Description | Type |
|---|---|---|---|
| identifier | Identifier | Identifier for this dataset used in this workflow. |
|
| threads | Number of threads | Number of threads to use for computational processes. (default 2) |
|
| memory | Maximum memory in MB | Maximum memory usage in MegaBytes. (default 8000) |
|
| forward_reads | Forward reads | Forward sequence fastq file(s) locally |
|
| reverse_reads | Reverse reads | Reverse sequence fastq file(s) locally |
|
| do_not_output_filtered_reads | Don't output reads. | Do not output filtered reads. (default false) |
|
| skip_qc_unfiltered | Skip QC unfiltered | Skip FastQC analyses of raw input reads (default false) |
|
| skip_qc_filtered | Skip QC filtered | Skip FastQC analyses of filtered input reads (default false) |
|
| filter_rrna | filter rRNA | Optionally remove rRNA sequences from the reads (default false) |
|
| deduplicate | Deduplicate reads | Remove exact duplicate reads with fastp. (default false) |
|
| humandb | Filter human reads | Bowtie2 index folder. Provide the folder in which the in index files are located. |
|
| reference_filter_db | Filter reference file(s) | Custom reference database for filtering with Hostile. Provide the folder in which the bowtie2 index files are located. (default false) |
|
| keep_reference_mapped_reads | Keep mapped reads | Discard unmapped and keep reads mapped to the given reference. Default false (discard mapped) |
|
| destination | Output Destination | Optional output destination only used for cwl-prov reporting. |
|
| source | Input URLs used for this run | A provenance element to capture the original source of the input data |
|
Steps
| ID | Name | Description |
|---|---|---|
| workflow_merge_pe_reads | Merge paired reads | Merge multiple forward and reverse fastq reads to single file objects |
| sequali_illumina_before | Sequali before | Quality assessment and report of reads before filtering |
| fastp | fastp | Read quality filtering and (barcode) trimming. |
| rrna_filter | rRNA filter (bbduk) | Filters rRNA sequences from reads using bbduk |
| human_filter | Human filter | Filter human reads from the dataset using Hostile |
| reference_filter | Custom reference filter | Filter reads using custom references with Hostile |
| phix_filter | PhiX filter (bbduk) | Filters illumina spike-in PhiX sequences from reads using bbduk |
| sequali_illumina_after | Sequali after | Quality assessment and report of reads after filtering |
| reports_files_to_folder | Reports to folder | Preparation of QC output files to a specific output folder |
| out_fwd_reads | Output fwd reads | Step needed to output filtered reads because there is an option to not to. |
| out_rev_reads | Output rev reads | Step needed to output filtered reads because there is an option to not to. |
Outputs
| ID | Name | Description | Type |
|---|---|---|---|
| reports_folder | Filtering reports folder | Folder containing all reports of filtering and quality control |
|
| QC_forward_reads | Filtered forward read | Filtered forward read |
|
| QC_reverse_reads | Filtered reverse read | Filtered reverse read |
|
Version History
Version 2 (latest) Created 6th Nov 2025 at 13:41 by Bart Nijsse
Major changes: Replaced FastQC with Sequali. Reference and contamination filtered in now done using Hostile instead of BBduk (required pre-build indexes)
Open
master
f0dcdb1
Version 1 (earliest) Created 21st Apr 2022 at 14:00 by Bart Nijsse
Initial commit
Frozen
Version-1
5c2e0e5
Creators and SubmitterCreators
Submitter
Views: 5053 Downloads: 785
Created: 21st Apr 2022 at 14:00
Last updated: 6th Nov 2025 at 16:37
AttributionsNone
Visit source
https://orcid.org/0000-0001-8172-8981