Longread 16S classification workflow
Version 1

Workflow Type: Common Workflow Language
Work-in-progress

Workflow for quality assessment and taxonomic classification of amplicon long read sequences.
In addition files are exported to their respective subfolders for easier data management in a later stage.

Inputs are expected to be basecalled fastq files

Steps:
- NanoPlot read quality control, before and after filtering
- fastplong read quality and length filtering
- Emu abundance; species-level taxonomic abundance for full-length 16S read

Click and drag the diagram to pan, double click or use the controls to zoom.

Inputs

ID Name Description Type
identifier identifier used Identifier for this dataset used in this workflow
  • string
reads Read file Read file in FASTA or FASTQ format (can be gz)
  • File[]
reference_db Reference database Reference database used in FASTA format
  • Directory
readtype Read type Type of read nanopore or pacbio. default nanopore
  • <strong>enum</strong> of: nanopore, pacbio
fastq_rich Fastq rich (ONT) Input fastq is generated by albacore, MinKNOW or guppy with additional information concerning channel and time. Used to creating more informative quality plots (default false)
  • boolean
threads Number of threads Number of threads to use for computational processes
  • int?
skip_read_filter Skip quality filtering Skip quality reporting and filtering. (Default false)
  • boolean
disable_quality_filtering Disable_quality_filtering "Quality filtering is enabled by default. If this option is specified, quality filtering is disabled. Quality plots will still be generated when skip_read_filter is false. (Default false)"
  • boolean?
qualified_quality_phred Qualified_quality_phred the quality value that a base is qualified. Default 8 means phred quality >=Q9 is qualified.
  • int?
mean_qual Mean quality if one read's mean_qual quality score < mean_qual, then this read is discarded. (Default 10)
  • int?
minimum_length Minimum length required Reads shorter will be discarded. (Default 1200)
  • int?
length_limit Maximum length limit Reads longer than length_limit will be discarded. (Default 1600)
  • int?
trim_front Trim_front Trimming how many bases in front for read. (Default not set, 0)
  • int?
trim_tail trim_tail Trimming how many bases in tail for read. (Default not set, 0)
  • int?
cut_front Cut front Move a sliding window from front (5') to tail, drop the bases in the window if its mean quality < threshold, stop otherwise. Default false
  • boolean?
cut_tail Cut tail Move a sliding window from tail (3') to front, drop the bases in the window if its mean quality < threshold, stop otherwise Default false.
  • boolean?
cut_window_size Cut window size The window size option shared by cut_front, cut_tail or cut_sliding. Range: 1~1000. Default 4
  • int?
cut_mean_quality Cut mean quality The mean quality requirement option shared by cut_front, cut_tail or cut_sliding. Range: 1~36. Default 20
  • int?
start_adapter start_adapter The adapter sequence at read start (5'). (Default auto-detect)
  • string?
end_adapter End adapter The adapter sequence at read end (3'). (Default auto-detect)
  • string?
adapter_fasta Adapter fasta Specify a FASTA file to trim both read ends by all the sequences in this FASTA file. (Default None)
  • File?
disable_adapter_trimming Disable adapter trimming Adapter trimming is enabled by default. If this option is specified, adapter trimming is disabled. Default true
  • boolean
output_filtered_reads Output filtered reads Output filtered reads when filtering is applied. (Default false)
  • boolean
destination Output Destination Optional Output destination used for cwl-prov reporting.
  • string?

Steps

ID Name Description
workflow_merge_reads Merge paired reads Creates a single file object. Also merges reads if multiple files are given.
workflow_longread_quality Oxford Nanopore quality workflow Quality, filtering and taxonomic classification workflow for Oxford Nanopore reads
emu Emu abundance Emu abundance; species-level taxonomic abundance for full-length 16S read
step_output_filtered_reads Output reads Step needed to output filtered reads.

Outputs

ID Name Description Type
quality_folder NanoPlot Folder with quality plots from Nanoplot
  • Directory?
filtered_reads n/a Filtered reads output file
  • File?
emu_abundance Emu abundances n/a
  • File
emu_read_assignment_distributions Emu read assignment distribution n/a
  • File
emu_unclassified Emu unclassified n/a
  • File

Version History

Version 1 (earliest) Created 10th Sep 2025 at 13:31 by Bart Nijsse

Initial commit


Open master d20af19
help Creators and Submitter
Discussion Channel
Activity

Views: 42   Downloads: 7

Created: 10th Sep 2025 at 13:30

Last updated: 10th Sep 2025 at 14:40

Annotated Properties
Topic annotations
help Attributions

None

Total size: 9.17 KB
Powered by
(v.1.17.0-main)
Copyright © 2008 - 2025 The University of Manchester and HITS gGmbH