Workflow Type: Galaxy

COVID-19: consensus construction

This workflow aims at generating reliable consensus sequences from variant calls according to transparent criteria that capture at least some of the complexity of variant calling.

It takes a collection of VCFs (with DP and DP4 INFO fields) and a collection of the corresponding aligned reads (for the purpose of calculating genome-wide coverage) such as produced by any of the variant calling workflows in https://github.com/galaxyproject/iwc/tree/main/workflows/sars-cov-2-variant-calling and generates a collection of viral consensus sequences and a multisample FASTA of all these sequences.

Each consensus sequence is guaranteed to capture all called, filter-passing (as per the FILTER column of the VCF input) variants found in the VCF of its sample that reach a user-defined consensus allele frequency threshold.

Filter-failing variants and variants below a second user-defined minimal allele frequency threshold will be ignored.

Genomic positions of filter-passing variants with an allele frequency in between the two thresholds will be hard-masked (with N) in the consensus sequence of their sample.

Genomic positions with a coverage (calculated from the read alignments input) below another user-defined threshold will be hard-masked, too, unless they are consensus variant sites.

Click and drag the diagram to pan, double click or use the controls to zoom.

Inputs

ID Name Description Type
Depth-threshold for masking Depth-threshold for masking Sites in the viral genome covered by less than this number of reads are considered questionable and will be masked (with Ns) in the consensus sequence independent of whether a variant has been called at them or not.
  • int?
Reference genome Reference genome The SARS-CoV-2 reference genome
  • File
Variant calls Variant calls Collection of VCFs produced by upstream workflows for variation analysis
  • File[]
aligned reads data for depth calculation aligned reads data for depth calculation Fully processed BAMs as generated by upstream workflows for variation analysis. Note: for ARTIC data, these BAMs should NOT have undergone processing with ivar removereads.
  • File[]
min-AF for consensus variant min-AF for consensus variant Only variant calls with an allele-frequency greater this value will be considered consensus variants.
  • float?
min-AF for failed variants min-AF for failed variants Variant calls with an allele frequency higher than this value, but lower than the AF threshold for consensus variants will be considered questionable and the respective sites be masked (with Ns) in the consensus sequence.
  • float?

Steps

ID Name Description
6 Compose text parameter value toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1
7 Compose text parameter value toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1
8 bedtools Genome Coverage toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_genomecoveragebed/2.29.2
9 Compose text parameter value toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1
10 SnpSift Filter toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_filter/4.3+t.galaxy1
11 SnpSift Filter toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_filter/4.3+t.galaxy1
12 Filter Filter1
13 SnpSift Extract Fields toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_extractFields/4.3+t.galaxy0
14 SnpSift Extract Fields toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_extractFields/4.3+t.galaxy0
15 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6
16 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6
17 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6
18 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6
19 Cut Cut1
20 Cut Cut1
21 Concatenate toolshed.g2.bx.psu.edu/repos/devteam/concat/gops_concat_1/1.0.1
22 Merge toolshed.g2.bx.psu.edu/repos/devteam/merge/gops_merge_1/1.0.0
23 Subtract toolshed.g2.bx.psu.edu/repos/devteam/subtract/gops_subtract_1/1.0.0
24 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6
25 Cut Cut1
26 bcftools consensus toolshed.g2.bx.psu.edu/repos/iuc/bcftools_consensus/bcftools_consensus/1.10+galaxy1
27 Collapse Collection toolshed.g2.bx.psu.edu/repos/nml/collapse_collections/collapse_dataset/5.1.0

Outputs

ID Name Description Type
coverage_depth coverage_depth n/a
  • File
consensus_variants consensus_variants n/a
  • File
filter_failed_variants filter_failed_variants n/a
  • File
low_cov_regions low_cov_regions n/a
  • File
chrom_pos_ref_called_variants chrom_pos_ref_called_variants n/a
  • File
chrom_pos_ref_failed_variants chrom_pos_ref_failed_variants n/a
  • File
chrom_pos_ref_called_variants_with_0_based_start chrom_pos_ref_called_variants_with_0_based_start n/a
  • File
chrom_pos_ref_failed_variants_with_0_based_start chrom_pos_ref_failed_variants_with_0_based_start n/a
  • File
chrom_pos_ref_called_variants_with_0_based_start_end chrom_pos_ref_called_variants_with_0_based_start_end n/a
  • File
chrom_pos_ref_failed_variants_with_0_based_start_end chrom_pos_ref_failed_variants_with_0_based_start_end n/a
  • File
called_variant_sites called_variant_sites n/a
  • File
failed_variant_sites failed_variant_sites n/a
  • File
low_cov_regions_plus_filter_failed low_cov_regions_plus_filter_failed n/a
  • File
low_cov_regions_plus_filter_failed_combined low_cov_regions_plus_filter_failed_combined n/a
  • File
masking_regions masking_regions n/a
  • File
masking_regions_with_1_based_start masking_regions_with_1_based_start n/a
  • File
1_based_masking_regions 1_based_masking_regions n/a
  • File
consensus consensus n/a
  • File
multisample_consensus_fasta multisample_consensus_fasta n/a
  • File

Version History

v0.4 (latest) Created 25th Oct 2022 at 03:01 by WorkflowHub Bot

Updated to v0.4


Frozen v0.4 1482e55

v0.3 Created 5th Feb 2022 at 03:00 by WorkflowHub Bot

Updated to v0.3


Frozen v0.3 d49f767

v0.2.2 Created 21st Dec 2021 at 03:01 by WorkflowHub Bot

Updated to v0.2.2


Open master 58bf9e2

v0.2.1 Created 27th Jul 2021 at 03:01 by WorkflowHub Bot

Updated to v0.2.1


Frozen master ac346ee

v0.2 (earliest) Created 23rd Jul 2021 at 10:18 by WorkflowHub Bot

Added/updated 10 files


Frozen master 2175b4a
help Creators and Submitter
Creators
Not specified
Additional credit

Wolfgang Maier

Submitter
License
Activity

Views: 7999   Downloads: 1459   Runs: 0

Created: 23rd Jul 2021 at 10:18

Last updated: 25th Oct 2022 at 03:01

help Attributions

None

Total size: 377 KB
Powered by
(v.1.16.0-main)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH