COVID-19 sequence analysis on Illumina Amplicon PE data
This workflow implements an iVar based analysis similar to the one in ncov2019-artic-nf, covid-19-signal and the Thiagen Titan workflow. These workflows (written in Nextflow, Snakemake and WDL) are widely in use in COG UK, CanCOGeN and some US state public health laboratories.
This workflow is also the subject of a Galaxy Training Network tutorial (currently a Work in Progress).
It differs from this workflow in
that it does not use lofreq
and is aimed at rapid analysis of majority variants and lineage/clade assignment with pangolin
and nextclade
.
TODO:
- Add support for QC using negative and positive controls
- Integrate with phylogeny tools including IQTree and UShER (and possibly more).
Inputs
ID | Name | Description | Type |
---|---|---|---|
Minimum quality score to call base | Minimum quality score to call base | Minimum base quality score to count a base towards the sequence consensus. |
|
Paired read collection for samples | Paired read collection for samples | FASTQ format Illumina Reads (Amplicon Protocol) |
|
Primer BED | Primer BED | Primer BED file (from ARTIC project or similar) |
|
Read fraction to call variant | Read fraction to call variant | Specify the proportion of reads that need to agree with each other to call a variant. This is a floating point value between 0 and 1. |
|
Reference FASTA | Reference FASTA | SARS-CoV-2 reference genome (typically MN908947.3) |
|
Steps
ID | Name | Description |
---|---|---|
5 | fastp: Trimmed Illumina Reads | toolshed.g2.bx.psu.edu/repos/iuc/fastp/fastp/0.20.1+galaxy0 |
6 | Rename reference to NC_045512.2 | If the reference is named MN908947.3 (Genbank name of SARS-CoV-2 reference genome), rename it to NC_045512.2 (RefSeq name of SARS-CoV-2 reference genome) toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1 |
7 | Map with BWA-MEM | toolshed.g2.bx.psu.edu/repos/devteam/bwa/bwa_mem/0.7.17.1 |
8 | Samtools stats | toolshed.g2.bx.psu.edu/repos/devteam/samtools_stats/samtools_stats/2.0.2+galaxy2 |
9 | Samtools view | toolshed.g2.bx.psu.edu/repos/iuc/samtools_view/samtools_view/1.9+galaxy3 |
10 | QualiMap BamQC | toolshed.g2.bx.psu.edu/repos/iuc/qualimap_bamqc/qualimap_bamqc/2.2.2d+galaxy3 |
11 | ivar trim | toolshed.g2.bx.psu.edu/repos/iuc/ivar_trim/ivar_trim/1.3.1+galaxy2 |
12 | Flatten Collection | __FLATTEN__ |
13 | ivar variants | toolshed.g2.bx.psu.edu/repos/iuc/ivar_variants/ivar_variants/1.3.1+galaxy2 |
14 | ivar consensus | toolshed.g2.bx.psu.edu/repos/iuc/ivar_consensus/ivar_consensus/1.3.1+galaxy0 |
15 | Quality Control Report | toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.9+galaxy1 |
16 | Annotated variants | toolshed.g2.bx.psu.edu/repos/iuc/snpeff_sars_cov_2/snpeff_sars_cov_2/4.5covid19 |
17 | Consensus genome (masked for depth) | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1 |
18 | Concatenate datasets | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cat/0.1.1 |
19 | Pangolin | toolshed.g2.bx.psu.edu/repos/iuc/pangolin/pangolin/3.1.14+galaxy0 |
20 | Nextclade | toolshed.g2.bx.psu.edu/repos/iuc/nextclade/nextclade/1.4.1+galaxy0 |
Outputs
ID | Name | Description | Type |
---|---|---|---|
primer_trimmed_bam | primer_trimmed_bam | n/a |
|
ivar_variants_tabular | ivar_variants_tabular | n/a |
|
bamqc_report_html | bamqc_report_html | n/a |
|
snpeff_annotated_vcf | snpeff_annotated_vcf | n/a |
|
ivar_consensus_genome | ivar_consensus_genome | n/a |
|
combined_multifasta | combined_multifasta | n/a |
|
all_samples_pangolin | all_samples_pangolin | n/a |
|
all_samples_nextclade | all_samples_nextclade | n/a |
|
Version History
v0.2.3 (latest) Created 7th Oct 2024 at 16:32 by WorkflowHub Bot
Updated to v0.2.3
Frozen
v0.2.3
c7afd32
v0.1 (earliest) Created 31st Aug 2021 at 03:01 by WorkflowHub Bot
Added/updated 7 files
Frozen
master
fe01ebf
Creator
Additional credit
Peter van Heusden
Submitter
Views: 7344 Downloads: 1267 Runs: 0
Created: 31st Aug 2021 at 03:01
Last updated: 3rd Oct 2024 at 03:03
This item has not yet been tagged.
None