Generic variation analysis reporting
This workflow generates reports from a list of variants generated by Variant Calling Workflow.
The workflow accepts a single input:
- A collection of VCF files
The workflow produces two outputs (format description below):
- A list of variants grouped by Sample
- A list of variants grouped by Variant
Here is example of output by sample. In this table all varinats in all samples are epxlicitrly listed:
Sample | POS | FILTER | REF | ALT | DP | AF | AFcaller | SB | DP4 | IMPACT | FUNCLASS | EFFECT | GENE | CODON | AA | TRID | min(AF) | max(AF) | countunique(change) | countunique(FUNCLASS) | change |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ERR3485786 | 11644 | PASS | A | G | 97 | 0.979381 | 0.907216 | 0 | 1,1,49,46 | LOW | SILENT | SYNONYMOUS_CODING | D7L | tgT/tgC | C512 | AKG51361.1 | 0.979381 | 1 | 1 | 1 | A>G |
ERR3485786 | 11904 | PASS | T | C | 102 | 0.990196 | 0.95098 | 0 | 0,0,51,50 | MODERATE | MISSENSE | NON_SYNONYMOUS_CODING | D7L | Act/Gct | T426A | AKG51361.1 | 0.990196 | 1 | 1 | 1 | T>C |
Note the two alernative allele frequency fields: "AFcaller" ans "AF". LoFreq reports AF values listed in "AFcaller". They incorrect due to the known LoFreq bug. To correct for this we are recomputing AF values from DP4 and DP fields as follows:
AF == (DP4[2] + DP4[3]) / DP.
Here is an example of output by variant. In this table data is aggregated by variant across all samples in which this variant is present:
POS | REF | ALT | IMPACT | FUNCLASS | EFFECT | GENE | CODON | AA | TRID | countunique(Sample) | min(AF) | max(AF) | SAMPLES(above-thresholds) | SAMPLES(all) | AFs(all) | change |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
11644 | A | G | LOW | SILENT | SYNONYMOUS_CODING | D7L | tgT/tgC | C512 | AKG51361.1 | 11 | 0.979381 | 1 | ERR3485786,ERR3485787... | ERR3485786,ERR3485787,ERR3485789 ... | 0.979381,1.0... | A>G |
11904 | T | C | MODERATE | MISSENSE | NON_SYNONYMOUS_CODING | D7L | Act/Gct | T426A | AKG51361.1 | 12 | 0.990196 | 1 | ERR3485786,ERR3485787... | ERR3485786,ERR3485787,ERR3485789... | 0.990196,1.0,1.0... | T>C |
The workflow can be accessed at usegalaxy.org
The general idea of the workflow is:
Inputs
ID | Name | Description | Type |
---|---|---|---|
AF Filter | AF Filter | Allele Frequency Filter. This is the minimum allele frequency required for variants to be included in the reports. |
|
DP Filter | DP Filter | Depth Filter. This is the minimum depth of all alignments at a variant site. |
|
DP_ALT Filter | DP_ALT Filter | Depth Filter for variant allele. This is the minimum depth of alignments supporting a variant. |
|
Variation data to report | Variation data to report | Variation data in VCF format. Can be the output of any of the workflows in https://github.com/galaxyproject/iwc/tree/main/workflows/sars-cov-2-variant-calling |
|
Steps
ID | Name | Description |
---|---|---|
4 | SnpSift Filter | toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_filter/4.3+t.galaxy1 |
5 | Compose text parameter value | toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1 |
6 | Compose text parameter value | toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1 |
7 | SnpSift Filter | toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_filter/4.3+t.galaxy1 |
8 | SnpSift Extract Fields | toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_extractFields/4.3+t.galaxy0 |
9 | Compute | toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6 |
10 | Datamash | toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.1.0 |
11 | Replace | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.3 |
12 | Replace | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.3 |
13 | Replace | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.3 |
14 | Collapse Collection | toolshed.g2.bx.psu.edu/repos/nml/collapse_collections/collapse_dataset/5.1.0 |
15 | Compute | toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6 |
16 | Compute | toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/1.6 |
17 | Replace | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.3 |
18 | Datamash | toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.1.0 |
19 | Filter | Filter1 |
20 | Datamash | toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.1.0 |
21 | Join | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_easyjoin_tool/1.1.2 |
22 | Datamash | toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.1.0 |
23 | Datamash | toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.1.0 |
24 | Datamash | toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.1.0 |
25 | Join | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_easyjoin_tool/1.1.2 |
26 | Join | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_easyjoin_tool/1.1.2 |
27 | Cut | Cut1 |
28 | Join | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_easyjoin_tool/1.1.2 |
29 | Cut | Cut1 |
30 | Replace | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.3 |
31 | Cut | Cut1 |
32 | Split file | toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/split_file_to_collection/0.5.0 |
33 | Sort | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sort_header_tool/1.1.1 |
34 | Sort | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sort_header_tool/1.1.1 |
Outputs
ID | Name | Description | Type |
---|---|---|---|
prefiltered_variants | prefiltered_variants | n/a |
|
filtered_variants | filtered_variants | n/a |
|
filtered_extracted_variants | filtered_extracted_variants | n/a |
|
af_recalculated | af_recalculated | n/a |
|
collapsed_effects | collapsed_effects | n/a |
|
highest_impact_effects | highest_impact_effects | n/a |
|
cleaned_header | cleaned_header | n/a |
|
processed_variants_collection | processed_variants_collection | n/a |
|
all_variants_all_samples | all_variants_all_samples | n/a |
|
variants_for_plotting | variants_for_plotting | n/a |
|
by_variant_report | by_variant_report | n/a |
|
combined_variant_report | combined_variant_report | n/a |
|
Version History
Version 1 (earliest) Created 1st Jun 2022 at 16:36 by Anton Nekrutenko
Initial commit
Open
master
0a39792
Creators
Not specifiedAdditional credit
Wolfgang Maier
Submitter
Views: 7843 Downloads: 481 Runs: 0
Created: 1st Jun 2022 at 16:36
Last updated: 3rd Jun 2022 at 10:28
None