Workflow Type: Galaxy
Frozen
Frozen
Purge Duplicate Contigs
Purge contigs marked as duplicates by purge_dups in a single haplotype(could be haplotypic duplication or overlap duplication) This workflow is the 6th workflow of the VGP pipeline. It is meant to be run after one of the contigging steps (Workflow 3, 4, or 5)
Inputs
- Genomescope model parameters [txt] (Generated by the k-mer profiling workflow)
- Hifi long reads - trimmed [fastq] (Generated by Cutadapt in the contigging workflow)
- Assembly to purge (e.g. hap1) [fasta] (Generated by the contigging workflow)
- K-mer database [meryldb] (Generated by the k-mer profiling workflow)
- Estimated Genome Size [txt]
- Assembly to leave alone (used for merqury statistics) (e.g. hap2) [fasta] (Generated by the contigging workflow)
- Name of un-altered assembly
- Name of purged assembly
Outputs
- Haplotype 1 purged assembly (Fasta and gfa)
- Haplotype 2 purged assembly (Fasta and gfa)
- QC: BUSCO report for both assemblies
- QC: Merqury report for both assemblies
- QC: Assembly statistics for both assemblies
- QC: Nx plot for both assemblies
- QC: Size plot for both assemblies
Inputs
ID | Name | Description | Type |
---|---|---|---|
Assembly to leave alone (For Merqury comparison) | Assembly to leave alone (For Merqury comparison) | n/a |
|
Assembly to purge | Assembly to purge | n/a |
|
Estimated genome size - Parameter File | Estimated genome size - Parameter File | n/a |
|
Genomescope model parameters | Genomescope model parameters | n/a |
|
Meryl Database | Meryl Database | n/a |
|
Name of purged assembly | Name of purged assembly | n/a |
|
Name of un-altered assembly | Name of un-altered assembly | n/a |
|
Pacbio Reads Collection - Trimmed | Pacbio Reads Collection - Trimmed | n/a |
|
Steps
ID | Name | Description |
---|---|---|
8 | Compute | toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/2.0 |
9 | Map with minimap2 | toolshed.g2.bx.psu.edu/repos/iuc/minimap2/minimap2/2.28+galaxy0 |
10 | Purge overlaps | toolshed.g2.bx.psu.edu/repos/iuc/purge_dups/purge_dups/1.2.6+galaxy0 |
11 | gfastats | toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0 |
12 | Estimated genome size | param_value_from_file |
13 | Cut | Cut1 |
14 | Cut | Cut1 |
15 | Map with minimap2 | toolshed.g2.bx.psu.edu/repos/iuc/minimap2/minimap2/2.28+galaxy0 |
16 | gfastats_data_prep | n/a |
17 | gfastats | toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0 |
18 | Parse parameter value | param_value_from_file |
19 | Parse parameter value | param_value_from_file |
20 | Text reformatting | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/9.3+galaxy1 |
21 | Purge overlaps | toolshed.g2.bx.psu.edu/repos/iuc/purge_dups/purge_dups/1.2.6+galaxy0 |
22 | Purge overlaps | toolshed.g2.bx.psu.edu/repos/iuc/purge_dups/purge_dups/1.2.6+galaxy0 |
23 | Remove REPEATs from BED | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/9.3+galaxy1 |
24 | Purge overlaps | toolshed.g2.bx.psu.edu/repos/iuc/purge_dups/purge_dups/1.2.6+galaxy0 |
25 | Merqury | toolshed.g2.bx.psu.edu/repos/iuc/merqury/merqury/1.3+galaxy4 |
26 | gfastats | toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0 |
27 | Busco | toolshed.g2.bx.psu.edu/repos/iuc/busco/busco/5.5.0+galaxy0 |
28 | Convert purged fasta to gfa | toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0 |
29 | gfastats | toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0 |
30 | merqury_QV | __EXTRACT_DATASET__ |
31 | output_merqury.spectra-cn.fl | __EXTRACT_DATASET__ |
32 | output_merqury.spectra-asm.fl | __EXTRACT_DATASET__ |
33 | output_merqury.assembly_01.spectra-cn.fl | __EXTRACT_DATASET__ |
34 | merqury_stats | __EXTRACT_DATASET__ |
35 | output_merqury.assembly_02.spectra-cn.fl | __EXTRACT_DATASET__ |
36 | gfastats_data_prep | n/a |
37 | Text reformatting | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/9.3+galaxy1 |
38 | gfastats_plot | n/a |
39 | Join two Datasets | join1 |
40 | Advanced Cut | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cut_tool/9.3+galaxy1 |
41 | Replace | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/9.3+galaxy1 |
Outputs
ID | Name | Description | Type |
---|---|---|---|
Cutoffs | Cutoffs | n/a |
|
Read Coverage and cutoffs calculation: Histogram plot | Read Coverage and cutoffs calculation: Histogram plot | n/a |
|
Removed haplotigs | Removed haplotigs | n/a |
|
Purged assembly | Purged assembly | n/a |
|
Merqury on Phased assemblies: Images | Merqury on Phased assemblies: Images | n/a |
|
Merqury on Phased assemblies: stats | Merqury on Phased assemblies: stats | n/a |
|
qv_files | qv_files | n/a |
|
Busco on Purged Primary assembly: short summary | Busco on Purged Primary assembly: short summary | n/a |
|
Busco on Purged Primary assembly: summary image | Busco on Purged Primary assembly: summary image | n/a |
|
Purged assembly (GFA) | Purged assembly (GFA) | n/a |
|
Purged assembly statistics | Purged assembly statistics | n/a |
|
merqury_QV | merqury_QV | n/a |
|
output_merqury.spectra-cn.fl | output_merqury.spectra-cn.fl | n/a |
|
output_merqury.spectra-asm.fl | output_merqury.spectra-asm.fl | n/a |
|
output_merqury.assembly_01.spectra-cn.fl | output_merqury.assembly_01.spectra-cn.fl | n/a |
|
merqury_stats | merqury_stats | n/a |
|
output_merqury.assembly_02.spectra-cn.fl | output_merqury.assembly_02.spectra-cn.fl | n/a |
|
Nx Plot | Nx Plot | n/a |
|
Size Plot | Size Plot | n/a |
|
Assembly statistics for both assemblies | Assembly statistics for both assemblies | n/a |
|
clean_stats | clean_stats | n/a |
|
Version History
v0.7.1 (latest) Created 7th Oct 2024 at 16:34 by WorkflowHub Bot
Updated to v0.7.1
Frozen
v0.7.1
cfe3920
v0.1 (earliest) Created 15th Feb 2024 at 03:01 by WorkflowHub Bot
Updated to v0.1
Frozen
v0.1
49773bd
Creators and Submitter
Creators
Not specifiedAdditional credit
Galaxy, VGP
Submitter
Tools
Activity
Views: 3292 Downloads: 865 Runs: 0
Created: 15th Feb 2024 at 03:01
Last updated: 17th Aug 2024 at 03:02
Tags
This item has not yet been tagged.
Attributions
None