Purge-duplicates-one-haplotype-VGP6b/main
v0.7.1

Workflow Type: Galaxy

Purge Duplicate Contigs

Purge contigs marked as duplicates by purge_dups in a single haplotype(could be haplotypic duplication or overlap duplication) This workflow is the 6th workflow of the VGP pipeline. It is meant to be run after one of the contigging steps (Workflow 3, 4, or 5)

Inputs

  1. Genomescope model parameters [txt] (Generated by the k-mer profiling workflow)
  2. Hifi long reads - trimmed [fastq] (Generated by Cutadapt in the contigging workflow)
  3. Assembly to purge (e.g. hap1) [fasta] (Generated by the contigging workflow)
  4. K-mer database [meryldb] (Generated by the k-mer profiling workflow)
  5. Assembly to leave alone (used for merqury statistics) (e.g. hap2) [fasta] (Generated by the contigging workflow)
  6. Estimated Genome Size [txt]
  7. Database for busco lineage (recommended: latest)
  8. Busco lineage (recommended: vertebrata)
  9. Name of un-altered assembly
  10. Name of purged assembly

Outputs

  1. Haplotype 1 purged assembly (Fasta and gfa)
  2. Haplotype 2 purged assembly (Fasta and gfa)
  3. QC: BUSCO report for both assemblies
  4. QC: Merqury report for both assemblies
  5. QC: Assembly statistics for both assemblies
  6. QC: Nx plot for both assemblies
  7. QC: Size plot for both assemblies

Inputs

ID Name Description Type
Assembly to leave alone (For Merqury comparison) Assembly to leave alone (For Merqury comparison) n/a
  • File
Assembly to purge Assembly to purge n/a
  • File
Database for Busco Lineage Database for Busco Lineage n/a
  • string
Estimated genome size - Parameter File Estimated genome size - Parameter File n/a
  • File
Genomescope model parameters Genomescope model parameters n/a
  • File
Lineage Lineage Taxonomic lineage for the organism being assembled for Busco analysis
  • string
Meryl Database Meryl Database n/a
  • File
Name of purged assembly Name of purged assembly n/a
  • string?
Name of un-altered assembly Name of un-altered assembly n/a
  • string?
Pacbio Reads Collection - Trimmed Pacbio Reads Collection - Trimmed n/a
  • File[]

Steps

ID Name Description
10 Compute toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/2.0
11 Map with minimap2 toolshed.g2.bx.psu.edu/repos/iuc/minimap2/minimap2/2.28+galaxy0
12 Purge overlaps toolshed.g2.bx.psu.edu/repos/iuc/purge_dups/purge_dups/1.2.6+galaxy0
13 gfastats toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0
14 Estimated genome size param_value_from_file
15 Cut Cut1
16 Cut Cut1
17 Map with minimap2 toolshed.g2.bx.psu.edu/repos/iuc/minimap2/minimap2/2.28+galaxy0
18 gfastats_data_prep n/a
19 gfastats toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0
20 Parse parameter value param_value_from_file
21 Parse parameter value param_value_from_file
22 Text reformatting toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/9.3+galaxy1
23 Purge overlaps toolshed.g2.bx.psu.edu/repos/iuc/purge_dups/purge_dups/1.2.6+galaxy0
24 Purge overlaps toolshed.g2.bx.psu.edu/repos/iuc/purge_dups/purge_dups/1.2.6+galaxy0
25 Remove REPEATs from BED toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/9.3+galaxy1
26 Purge overlaps toolshed.g2.bx.psu.edu/repos/iuc/purge_dups/purge_dups/1.2.6+galaxy0
27 Merqury toolshed.g2.bx.psu.edu/repos/iuc/merqury/merqury/1.3+galaxy4
28 gfastats toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0
29 Busco toolshed.g2.bx.psu.edu/repos/iuc/busco/busco/5.5.0+galaxy0
30 Convert purged fasta to gfa toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0
31 gfastats toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0
32 merqury_QV __EXTRACT_DATASET__
33 output_merqury.spectra-cn.fl __EXTRACT_DATASET__
34 output_merqury.spectra-asm.fl __EXTRACT_DATASET__
35 output_merqury.assembly_01.spectra-cn.fl __EXTRACT_DATASET__
36 merqury_stats __EXTRACT_DATASET__
37 output_merqury.assembly_02.spectra-cn.fl __EXTRACT_DATASET__
38 gfastats_data_prep n/a
39 Text reformatting toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/9.3+galaxy1
40 gfastats_plot n/a
41 Join two Datasets join1
42 Advanced Cut toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cut_tool/9.3+galaxy1
43 Replace toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/9.3+galaxy1

Outputs

ID Name Description Type
Cutoffs Cutoffs n/a
  • File
Read Coverage and cutoffs calculation: Histogram plot Read Coverage and cutoffs calculation: Histogram plot n/a
  • File
Removed haplotigs Removed haplotigs n/a
  • File
Purged assembly Purged assembly n/a
  • File
Merqury on Phased assemblies: Images Merqury on Phased assemblies: Images n/a
  • File
Merqury on Phased assemblies: stats Merqury on Phased assemblies: stats n/a
  • File
qv_files qv_files n/a
  • File
Busco on Purged Primary assembly: short summary Busco on Purged Primary assembly: short summary n/a
  • File
Busco on Purged Primary assembly: summary image Busco on Purged Primary assembly: summary image n/a
  • File
Purged assembly (GFA) Purged assembly (GFA) n/a
  • File
Purged assembly statistics Purged assembly statistics n/a
  • File
merqury_QV merqury_QV n/a
  • File
output_merqury.spectra-cn.fl output_merqury.spectra-cn.fl n/a
  • File
output_merqury.spectra-asm.fl output_merqury.spectra-asm.fl n/a
  • File
output_merqury.assembly_01.spectra-cn.fl output_merqury.assembly_01.spectra-cn.fl n/a
  • File
merqury_stats merqury_stats n/a
  • File
output_merqury.assembly_02.spectra-cn.fl output_merqury.assembly_02.spectra-cn.fl n/a
  • File
Nx Plot Nx Plot n/a
  • File
Size Plot Size Plot n/a
  • File
Assembly statistics for both assemblies Assembly statistics for both assemblies n/a
  • File
clean_stats clean_stats n/a
  • File

Version History

v0.7.1 (earliest) Created 7th Sep 2024 at 03:02 by WorkflowHub Bot

Updated to v0.7.1


Frozen v0.7.1 cfe3920
help Creators and Submitter
Creators
Not specified
Additional credit

Galaxy, VGP

Submitter
Activity

Views: 166   Downloads: 24   Runs: 0

Created: 7th Sep 2024 at 03:02

help Tags

This item has not yet been tagged.

help Attributions

None

Total size: 159 KB
Powered by
(v.1.16.0-main)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH