Workflow Type: Galaxy
Stable

Assembly Evaluation for ERGA-BGE Reports

One Assmebly, HiFi WGS reads + HiC reads

The workflow requires the following:

  • Species Taxonomy ID number
  • NCBI Genome assembly accession code
  • BUSCO Lineage
  • WGS accurate reads accession code
  • NCBI HiC reads accession code

The workflow will get the data and process it to generate genome profiling (genomescope, smudgeplot -optional-), assembly stats (gfastats), merqury stats (QV, completeness), BUSCO, snailplot, contamination blobplot, and HiC heatmap.

Use this workflow for HiFi-based assemblies where the WGS accurate reads are PacBio HiFi

Inputs

ID Name Description Type
BUSCO Lineage BUSCO Lineage Choose the (eukaryotic) BUSCO lineage that corresponds to the assembled species, e.g.: mammalia_odb10
  • string
Multiple HiC paired-end files? Multiple HiC paired-end files? IMPORTANT! If you entered more than one accession code, select Yes
  • boolean
NCBI Genome assembly accession code NCBI Genome assembly accession code Should start with GCA or GCF, e.g.: GCA_963556495.2
  • string
NCBI HiC reads accession code NCBI HiC reads accession code Comma-separated accession code of the reads. Must start with SRR, DRR or ERR, e.g. SRR925743, ERR343809
  • string
NCBI HiFi reads accession code NCBI HiFi reads accession code Comma-separated accession code of the reads. Must start with SRR, DRR or ERR, e.g. SRR925743, ERR343809
  • string
Ploidy Ploidy Default value: 2
  • int?
Run Smudgeplot? Run Smudgeplot? n/a
  • boolean
Species Taxonomy ID number Species Taxonomy ID number Get the NCBI taxonomy number here: https://www.ncbi.nlm.nih.gov/taxonomy
  • int
kmer length kmer length Default value: 21
  • int?

Steps

ID Name Description
1 taxdump address toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_text_file_with_recurring_lines/9.3+galaxy1
10 downloads lftp
11 NCBI Datasets Genomes toolshed.g2.bx.psu.edu/repos/iuc/ncbi_datasets/datasets_download_genome/16.20.0+galaxy0
12 Faster Download and Extract Reads in FASTQ toolshed.g2.bx.psu.edu/repos/iuc/sra_tools/fasterq_dump/3.1.1+galaxy0
13 Faster Download and Extract Reads in FASTQ toolshed.g2.bx.psu.edu/repos/iuc/sra_tools/fasterq_dump/3.1.1+galaxy0
14 Collapse Collection toolshed.g2.bx.psu.edu/repos/nml/collapse_collections/collapse_dataset/5.1.0
15 Flatten collection __FLATTEN__
16 Cutadapt toolshed.g2.bx.psu.edu/repos/lparsons/cutadapt/cutadapt/4.9+galaxy1
17 fastp toolshed.g2.bx.psu.edu/repos/iuc/fastp/fastp/0.23.4+galaxy0
18 Extract dataset __EXTRACT_DATASET__
19 Flatten collection __FLATTEN__
20 Create BlobtoolKit toolshed.g2.bx.psu.edu/repos/bgruening/blobtoolkit/blobtoolkit/4.0.7+galaxy2
21 gfastats toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0
22 Diamond toolshed.g2.bx.psu.edu/repos/bgruening/diamond/bg_diamond/2.0.15+galaxy0
23 Map with minimap2 toolshed.g2.bx.psu.edu/repos/iuc/minimap2/minimap2/2.28+galaxy0
24 Busco toolshed.g2.bx.psu.edu/repos/iuc/busco/busco/5.5.0+galaxy0
25 BWA-MEM2 toolshed.g2.bx.psu.edu/repos/iuc/bwa_mem2/bwa_mem2/2.2.1+galaxy1
26 Convert FASTA to fai file CONVERTER_fasta_to_fai
27 Meryl toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6
28 Merge BAM Files toolshed.g2.bx.psu.edu/repos/devteam/sam_merge/sam_merge2/1.2.0
29 Sambamba merge toolshed.g2.bx.psu.edu/repos/bgruening/sambamba_merge/sambamba_merge/1.0.1+galaxy1
30 Extract dataset __EXTRACT_DATASET__
31 Cut Cut1
32 Meryl toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6
33 BlobToolKit toolshed.g2.bx.psu.edu/repos/bgruening/blobtoolkit/blobtoolkit/4.0.7+galaxy2
34 BAM/SAM Mapping Stats toolshed.g2.bx.psu.edu/repos/nilesh/rseqc/rseqc_bam_stat/5.0.3+galaxy0
35 Pick parameter value toolshed.g2.bx.psu.edu/repos/iuc/pick_value/pick_value/0.2.0
36 bedtools MakeWindowsBed toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_makewindowsbed/2.31.1
37 Merqury toolshed.g2.bx.psu.edu/repos/iuc/merqury/merqury/1.3+galaxy3
38 Meryl toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6
39 BlobToolKit toolshed.g2.bx.psu.edu/repos/bgruening/blobtoolkit/blobtoolkit/4.0.7+galaxy2
40 BlobToolKit toolshed.g2.bx.psu.edu/repos/bgruening/blobtoolkit/blobtoolkit/4.0.7+galaxy2
41 Pairtools parse toolshed.g2.bx.psu.edu/repos/iuc/pairtools_parse/pairtools_parse/1.1.0+galaxy1
42 Sambamba flagstat toolshed.g2.bx.psu.edu/repos/bgruening/sambamba_flagstat/sambamba_flagstat/1.0.1+galaxy1
43 Smudgeplot toolshed.g2.bx.psu.edu/repos/galaxy-australia/smudgeplot/smudgeplot/0.2.5+galaxy3
44 GenomeScope toolshed.g2.bx.psu.edu/repos/iuc/genomescope/genomescope/2.0+galaxy2
45 Pairtools sort toolshed.g2.bx.psu.edu/repos/iuc/pairtools_sort/pairtools_sort/1.1.0+galaxy1
46 Pairtools dedup toolshed.g2.bx.psu.edu/repos/iuc/pairtools_dedup/pairtools_dedup/1.1.0+galaxy1
47 Pairtools split toolshed.g2.bx.psu.edu/repos/iuc/pairtools_split/pairtools_split/1.1.0+galaxy1
48 cooler csort with tabix toolshed.g2.bx.psu.edu/repos/lldelisle/cooler_csort_tabix/cooler_csort_tabix/0.8.11+galaxy1
49 cooler_cload_tabix toolshed.g2.bx.psu.edu/repos/lldelisle/cooler_cload_tabix/cooler_cload_tabix/0.8.11+galaxy1
50 hicMergeMatrixBins toolshed.g2.bx.psu.edu/repos/bgruening/hicexplorer_hicmergematrixbins/hicexplorer_hicmergematrixbins/3.7.2+galaxy0
51 hicMergeMatrixBins toolshed.g2.bx.psu.edu/repos/bgruening/hicexplorer_hicmergematrixbins/hicexplorer_hicmergematrixbins/3.7.2+galaxy0
52 hicPlotMatrix toolshed.g2.bx.psu.edu/repos/bgruening/hicexplorer_hicplotmatrix/hicexplorer_hicplotmatrix/3.7.2+galaxy0
53 hicPlotMatrix toolshed.g2.bx.psu.edu/repos/bgruening/hicexplorer_hicplotmatrix/hicexplorer_hicplotmatrix/3.7.2+galaxy0

Outputs

ID Name Description Type
Busco on input dataset(s): short summary Busco on input dataset(s): short summary n/a
  • File
Busco on input dataset(s): full table Busco on input dataset(s): full table n/a
  • File

Version History

Version 1.1 (latest) Created 4th Nov 2024 at 14:29 by Diego De Panis

No revision comments

Open master 9b0d0d4

Version 1 (earliest) Created 20th Aug 2024 at 14:19 by Diego De Panis

Initial commit


Frozen Version-1 48bc4d9
help Creators and Submitter
Creator
Additional credit

ERGA

Submitter
Citation
De Panis, D. (2024). ERGA-BGE Genome Report ASM analyses (one-asm HiFi + HiC). WorkflowHub. https://doi.org/10.48546/WORKFLOWHUB.WORKFLOW.1104.1
License
Activity

Views: 831   Downloads: 101   Runs: 19

Created: 20th Aug 2024 at 14:19

Last updated: 20th Aug 2024 at 14:21

Annotated Properties
Topic annotations
Operation annotations
help Attributions

None

Total size: 517 KB
Powered by
(v.1.16.0-main)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH