Workflow Type: Galaxy
Stable

Post-genome assembly quality control workflow using Quast, BUSCO, Meryl, Merqury and Fasta Statistics. Updates November 2023. Inputs: reads as fastqsanger.gz (not fastq.gz), and assembly.fasta. New default settings for BUSCO: lineage = eukaryota; for Quast: lineage = eukaryotes, genome = large. Reports assembly stats into a table called metrics.tsv, including selected metrics from Fasta Stats, and read coverage; reports BUSCO versions and dependencies; and displays these tables in the workflow report. Note: a known bug is that sometimes the workflow report text resets to default text. To restore, look for an earlier workflow version with correct workflow report text, and copy and paste report text into current version.

Inputs

ID Name Description Type
FASTA contigs - Primary Assembly #main/FASTA contigs - Primary Assembly n/a
  • File
Raw reads #main/Raw reads n/a
  • File

Steps

ID Name Description
2 FASTQ to FASTA toolshed.g2.bx.psu.edu/repos/devteam/fastqtofasta/fastq_to_fasta_python/1.1.5
3 Meryl toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6
4 Fasta Statistics toolshed.g2.bx.psu.edu/repos/iuc/fasta_stats/fasta-stats/2.0
5 Quast toolshed.g2.bx.psu.edu/repos/iuc/quast/quast/5.0.2+galaxy1
6 Busco toolshed.g2.bx.psu.edu/repos/iuc/busco/busco/5.4.6+galaxy0
7 Fasta Statistics toolshed.g2.bx.psu.edu/repos/iuc/fasta_stats/fasta-stats/2.0
8 Merqury toolshed.g2.bx.psu.edu/repos/iuc/merqury/merqury/1.3
9 Search in textfiles toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1
10 Relabel some items in Fasta stats toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1
11 Get required Busco stats toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1
12 Get Busco version toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1
13 Get Busco dependencies toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1
14 Search in textfiles toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1
15 Cut Cut1
16 Filter out unneeded lines from fasta stats toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0
17 Rename some items and add in delimiters for later toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1
18 Reformat some text toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1
19 Cut Cut1
20 Extract assembly size toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0
21 Extract number of contigs toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0
22 Extract Contig N and L 50s and 90s toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0
23 Extract longest contig toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0
24 Extract GC content toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0
25 Convert commas to tabs Convert characters1
26 Collate Busco info cat1
27 Paste Paste1
28 Add blank header toolshed.g2.bx.psu.edu/repos/bgruening/add_line_to_file/add_line_to_file/0.1.0
29 Transpose cols to rows toolshed.g2.bx.psu.edu/repos/iuc/datamash_transpose/datamash_transpose/1.8+galaxy0
30 Convert to table Convert characters1
31 Compute coverage, total reads length divided by assembly length toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/2.0
32 Convert underscores to tabs Convert characters1
33 Keep two columns Cut1
34 Round the percentage to 2 decimal places toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/2.0
35 Label the column toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_column/1.1.3
36 Join info into one table cat1

Outputs

ID Name Description Type
Busco and dependencies version #main/Busco and dependencies version n/a
  • File
Busco on input dataset(s): full table #main/Busco on input dataset(s): full table n/a
  • File
Fasta Statistics on input dataset(s): summary stats #main/Fasta Statistics on input dataset(s): summary stats n/a
  • File
Genome assembly metrics #main/Genome assembly metrics n/a
  • File
Genome coverage #main/Genome coverage n/a
  • File
Merqury on input dataset(s): bed #main/Merqury on input dataset(s): bed n/a
  • File
Merqury on input dataset(s): png #main/Merqury on input dataset(s): png n/a
  • File
Merqury on input dataset(s): qv #main/Merqury on input dataset(s): qv n/a
  • File
Merqury on input dataset(s): size files #main/Merqury on input dataset(s): size files n/a
  • File
Merqury on input dataset(s): stats #main/Merqury on input dataset(s): stats n/a
  • File
Merqury on input dataset(s): wig #main/Merqury on input dataset(s): wig n/a
  • File
Meryl on input dataset(s): read-db.meryldb #main/Meryl on input dataset(s): read-db.meryldb n/a
  • File
Quast on input dataset(s): HTML report #main/Quast on input dataset(s): HTML report n/a
  • File
Quast on input dataset(s): PDF report #main/Quast on input dataset(s): PDF report n/a
  • File
Quast on input dataset(s): Log #main/Quast on input dataset(s): Log n/a
  • File
Quast on input dataset(s): tabular report #main/Quast on input dataset(s): tabular report n/a
  • File
_anonymous_output_1 #main/_anonymous_output_1 n/a
  • File
_anonymous_output_10 #main/_anonymous_output_10 n/a
  • File
_anonymous_output_11 #main/_anonymous_output_11 n/a
  • File
_anonymous_output_12 #main/_anonymous_output_12 n/a
  • File
_anonymous_output_13 #main/_anonymous_output_13 n/a
  • File
_anonymous_output_14 #main/_anonymous_output_14 n/a
  • File
_anonymous_output_15 #main/_anonymous_output_15 n/a
  • File
_anonymous_output_16 #main/_anonymous_output_16 n/a
  • File
_anonymous_output_17 #main/_anonymous_output_17 n/a
  • File
_anonymous_output_18 #main/_anonymous_output_18 n/a
  • File
_anonymous_output_19 #main/_anonymous_output_19 n/a
  • File
_anonymous_output_2 #main/_anonymous_output_2 n/a
  • File
_anonymous_output_20 #main/_anonymous_output_20 n/a
  • File
_anonymous_output_21 #main/_anonymous_output_21 n/a
  • File
_anonymous_output_22 #main/_anonymous_output_22 n/a
  • File
_anonymous_output_23 #main/_anonymous_output_23 n/a
  • File
_anonymous_output_24 #main/_anonymous_output_24 n/a
  • File
_anonymous_output_25 #main/_anonymous_output_25 n/a
  • File
_anonymous_output_26 #main/_anonymous_output_26 n/a
  • File
_anonymous_output_3 #main/_anonymous_output_3 n/a
  • File
_anonymous_output_4 #main/_anonymous_output_4 n/a
  • File
_anonymous_output_5 #main/_anonymous_output_5 n/a
  • File
_anonymous_output_6 #main/_anonymous_output_6 n/a
  • File
_anonymous_output_7 #main/_anonymous_output_7 n/a
  • File
_anonymous_output_8 #main/_anonymous_output_8 n/a
  • File
_anonymous_output_9 #main/_anonymous_output_9 n/a
  • File
out_file1 #main/out_file1 n/a
  • File
outfile #main/outfile n/a
  • File

Version History

v2.0.5 (latest) Created 6th Aug 2024 at 11:04 by Anna Syme

Merge pull request #7 from AustralianBioCommons/supernord-workflow-name-fix

Update workflow name in ro-crate-metadata.json


Frozen v2.0.5 fe2213b

v2.0.4 Created 19th Apr 2024 at 02:56 by Anna Syme

Merge pull request #7 from AustralianBioCommons/supernord-workflow-name-fix

Update workflow name in ro-crate-metadata.json


Frozen v2.0.4 fe2213b

v2.0.2 Created 16th Apr 2024 at 08:19 by Johan Gustafsson

Update .lifemonitor.yaml


Frozen v2.0.2 4ad99a2

v1.1.0 Created 9th May 2023 at 01:59 by Johan Gustafsson

Add missing raw data input


Frozen v1.1.0 46d8253

v1.0.0 (earliest) Created 7th Nov 2022 at 07:10 by Johan Gustafsson

Update links


Frozen v1.0.0 efaf002
help Creators and Submitter
Citation
Price, G., & Syme, A. (2024). Genome-assessment-post-assembly. WorkflowHub. https://doi.org/10.48546/WORKFLOWHUB.WORKFLOW.403.3
Activity

Views: 5085   Downloads: 591   Runs: 1

Created: 7th Nov 2022 at 07:10

Last updated: 6th Aug 2024 at 11:04

Annotated Properties
Topic annotations
Operation annotations
help Attributions

None

Total size: 978 KB
Powered by
(v.1.16.0-main)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH