Workflow Type: Galaxy
Stable

Post-genome assembly quality control workflow using Quast, BUSCO, Meryl, Merqury and Fasta Statistics. Updates November 2023.

  • Inputs: reads as fastqsanger.gz (not fastq.gz), and assembly.fasta. (To change format: click on the pencil icon next to the file in the Galaxy history, then "Datatypes", then set "New type" as fastqsanger.gz).
  • New default settings for BUSCO: lineage = eukaryota; for Quast: lineage = eukaryotes, genome = large.
  • Reports assembly stats into a table called metrics.tsv, including selected metrics from Fasta Stats, and read coverage; reports BUSCO versions and dependencies; and displays these tables in the workflow report.
  • Note: a known bug is that sometimes the workflow report text resets to default text.
  • To restore: open the workflow in Galaxy for editing.
  • Click on the "Edit Report" icon
  • Copy and paste the following text into the workflow report, then exit and save.

# Workflow Execution Report

Workflow name: Genome assessment post assembly

## Genome assembly metrics

Selected statistics from the workflow outputs. Additional metrics are available in other outputs in the history.

```galaxy
history_dataset_display(output="Genome assembly metrics")
```

## Software

Busco version and dependencies:

```galaxy
history_dataset_display(output="Busco and dependencies version")
```

## Galaxy Australia

Thanks for using Galaxy! When you use Galaxy Australia to support your publication or project, please acknowledge its use with the following statement: "This work is supported by Galaxy Australia, a service provided by the Australian Biocommons and its partners. The service receives NCRIS funding through Bioplatforms Australia and the Australian Research Data Commons (https://doi.org/10.47486/PL105), as well as The University of Melbourne and Queensland Government RICF funding."

Inputs

ID Name Description Type
FASTA contigs - Primary Assembly #main/FASTA contigs - Primary Assembly n/a
  • File
Raw reads #main/Raw reads n/a
  • File

Steps

ID Name Description
2 FASTQ to FASTA toolshed.g2.bx.psu.edu/repos/devteam/fastqtofasta/fastq_to_fasta_python/1.1.5
3 Meryl toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6
4 Fasta Statistics toolshed.g2.bx.psu.edu/repos/iuc/fasta_stats/fasta-stats/2.0
5 Quast toolshed.g2.bx.psu.edu/repos/iuc/quast/quast/5.0.2+galaxy1
6 Busco toolshed.g2.bx.psu.edu/repos/iuc/busco/busco/5.4.6+galaxy0
7 Fasta Statistics toolshed.g2.bx.psu.edu/repos/iuc/fasta_stats/fasta-stats/2.0
8 Merqury toolshed.g2.bx.psu.edu/repos/iuc/merqury/merqury/1.3
9 Search in textfiles toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1
10 Relabel some items in Fasta stats toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1
11 Get required Busco stats toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1
12 Get Busco version toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1
13 Get Busco dependencies toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1
14 Search in textfiles toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/1.1.1
15 Cut Cut1
16 Filter out unneeded lines from fasta stats toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0
17 Rename some items and add in delimiters for later toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1
18 Reformat some text toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/1.1.1
19 Cut Cut1
20 Extract assembly size toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0
21 Extract number of contigs toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0
22 Extract Contig N and L 50s and 90s toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0
23 Extract longest contig toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0
24 Extract GC content toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.0
25 Convert commas to tabs Convert characters1
26 Collate Busco info cat1
27 Paste Paste1
28 Add blank header toolshed.g2.bx.psu.edu/repos/bgruening/add_line_to_file/add_line_to_file/0.1.0
29 Transpose cols to rows toolshed.g2.bx.psu.edu/repos/iuc/datamash_transpose/datamash_transpose/1.8+galaxy0
30 Convert to table Convert characters1
31 Compute coverage, total reads length divided by assembly length toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/2.0
32 Convert underscores to tabs Convert characters1
33 Keep two columns Cut1
34 Round the percentage to 2 decimal places toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/2.0
35 Label the column toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_column/1.1.3
36 Join info into one table cat1

Outputs

ID Name Description Type
Busco and dependencies version #main/Busco and dependencies version n/a
  • File
Busco on input dataset(s): full table #main/Busco on input dataset(s): full table n/a
  • File
Fasta Statistics on input dataset(s): summary stats #main/Fasta Statistics on input dataset(s): summary stats n/a
  • File
Genome assembly metrics #main/Genome assembly metrics n/a
  • File
Genome coverage #main/Genome coverage n/a
  • File
Merqury on input dataset(s): bed #main/Merqury on input dataset(s): bed n/a
  • File
Merqury on input dataset(s): png #main/Merqury on input dataset(s): png n/a
  • File
Merqury on input dataset(s): qv #main/Merqury on input dataset(s): qv n/a
  • File
Merqury on input dataset(s): size files #main/Merqury on input dataset(s): size files n/a
  • File
Merqury on input dataset(s): stats #main/Merqury on input dataset(s): stats n/a
  • File
Merqury on input dataset(s): wig #main/Merqury on input dataset(s): wig n/a
  • File
Meryl on input dataset(s): read-db.meryldb #main/Meryl on input dataset(s): read-db.meryldb n/a
  • File
Quast on input dataset(s): HTML report #main/Quast on input dataset(s): HTML report n/a
  • File
Quast on input dataset(s): PDF report #main/Quast on input dataset(s): PDF report n/a
  • File
Quast on input dataset(s): Log #main/Quast on input dataset(s): Log n/a
  • File
Quast on input dataset(s): tabular report #main/Quast on input dataset(s): tabular report n/a
  • File
_anonymous_output_1 #main/_anonymous_output_1 n/a
  • File
_anonymous_output_10 #main/_anonymous_output_10 n/a
  • File
_anonymous_output_11 #main/_anonymous_output_11 n/a
  • File
_anonymous_output_12 #main/_anonymous_output_12 n/a
  • File
_anonymous_output_13 #main/_anonymous_output_13 n/a
  • File
_anonymous_output_14 #main/_anonymous_output_14 n/a
  • File
_anonymous_output_15 #main/_anonymous_output_15 n/a
  • File
_anonymous_output_16 #main/_anonymous_output_16 n/a
  • File
_anonymous_output_17 #main/_anonymous_output_17 n/a
  • File
_anonymous_output_18 #main/_anonymous_output_18 n/a
  • File
_anonymous_output_19 #main/_anonymous_output_19 n/a
  • File
_anonymous_output_2 #main/_anonymous_output_2 n/a
  • File
_anonymous_output_20 #main/_anonymous_output_20 n/a
  • File
_anonymous_output_21 #main/_anonymous_output_21 n/a
  • File
_anonymous_output_22 #main/_anonymous_output_22 n/a
  • File
_anonymous_output_23 #main/_anonymous_output_23 n/a
  • File
_anonymous_output_24 #main/_anonymous_output_24 n/a
  • File
_anonymous_output_25 #main/_anonymous_output_25 n/a
  • File
_anonymous_output_26 #main/_anonymous_output_26 n/a
  • File
_anonymous_output_3 #main/_anonymous_output_3 n/a
  • File
_anonymous_output_4 #main/_anonymous_output_4 n/a
  • File
_anonymous_output_5 #main/_anonymous_output_5 n/a
  • File
_anonymous_output_6 #main/_anonymous_output_6 n/a
  • File
_anonymous_output_7 #main/_anonymous_output_7 n/a
  • File
_anonymous_output_8 #main/_anonymous_output_8 n/a
  • File
_anonymous_output_9 #main/_anonymous_output_9 n/a
  • File
out_file1 #main/out_file1 n/a
  • File
outfile #main/outfile n/a
  • File

Version History

v2.0.5 (latest) Created 6th Aug 2024 at 11:04 by Anna Syme

Merge pull request #7 from AustralianBioCommons/supernord-workflow-name-fix

Update workflow name in ro-crate-metadata.json


Frozen v2.0.5 fe2213b

v2.0.4 Created 19th Apr 2024 at 02:56 by Anna Syme

Merge pull request #7 from AustralianBioCommons/supernord-workflow-name-fix

Update workflow name in ro-crate-metadata.json


Frozen v2.0.4 fe2213b

v2.0.2 Created 16th Apr 2024 at 08:19 by Johan Gustafsson

Update .lifemonitor.yaml


Frozen v2.0.2 4ad99a2

v1.1.0 Created 9th May 2023 at 01:59 by Johan Gustafsson

Add missing raw data input


Frozen v1.1.0 46d8253

v1.0.0 (earliest) Created 7th Nov 2022 at 07:10 by Johan Gustafsson

Update links


Frozen v1.0.0 efaf002
help Creators and Submitter
Creators
Submitter
Citation
Price, G., Syme, A., Price, G., & Syme, A. (2024). Genome-assessment-post-assembly. WorkflowHub. https://doi.org/10.48546/WORKFLOWHUB.WORKFLOW.403.4
Activity

Views: 5371   Downloads: 648   Runs: 6

Created: 7th Nov 2022 at 07:10

Last updated: 6th Aug 2024 at 11:04

Annotated Properties
Topic annotations
Operation annotations
help Attributions

None

Total size: 978 KB
Powered by
(v.1.16.0-main)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH