The workflow takes trimmed HiC forward and reverse reads, and Pri/Alt assemblies to produce a scaffolded primary assembliy (and alternate contigs) using YaHS. It also runs all the QC analyses (gfastats, BUSCO, and Merqury).
The workflow takes a trimmed HiFi reads collection, Pri/Alt contigs, and the values for transition parameter and max coverage depth (calculated from WF1) to run Purge_Dups. It produces purged Pri and Alt contigs assemblies, and runs all the QC analysis (gfastats, BUSCO, and Merqury).
The workflow takes a trimmed HiFi reads collection, and max coverage depth (calculated from WF1) to run Hifiasm in HiFi solo mode. It produces a Pri/Alt assembly, and runs all the QC analysis (gfastats, BUSCO, and Merqury).
Assembly Evaluation for ERGA-BGE Reports
One Assmebly, HiFi WGS reads + HiC reads
The workflow requires the following:
- Species Taxonomy ID number
- NCBI Genome assembly accession code
- BUSCO Lineage
- WGS accurate reads accession code
- NCBI HiC reads accession code
The workflow will get the data and process it to generate genome profiling (genomescope, smudgeplot -optional-), assembly stats (gfastats), merqury stats (QV, completeness), BUSCO, snailplot, contamination blobplot, and HiC ...
Assembly Evaluation for ERGA-BGE Reports
One Assmebly, Illumina WGS reads + HiC reads
The workflow requires the following:
- Species Taxonomy ID number
- NCBI Genome assembly accession code
- BUSCO Lineage
- WGS accurate reads accession code
- NCBI HiC reads accession code
The workflow will get the data and process it to generate genome profiling (genomescope, smudgeplot -optional-), assembly stats (gfastats), merqury stats (QV, completeness), BUSCO, snailplot, contamination blobplot, and ...
The workflow requires the user to provide:
- ENSEMBL link address of the annotation GFF3 file
- ENSEMBL link address of the assembly FASTA file
- NCBI taxonomy ID
- BUSCO lineage
- OMArk database
Thw workflow will produce statistics of the annotation based on AGAT, BUSCO and OMArk.
A pipeline to demultiplex, QC and map Nanopore data
Post-genome assembly quality control workflow using Quast, BUSCO, Meryl, Merqury and Fasta Statistics. Updates November 2023. Inputs: reads as fastqsanger.gz (not fastq.gz), and assembly.fasta. New default settings for BUSCO: lineage = eukaryota; for Quast: lineage = eukaryotes, genome = large. Reports assembly stats into a table called metrics.tsv, including selected metrics from Fasta Stats, and read coverage; reports BUSCO versions and dependencies; and displays these tables in the workflow ...
Post-genome assembly quality control workflow using Quast, BUSCO, Meryl, Merqury and Fasta Statistics. Updates November 2023.
- Inputs: reads as fastqsanger.gz (not fastq.gz), and assembly.fasta. (To change format: click on the pencil icon next to the file in the Galaxy history, then "Datatypes", then set "New type" as fastqsanger.gz).
- New default settings for BUSCO: lineage = eukaryota; for Quast: lineage = eukaryotes, genome = large.
- Reports assembly stats into a table called metrics.tsv, ...
BAM-to-FASTQ-QC
General recommendations for using BAM-to-FASTQ-QC
Please see the Genome assembly with hifiasm on Galaxy Australia
guide.
Acknowledgements
The workflow & the doc_guidelines template used are supported by the Australian BioCommons via Bioplatforms Australia funding, the Australian Research Data Commons (https://doi.org/10.47486/PL105) ...
Collection of Galaxy workflows for generating results used for creating ERGA-BGE Reports
For a given genome, two workflows should be run: the assembly evaluation (ASM analyses), and the annotation evaluation (ANNOT analyses)
Depending on the kind of data used for the genome assembly, you should choose HiFi or ONT (Illumina) workflows for ASM analyses