SEEK ID: https://workflowhub.eu/people/200
Location: Australia
ORCID: https://orcid.org/0000-0002-9906-0673
Joined: 8th Nov 2021
Expertise: Not specified
Tools: Not specified
Related items
The Australian BioCommons enhances digital life science research through world class collaborative distributed infrastructure. It aims to ensure that Australian life science research remains globally competitive, through sustained strategic leadership, research community engagement, digital service provision, training and support.
Teams: Australian BioCommons, QCIF Bioinformatics, Pawsey Supercomputing Research Centre, Sydney Informatics Hub, Janis, Melbourne Data Analytics Platform (MDAP), Galaxy Australia
Web page: https://www.biocommons.org.au/
The Australian BioCommons enhances digital life science research through world class collaborative distributed infrastructure. It aims to ensure that Australian life science research remains globally competitive, through sustained strategic leadership, research community engagement, digital service provision, training and support.
Space: Australian BioCommons
Public web page: https://www.biocommons.org.au/
Organisms: Not specified
Galaxy is an open, web-based platform for accessible, reproducible, and transparent computational biological research.
- Accessible: Users can easily run tools without writing code or using the CLI; all via a user-friendly web interface.
- Reproducible: Galaxy captures all the metadata from an analysis, making it completely reproducible.
- Transparent: Users share and publish analyses via interactive pages that can enhance analyses with user annotations.
- Scalable: Galaxy ...
Space: Australian BioCommons
Public web page: https://usegalaxy.org.au/
Organisms: Not specified
Genome assembly workflow for nanopore reads, for TSI
Input:
- Nanopore reads (can be in format: fastq, fastq.gz, fastqsanger, or fastqsanger.gz)
Optional settings to specify when the workflow is run:
- [1] how many input files to split the original input into (to speed up the workflow). default = 0. example: set to 2000 to split a 60 GB read file into 2000 files of ~ 30 MB.
- [2] filtering: min average read quality score. default = 10
- [3] filtering: min read length. default = 200
- [4] ...
Post-genome assembly quality control workflow using Quast, BUSCO, Meryl, Merqury and Fasta Statistics. Updates November 2023.
- Inputs: reads as fastqsanger.gz (not fastq.gz), and assembly.fasta. (To change format: click on the pencil icon next to the file in the Galaxy history, then "Datatypes", then set "New type" as fastqsanger.gz).
- New default settings for BUSCO: lineage = eukaryota; for Quast: lineage = eukaryotes, genome = large.
- Reports assembly stats into a table called metrics.tsv, ...
Scaffolding using HiC data with YAHS
This workflow has been created from a Vertebrate Genomes Project (VGP) scaffolding workflow.
- For more information about the VGP project see https://galaxyproject.org/projects/vgp/.
- The scaffolding workflow is at https://dockstore.org/workflows/github.com/iwc-workflows/Scaffolding-HiC-VGP8/main:main?tab=info
- Please see that link for the workflow diagram.
Some minor changes have been made to better fit with TSI project data:
- optional inputs of SAK info ...
This is part of a series of workflows to annotate a genome, tagged with TSI-annotation
.
These workflows are based on command-line code by Luke Silver, converted into Galaxy Australia workflows.
The workflows can be run in this order:
- Repeat masking
- RNAseq QC and read trimming
- Find transcripts
- Combine transcripts
- Extract transcripts
- Convert formats
- Fgenesh annotation
Workflow information:
- Input = genome.fasta.
- Outputs = soft_masked_genome.fasta, hard_masked_genome.fasta, ...
This is part of a series of workflows to annotate a genome, tagged with TSI-annotation
.
These workflows are based on command-line code by Luke Silver, converted into Galaxy Australia workflows.
The workflows can be run in this order:
- Repeat masking
- RNAseq QC and read trimming
- Find transcripts
- Combine transcripts
- Extract transcripts
- Convert formats
- Fgenesh annotation
For this workflow:
Inputs:
- assembled-genome.fasta
- hard-repeat-masked-genome.fasta
- If using the mRNAs option, ...
This is part of a series of workflows to annotate a genome, tagged with TSI-annotation
.
These workflows are based on command-line code by Luke Silver, converted into Galaxy Australia workflows.
The workflows can be run in this order:
- Repeat masking
- RNAseq QC and read trimming
- Find transcripts
- Combine transcripts
- Extract transcripts
- Convert formats
- Fgenesh annotation
About this workflow:
- Inputs: transdecoder-peptides.fasta, transdecoder-nucleotides.fasta
- Runs many steps ...
This is part of a series of workflows to annotate a genome, tagged with TSI-annotation
.
These workflows are based on command-line code by Luke Silver, converted into Galaxy Australia workflows.
The workflows can be run in this order:
- Repeat masking
- RNAseq QC and read trimming
- Find transcripts
- Combine transcripts
- Extract transcripts
- Convert formats
- Fgenesh annotation
About this workflow:
- Input: merged_transcriptomes.fasta.
- Runs TransDecoder to produce longest_transcripts.fasta ...
This is part of a series of workflows to annotate a genome, tagged with TSI-annotation
.
These workflows are based on command-line code by Luke Silver, converted into Galaxy Australia workflows.
The workflows can be run in this order:
- Repeat masking
- RNAseq QC and read trimming
- Find transcripts
- Combine transcripts
- Extract transcripts
- Convert formats
- Fgenesh annotation
About this workflow:
- Inputs: multiple transcriptome.gtfs from different tissues, genome.fasta, coding_seqs.fasta, ...
This is part of a series of workflows to annotate a genome, tagged with TSI-annotation
.
These workflows are based on command-line code by Luke Silver, converted into Galaxy Australia workflows.
The workflows can be run in this order:
- Repeat masking
- RNAseq QC and read trimming
- Find transcripts
- Combine transcripts
- Extract transcripts
- Convert formats
- Fgenesh annotation
About this workflow:
- Run this workflow per tissue.
- Inputs: masked_genome.fasta and the trimmed RNAseq reads ...
This is part of a series of workflows to annotate a genome, tagged with TSI-annotation
.
These workflows are based on command-line code by Luke Silver, converted into Galaxy Australia workflows.
The workflows can be run in this order:
- Repeat masking
- RNAseq QC and read trimming
- Find transcripts
- Combine transcripts
- Extract transcripts
- Convert formats
- Fgenesh annotation
About this workflow:
- Repeat this workflow separately for datasets from different tissues.
- Inputs = collections ...
workflow-partial-gstacks-populations
These workflows are part of a set designed to work for RAD-seq data on the Galaxy platform, using the tools from the Stacks program.
Galaxy Australia: https://usegalaxy.org.au/
Stacks: http://catchenlab.life.illinois.edu/stacks/
This workflow is part of the reference-guided stacks workflow, https://workflowhub.eu/workflows/347
This workflow takes in bam files and a population map.
To generate bam files see: https://workflowhub.eu/workflows/351
workflow-partial-bwa-mem
These workflows are part of a set designed to work for RAD-seq data on the Galaxy platform, using the tools from the Stacks program.
Galaxy Australia: https://usegalaxy.org.au/
Stacks: http://catchenlab.life.illinois.edu/stacks/
This workflow is part of the reference-guided stacks workflow, https://workflowhub.eu/workflows/347
Inputs
- demultiplexed reads in fastq format, may be output from the QC workflow. Files are in a collection.
- reference genome in fasta format ...
workflow-partial-cstacks-sstacks-gstacks
These workflows are part of a set designed to work for RAD-seq data on the Galaxy platform, using the tools from the Stacks program.
Galaxy Australia: https://usegalaxy.org.au/
Stacks: http://catchenlab.life.illinois.edu/stacks/
This workflow takes in ustacks output, and runs cstacks, sstacks and gstacks.
To generate ustacks output see https://workflowhub.eu/workflows/349
For the full de novo workflow see https://workflowhub.eu/workflows/348
workflow-partial-ustacks-only
These workflows are part of a set designed to work for RAD-seq data on the Galaxy platform, using the tools from the Stacks program.
Galaxy Australia: https://usegalaxy.org.au/
Stacks: http://catchenlab.life.illinois.edu/stacks/
For the full de novo workflow see https://workflowhub.eu/workflows/348
You may want to run ustacks with different batches of samples.
- To be able to combine these later, there are some necessary steps - we need to keep track of how many ...
workflow-denovo-stacks
These workflows are part of a set designed to work for RAD-seq data on the Galaxy platform, using the tools from the Stacks program.
Galaxy Australia: https://usegalaxy.org.au/
Stacks: http://catchenlab.life.illinois.edu/stacks/
Inputs
- demultiplexed reads in fastq format, may be output from the QC workflow. Files are in a collection.
- population map in text format
Steps and outputs
ustacks:
- input reads go to ustacks.
- ustacks assembles the reads into matching ...
workflow-ref-guided-stacks
These workflows are part of a set designed to work for RAD-seq data on the Galaxy platform, using the tools from the Stacks program.
Galaxy Australia: https://usegalaxy.org.au/
Stacks: http://catchenlab.life.illinois.edu/stacks/
Inputs
- demultiplexed reads in fastq format, may be output from the QC workflow. Files are in a collection.
- population map in text format
- reference genome in fasta format
Steps and outputs
BWA MEM 2:
- The reads are mapped to the ...
workflow-qc-of-radseq-reads
These workflows are part of a set designed to work for RAD-seq data on the Galaxy platform, using the tools from the Stacks program.
Galaxy Australia: https://usegalaxy.org.au/
Stacks: http://catchenlab.life.illinois.edu/stacks/
Inputs
- demultiplexed reads in fastq format, in a collection
- two adapter sequences in fasta format, for input into cutadapt
Steps and outputs
The workflow can be modified to suit your own parameters.
The workflow steps are:
- Run ...
Combined workflow for large genome assembly
The tutorial document for this workflow is here: https://doi.org/10.5281/zenodo.5655813
What it does: A workflow for genome assembly, containing subworkflows:
- Data QC
- Kmer counting
- Trim and filter reads
- Assembly with Flye
- Assembly polishing
- Assess genome quality
Inputs:
- long reads and short reads in fastq format
- reference genome for Quast
Outputs:
- Data information - QC, kmers
- Filtered, trimmed reads
- Genome assembly, assembly graph, ...
Assess genome quality; can run alone or as part of a combined workflow for large genome assembly.
- What it does: Assesses the quality of the genome assembly: generate some statistics and determine if expected genes are present; align contigs to a reference genome.
- Inputs: polished assembly; reference_genome.fasta (e.g. of a closely-related species, if available).
- Outputs: Busco table of genes found; Quast HTML report, and link to Icarus contigs browser, showing contigs aligned to a reference ...
Assembly polishing subworkflow: Racon polishing with long reads
Inputs: long reads and assembly contigs
Workflow steps:
- minimap2 : long reads are mapped to assembly => overlaps.paf.
- overaps, long reads, assembly => Racon => polished assembly 1
- using polished assembly 1 as input; repeat minimap2 + racon => polished assembly 2
- using polished assembly 2 as input, repeat minimap2 + racon => polished assembly 3
- using polished assembly 3 as input, repeat minimap2 + racon => ...
This is part of a series of workflows to annotate a genome, tagged with TSI-annotation
.
These workflows are based on command-line code by Luke Silver, converted into Galaxy Australia workflows.
The workflows can be run in this order:
- Repeat masking
- RNAseq QC and read trimming
- Find transcripts
- Combine transcripts
- Extract transcripts
- Convert formats
- Fgenesh annotation