Workflows
What is a Workflow?Filters
MGnify (http://www.ebi.ac.uk/metagenomics) provides a free to use platform for the assembly, analysis and archiving of microbiome data derived from sequencing microbial populations that are present in particular environments. Over the past 2 years, MGnify (formerly EBI Metagenomics) has more than doubled the number of publicly available analysed datasets held within the resource. Recently, an updated approach to data analysis has been unveiled (version 5.0), replacing the previous single pipeline ...
Type: Common Workflow Language
Creator: Alex L Mitchell, Alexandre Almeida, Martin Beracochea, Miguel Boland, Josephine Burgin, Guy Cochrane, Michael R Crusoe, Varsha Kale, Simon C Potter, Lorna J Richardson, Ekaterina Sakharova, Maxim Scheremetjew, Anton Korobeynikov, Alex Shlemov, Olga Kunyavskaya, Alla Lapidus, Robert D Finn
Submitter: Martin Beracochea
MGnify (http://www.ebi.ac.uk/metagenomics) provides a free to use platform for the assembly, analysis and archiving of microbiome data derived from sequencing microbial populations that are present in particular environments. Over the past 2 years, MGnify (formerly EBI Metagenomics) has more than doubled the number of publicly available analysed datasets held within the resource. Recently, an updated approach to data analysis has been unveiled (version 5.0), replacing the previous single pipeline ...
Type: Common Workflow Language
Creator: Alex L Mitchell, Alexandre Almeida, Martin Beracochea, Miguel Boland, Josephine Burgin, Guy Cochrane, Michael R Crusoe, Varsha Kale, Simon C Potter, Lorna J Richardson, Ekaterina Sakharova, Maxim Scheremetjew, Anton Korobeynikov, Alex Shlemov, Olga Kunyavskaya, Alla Lapidus, Robert D Finn
Submitter: Martin Beracochea
HiFi de novo genome assembly workflow
HiFi-assembly-workflow is a bioinformatics pipeline that can be used to analyse Pacbio CCS reads for de novo genome assembly using PacBio Circular Consensus Sequencing (CCS) reads. This workflow is implemented in Nextflow and has 3 major sections.
Please refer to the following documentation for detailed description of each workflow section:
Phylogenetic reconstruction using genome-wide and single-gene alignment data. Here we use maximum likelihood reconstruction program IQTree. Data can be prepared using the phylogenetic data preparation workflow prior to phylogenetic reconstruction. Resulting trees can be viewed interactively using Galaxy's 'Phyloviz' or 'Phylogenetic Tree Visualization'
This workflow begins from a set of genome assemblies of different samples, strains, species. The genome is first annotated with Funnanotate. Predicted proteins are furtner annotated with Busco. Next, 'ProteinOrtho' finds orthologs across the samples and makes orthogroups. Orthogroups where all samples are represented are extracted. Orthologs in each orthogroup are aligned with ClustalW. Test dataset: https://zenodo.org/record/6610704#.Ypn3FzlBw5k
Generic variation analysis on WGS PE data
This workflows performs paired end read mapping with bwa-mem followed by sensitive variant calling across a wide range of AFs with lofreq and variant annotation with snpEff. The reference genome can be provided as a GenBank file.
Generic consensus building
This workflow generates consensus sequences using a list of variants generated by Variant Calling Workflow.
The workflow accepts a single input:
- A collection of VCF files
The workflow produces a single output:
- Consensus sequence for each input VCF file
The workflow can be accessed at usegalaxy.org
Generic variation analysis reporting
This workflow generates reports from a list of variants generated by Variant Calling Workflow.
The workflow accepts a single input:
- A collection of VCF files
The workflow produces two outputs (format description below):
- A list of variants grouped by Sample
- A list of variants grouped by Variant
Here is example of output by sample. In this table all varinats in all samples are epxlicitrly listed:
| Sample | ...
Generic variant calling
A generic workflow for identification of variants in a haploid genome such as genomes of bacteria or viruses. It can be readily used on MonkeyPox. The workflow accepts two inputs:
- A genbank file with the reference genomes
- A collection of paired fastqsanger files
The workflow outputs a collection of VCF files for each sample (each fastq pair). These VCF files serve as input to the Reporting workflow.
Workflow can be accessed ...
workflow-partial-gstacks-populations
These workflows are part of a set designed to work for RAD-seq data on the Galaxy platform, using the tools from the Stacks program.
Galaxy Australia: https://usegalaxy.org.au/
Stacks: http://catchenlab.life.illinois.edu/stacks/
This workflow is part of the reference-guided stacks workflow, https://workflowhub.eu/workflows/347
This workflow takes in bam files and a population map.
To generate bam files see: https://workflowhub.eu/workflows/351