Workflows
What is a Workflow?Filters
Genome assembly workflow for nanopore reads, for TSI
Input:
- Nanopore reads (can be in format: fastq, fastq.gz, fastqsanger, or fastqsanger.gz)
Optional settings to specify when the workflow is run:
- [1] how many input files to split the original input into (to speed up the workflow). default = 0. example: set to 2000 to split a 60 GB read file into 2000 files of ~ 30 MB.
- [2] filtering: min average read quality score. default = 10
- [3] filtering: min read length. default = 200
- [4] ...
Process argo data with the Pangeo Ecosystem and visualise them with Ocean Data View (ODV)
A R workflow for proteomics data analysis is reported. This pipeline was basing on protein expression projects, stored on the PRIDE database and reported on the COVID-19 Data portal. This is an R pipeline to analyze protein expression data, built on lung cell lines infected by SARS-CoV-2 variants: B.1, Delta, and Omicron BA.1 (Mezler et al. 2023) https://www.ebi.ac.uk/pride/archive/projects/PXD037265. This pipeline can obtain DEPs for each variant, starting from normalized protein expression ...
Pairwise alignment pipeline (genome to genome or reads to genome)
GraphRBF is a state-of-the-art protein-protein/nucleic acid interaction site prediction model built by enhanced graph neural networks and prioritized radial basis function neural networks. This project serves users to use our software to directly predict protein binding sites or train our model on a new database. Identification of protein-protein and protein-nucleic acid binding sites provides insights into biological processes related to protein functions and technical guidance for disease ...
Swedish Earth Biogenome Project - Genome Assembly Workflow
The primary genome assembly workflow for the Earth Biogenome Project at NBIS.
Workflow overview
General aim:
flowchart LR
hifi[/ HiFi reads /] --> data_inspection
ont[/ ONT reads /] --> data_inspection
hic[/ Hi-C reads /] --> data_inspection
data_inspection[[ Data inspection ]] --> preprocessing
preprocessing[[ Preprocessing ]] --> assemble
assemble[[ Assemble ]] --> validation
validation[[ Assembly
...
Secondary metabolite biosynthetic gene cluster (SMBGC) Annotation using Neural Networks Trained on Interpro Signatures
The workflow requires the user to provide:
- ENSEMBL link address of the annotation GFF3 file
- ENSEMBL link address of the assembly FASTA file
- NCBI taxonomy ID
- BUSCO lineage
- OMArk database
Thw workflow will produce statistics of the annotation based on AGAT, BUSCO and OMArk.
Assembly Evaluation for ERGA-BGE Reports
One Assmebly, HiFi WGS reads + HiC reads
The workflow requires the following:
- Species Taxonomy ID number
- NCBI Genome assembly accession code
- BUSCO Lineage
- WGS accurate reads accession code
- NCBI HiC reads accession code
The workflow will get the data and process it to generate genome profiling (genomescope, smudgeplot -optional-), assembly stats (gfastats), merqury stats (QV, completeness), BUSCO, snailplot, contamination blobplot, and HiC ...
Assembly Evaluation for ERGA-BGE Reports
One Assmebly, Illumina WGS reads + HiC reads
The workflow requires the following:
- Species Taxonomy ID number
- NCBI Genome assembly accession code
- BUSCO Lineage
- WGS accurate reads accession code
- NCBI HiC reads accession code
The workflow will get the data and process it to generate genome profiling (genomescope, smudgeplot -optional-), assembly stats (gfastats), merqury stats (QV, completeness), BUSCO, snailplot, contamination blobplot, and ...