157 items tagged with 'Bioinformatics'.

Teams: usegalaxy-eu, GalaxyProject SARS-CoV-2, EOSC4Cancer, EuroScienceGateway, Intergalactic Workflow Commission (IWC), Galaxy Training Network

Organizations: European Galaxy Team, Albert-Ludwigs-Universität Freiburg

https://orcid.org/0000-0002-9464-6640

Expertise: Bioinformatics, Genetics

Tools: Galaxy, Genomics, Virology

Izaskun Mallona

Teams: omnibenchmark

Organizations: University of Zurich

https://orcid.org/0000-0002-2853-7526

Expertise: science, Bioinformatics, Genetics

Tools: methods development, biology, Computer Science, open science, linux

Current FOSS projects at bitbucket, github.

Academic CV, linkedin.

Last known whereabouts: Mark Robinson's lab at Univ. Zurich.

I've been doing research in bioinformatics since ~2007. I am not a "real nice guy". I advocate for open science, sound methods, and respectful working environments.

Gemy Kaithakottil

Teams: EI Core Bioinformatics Group

Organizations: Earlham Institute

https://orcid.org/0000-0003-1360-7808

Expertise: Bioinformatics, Data Management, Genomics, High Performance Computing, NGS, Python, R, Scientific workflow developement, Software Engineering, Workflows

Tools: CWL, Conda, Databases, Galaxy, Genomics, Git, Java, Jupyter notebook, Machine Learning, Nextflow, Perl, Python, R, Single Cell analysis, Snakemake, Transcriptomics, WDL, Workflows, nf-core

Gemy Kaithakottil is a Senior Bioinformatician / Developer at the Earlham Institute

Bundit Boonyarit

Teams: BAID Team

Organizations: Vidyasirimedhi Institute of Science and Technology

https://orcid.org/0000-0003-4425-2608

Expertise: Bioinformatics, Cheminformatics, Machine Learning, Proteomics, Molecular Modelling, Biomolecular Dynamics Simulations

Tools: Machine Learning, Biochemistry and protein analysis, Computational and theoretical biology

I'm Bundit, a current Ph.D. candidate in Information Science and Technology at the Natural Language Processing and Representation Learning Lab (NRL), Vidyasirimedhi Institute of Science and Technology (VISTEC), Thailand. I received a B.Sc. degree in Chemistry from Prince of Songkla University, Thailand and an M.S. degree in Biochemistry from Kasetsart University, Thailand.

My interdisciplinary research bridges the disciplines of Biochemistry, Molecular Biology, Chemistry, Pharmaceutical Sciences, ...

Dan Parsons

Teams: iBOL Europe Museum Skimming, iBOL Europe Barcoding, iBOL Europe Barcode Data, Biodiversity Genomics Europe (general)

Organizations: The Natural History Museum London

https://orcid.org/0000-0002-5246-0725

Expertise: Bioinformatics, Molecular Biology

Bioinformatician on Biodiversity Genomics Europe

Mitchell O'Brien

Teams: Sydney Informatics Hub

Organizations: The University of Sydney

https://orcid.org/0000-0003-0662-9101

Expertise: Bioinformatics, Genomics, Genetics

Senior Bioinformatics Engineer - Sydney Informatics Hub | Australian BioCommons

Toni Hermoso Pulido

Teams: Not specified

Organizations: Not specified

https://orcid.org/0000-0003-2016-6465

Expertise: Bioinformatics

Sora Yonezawa

Teams: bonohulab

Organizations: Hiroshima University

https://orcid.org/0009-0004-1874-3117

Expertise: Bioinformatics

Tools: CWL, Genomics, Python, R, Transcriptomics, Jupyter notebook

Hiroshima University, Graduate School of Integrated Sciences for life, Laboratory of Genome Informatics, Ph.D student GitHub: https://github.com/yonesora56

Aldar Cabrelles

Teams: EGA

Organizations: EGA

https://orcid.org/0000-0002-1131-3139

Expertise: Bioinformatics, Biochemistry

Tools: Python

Mahesh Binzer-Panchal

Teams: NBIS, ERGA Assembly

Organizations: NBIS – National Bioinformatics Infrastructure Sweden

https://orcid.org/0000-0003-1675-0677

Expertise: Bioinformatics, Genomics, Scientific workflow developement, Workflows

Tools: Nextflow, nf-core

I'm a bioinformatician for the National Bioinformatics Infrastrure Sweden. I specialise in de novo genome assembly and workflow development with Nextflow. I'm also a Nextflow ambassador and nf-core maintainer.

仕卓张

Teams: Protein-protein and protein-nucleic acid binding site prediction research

Organizations: Shandong University

https://orcid.org/0009-0003-5182-3533

Expertise: Bioinformatics

Saim Momin

Teams: Not specified

Organizations: Not specified

https://orcid.org/0009-0003-9935-828X

Expertise: Bioinformatics, Metagenomics, NGS, Scientific workflow developement, Software Engineering

Tools: Conda, Jupyter notebook, Python, R, Single Cell analysis, Snakemake

Zsolt Balázs

Teams: KrauthammerLab

Organizations: University of Zurich

https://orcid.org/0000-0003-3537-7441

Expertise: Bioinformatics

Tools: Genomics, Molecular Biology, Single Cell analysis

Clea Siguret

Teams: Not specified

Organizations: Not specified

https://orcid.org/0009-0005-6140-0379

Expertise: Bioinformatics, Computer Science, Data Management, Genomics, Python, R, Scientific workflow developement, Workflows, phylogenomics

Tools: Galaxy, Genomics, Git, Python, R, Workflows

DINESH Ravindra Raju

Teams: Not specified

Organizations: Not specified

https://orcid.org/0000-0003-0657-1560

Expertise: Bioinformatics, Cheminformatics

Computation Biologist @UT Southwestern

Raül G Veiga

Teams: EGA

Organizations: EGA

https://orcid.org/0000-0002-8517-1559

Expertise: Bioinformatics

Helge Hecht

Teams: RECETOX SpecDatRI, RECETOX, usegalaxy-eu, ELIXIR Metabolomics, Intergalactic Workflow Commission (IWC)

Organizations: Masaryk University, RECETOX

https://orcid.org/0000-0001-6744-996X

Expertise: Bioinformatics, Cheminformatics, Metabolomics, Python, R, Software Engineering, Workflows

Tools: Metabolomics, Python, R, Workflows, Mass spectrometry, Chromatography

Xiaolong Luo

Teams: Genome Data Compression Team

Organizations: Shenzhen University

https://orcid.org/0009-0007-9672-6728

Expertise: Bioinformatics

Tools: Workflows

Gaurav Sablok

Teams: Not specified

Organizations: Not specified

https://orcid.org/0000-0002-4157-9405

Expertise: Bioinformatics, Data Management, Machine Learning, Scientific workflow developement, Software Engineering, high-performance computing

Research Interest: Bioinformatics | Deep Learning | DevOps | Generative AI | Knowledge Graphs. Highly communicative, task oriented, feature responsive, time oriented, approachable, solution seeker and initiative taker focussed professional working across a wide variety of topics which includes bioinformatics involving genomes, transcriptomes, metagenomes and metatranscriptomes focussing on datasets coming from the plant, bacterial and fungal genome (Illumina Miseq, NextSeq, NovaSeq, PacBio, Oxford ...

Rafael Terra

Teams: HP2NET - Framework for construction of phylogenetic networks on High Performance Computing (HPC) environment

Organizations: National Laboratory of Scientific Computing

https://orcid.org/0000-0002-3811-4527

Expertise: Bioinformatics, Scientific workflow developement, High Performance Computing

Tools: C/C++, Java, Python, R, Git, Parsl

Phuong Doan

Teams: ERGA Annotation, Bioinformatics Laboratory for Genomics and Biodiversity (LBGB)

Organizations: Genoscope

https://orcid.org/0000-0002-6621-9908

Expertise: Bioinformatics

Tools: Nextflow, Python, R, Genetic analysis, Single Cell analysis

Assa Yeroslaviz

Teams: Not specified

Organizations: Not specified

https://orcid.org/0000-0001-9638-4026

Expertise: Bioinformatics

Tools: R, Transcriptomics

Naser Elmi

Teams: Cimorgh IT solutions

Organizations: cimorgh IT

Expertise: Bioinformatics, Genomics, Metagenomics, Microbiology, NGS, Python, R, bash, WDL

Tools: Mathematical Modelling, R, WDL

Alvaro Gargantilla Becerra

Teams: Not specified

Organizations: Not specified

https://orcid.org/0000-0003-0429-2365

Expertise: Bioengineering, Bioinformatics, Scientific workflow developement, Systems Biology, Workflows, Computer Aided Design

Tools: Databases, Jupyter notebook, Nextflow, Python, Transcriptomics, Computational and theoretical biology, SBML, Matlab

Vincent Hervé

Teams: Metagenomic tools

Organizations: INRAe

https://orcid.org/0000-0002-3495-561X

Expertise: Bioinformatics, Metabarcoding, Metagenomics, Microbiology

Elisabetta Spinazzola

Teams: EOSC-Life WP3 OC Team, cross RI project, EOSC-Life WP3, Euro-BioImaging

Organizations: EOSC-Life, Euro-BioImaging

Expertise: Bioengineering, Bioinformatics, Computer Science, Data Management

Tools: Databases, Jupyter notebook, Python

Biomedical Engineer working on preclinical image dataset repository and cross researching RIs

Priyanka Surana

Teams: Not specified

Organizations: Not specified

https://orcid.org/0000-0002-7167-0875

Expertise: Bioinformatics, Genomics, Scientific workflow developement

Tools: Nextflow, nf-core, Python, R

Dongyang Wang

Teams: Not specified

Organizations: Not specified

https://orcid.org/0000-0001-6440-6980

Expertise: Bioinformatics, Genomics, Machine Learning

Tools: Python, R, Machine Learning

I am a Ph.D. student in Gong lab. I am interested in cancer genomics, including the mining of genetic risk determinants in cancer, functional prediction of genetic variants, tumor-associated molecular epidemiology, large-scale data integration, analysis, and mining, as well as the construction of bioinformatical data platforms.

Jasper Ouwerkerk

Teams: Not specified

Organizations: Not specified

https://orcid.org/0000-0003-2556-2125

Expertise: Bioinformatics

Nils Hoffmann

Teams: de.NBI Cloud

Organizations: de.NBI Cloud

https://orcid.org/0000-0002-6540-6875

Expertise: Bioinformatics, Cheminformatics, Software Engineering, Metabolomics, Lipidomics

Juan Caballero

Teams: MGnify

Organizations: EMBL-EBI

https://orcid.org/0000-0002-6160-3644

Expertise: Bioinformatics, Genomics, Metagenomics, Data Management

Tools: CWL, Jupyter notebook, Nextflow, Molecular Biology, Workflows, Microbiology, Transcriptomics, Perl, Python, R

Jeanette Reinshagen

Teams: EU-Openscreen

Organizations: Fraunhofer Institute for Translational Medicine and Pharmacology ITMP

https://orcid.org/0000-0002-8080-9170

Expertise: Bioinformatics, Cheminformatics, Machine Learning

Tools: Workflows

Kary Ocaña

Teams: ParslRNA-Seq: an efficient and scalable RNAseq analysis workflow for studies of differentiated gene expression

Organizations: National Laboratory of Scientific Computing

https://orcid.org/0000-0002-2151-7418

Expertise: Bioinformatics, Scientific workflow developement, high-performance computing, phylogenomics

I am a bioinformatician and phylogenetics. I really love working on problems at the intersection of high-performance computing and scientific workflows applied to omics

Hrishikesh Dhondge

Teams: CAPSID

Organizations: CNRS - Centre Est

https://orcid.org/0000-0002-7025-2241

Expertise: Bioinformatics, Molecular Modelling, Biomolecular Dynamics Simulations, Database Development, Protein Structural Alignment

Stephen Moss

Teams: Not specified

Organizations: Not specified

https://orcid.org/0000-0002-1399-293X

Expertise: Bioinformatics, Computer Science, Data Management, Genetics, Genomics, Machine Learning, Metagenomics, NGS, Scientific workflow developement, Software Engineering

Tools: Databases, Galaxy, Genomics, Jupyter notebook, Machine Learning, Nextflow, nf-core, PCR, Perl, Python, R, rtPCR, Snakemake, Transcriptomics, Virology, Web, Web services, Workflows

Dad, husband and PhD. Scientist, technologist and engineer. Bibliophile. Philomath. Passionate about science, medicine, research, computing and all things geeky!

Philip Kensche

Teams: Not specified

Organizations: Not specified

https://orcid.org/0000-0003-1299-9600

Expertise: Bioinformatics, Molecular Biology, Computer Science, NGS, Software Engineering

Tools: Nextflow, Roddy

Andrea Zaliani

Teams: EU-Openscreen, OME

Organizations: Fraunhofer Institute for Translational Medicine and Pharmacology ITMP

https://orcid.org/0000-0002-1740-8390

Expertise: Cheminformatics, Bioinformatics

Tools: R, Python, Workflows

Pranavathiyani G

Teams: Bioinformatics Innovation Lab

Organizations: Pondicherry University

https://orcid.org/0000-0003-4854-8238

Expertise: Bioinformatics, Systems Biology, Machine Learning

Tools: Galaxy, Cytoscape, Databases, Jupyter notebook, R, Python

Assistant Professor (Research) at Department of Bioinformatics, School of Chemical & Biotechnology, SASTRA Deemed to be University

Julie Ripoll

Teams: MAB - ATGC

Organizations: Centre National de la Recherche Scientifique (CNRS)

https://orcid.org/0000-0003-1590-8313

Expertise: Bioinformatics, Biostatistics, Bioengineering, Metagenomics, Scientific workflow developement, Data Management, Systems Biology, Ecophysiology

Tools: Transcriptomics, Python, Workflows, Snakemake, R, Conda, Jupyter notebook, Java, Web, Machine Learning, Databases

eric.rivals@lirmm.fr Rivals

Teams: MAB - ATGC

Organizations: Centre National de la Recherche Scientifique (CNRS)

https://orcid.org/0000-0003-3791-3973

Expertise: Bioinformatics, Genomics, algorithm, Machine Learning, Metagenomics, NGS, Computer Science

Tools: Transcriptomics, Genomics, Python, C/C++, Web services, Workflows

Cyril Noel

Teams: SeBiMER

Organizations: IFREMER

https://orcid.org/0000-0002-7139-4073

Expertise: Bioinformatics, Biostatistics, Metabarcoding, Metagenomics

Tools: Workflows, Nextflow, R

Evgenii Tretiakov

Teams: Harkany Lab

Organizations: Medical University of Vienna

https://orcid.org/0000-0001-5920-2190

Expertise: Systems Biology, Bioengineering, Bioinformatics, Neuroscience

Tools: Workflows, Machine Learning, Transcriptomics

Laura Rodriguez-Navas

Teams: GalaxyProject SARS-CoV-2, nf-core viralrecon, EOSC-Life - Demonstrator 7: Rare Diseases, iPC: individualizedPaediatricCure, EJPRD WP13 case-studies workflows, TransBioNet, OpenEBench, ELIXIR Proteomics

Organizations: Barcelona Supercomputing Center, ELIXIR

https://orcid.org/0000-0003-4929-1219

Expertise: Bioinformatics, Computer Science, AI, Machine Learning

Computer Engineer in Barcelona Supercomputing Center (BSC)

Sangram Keshari Sahu

Teams: nf-core

Organizations: IISER, Mohali

https://orcid.org/0000-0001-5010-9539

Expertise: Bioinformatics, Scientific workflow developement, Software Engineering

Tools: Workflows, Nextflow, nf-core, R, Python

Phil Ewels

Teams: nf-core

Organizations: SciLifeLab

https://orcid.org/0000-0003-4101-2502

Expertise: Bioinformatics

Tools: Nextflow, nf-core, Workflows, Python

Bioinformatician in Stockholm, Sweden. Lead for nf-core and MultiQC projects.

Nicola Soranzo

Teams: GalaxyProject SARS-CoV-2, EI Papatheodorou Group, BioFAIR

Organizations: Earlham Institute

https://orcid.org/0000-0003-3627-5340

Expertise: Bioinformatics

Tools: Galaxy

Ivan Topolsky

Teams: V-Pipe

Organizations: SIB - Swiss Institute of Bioinformatics

https://orcid.org/0000-0002-7561-0810

Expertise: Bioinformatics, Software Engineering

Tools: Workflows, C/C++, Perl

Medical doctor and bioinformatician

Developer from the Swiss Institute of Bioinformatics (SIB) Working at the Computational Biology Group (CBG) of ETH Zurich.

Diplom in Medicine. MSc in Bioinformatics and Proteomics.

I am also a ski teacher as a hobby.

Jean-Loup Faulon

Teams: IBISBA Workflows

Organizations: INRAe

https://orcid.org/0000-0003-4274-2953

Expertise: Synthetic Biology, Computer Aided Design, Scientific workflow developement, Retrosynthesis, Systems Biology, Bioinformatics, Cheminformatics

Tools: Machine Learning, Galaxy

Research Director @ INRAe

Foivos Gypas

Teams: IBISBA Workflows

Organizations: Unspecified

Expertise: Bioinformatics

Tools: Workflows, Web services, Python

Dan Fornika

Teams: GalaxyProject SARS-CoV-2

Organizations: BC Centre for Disease Control

https://orcid.org/0000-0002-6178-3585

Expertise: Bioinformatics, Data Management, Molecular Biology

Tools: Databases, PCR, Workflows, Web services

Systems and Synthetic Biology

The Laboratory of Systems and Synthetic Biology (SSB) contributes to elucidate mechanisms underlying basic cellular processes, evolution and interactions among microbes and between microbes and their environment (including the human host). We do so in the context of entire biological systems. We translate the acquired knowledge into biotechnological, medical and environmental applications.

Space: Independent Teams

Public web page: https://www.wur.nl/en/research-results/chair-groups/agrotechnology-and-food-sciences/biomolecular-sciences/laboratory-of-systems-and-synthetic-biology.htm

Organisms: Not specified

Bioinformatics Unit IIS-FJD

Bioinformatics Unit of the Health Research Institute Fundación Jiménez Díaz (IIS-FJD).

Space: Independent Teams

Public web page: https://www.translationalbioinformaticslab.es/

Organisms: Not specified

Senckenberg Digital Collection and Biodiversity Information Technologies

FAIR data specialists and research software engineers working towards interlinking, harmonization and facilitation of access to geobiodiversity research data.

Space: BioDT

Public web page: https://www.senckenberg.de/

Organisms: Not specified

UM BRCF Bioinformatics Core

The Bioinformatics Core helps researchers identify and interpret patterns in RNA and DNA by placing sequencing data into a biologically meaningful context. This encompasses assisting with experimental design, developing reproducible workflows, analyzing next-generation sequencing data, and supporting manuscript development/publication

Space: University of Michigan BRCF Bioinformatics Core

Public web page: https://medresearch.umich.edu/office-research/about-office-research/biomedical-research-core-facilities/bioinformatics-core

Organisms: Not specified

bonohulab

Toward data-driven genome breeding (digital breeding), we are developing data analysis infrastructure technology essential for genome editing, focusing on gene function analysis using bioinformatics called BioDX.

Space: Hiroshima workflow community

Public web page: https://bonohu.hiroshima-u.ac.jp/index_en.html

Organisms: Not specified

Taudière group

No description specified

Space: Independent Teams

Public web page: Not specified

Organisms: Not specified

Illumina Protocol - Testing

COPO

No description specified

Creator: Felix Shaw

Submitter: Felix Shaw

Download

Created: 9th Oct 2024 at 08:20

Multi-task analysis of gene expression data on cancer public datasets

yPublish - Bioinfo tools

Abstract (Expand)

Background There is an availability of omics and often multi-omics cancer datasets on public databases such as Gene Expression Omnibus (GEO), International Cancer Genome Consortium and The Cancer Genome … Atlas Program. Most of these databases provide at least the gene expression data for the samples contained in the project. Multi-omics has been an advantageous strategy to leverage personalized medicine, but few works explore strategies to extract knowledge relying only on gene expression level for decisions on tasks such as disease outcome prediction and drug response simulation. The models and information acquired on projects based only on expression data could provide decision making background for future projects that have other level of omics data such as DNA methylation or miRNAs. Results We extended previous methodologies to predict disease outcome from the combination of protein interaction networks and gene expression profiling by proposing an automated pipeline to perform the graph feature encoding and further patient networks outcome classification derived from RNA-Seq. We integrated biological networks from protein interactions and gene expression profiling to assess patient specificity combining the treatment/control ratio with the patient normalized counts of the deferentially expressed genes. We also tackled the disease outcome prediction from the gene set enrichment perspective, combining gene expression with pathway gene sets information as features source for this task. We also explored the drug response outcome perspective of the cancer disease still evaluating the relationship among gene expression profiling with single sample gene set enrichment analysis (ssGSEA), proposing a workflow to perform drug response screening according to the patient enriched pathways. Conclusion We showed the importance of the patient network modeling for the clinical task of disease outcome prediction using graph kernel matrices strategy and showed how ssGSEA improved the prediction only using transcriptomic data combined with pathway scores. We also demonstrated a detailed screening analysis showing the impact of pathway-based gene sets and normalization types for the drug response simulation. We deployed two fully automatized Screening workflows following the FAIR principles for the disease outcome prediction and drug response simulation tasks.

Author: Yasmmin Martins

Date Published: 28th Sep 2023

Publication Type: Journal

DOI: 10.1101/2023.09.27.23296213

Citation: medrxiv;2023.09.27.23296213v1,[Preprint]

Created: 23rd Oct 2023 at 15:33, Last updated: 23rd Oct 2023 at 15:34

survInTime - Exploring surveillance methods and data analysis on Brazilian respiratory syndrome dataset and community mobility changes

yPublish - Bioinfo tools

Abstract (Expand)

Background The covid-19 pandemic brought negative impacts in almost every country in the world. These impacts were observed mainly in the public health sphere, with a rapid raise and spread of the … disease and failed attempts to restrain it while there was no treatment. However, in developing countries, the impacts were severe in other aspects such as the intensification of social inequality, poverty and food insecurity. Specifically in Brazil, the miscommunication among the government layers conducted the control measures to a complete chaos in a country of continental dimensions. Brazil made an effort to register granular informative data about the case reports and their outcomes, while this data is available and can be consumed freely, there are issues concerning the integrity and inconsistencies between the real number of cases and the number of notifications in this dataset. Results We projected and implemented four types of analysis to explore the Brazilian public dataset of Severe Acute Respiratory Syndrome (srag dataset) notifications and the google dataset of community mobility change (mobility dataset). These analysis provides some diagnosis of data integration issues and strategies to integrate data and experimentation of surveillance analysis. The first type of analysis aims at describing and exploring the data contained in both datasets, starting by assessing the data quality concerning missing data, then summarizing the patterns found in this datasets. The Second type concerns an statistical experiment to estimate the cases from mobility patterns organized in periods of time. We also developed, as the third analysis type, an algorithm to help the understanding of the disease waves by detecting them and compare the time periods across the cities. Lastly, we build time series datasets considering deaths, overall cases and residential mobility change in regular time periods and used as features to group cities with similar behavior. Conclusion The exploratory data analysis showed the under representation of covid-19 cases in many small cities in Brazil that were absent in the srag dataset or with a number of cases very low than real projections. We also assessed the availability of data for the Brazilian cities in the mobility dataset in each state, finding out that not all the states were represented and the best coverage occurred in Rio de Janeiro state. We compared the capacity of place categories mobility change combination on estimating the number of cases measuring the errors and identifying the best components in mobility that could affect the cases. In order to target specific strategies for groups of cities, we compared strategies to cluster cities that obtained similar outcomes behavior along the time, highlighting the divergence on handling the disease.

Authors: Yasmmin Côrtes Martins, Ronaldo Francisco da Silva

Date Published: 27th Sep 2023

Publication Type: Journal

DOI: 10.1101/2023.09.26.559599

Citation: biorxiv;2023.09.26.559599v1,[Preprint]

Created: 23rd Oct 2023 at 15:30, Last updated: 23rd Oct 2023 at 15:32

The impact of non-lineage defining mutations in the structural stability for variants of concern of SARS-CoV-2

yPublish - Bioinfo tools

Abstract (Expand)

Motivation The identification of the most important mutations, that lead to a structural and functional change in a highly transmissible virus variants, is essential to understand the impacts and the …

Authors: Yasmmin Martins, Ronaldo Francisco da Silva

Date Published: 22nd Jun 2023

Publication Type: Journal

DOI: 10.1101/2023.06.22.546079

Citation: biorxiv;2023.06.22.546079v1,[Preprint]

Created: 23rd Oct 2023 at 15:25, Last updated: 23rd Oct 2023 at 15:28

Analysis of Protein-Protein Interactions networks and cross-species transfer learning comparison for seven organisms

yPublish - Bioinfo tools

Abstract (Expand)

Motivation Protein-protein interactions (PPIs) can be used for a plenty of applications like inferring protein functions or even helping the drug discovery process. For human specie, there is a lot of … validated information and functional annotations for the proteins in its interactome. In other species, the known interactome is much smaller compared with human and there are many proteins with few or no annotations by specialists. Understanding the interactome of other species helps to trace evolutionary characteristics, compare important biological processes and also build interactomes for new organisms according to other organisms more related with it instead of relying just to the human interactome. Results In this study, we evaluate the performance of PredPrIn workflow in predicting interactome for seven organisms in terms of scalability and precision showing that PredPrIn gets over than 70% of precision and it takes less than three days even on the largest datasets. We made a transfer learning analysis predicting an organism interactome from each other organism, we then showed an implication regarding to their evolutionary relation in the number of ortholog proteins shared between these organisms. We also present an analysis of functional enrichment showing the proportion of shared annotations between positive and false interactions predicted and extraction of topological features of each organism interactome such as proteins acting as hubs and bridge between modules. From each organism, one of the most frequent biological processes was selected and the proteins and pairs present in it were compared in terms of quantity in the interactome available in HINT database for that organism and the one predicted by PredPrIn. In this comparison we showed that we covered those proteins and pairs covered in HINT and also enriched these processes for almost all organisms. Conclusions In this work, we have proved the efficiency of PredPrIn workflow for protein interaction prediction for seven different organisms using scalability, performance and transfer learning analyses. We have also made cross-species interactome comparisons showing the most frequent biological processes for each organism as well as the topological features of each organism interactome showing the consistency with hypothesis about biological networks. Finally, we described the enrichment made by PredPrIn in selected biological processes showing that its prediction was important to enhance information about these organisms interactomes.

Author: Yasmmin C Martins

Date Published: 7th Jun 2023

Publication Type: Journal

DOI: 10.1101/2023.06.05.543725

Citation: biorxiv;2023.06.05.543725v1,[Preprint]

Created: 23rd Oct 2023 at 15:23, Last updated: 23rd Oct 2023 at 15:24

EpiCurator: an immunoinformatic workflow to predict and prioritize SARS-CoV-2 epitopes

yPublish - Bioinfo tools

Abstract (Expand)

The ongoing coronavirus 2019 (COVID-19) pandemic, triggered by the emerging SARS-CoV-2 virus, represents a global public health challenge. Therefore, the development of effective vaccines is an urgent …

Authors: Cristina S. Ferreira, Yasmmin C. Martins, Rangel Celso Souza, Ana Tereza R. Vasconcelos

Date Published: 2021

Publication Type: Journal

DOI: 10.7717/peerj.12548

Citation: PeerJ 9:e12548

Created: 23rd Oct 2023 at 15:04, Last updated: 23rd Oct 2023 at 15:06

R workflow for RNA-seq analysis in unexplained recurrent pregnancy loss

Transcriptomics in unexplained recurrent pregnancy loss

Stable

The bioinformatic workflow presented here enables the analysis of RNA sequencing data obtained from human reproductive tissues in unexplained recurrent pregnancy loss (uRPL) research. This pipeline requires a sample sheet containing the sample information (example_input_data.csv) and gene expression matrices generated using the Salmon tool in the nf-core/rnaseq bioinformatics pipeline (example_count_data.csv). For more information on how to use the nf-core/rnaseq pipeline including the required ...

Type: R markdown

Creators: Isabella M Brown, Paul Whatmore, Kylie Munyard

Submitter: Isabella Brown

DOI: 10.48546/workflowhub.workflow.1966.1

Created: 3rd Oct 2025 at 00:48, Last updated: 6th Oct 2025 at 03:47

Snakemake workflow for PacBio WGS short and long variant calling, phasing and much more

WGGC

Work-in-progress

pb_variants

A snakemake 9 based Pipeline for hifi snp, sv, cnv calling, phasing and more

Only PacBio data for now

!!THIS PIPLINE IS IN-DEVELOPMENT AND EXPERIMENTAL, USE AT YOUR OWN RISK!!

what this tool aims to deliver:

newest and best tools suited for HiFi data (only for now)
singletons and trio analysis (trio is coming sometime...)
human-first (hg38 for now), others should be possible (untested...)

included tools:

deepvariant or bcftools for snp calling
snps get used for ...

Type: Snakemake

Creator: daniel rickert

Submitter: dan rick

DOI: 10.48546/workflowhub.workflow.1965.2

Created: 30th Sep 2025 at 13:42, Last updated: 17th Oct 2025 at 14:17

Soil Metagenome Pipeline

ZiemertLab

Stable

Soil Metagenome Pipeline

Soil Metagenome Pipeline is a modular, Nextflow DSL2 workflow for assembling, polishing, binning, annotating, and functionally characterizing complex soil metagenomes. It orchestrates state-of-the-art tools for long- and short-read metagenomics, generates high-quality MAGs, assigns taxonomy, and screens for biosynthetic gene clusters (BGCs).

What it does

Assembles long-read metagenomes (e.g., ONT) with Flye and optionally polishes with Medaka and/or NextPolish using ...

Type: Nextflow

Creator: Caner Bagci

Submitter: Caner Bağcı

DOI: 10.48546/workflowhub.workflow.1960.1

Created: 19th Sep 2025 at 19:34

nf-core/proteinfamilies

MGnify

Stable

[![AWS ...

Type: Nextflow

Creators: Evangelos Karatzas, Martin Beracochea

Submitter: Evangelos Karatzas

DOI: 10.48546/workflowhub.workflow.1954.2

Created: 18th Sep 2025 at 14:41, Last updated: 22nd Oct 2025 at 09:49

Longread 16S classification workflow

UNLOCK

Work-in-progress

Workflow for quality assessment and taxonomic classification of amplicon long read sequences. In addition files are exported to their respective subfolders for easier data management in a later stage.

Inputs are expected to be basecalled fastq files

Steps:

NanoPlot read quality control, before and after filtering
fastplong read quality and length filtering
Emu abundance; species-level taxonomic abundance for full-length 16S read

Type: Common Workflow Language

Creators: Bart Nijsse, Jasper Koehorst

Submitter: Bart Nijsse

Created: 10th Sep 2025 at 13:30, Last updated: 10th Sep 2025 at 14:40

Processed fastq QC

Bioinformatics Core CEITEC

Stable

The workflow main goal is to quality trim reads of input fastq files and to remove adaptors. It can also run Biobloom tools and species detector in order to check for contamination. Finally it runs fastq QC to obtain quality check after trimming. The workflow was designed to be run in the SeqUIa (http://cfb.ceitec.muni.cz/sequia) application.

Type: Snakemake

Creators: None

Submitter: Nicolas Blavet

Created: 5th Sep 2025 at 13:38

Raw fastq QC

Bioinformatics Core CEITEC

Stable

The workflow main goal is to check the quality of input fastq files after sequencing and demultiplexing. Optionaly it can check for most abundant adaptor sequences. It can also merge fastq files belonging from the same samples after resequencing prior to quality check. The workflow was designed to be run in the SeqUIa (http://cfb.ceitec.muni.cz/sequia) application.

Type: Snakemake

Creators: None

Submitter: Nicolas Blavet

Created: 3rd Sep 2025 at 10:17

cfDNA-Flow

KrauthammerLab

Stable

cfDNA-Flow

1. Overview

cfDNA-Flow facilitates the accurate and reproducible analysis of cfDNA WGS data. It offers various preprocessing options to accommodate different experimental setups and research needs in the field of liquid biopsies.

2. Preprocessing options

2.1 Trimming Options

cfDNA-Flow provides the flexibility to either trim or not trim the input reads based on the user's requirements. Trimming removes low-quality bases, which can impact downstream analyses.

2.2 Reference

...

Type: Snakemake

Creators: Ivna Ivankovic, Todor Gitchev, Zsolt Balázs

Submitter: Zsolt Balázs

DOI: 10.48546/workflowhub.workflow.1900.1

Created: 2nd Sep 2025 at 15:56

Pipeface

Deveson Lab

(Show All)

Stable

Pipeface

Overview

Pipefaceee.

Nextflow pipeline to process long read ONT and/or pacbio HiFi data.

Pipeface's future hold's mitochondrial, STR, CNV and tandem repeat calling.

Workflow

Singleton

%%{init: {'theme':'dark'}}%% 
flowchart LR 

input_data("Input data: ONT fastq.gz and/or ONT fastq and/or ONT uBAM and/or pacbio HiFi uBAM") 
merging{{"Merge runs (if needed)"}} 
alignment{{"bam to fastq conversion (if needed), alignment,
...

Type: Nextflow

Creators: Leah Kemp, Andre Reis, Ira Deveson, Kisaru Liyanage, Matthew Downton, Hardip Patel, Kirat Alreja, This is a highly collaborative project, with many contributions from the Genomic Technologies Lab. Notably, Dr Andre Reis and Dr Ira Deveson are closely involved in the development of this pipeline. Optimisations involving DeepVariant and DeepTrio have been contributed by Dr Kisaru Liyanage and Dr Matthew Downton from the National Computational Infrastructure, with support from Australian BioCommons as part of the Workflow Commons project. Haploid-aware mode has been contributed by Dr Hardip Patel & Kirat Alreja from the National Centre for Indigenous Genomics. The installation and hosting of software used in this pipeline has and continues to be supported by the Australian BioCommons Tools and Workflows project (if89).

Submitter: Leah Kemp

DOI: 10.48546/workflowhub.workflow.1888.1

Created: 1st Sep 2025 at 00:16, Last updated: 7th Sep 2025 at 20:21

VIsoQLR

Bioinformatics Unit IIS-FJD

Stable

VIsoQLR: an interactive tool for the detection, quantification and fine-tuning of isoforms using long-read sequencing

VIsoQLR is an interactive analyzer, viewer and editor for the semi-automated identification and quantification of known and novel isoforms using long-read sequencing data. VIsoQLR is tailored to thoroughly analyze mRNA expression and maturation in low-throughput splicing assays. This tool takes sequences aligned to a reference, defines consensus splice sites, and quantifies ...

Type: Docker

Creators: None

Submitter: Yolanda Benítez Quesada

Created: 12th Aug 2025 at 13:48

Long Read WGS pipeline

Systems and Synthetic Biology

Work-in-progress

Workflow for long read quality control, contamination filtering, assembly, variant calling and annotation.

Steps:

Preprocessing of reference file : https://workflowhub.eu/workflows/1818
LongReadSum before and after filtering (read quality control)
Filtlong filter on quality and length
Flye assembly
Minimap2 mapping of reads and assembly
Clair3 variant calling of reads
Freebayes variant calling of assembly
Optional Bakta annotation of genomes with no reference
SnpEff building ...

Type: Common Workflow Language

Creator: Martijn Melissen

Submitter: Martijn Melissen

Created: 12th Aug 2025 at 13:00

PriorR

Bioinformatics Unit IIS-FJD

Stable

PriorR

Priorr is a prioritization program of disease-linked genetic variants devoloped within the Genetics&Genomics Department of La Fundacion Jimenez Diaz University Hospital. Priorr is conceived to analyse the output of the FJD-pipeline of SNVs or CNVs. This software program offers a number of useful functionalities for variant analysis such as: filtering by a virtual panel of genes. manual control of different population frequencies or pathogenicity predictors or filtering out variants ...

Type: Docker

Creators: None

Submitter: Yolanda Benítez Quesada

Created: 12th Aug 2025 at 11:06

GLOWgenes

Bioinformatics Unit IIS-FJD

Stable

GLOWgenes

Prioritization of gene diseases candidates by disease-aware evaluation of heterogeneous evidence networks Visit www.glowgenes.org for more information

Citing

de la Fuente L, Del Pozo-Valero M, Perea-Romero I, Blanco-Kelly F, Fernández-Caballero L, Cortón M, Ayuso C, Mínguez P. Prioritization of New Candidate Genes for Rare Genetic Diseases by a Disease-Aware Evaluation of Heterogeneous Molecular Networks. International Journal of Molecular Sciences. 2023; 24(2):1661. ...

Type: Python

Creators: None

Submitter: Yolanda Benítez Quesada

Created: 12th Aug 2025 at 10:59, Last updated: 12th Aug 2025 at 11:01

WHALE

Bioinformatics Unit IIS-FJD

Work-in-progress

WHALE: (W)orkflow for (H)uman-genome (A)nalysis of (L)ong-read (E)xperiments

Introduction

WHALE is a bioinformatics pipeline based on Nextflow and nf-core for long-read DNA sequencing analysis. It takes a samplesheet as input and performs quality control, alignment, variant calling and annotation.

Pipeline summary

Read QC (FastQC)
Present QC for raw reads (MultiQC)
Alignment ...

Type: Nextflow

Creators: None

Submitter: Yolanda Benítez Quesada

Created: 12th Aug 2025 at 10:56

PARROT-FJD

Bioinformatics Unit IIS-FJD

Work-in-progress

PARROT-FJD

Pipeline of Analysis and Research of Rare diseases Optimized in Tblab - Fundación Jiménez Díaz. This is a germline variant calling pipeline implemented in Nextflow which performs mapping, SNV/INDEL calling and annotation, and CNV calling and annotation for targeted sequencing (gene panels and WES) and whole genome sequencing.

How to run this pipeline

The different tasks previously mention are divided into different workflows which are specified usig the --analysis flag followed ...

Type: Nextflow

Creators: None

Submitter: Yolanda Benítez Quesada

Created: 12th Aug 2025 at 10:51

nf-CBRA-snvs

Bioinformatics Unit IIS-FJD

Work-in-progress

Introduction

nf-CBRA-snvs (nf-core - CIBERER Bioinformatics for Rare diseases Analysis - Small Nucleotide Variant) is a workflow optimized for the analysis of rare diseases, designed to detect SNVs and INDELs in targeted sequencing data (CES/WES) as well as whole genome sequencing (WGS).

This pipeline is developed using Nextflow, a workflow management system that enables an easy execution across various computing environments. It uses Docker or Singularity containers, simplifying setup and ...

Type: Nextflow

Creators: None

Submitter: Yolanda Benítez Quesada

Created: 12th Aug 2025 at 10:45

A nextflow pipeline to run the end-to-end image-based in-situ sequencing decoding and RNAScope-like analysis

Euro-BioImaging

Work-in-progress

PaSTa is a nextflow-based end-to-end image analysis pipeline for decoding image-based spatial transcriptomics data. It performs imaging cycle registration, cell segmentation and transcripts peak decoding. It is currently supports analysis of three types of ST technology:

in-situ sequencing-like encoding
MERFISH-like encoding
RNAScope-like labelling

Prerequisites:

Nextflow. Installation guide: https://www.nextflow.io/docs/latest/getstarted.html
Docker or Singularity. Installation guide: ...

Type: Nextflow

Creator: Tong LI

Submitter: Tong LI

Created: 29th Jul 2025 at 09:29

reference (and plasmid) preprocessing workflow

Systems and Synthetic Biology

**Workflow for preprocessing a reference file. **

Steps: -When a GenBank file is not provided, it is downloaded from NCBI based on a accession number. -When multiple plasmid GenBank files are provided, they are merged into one file. -When any amount of plasmid GenBank files are provided, the reference is merged with the plasmid GenBank file(s) into one file. A FASTA file is also extracted. -When no plasmid Genbank files are provided, a FASTA file is extracted from the reference GenBank file. -A ...

Type: Common Workflow Language

Creator: Martijn Melissen

Submitter: Martijn Melissen

Created: 22nd Jul 2025 at 16:22, Last updated: 12th Aug 2025 at 11:40

Exploring the role of normalization and feature selection in microbiome disease classification pipelines

Machine Learning Techniques in Microbiome

Stable

Code and supporting data for the article: "Exploring the role of normalization and feature selection in microbiome disease classification pipelines."

The repository contains the following folders:

1. data: contains OTU/ASV tables and class annotations for the 15 curated datasets considered.
2. src: code writen to perform the analyses from the article and the statistical tests
3. results: tables containing global nested cross validation results
4. figures

License: This

...

Type: Python

Creator: Ignacio Garach

Submitter: Ignacio Garach Vélez

DOI: 10.48546/workflowhub.workflow.1807.1

Created: 13th Jul 2025 at 17:40

metaBIOMx: Metagenomics pipeline for Microbial shot-gun sequencing data

CMG-GUTS

Stable

...

Type: Nextflow

Creators: Alem Gusinac, Thomas Ederveen, Jos Boekhorst, Annemarie Boleij

Submitter: Alem Gusinac

DOI: 10.48546/workflowhub.workflow.1787.6

Created: 3rd Jul 2025 at 15:46, Last updated: 8th Oct 2025 at 09:19

SynProtX

BAID Team

Stable

SynProtX

An official implementation of our research paper "SynProtX: A Large-Scale Proteomics-Based Deep Learning Model for Predicting Synergistic Anticancer Drug Combinations".

SynProtX is a deep learning model that integrates large-scale proteomics data, molecular graphs, and chemical fingerprints to predict synergistic effects of anticancer drug combinations. It provides robust ...

Type: Python

Creators: Bundit Boonyarit, Matin Kositchutima, Tisorn Na Phattalung, Nattawin Yamprasert, Chanitra Thuwajit, Thanyada Rungrotmongkol, Sarana Nutanong

Submitter: Bundit Boonyarit

DOI: 10.48546/workflowhub.workflow.1726.3

Created: 5th Jun 2025 at 17:29, Last updated: 5th Jun 2025 at 20:44

Click-qPCR: An interactive Shiny application for qPCR data analysis

Click-qPCR

Work-in-progress

🧬 Click-qPCR 🧬

An ultra-simple tool for interactive qPCR data analysis developed with R and Shiny.

日本語版のユーザーガイドはこちら (Read this document in Japanese)

Overview

Click-qPCR is a user-friendly Shiny web application designed for the straightforward analysis of real-time quantitative PCR (qPCR) data.

This tool is readily accessible via a web browser at , requiring no local installation for end-users.

It allows users to upload their Cq (quantification cycle) values, perform ΔCq ...

Type: Unrecognized workflow type

Creators: Azusa Kubota, Atsushi Tajima

Submitter: Azusa Kubota

Created: 4th Jun 2025 at 06:10, Last updated: 1st Sep 2025 at 09:24

SeuratExtend

Applied Computational Cancer Research

SeuratExtend: An Enhanced Toolkit for scRNA-seq Analysis

Overview

SeuratExtend is an R package designed to provide an improved and easy-to-use toolkit for scRNA-seq analysis and visualization, built upon the Seurat object. While Seurat is a widely-used tool in the R community that offers a foundational framework for scRNA-seq analysis, it has limitations when it comes to more advanced analysis and customized visualization. SeuratExtend expands upon Seurat by offering an array of ...

Type: Unrecognized workflow type

Creator: Yichao Hua

Submitter: Yichao Hua

DOI: 10.48546/workflowhub.workflow.1385.1

Created: 29th May 2025 at 16:04, Last updated: 30th May 2025 at 10:12

Consensus Virtual Drug Screening Workflow

Scipion CNB

This workflows contains a pipeline in Scipion that performs the following steps:

1.1) Import small molecules: introduces a set of small molecular structures in the pipeline as prospective ligands

1.2) Import atomic structure: introduces a protein atomic structure in the pipeline as receptor.

2.1) Ligand preparation: uses RDKit to prepare the small molecules optimizing their 3D structure.

2.2) Receptor preparation: uses bioPython to prepare the receptor structure, removing waters, adding hydrogens ...

Type: Scipion

Creators: None

Submitter: Daniel Del Hoyo

Created: 14th May 2025 at 14:25, Last updated: 14th May 2025 at 14:26

Basic Virtual Drug Screening Workflow

Scipion CNB

Work-in-progress

This workflow performs the most basic Virtual Drug Screening Pipeline to import a set of small molecules and dock them to an imported protein structure.

Type: Scipion

Creators: None

Submitter: Daniel Del Hoyo

Created: 14th May 2025 at 12:22, Last updated: 14th May 2025 at 14:04

CWL4IncorporateTSSintoGXF (paired-end file)

bonohulab

GitHub last commit (branch)

CWL4IncorporateTSSintoGXF

This ...

Type: Common Workflow Language

Creators: Ryo Nozu, Sora Yonezawa

Submitter: Sora Yonezawa

Created: 18th Apr 2025 at 03:51, Last updated: 22nd Apr 2025 at 02:20

Gene Fetch

iBOL Europe Museum Skimming, Biodiversity Genomics Europe (general)

Stable

Gene_fetch

This tool fetches gene sequences from NCBI databases based on taxonomy IDs (taxids) or taxonomic information. It can retrieve both protein and nucleotide sequences for various genes, including protein-coding genes (e.g., cox1, cytb, rbcl, matk) and rRNA genes (e.g., 16S, 18S).

Feature highlight

Fetch protein and/or nucleotide sequences from NCBI GenBank database.
Handles both direct nucleotide sequences and protein-linked nucleotide searches (CDS extraction includes fallback ...

Type: Python

Creators: Dan Parsons, Ben Price

Submitter: Dan Parsons

Created: 17th Apr 2025 at 13:57, Last updated: 28th May 2025 at 14:49

gSpreadComp

Kasmanas

gSpreadComp: Streamlining Microbial Community Analysis for Resistance, Virulence, and Plasmid-Mediated Spread

Overview

gSpreadComp is a UNIX-based, modular bioinformatics toolkit designed to streamline comparative genomics for analyzing microbial communities. It integrates genome annotation, gene spread calculation, plasmid-mediated horizontal gene transfer (HGT) detection and resistance-virulence ranking within the analysed microbial community to help researchers identify potential ...

Type: Shell Script

Creator: Jonas Kasmanas

Submitter: Jonas Kasmanas

DOI: 10.48546/workflowhub.workflow.1340.3

Created: 15th Apr 2025 at 11:29

AnnoAudit - Annotation Auditor

Bioinformatics Laboratory for Genomics and Biodiversity (LBGB)

Work-in-progress

AnnoAudit - Annotation Auditor

AnnoAudit is a robust Nextflow pipeline designed to evaluate the quality of genomic annotations through a multifaceted approach.

Overview of the workflow

The workflow assess the annotation quality based on different criteria:

Protein evidence support
RNASeq evidence support
Statistics of the predictions (i.e., gene length, exon number, etc.)
Ortholog analysis (BUSCO, OMArk)

Input data

Reference genome genome.[.fna, .fa, .fasta]
Annotation ...

Type: Nextflow

Creator: Phuong Doan

Submitter: Phuong Doan

DOI: 10.48546/workflowhub.workflow.1330.1

Created: 3rd Apr 2025 at 09:50

sanger-tol/curationpretext

Tree of Life Genome Assembly, Tree of Life Genome Analysis

Work-in-progress

sanger-tol/curationpretext

[![Cite with ...

Type: Nextflow

Creators: Damon-Lee Pointon, Mahesh Panchel, Yumi Sims, Will Eagles, Matthieu Muffato, Solenne Correard, Josie Paris

Submitter: Damon-Lee Pointon

Created: 12th Mar 2025 at 10:23

sanger-tol/curationpretext

Tree of Life Genome Assembly, Tree of Life Genome Analysis

Work-in-progress

[![Cite ...

Type: Nextflow

Creators: Damon-Lee Pointon, Mahesh Panchel

Submitter: Damon-Lee Pointon

Created: 12th Mar 2025 at 10:19

tcga-data-nf

QuackenbushLab

Work-in-progress

Workflow to download and prepare TCGA data.

The workflow divides the process of generating Gene Regulatory networks from TCGA cancer data in three steps:

Downloading the raw data from GDC and saving the rds/tables needed later
Preparing the data. This step includes filtering the data, normalizing it...
Analysis of gene regulatory networks

Type: Nextflow

Creator: Viola Fanfani

Submitter: Viola Fanfani

Created: 18th Feb 2025 at 20:38

ONT Artificial Deletion Filter-Delter

NkuyfqLab

Stable

ONT Artificial Deletion Filter-Delter

A tool to filter short artificial deletion variations by Oxford Nanopore Technologies (ONT) R9 and R10 flow cells and chemistries.

Requirements

The tool has been tested on Ubuntu 20.04 with 256GB RAM, 64 CPU cores and a NVIDIA GPU with 48GB RAM. The minimal requirements should be >= 64GB RAM and a NVIDIA GPU with >= 8GB RAM. Other operating systems like Windows or Mac were not tested.

ONT softwares like Guppy, ...

Type: Snakemake

Creator: Qiang Ye

Submitter: Qiang Ye

DOI: 10.48546/workflowhub.workflow.1205.2

Created: 14th Nov 2024 at 14:27, Last updated: 17th Dec 2024 at 14:57

deepvariant-nextflow

National Computational Infrastructure (NCI) WorkflowHub team

Stable

Nextflow Pipeline for DeepVariant

This repository contains a Nextflow pipeline for Google’s DeepVariant, optimised for execution on NCI Gadi.

Quickstart Guide

Edit the pipeline_params.yml file to include:

samples: a list of samples, where each sample includes the sample name, BAM file path (ensure corresponding .bai is in the same directory), path to an optional regions-of-interest BED file (set to '' if not required), and the model type.
ref: path to the reference FASTA (ensure ...

Type: Nextflow

Creators: Kisaru Liyanage, Matthew Downton

Submitter: Kisaru Liyanage

Created: 5th Dec 2024 at 01:16

plant2human workflow

bonohulab

Work-in-progress

plant2human workflow 🌾 ↔ 🕺

GitHub last commit (branch) ...

Type: Common Workflow Language

Creator: Sora Yonezawa

Submitter: Sora Yonezawa

DOI: 10.48546/workflowhub.workflow.1206.8

Created: 16th Nov 2024 at 04:56, Last updated: 28th Sep 2025 at 05:51

GALOP - Genome Assembly using Long reads Pipeline

Bioinformatics Laboratory for Genomics and Biodiversity (LBGB), ERGA Assembly

(Show All)

Work-in-progress

GALOP - Genome Assembly using Long reads Pipeline

This repository contains an exact copy of the standard Genoscope long reads assembly pipeline.

At the moment, this is not intended for users to download as it uses grid submission commands that will only work at Genoscope. As time goes on, we intend to make this pipeline available to a broader audience. However, genome assembly and polishing commands are accessible in the lib/assembly.py and lib/polishing.py files.

galop.py -h 
Mandatory
...

Type: Python

Creators: Benjamin Istace, Jean-Marc Aury, Caroline Belser

Submitter: Benjamin Istace

DOI: 10.48546/workflowhub.workflow.1200.2

Created: 12th Nov 2024 at 07:37, Last updated: 14th Nov 2024 at 06:55

skim2mito

NHM Clark group

Stable

skim2mito

skim2mito is a snakemake pipeline for the batch assembly, annotation, and phylogenetic analysis of mitochondrial genomes from low coverage genome skims. The pipeline was designed to work with sequence data from museum collections. However, it should also work with genome skims from recently collected samples.

Setup
Example data
Input
Output
Filtering contaminants
[Assembly and ...

Type: Snakemake

Creators: None

Submitter: Oliver White

Created: 12th Mar 2024 at 15:03, Last updated: 7th Oct 2024 at 13:24

SAPP conversion Workflow

UNLOCK

Work-in-progress

Workflow for converting (genome) annotation tool output into a GBOL RDF file (TTL/HDT) using SAPP

Current formats / tools:

EMBL format
InterProScan (JSON/TSV)
eggNOG-mapper (TSV)
KoFamScan (TSV)

git: https://gitlab.com/m-unlock/cwl

SAPP (Semantic Annotation Platform with Provenance):
https://gitlab.com/sapp
https://academic.oup.com/bioinformatics/article/34/8/1401/4653704

Type: Common Workflow Language

Creators: Bart Nijsse, Jasper Koehorst

Submitter: Bart Nijsse

Created: 1st Oct 2024 at 14:46, Last updated: 3rd Oct 2024 at 10:43

Microbial (meta-) genome annotation

UNLOCK

Work-in-progress

Workflow for microbial (meta-)genome annotation

Input is a (meta)genome sequence in fasta format.

bakta
KoFamScan (optional)
InterProScan (optional)
eggNOG mapper (optional)
To RDF conversion with SAPP (optional, default on) --> SAPP conversion Workflow in WorkflowHub

git: https://gitlab.com/m-unlock/cwl

Type: Common Workflow Language

Creators: Jasper Koehorst, Bart Nijsse

Submitter: Bart Nijsse

Created: 1st Oct 2024 at 14:16, Last updated: 3rd Oct 2024 at 10:47

ECTI Atopic Dermatitis

IDUN - Drug Delivery and Sensing

Stable

Stratum corneum nanotexture feature detection using deep learning and spatial analysis: a non-invasive tool for skin barrier assessment

This repository presents an objective, quantifiable method for assessing atopic dermatitis (AD) severity. The program integrates deep learning object detection with spatial analysis algorithms to accurately calculate the density of circular nano-size objects (CNOs), termed the Effective Corneocyte Topographical Index (ECTI). The ECTI demonstrates remarkable ...

Type: Python

Creator: Jen-Hung Wang

Submitter: Jen-Hung Wang

DOI: 10.48546/workflowhub.workflow.1161.1

Created: 11th Sep 2024 at 12:09

GADES reproducibility workflow

Medvedeva Lab

Stable

Article-GADES

This repository represents generating and benchmarking the results of the GADES package for Distance Matrix Calculation

Installation

git lfs install 
git clone https://github.com/lab-medvedeva/Article-GADES.git 
cd Article-GADES

Put the Real datasets in the MEX format to the folder Datasets/Real.

Running benchmark using Docker Deployment

docker run --gpus all \ 
-v $PWD/Datasets:/workspace/Article-GADES/Datasets
...

Type: Docker

Creator: Pavel Akhtyamov

Submitter: Pavel Akhtyamov

DOI: 10.48546/workflowhub.workflow.1125.1

Created: 5th Sep 2024 at 11:35, Last updated: 5th Sep 2024 at 11:36

Swedish Earth Biogenome Project Genome Assembly Workflow

NBIS, ERGA Assembly

Work-in-progress

Swedish Earth Biogenome Project - Genome Assembly Workflow

The primary genome assembly workflow for the Earth Biogenome Project at NBIS.

Workflow overview

General aim:

flowchart LR 
hifi[/ HiFi reads /] --> data_inspection 
ont[/ ONT reads /] --> data_inspection 
hic[/ Hi-C reads /] --> data_inspection 
data_inspection[[ Data inspection ]] --> preprocessing 
preprocessing[[ Preprocessing ]] --> assemble 
assemble[[ Assemble ]] --> validation 
validation[[ Assembly
...

Type: Nextflow

Creators: Mahesh Binzer-Panchal, Martin Pippel

Submitter: Mahesh Binzer-Panchal

Created: 23rd Aug 2024 at 14:16

Protein-protein and protein-nucleic acid binding site prediction via interpretable hierarchical geometric deep learning

Protein-protein and protein-nucleic acid binding site prediction research

Stable

GraphRBF is a state-of-the-art protein-protein/nucleic acid interaction site prediction model built by enhanced graph neural networks and prioritized radial basis function neural networks. This project serves users to use our software to directly predict protein binding sites or train our model on a new database. Identification of protein-protein and protein-nucleic acid binding sites provides insights into biological processes related to protein functions and technical guidance for disease ...

Type: BioCompute Object

Creator: 仕卓张

Submitter: 仕卓张

DOI: 10.48546/workflowhub.workflow.1107.1

Created: 23rd Aug 2024 at 15:12

cfDNA UniFlow: A unified preprocessing pipeline for cell-free DNA data from liquid biopsies

KircherLab

Stable

cfDNA UniFlow is a unified, standardized, and ready-to-use workflow for processing whole genome sequencing (WGS) cfDNA samples from liquid biopsies. It includes essential steps for pre-processing raw cfDNA samples, quality control and reporting. Additionally, several optional utility functions like GC bias correction and estimation of copy number state are included. Finally, we provide specialized methods for extracting coverage derived signals and visualizations comparing cases and controls. ...

Type: Snakemake

Creator: Sebastian Röner

Submitter: Sebastian Röner

DOI: 10.48546/workflowhub.workflow.1091.2

Created: 7th Aug 2024 at 12:56, Last updated: 11th Nov 2024 at 08:21

Deepconsensus for Sequel2/2e subreads

WGGC

deepconsensus 1.2 snakemake pipeline

This snakemake-based workflow takes in a subreads.bam and results in a deepconsensus.fastq

no methylation calls !

The metadata id of the subreads file needs to be: "m[numeric][numeric][numeric].subreads.bam"

Chunking (how many subjobs) and ccs min quality filter can be adjusted in the config.yaml

the checkpoint model for deepconsensus1.2 should be accessible like this: gsutil cp -r gs://brain-genomics-public/research/deepconsensus/models/v1.2/model_checkpoint/* ...

Type: Snakemake

Creators: None

Submitter: dan rick

Created: 12th Jul 2024 at 09:59

mettannotator

MGnify

Stable

mettannotator

Introduction
Workflow and tools
Installation and dependencies ...

Type: Nextflow

Creators: Tatiana Gurbich, Martin Beracochea

Submitter: Martin Beracochea

Created: 1st Jul 2024 at 17:34, Last updated: 21st Jan 2025 at 15:32

MOLGENIS/VIP: Variant Interpretation Pipeline

MOLGENIS

Stable

Variant Interpretation Pipeline (VIP) that annotates, filters and reports prioritized causal variants in humans, see https://github.com/molgenis/vip for more information.

Type: Unrecognized workflow type

Creators: None

Submitter: Dennis Hendriksen

Download

Created: 21st Jun 2021 at 09:33, Last updated: 12th Jun 2024 at 10:50

Porto-Sinusoidal Vascular Disease transcriptomics analysis workflow

EJPRD WP13 case-studies workflows

Work-in-progress

Workflow for gene set enrichment analsysis (GSEA) and co-expression analysis (WGCNA) on transcriptomics data to analyze pathways affected in Porto-Sinusoidal Vascular Disease.

Type: Common Workflow Language

Creators: Aishwarya Iyer, Friederike Ehrhart

Submitter: Aishwarya Iyer

DOI: 10.48546/workflowhub.workflow.1040.1

Created: 11th Jun 2024 at 15:13, Last updated: 14th Jun 2024 at 13:23

CNVand

Institute for Human Genetics and Genomic Medicine Aachen

Stable

CNVand

[![Contributor ...

Type: Snakemake

Creator: Carlos Classen

Submitter: Carlos Classen

DOI: 10.48546/workflowhub.workflow.1039.1

Created: 10th Jun 2024 at 16:56

IMPaCT-Data quality control workflow implementation in nf-core/Sarek

EGA

(Show All)

...

Type: Nextflow

Creators: Arnau Soler Costa, Amy Curwin, Jordi Rambla, All the Sarek team, nf-core comunity and people in the IMPaCT-Data project.

Submitter: Arnau Soler Costa

DOI: 10.48546/workflowhub.workflow.1030.2

Created: 5th Jun 2024 at 16:05, Last updated: 25th Jun 2024 at 08:30

Theoretical fragment substructure generation and in silico mass spectral library high-resolution upcycling workflow

RECETOX SpecDatRI

(Show All)

Work-in-progress

Galaxy Workflow Documentation: MS Finder Pipeline

This document outlines a MSFinder Galaxy workflow designed for peak annotation. The workflow consists of several steps aimed at preprocessing MS data, filtering, enhancing, and running MSFinder.

Step 1: Data Collection and Preprocessing

Collect if the inchi and smiles are missing from the dataset, and subsequently filter out the spectra which are missing inchi and smiles.

1.1 MSMetaEnhancer: Collect InChi, Isomeric_smiles, and Nominal_mass

...

Type: Galaxy

Creators: Zargham Ahmad, Helge Hecht, Elliott J. Price, Research Infrastructure RECETOX RI (No LM2018121) financed by the Ministry of Education, Youth and Sports, and Operational Programme Research, Development and Innovation - project CETOCOEN EXCELLENCE (No CZ.02.1.01/0.0/0.0/17_043/0009632).

Submitters: Helge Hecht, Zargham Ahmad

DOI: 10.48546/workflowhub.workflow.888.2

Created: 20th May 2024 at 11:05, Last updated: 6th Jun 2024 at 10:58

GSC (Genotype Sparse Compression)

Genome Data Compression Team

Stable

GSC (Genotype Sparse Compression)

Genotype Sparse Compression (GSC) is an advanced tool for lossless compression of VCF files, designed to efficiently store and manage VCF files in a compressed format. It accepts VCF/BCF files as input and utilizes advanced compression techniques to significantly reduce storage requirements while ensuring fast query capabilities. In our study, we successfully compressed the VCF files from the 1000 Genomes Project (1000Gpip3), consisting of 2504 samples and 80 ...

Type: Docker

Creator: Xiaolong Luo

Submitter: Xiaolong Luo

DOI: 10.48546/workflowhub.workflow.887.1

Created: 18th May 2024 at 14:18

GSC (Genotype Sparse Compression)

Genome Data Compression Team

Stable

GSC (Genotype Sparse Compression)

Type: Common Workflow Language

Creators: None

Submitter: Xiaolong Luo

Created: 17th May 2024 at 17:51

Training a CNN model for classification of transcriptional subtypes and survival prediction in glioblastoma

BRAIN - Biomedical Research on Adult Intracranial Neoplasms

Work-in-progress

GBMatch_CNN

Work in progress... Predicting TS & risk from glioblastoma whole slide images

Reference

Upcoming paper: stay tuned...

Dependencies

python 3.7.7

randaugment by Khrystyna Faryna: https://github.com/tovaroe/pathology-he-auto-augment

tensorflow 2.1.0

scikit-survival 0.13.1

pandas 1.0.3

lifelines 0.25.0

Description

The pipeline implemented here predicts transcriptional subtypes and survival of glioblastoma patients based on H&E stained whole slide scans. Sample data is ...

Type: Python

Creator: Thomas Roetzer-Pejrimovsky

Submitter: Thomas Roetzer-Pejrimovsky

DOI: 10.48546/workflowhub.workflow.883.1

Created: 13th May 2024 at 08:10

JAX NGS Operations Nextflow DSL2 Pipelines

Jackson Laboratory NGS-Ops

Stable

JAX NGS Operations Nextflow DSL2 Pipelines

This repository contains production bioinformatic analysis pipelines for a variety of bulk 'omics data analysis. Please see the Wiki documentation associated with this repository for all documentation and available analysis workflows.

Type: Nextflow

Creators: Michael Lloyd, Brian Sanderson, Barry Guglielmo, Sai Lek, Peter Fields, Harshpreet Chandok, Carolyn Paisie, Gabriel Rech, Ardian Ferraj, Anuj Srivastava

Submitter: Michael Lloyd

DOI: 10.48546/workflowhub.workflow.874.1

Created: 3rd May 2024 at 13:55, Last updated: 3rd May 2024 at 13:58

ProGFASTAGen - Protein-Graph FASTA Generation (and Identification) Workflows

Medizinisches Proteom-Center, Medical Bioinformatics

Stable

ProGFASTAGen

The ProGFASTAGen (Protein-Graph-FASTA-Generator or ProtGraph-FASTA-Generator) repository contains workflows to generate so-called precursor-specific-FASTAs (using the precursors from MGF-files) including feature-peptides, like VARIANTs or CONFLICTs if desired, or global-FASTAs (as described in ProtGraph). The single workflow scripts have been implemented with Nextflow-DSL-2 ...

Type: Nextflow

Creators: Dominik Lux, Julian Uszkoreit

Submitter: Dominik Lux

DOI: 10.48546/workflowhub.workflow.837.1

Created: 26th Apr 2024 at 10:54

Parabricks-Genomics-nf

Sydney Informatics Hub

Parabricks-Genomics-nf is a GPU-enabled pipeline for alignment and germline short variant calling for short read sequencing data. The pipeline utilises NVIDIA's Clara Parabricks toolkit to dramatically speed up the execution of best practice bioinformatics tools. Currently, this pipeline is configured specifically for NCI's Gadi HPC.

NVIDIA's Clara Parabricks can deliver a significant ...

Type: Nextflow

Creator: Georgina Samaha

Submitter: Georgina Samaha

DOI: 10.48546/workflowhub.workflow.836.1

Created: 26th Apr 2024 at 00:19

sanger-tol/treeval v1.1.0 - Ancient Aurora

Tree of Life Genome Assembly

Stable

...

Type: Nextflow

Creators: Damon-Lee Pointon, William Eagles, Ying Sims

Submitter: Damon-Lee Pointon

Created: 9th Apr 2024 at 10:22

HiC contact map generation

ERGA Assembly, Biodiversity Genomics Europe (general)

Stable

HiC contact map generation

Snakemake pipeline for the generation of .pretext and .mcool files for visualisation of HiC contact maps with the softwares PretextView and HiGlass, respectively.

Prerequisites

This pipeine has been tested using Snakemake v7.32.4 and requires conda for installation of required tools. To run the pipline use the command:

snakemake --use-conda

There are provided a set of configuration and running scripts for exectution on a slurm queueing system. After configuring ...

Type: Snakemake

Creator: Tom Brown

Submitter: Tom Brown

DOI: 10.48546/workflowhub.workflow.795.2

Created: 14th Mar 2024 at 09:50, Last updated: 14th Mar 2024 at 09:52

Bactria: BarCode TRee Inference and Analysis

Biodiversity Genomics Europe (general)

Work-in-progress

Bactria: BarCode TRee Inference

...

Type: Snakemake

Creators: None

Submitter: Rutger Vos

Created: 24th Jan 2024 at 10:38, Last updated: 5th Feb 2024 at 10:09

HP2NET - Framework for Construction of Phylogenetic Networks on High Performance Computing (HPC) Environment

HP2NET - Framework for construction of phylogenetic networks on High Performance Computing (HPC) environment

Framework for construction of phylogenetic networks on High Performance Computing (HPC) environment

Introduction

Phylogeny refers to the evolutionary history and relationship between biological lineages related by common descent. Reticulate evolution refers to the origination of lineages through the complete or partial merging of ancestor lineages. Networks may be used to represent lineage independence events in non-treelike phylogenetic processes.

The methodology for reconstructing networks ...

Type: Python

Creators: Rafael Terra, Diego Carvalho

Submitter: Rafael Terra

DOI: 10.48546/workflowhub.workflow.703.1

Created: 9th Jan 2024 at 13:04, Last updated: 18th Jan 2024 at 17:50

Somatic-ShortV-nf

Sydney Informatics Hub, Australian BioCommons

(Show All)

Work-in-progress

This is a Nextflow implementaion of the GATK Somatic Short Variant Calling workflow. This workflow can be used to discover somatic short variants (SNVs and indels) from tumour and matched normal BAM files following GATK's Best Practices Workflow. The workflowis currently optimised to run efficiently and at scale on the National Compute Infrastructure, Gadi.

Type: Nextflow

Creators: Nandan Deshpande, Tracy Chew, Cali Willet, Georgina Samaha

Submitter: Georgina Samaha

DOI: 10.48546/workflowhub.workflow.691.1

Created: 20th Dec 2023 at 01:12, Last updated: 20th Dec 2023 at 01:16

dna-seq-varlociraptor

Snakemake-Workflows

Stable

Snakemake workflow: dna-seq-varlociraptor

A ...

Type: Snakemake

Creators: Felix Mölder, David Lähnemann, Johannes Köster

Submitter: Johannes Köster

Created: 14th Dec 2023 at 08:15

ONTViSc (ONT-based Viral Screening for Biosecurity)

QCIF Bioinformatics

Stable

ONTViSc (ONT-based Viral Screening for Biosecurity)

Introduction

eresearchqut/ontvisc is a Nextflow-based bioinformatics pipeline designed to help diagnostics of viruses and viroid pathogens for biosecurity. It takes fastq files generated from either amplicon or whole-genome sequencing using Oxford Nanopore Technologies as input.

The pipeline can either: 1) perform a direct search on the sequenced reads, 2) generate clusters, 3) assemble the reads to generate longer contigs or 4) directly ...

Type: Nextflow

Creators: Marie-Emilie Gauthier, Craig Windell, Magdalena Antczak, Roberto Barrero

Submitter: Magdalena Antczak

DOI: 10.48546/workflowhub.workflow.683.3

Created: 4th Dec 2023 at 01:42, Last updated: 18th Dec 2024 at 04:29

Inclusion Body Myositis Active Subnetwork Identification Workflow

EJPRD WP13 case-studies workflows

Workflow for Creating a large disease network from various datasets and databases for IBM, and applying the active subnetwork identification method MOGAMUN.

Type: Common Workflow Language

Creators: Daphne Wijnbergen, Mridul Johari

Submitter: Daphne Wijnbergen

DOI: 10.48546/workflowhub.workflow.681.7

Created: 27th Nov 2023 at 12:52, Last updated: 1st Feb 2024 at 11:26

ANNOTATO - ERGA Genome Annotation Workflow in Nextflow

ERGA Annotation, Bioinformatics Laboratory for Genomics and Biodiversity (LBGB)

Stable

ANNOTATO - Annotation workflow To Annotate Them Oll

ANNOTATO - Annotation workflow To Annotate Them Oll
Overview of the workflow
Input data
Pipeline steps
Output data
Prerequisites
Installation
Running ANNOTATO
Before running the pipeline (IMPORTANT) ...

Type: Nextflow

Creator: Phuong Doan

Submitters: Tom Brown, Phuong Doan

DOI: 10.48546/workflowhub.workflow.654.2

Created: 9th Nov 2023 at 09:43, Last updated: 24th Nov 2023 at 15:24

sanger-tol/insdcdownload v1.1.0 - Deciduous ent

Tree of Life Genome Analysis

Stable

...

Type: Nextflow

Creators: Matthieu Muffato, Priyanka Surana

Submitter: Matthieu Muffato

Created: 2nd Nov 2023 at 11:59, Last updated: 14th Nov 2023 at 11:58

Workflow 4: Staramr

Seq4AMR

Work-in-progress

Correlation between Phenotypic and In Silico Detection of Antimicrobial Resistance in Salmonella enterica in Canada Using Staramr.

Doi: 10.3390/microorganisms10020292

tool	version	license
staramr	0.8.0	Apache-2.0 license

Type: Galaxy

Creators: None

Submitter: Dennis Dollée

Created: 11th May 2023 at 09:29, Last updated: 9th Sep 2024 at 09:06

Workflow 3: AMR - SeqSero2/SISTR

Seq4AMR

Work-in-progress

With this galaxy pipeline you can use Salmonella sp. next generation sequencing results to predict bacterial AMR phenotypes and compare the results against gold standard Salmonella sp. phenotypes obtained from food.

This pipeline is based on the work of the National Food Agency of Canada. Doi: 10.3389/fmicb.2020.00549

tool	version	license
SeqSero2	1.2.1	GNU GPL v2.0
...

Type: Galaxy

Creators: None

Submitter: Dennis Dollée

Created: 24th Nov 2022 at 13:42, Last updated: 9th Sep 2024 at 09:23

ScreenDOP - Screening of strategies for disease outcome prediction

yPublish - Bioinfo tools

Stable

Summary

The data preparation pipeline contains tasks for two distinct scenarios: leukaemia that contains microarray data for 119 patients and ovarian cancer that contains next generation sequencing data for 380 patients.

The disease outcome prediction pipeline offers two strategies for this task:

Graph kernel method: It starts generating personalized networks for ...

Type: Python

Creator: Yasmmin Martins

Submitter: Yasmmin Martins

Created: 22nd Oct 2023 at 01:18, Last updated: 22nd Oct 2023 at 01:19

PipePatExp - Pipeline to aggregate gene expression correlation information for PPI

yPublish - Bioinfo tools

Stable

Summary

The PPI information aggregation pipeline starts getting all the datasets in GEO database whose material was generated using expression profiling by high throughput sequencing. From each database identifiers, it extracts the supplementary files that had the counts table. Once finishing the download step, it identifies those that were normalized or had the raw counts to normalize. It also identify and map the gene ids to uniprot (the ids found usually ...

Type: Python

Creator: Yasmmin Martins

Submitter: Yasmmin Martins

Created: 22nd Oct 2023 at 01:02

PPIVPro - PPI Validation Process

yPublish - Bioinfo tools

Stable

Summary

The validation process proposed has two pipelines for filtering PPIs predicted by some IN SILICO detection method, both pipelines can be executed separately. The first pipeline (i) filter according to association rules of cellular locations extracted from HINT database. The second pipeline (ii) filter according to scientific papers where both proteins in the PPIs appear in interaction context in the sentences.

The pipeline (i) starts extracting cellular component annotations from ...

Type: Python

Creator: Yasmmin Martins

Submitter: Yasmmin Martins

Created: 22nd Oct 2023 at 00:43, Last updated: 22nd Oct 2023 at 00:45

PredPrIn - Scientific workflow to predict protein-protein interactions based in a combined analysis of multiple protein characteristics.

yPublish - Bioinfo tools

Summary

PredPrIn is a scientific workflow to predict Protein-Protein Interactions (PPIs) using machine learning to combine multiple PPI detection methods of proteins according to three categories: structural, based on primary aminoacid sequence and functional annotations.

PredPrIn contains three main steps: (i) acquirement and treatment of protein information, (ii) feature generation, and (iii) classification and analysis.

(i) The first step builds a knowledge base with the available annotations ...

Type: Python

Creator: Yasmmin Martins

Submitter: Yasmmin Martins

Created: 22nd Oct 2023 at 00:35, Last updated: 22nd Oct 2023 at 00:37

HPPIDiscovery - Scientific workflow to augment, predict and evaluate host-pathogen protein-protein interactions

yPublish - Bioinfo tools

Stable

Summary

HPPIDiscovery is a scientific workflow to augment, predict and perform an insilico curation of host-pathogen Protein-Protein Interactions (PPIs) using graph theory to build new candidate ppis and machine learning to predict and evaluate them by combining multiple PPI detection methods of proteins according to three categories: structural, based on primary aminoacid sequence and functional annotations.

HPPIDiscovery contains three main steps: (i) acquirement of pathogen and host proteins ...

Type: Snakemake

Creator: Yasmmin Martins

Submitter: Yasmmin Martins

Created: 20th Oct 2023 at 00:56

VVV2_align_PE

ANSES-Ploufragan

Deprecated

PAIRED-END workflow. Align reads on fasta reference/assembly using bwa mem, get a consensus, variants, mutation explanations.

IMPORTANT:

For "bcftools call" consensus step, the --ploidy file is in "Données partagées" (Shared Data) and must be imported in your history to use the worflow by providing this file (tells bcftools to consider haploid variant calling).
SELECT THE MOST ADAPTED VADR MODEL for annotation (see vadr parameters).

Type: Galaxy

Creator: Fabrice Touzain

Submitter: Fabrice Touzain

Created: 28th Jun 2023 at 10:52, Last updated: 19th Jun 2025 at 11:23

MLme: Machine Learning Made Easy

This workflow represents the Default ML Pipeline for AutoML feature from MLme. Machine Learning Made Easy (MLme) is a novel tool that simplifies machine learning (ML) for researchers. By integrating four essential functionalities, namely data exploration, AutoML, CustomML, and visualization, MLme fulfills the diverse requirements of researchers while eliminating the need for extensive coding efforts. MLme serves as a valuable resource that empowers researchers of all technical levels to leverage ...

Type: Workflow Description Language

Creator: Akshay Akshay

Submitter: Akshay Akshay

DOI: 10.48546/workflowhub.workflow.571.1

Created: 15th Sep 2023 at 15:36, Last updated: 15th Sep 2023 at 15:39

CLAWS (CNAG's long-read assembly workflow in Snakemake)

ERGA Assembly

Stable

CLAWS (CNAG's Long-read Assembly Workflow in Snakemake)

Snakemake Pipeline used for de novo genome assembly @CNAG. It has been developed for Snakemake v6.0.5.

It accepts Oxford Nanopore Technologies (ONT) reads, PacBio HFi reads, illumina paired-end data, illumina 10X data and Hi-C reads. It does the preprocessing of the reads, assembly, polishing, purge_dups, scaffodling and different evaluation steps. By default it will preprocess the reads, run Flye + Hypo + purge_dups + yahs and evaluate ...

Type: Snakemake

Creators: Jessica Gomez-Garrido, Fernando Cruz (CNAG), Francisco Camara (CNAG), Tyler Alioto (CNAG)

Submitter: Jessica Gomez-Garrido

DOI: 10.48546/workflowhub.workflow.567.2

Created: 12th Sep 2023 at 14:23, Last updated: 2nd Feb 2024 at 12:24

SnakeMAGs: a simple, efficient, flexible and scalable workflow to reconstruct prokaryotic genomes from metagenomes

Metagenomic tools

Stable

About SnakeMAGs

SnakeMAGs is a workflow to reconstruct prokaryotic genomes from metagenomes. The main purpose of SnakeMAGs is to process Illumina data from raw reads to metagenome-assembled genomes (MAGs). SnakeMAGs is efficient, easy to handle and flexible to different projects. The workflow is CeCILL licensed, implemented in Snakemake (run on multiple cores) and available ...

Type: Snakemake

Creators: Nachida Tadrent, Franck Dedeine, Vincent Hervé

Submitter: Vincent Hervé

Created: 2nd Aug 2023 at 12:41

GERONIMO

Mendel Centre for Plant Genomics and Proteomics

GERONIMO

Introduction

GERONIMO is a bioinformatics pipeline designed to conduct high-throughput homology searches of structural genes using covariance models. These models are based on the alignment of sequences and the consensus of secondary structures. The pipeline is built using Snakemake, a workflow management tool that allows for the reproducible execution of analyses on various computational platforms.

The idea for developing GERONIMO emerged from a comprehensive search for [telomerase ...

Type: Snakemake

Creator: Agata Kilar

Submitter: Agata Kilar

DOI: 10.48546/workflowhub.workflow.547.1

Created: 1st Aug 2023 at 02:34, Last updated: 3rd Aug 2023 at 19:15

prepareChIPs:

Black Ochre Data Labs

Work-in-progress

prepareChIPs

This is a simple snakemake workflow template for preparing single-end ChIP-Seq data. The steps implemented are:

Download raw fastq files from SRA
Trim and Filter raw fastq files using AdapterRemoval
Align to the supplied genome using bowtie2
Deduplicate Alignments using Picard MarkDuplicates
Call Macs2 Peaks using macs2

A pdf of the rulegraph is available here

Full details for each step are given below. Any additional ...

Type: Snakemake

Creator: Stevie Pederson

Submitter: Stevie Pederson

DOI: 10.48546/workflowhub.workflow.528.1

Created: 9th Jul 2023 at 09:54

VVV2_align_SE

ANSES-Ploufragan

Deprecated

SINGLE-END workflow. Align reads on fasta reference/assembly using bwa mem, get a consensus, variants, mutation explanations.

IMPORTANT:

For "bcftools call" consensus step, the --ploidy file is in "Données partagées" (Shared Data) and must be imported in your history to use the worflow by providing this file (tells bcftools to consider haploid variant calling).
SELECT the mot ADAPTED VADR MODEL for annotation (see vadr parameters).

Type: Galaxy

Creator: Fabrice Touzain

Submitter: Fabrice Touzain

Created: 27th Jun 2023 at 15:41, Last updated: 19th Jun 2025 at 11:24

Metabolome Annotation Workflow (MAW)

Metabolomics-Reproducibility

(Show All)

Stable

This repository hosts Metabolome Annotation Workflow (MAW). The workflow takes MS2 .mzML format data files as an input in R. It performs spectral database dereplication using R Package Spectra and compound database dereplication using SIRIUS OR MetFrag . Final candidate selection is done in Python using RDKit and PubChemPy.

Type: Common Workflow Language

Creators: Mahnoor Zulfiqar, Michael R. Crusoe, Luiz Gadelha, Christoph Steinbeck, Maria Sorokina, Kristian Peters

Submitter: Mahnoor Zulfiqar

DOI: 10.48546/workflowhub.workflow.510.2

Created: 19th Jun 2023 at 21:09, Last updated: 1st Aug 2023 at 15:21

Purge retained haplotypes using Purge-Dups

ERGA Assembly, Biodiversity Genomics Europe (general)

Purge dups

This snakemake pipeline is designed to be run using as input a contig-level genome and pacbio reads. This pipeline has been tested with snakemake v7.32.4. Raw long-read sequencing files and the input contig genome assembly must be given in the config.yaml file. To execute the workflow run:

snakemake --use-conda --cores N

Or configure the cluster.json and run using the ./run_cluster command

Type: Snakemake

Creator: Tom Brown

Submitter: Tom Brown

DOI: 10.48546/workflowhub.workflow.506.2

Created: 16th Jun 2023 at 14:56, Last updated: 16th Mar 2024 at 07:49

MGnify genomes catalogue pipeline

MGnify

(Show All)

Stable

MGnify genomes catalogue pipeline

MGnify A pipeline to perform taxonomic and functional annotation and to generate a catalogue from a set of isolate and/or metagenome-assembled genomes (MAGs) using the workflow described in the following publication:

Gurbich TA, Almeida A, Beracochea M, Burdett T, Burgin J, Cochrane G, Raj S, Richardson L, Rogers AB, Sakharova E, Salazar GA and Finn RD. (2023) [MGnify Genomes: A Resource for Biome-specific Microbial Genome ...

Type: Nextflow

Creators: Ekaterina Sakharova, Tatiana Gurbich, Martin Beracochea

Submitter: Martin Beracochea

Created: 28th Apr 2023 at 10:36, Last updated: 3rd Dec 2024 at 15:41

ZARP: An automated workflow for processing of RNA-seq data

Zavolan Lab

ZARP

...

Type: Snakemake

Creator: Zavolan Lab

Submitter: Zavolan Lab

DOI: 10.48546/workflowhub.workflow.447.1

Created: 21st Mar 2023 at 13:07, Last updated: 12th May 2023 at 16:33

GRAVI: Gene Regulatory Analysis using Variable Inputs

Black Ochre Data Labs

Work-in-progress

GRAVI: Gene Regulatory Analysis using Variable Inputs

This is a snakemake workflow for:

Performing sample QC
Calling ChIP peaks
Performing Differential Binding Analysis
Comparing results across ChIP targets

The minimum required input is one ChIP target with two conditions.

Full documentation can be found here

Snakemake Implementation

The basic workflow is written snakemake, requiring at least v7.7, and can be called using the following ...

Type: Snakemake

Creator: Stevie Pederson

Submitter: Stevie Pederson

DOI: 10.48546/workflowhub.workflow.443.1

Created: 21st Mar 2023 at 05:48

GermlineStructuralV-nf

Sydney Informatics Hub, Australian BioCommons

(Show All)

GermlineStructuralV-nf is a pipeline for identifying structural variant events in human Illumina short read whole genome sequence data. GermlineStructuralV-nf identifies structural variant and copy number events from BAM files using Manta, Smoove, and TIDDIT. Variants are then merged using SURVIVOR, ...

Type: Nextflow

Creators: Georgina Samaha, Marina Kennerson, Tracy Chew, Sarah Beecroft

Submitter: Georgina Samaha

DOI: 10.48546/workflowhub.workflow.431.1

Created: 31st Jan 2023 at 23:40, Last updated: 18th Dec 2023 at 05:36

TronFlow BAM preprocessing pipeline

TRON gGmbH

Stable

TronFlow BAM preprocessing pipeline

GitHub tag (latest SemVer) ...

Type: Nextflow

Creators: None

Submitter: Pablo Riesgo Ferreiro

Created: 17th Jan 2023 at 16:54

TronFlow alignment pipeline

TRON gGmbH

Stable

TronFlow alignment pipeline

GitHub tag (latest SemVer) ...

Type: Nextflow

Creators: None

Submitter: Pablo Riesgo Ferreiro

Created: 17th Jan 2023 at 16:51

CoVigator pipeline: variant detection pipeline for Sars-CoV-2 (and other viruses...)

TRON gGmbH

Stable

CoVigator logo

CoVigator pipeline: variant detection pipeline for Sars-CoV-2

[![Powered by ...

Type: Nextflow

Creators: Pablo Riesgo Ferreiro, Thomas Bukur, Patrick Sorn

Submitter: Pablo Riesgo Ferreiro

Created: 17th Jan 2023 at 15:06

IndexReferenceFasta-nf

Sydney Informatics Hub, Australian BioCommons

Stable

IndexReferenceFasta-nf

===========

Description
Diagram
User guide
Benchmarking
Workflow summaries
Metadata
Component tools
Required (minimum) inputs/parameters
Additional notes
Help/FAQ/Troubleshooting
Acknowledgements/citations/credits ...

Type: Nextflow

Creator: Georgina Samaha

Submitter: Georgina Samaha

DOI: 10.48546/workflowhub.workflow.393.1

Created: 12th Oct 2022 at 03:34

Interactive Jupyter Notebooks for FAIR and reproducible biomolecular simulation workflows

Interactive Jupyter Notebooks in combination with Conda environments can be used to generate FAIR (Findable, Accessible, Interoperable and Reusable/Reproducible) biomolecular simulation workflows. The interactive programming code accompanied by documentation, and the possibility to inspect intermediate results with versatile graphical charts and data visualization is very helpful, especially in iterative processes, where parameters might be adjusted to a particular system of interest. This work ...

Maintainers: Genís Bayarri, Adam Hospital

Number of items: 17

Tags: Bioinformatics, BioBB

Created: 5th Mar 2024 at 09:29, Last updated: 5th Mar 2024 at 09:34

ERGA Assembly Galaxy ONT+Illumina & HiC Pipelines (Flye-HyPo + Purge_Dups + YaHS)

Collection of de-novo genome assembly workflows written for implementation in Galaxy

Input data should be Oxford Nanopore raw reads plus Illumina WGS reads and Illumina 3-dimensional Chromatin Confirmation Capture (HiC) reads

Executing all workflows will output one scaffolded collapsed assembly and the complete QC analyses

Please run the workflows in order: WF0 (there are two, one for ONT, and another one for Illumina that can be used independently for the WGS and HiC reads), WF1, WF2, WF3, WF4

Maintainers: Diego De Panis

Number of items: 6

Tags: Assembly, Bioinformatics, Galaxy, Genomics, Genome assembly, ONT, illumina, Hi-C

Created: 8th Jan 2024 at 09:54, Last updated: 11th Mar 2024 at 12:42

ERGA Assembly Galaxy ONT+Illumina & HiC Pipelines (NextDenovo-HyPo + Purge_Dups + YaHS)

Collection of de-novo genome assembly workflows written for implementation in Galaxy

Input data should be Oxford Nanopore raw reads plus Illumina WGS reads and Illumina 3-dimensional Chromatin Confirmation Capture (HiC) reads

Executing all workflows will output one scaffolded collapsed assembly and the complete QC analyses

Please run the workflows in order: WF0 (there are two, one for ONT, and another one for Illumina that can be used independently for the WGS and HiC reads), WF1, WF2, WF3, WF4

Maintainers: Diego De Panis

Number of items: 6

Tags: Assembly, Bioinformatics, Galaxy, Genomics, Genome assembly, ONT, illumina, Hi-C

Created: 8th Jan 2024 at 09:51, Last updated: 11th Mar 2024 at 14:45

ERGA Assembly Galaxy HiFi & HiC Pipelines (Hifiasm-HiC + Purge_Dups + YaHS)

Collection of de-novo genome assembly workflows written for implementation in Galaxy

Input data should be PacBio HiFi reads and Illumina 3-dimensional Chromatin Confirmation Capture (HiC) reads

Executing all workflows will output two scaffolded haplotype assemblies and the complete QC analyses

Please run the workflows in order: WF0 (there are two, one for HiFi and one for Illumina HiC), WF1, WF2, WF3, WF4

Maintainers: Tom Brown, Diego De Panis

Number of items: 6

Tags: Assembly, Bioinformatics, Galaxy, Genomics, Genome assembly, HiFi, Hi-C

Created: 16th Jun 2023 at 15:07, Last updated: 20th Nov 2023 at 16:20

pb_variants

what this tool aims to deliver:

included tools:

Soil Metagenome Pipeline

What it does

cfDNA-Flow

1. Overview

2. Preprocessing options

2.1 Trimming Options

2.2 Reference

Pipeface

Overview

Workflow

Singleton

VIsoQLR: an interactive tool for the detection, quantification and fine-tuning of isoforms using long-read sequencing

PriorR

GLOWgenes

Citing

WHALE: (W)orkflow for (H)uman-genome (A)nalysis of (L)ong-read (E)xperiments

Introduction

Pipeline summary

PARROT-FJD

How to run this pipeline

Introduction

License: This

SynProtX

🧬 Click-qPCR 🧬

Overview

SeuratExtend: An Enhanced Toolkit for scRNA-seq Analysis

Overview

CWL4IncorporateTSSintoGXF

Gene_fetch

Feature highlight

gSpreadComp: Streamlining Microbial Community Analysis for Resistance, Virulence, and Plasmid-Mediated Spread

Overview

AnnoAudit - Annotation Auditor

Overview of the workflow

Input data

sanger-tol/curationpretext

ONT Artificial Deletion Filter-Delter

Requirements

Nextflow Pipeline for DeepVariant

Quickstart Guide

plant2human workflow 🌾 ↔ 🕺

GALOP - Genome Assembly using Long reads Pipeline

skim2mito

Contents

Workflow for converting (genome) annotation tool output into a GBOL RDF file (TTL/HDT) using SAPP

Workflow for microbial (meta-)genome annotation

Stratum corneum nanotexture feature detection using deep learning and spatial analysis: a non-invasive tool for skin barrier assessment

Article-GADES

Installation

Running benchmark using Docker Deployment

Swedish Earth Biogenome Project - Genome Assembly Workflow

Workflow overview

deepconsensus 1.2 snakemake pipeline

mettannotator

CNVand

Galaxy Workflow Documentation: MS Finder Pipeline

Step 1: Data Collection and Preprocessing

1.1 MSMetaEnhancer: Collect InChi, Isomeric_smiles, and Nominal_mass

GSC (Genotype Sparse Compression)

GSC (Genotype Sparse Compression)

GBMatch_CNN

Reference

Dependencies

Description

JAX NGS Operations Nextflow DSL2 Pipelines

ProGFASTAGen

HiC contact map generation

Prerequisites

Bactria: BarCode TRee Inference

Framework for construction of phylogenetic networks on High Performance Computing (HPC) environment

Introduction

Snakemake workflow: dna-seq-varlociraptor

ONTViSc (ONT-based Viral Screening for Biosecurity)

Introduction

ANNOTATO - Annotation workflow To Annotate Them Oll

Summary

Summary