
What is a Workflow?
604 Workflows visible to you, out of a total of 648

Assembly with Hifi reads and Trio Data

Generate phased assembly based on PacBio Hifi Reads using parental Illumina data for phasing


  1. Hifi long reads [fastq]
  2. Concatenated Illumina reads : Paternal [fastq]
  3. Concatenated Illumina reads : Maternal [fastq]
  4. K-mer database [meryldb]
  5. Paternal hapmer database [meryldb]
  6. Maternal hapmer database [meryldb]
  7. Genome profile summary generated by Genomescope [txt]
  8. Bloom Filter
  9. Name of first haplotype
  10. Name of second haplotype ...

Type: Galaxy

Creator: Galaxy, VGP

Submitter: WorkflowHub Bot


Name: Matmul GPU Case 1 Cache-ON Contact Person: Access Level: public License Agreement: Apache2 Platform: COMPSs Machine: Minotauro-MN4

Matmul running on the GPU leveraging COMPSs GPU Cache for deserialization speedup. Launched using 32 GPUs (16 nodes). Performs C = A @ B Where A: shape (320, 56_900_000) block_size (10, 11_380_000)             B: shape (56_900_000, 10)   block_size (11_380_000, 10)             C: shape (320, 10)                block_size ...

Type: COMPSs

Creators: Cristian Tatu, The Workflows and Distributed Computing Team (

Submitter: Cristian Tatu

DOI: 10.48546/workflowhub.workflow.798.1


Name: Matmul GPU Case 1 Cache-OFF Contact Person: Access Level: public License Agreement: Apache2 Platform: COMPSs 3.3 Machine: Minotauro-MN4

Matmul running on the GPU without Cache. Launched using 32 GPUs (16 nodes). Performs C = A @ B Where A: shape (320, 56_900_000) block_size (10, 11_380_000)             B: shape (56_900_000, 10)   block_size (11_380_000, 10)             C: shape (320, 10)                block_size (10, 10) Total dataset size 291 ...

Type: COMPSs

Creators: Cristian Tatu, The Workflows and Distributed Computing Team (

Submitter: Cristian Tatu

DOI: 10.48546/workflowhub.workflow.797.1


Name: K-Means GPU Cache OFF Contact Person: Access Level: public License Agreement: Apache2 Platform: COMPSs Machine: Minotauro-MN4

K-Means running on GPUs. Launched using 32 GPUs (16 nodes). Parameters used: K=40 and 32 blocks of size (1_000_000, 1200). It creates a block for each GPU. Total dataset shape is (32_000_000, 1200). Version dislib-0.9

Average task execution time: 194 seconds

Type: COMPSs

Creators: Cristian Tatu, The Workflows and Distributed Computing Team (

Submitter: Cristian Tatu

DOI: 10.48546/workflowhub.workflow.799.1


Name: K-Means GPU Cache ON Contact Person: Access Level: public License Agreement: Apache2 Platform: COMPSs Machine: Minotauro-MN4

K-Means running on the GPU leveraging COMPSs GPU Cache for deserialization speedup. Launched using 32 GPUs (16 nodes). Parameters used: K=40 and 32 blocks of size (1_000_000, 1200). It creates a block for each GPU. Total dataset shape is (32_000_000, 1200). Version dislib-0.9

Average task execution time: 16 seconds

Type: COMPSs

Creators: Cristian Tatu, The Workflows and Distributed Computing Team (

Submitter: Cristian Tatu

DOI: 10.48546/workflowhub.workflow.800.1


Name: Dislib Distributed Training - Cache ON Contact Person: Access Level: public License Agreement: Apache2 Platform: COMPSs Machine: Minotauro-MN4

PyTorch distributed training of CNN on GPU and leveraging COMPSs GPU Cache for deserialization speedup. Launched using 32 GPUs (16 nodes). Dataset: Imagenet Version dislib-0.9 Version PyTorch 1.7.1+cu101

Average task execution time: 36 seconds

Type: COMPSs

Creators: Cristian Tatu, The Workflows and Distributed Computing Team (

Submitter: Cristian Tatu

DOI: 10.48546/workflowhub.workflow.802.1


Name: Dislib Distributed Training - Cache OFF Contact Person: Access Level: public License Agreement: Apache2 Platform: COMPSs Machine: Minotauro-MN4

PyTorch distributed training of CNN on GPU. Launched using 32 GPUs (16 nodes). Dataset: Imagenet Version dislib-0.9 Version PyTorch 1.7.1+cu101

Average task execution time: 84 seconds

Type: COMPSs

Creators: Cristian Tatu, The Workflows and Distributed Computing Team (

Submitter: Cristian Tatu

DOI: 10.48546/workflowhub.workflow.801.1


HiC scaffolding pipeline

Snakemake pipeline for scaffolding of a genome using HiC reads using yahs.


This pipeine has been tested using Snakemake v7.32.4 and requires conda for installation of required tools. To run the pipline use the command:

snakemake --use-conda --cores N

where N is number of cores to use. There are provided a set of configuration and running scripts for exectution on a slurm queueing system. After configuring the cluster.json file run:

./run_cluster ...

Type: Snakemake

Creator: Tom Brown

Submitter: Tom Brown

DOI: 10.48546/workflowhub.workflow.796.1

Purge dups

This snakemake pipeline is designed to be run using as input a contig-level genome and pacbio reads. This pipeline has been tested with snakemake v7.32.4. Raw long-read sequencing files and the input contig genome assembly must be given in the config.yaml file. To execute the workflow run:

snakemake --use-conda --cores N

Or configure the cluster.json and run using the ./run_cluster command

Type: Snakemake

Creator: Tom Brown

Submitter: Tom Brown

DOI: 10.48546/workflowhub.workflow.506.2


HiC contact map generation

Snakemake pipeline for the generation of .pretext and .mcool files for visualisation of HiC contact maps with the softwares PretextView and HiGlass, respectively.


This pipeine has been tested using Snakemake v7.32.4 and requires conda for installation of required tools. To run the pipline use the command:

snakemake --use-conda

There are provided a set of configuration and running scripts for exectution on a slurm queueing system. After configuring ...

Type: Snakemake

Creator: Tom Brown

Submitter: Tom Brown

DOI: 10.48546/workflowhub.workflow.795.2

Powered by
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH