HiC scaffolding pipeline

Snakemake pipeline for scaffolding of a genome using HiC reads using yahs.

Prerequisites

This pipeine has been tested using Snakemake v7.32.4 and requires conda for installation of required tools. To run the pipline use the command:

snakemake --use-conda --cores N

where N is number of cores to use. There are provided a set of configuration and running scripts for exectution on a slurm queueing system. After configuring the cluster.json file run:

./run_cluster

Before starting

You need to create a temporary folder and specify the path in the config.yaml file. This should be able to hold the temporary files created when sorting the .pairsam file (100s of GB or even many TBs)

The path to the genome assemly must be given in the config.yaml.

The HiC reads should be paired and named as follows: Library_1.fastq.gz Library_2.fastq.gz. The pipeline can accept any number of paired HiC read files, but the naming must be consistent. The folder containing these files must be provided in the config.yaml.

Version History

Version 2 (latest) Created 21st Jun 2024 at 10:42 by Tom Brown

Add cluster json for execution on slurm

Frozen Version-2 efc9e4b

Version 1 (earliest) Created 16th Mar 2024 at 09:01 by Tom Brown

Initial commit

Frozen Version-1 cd486a3

HiC scaffolding pipeline
Version 2 (latest)

Version 2 (latest)

Version 1 (earliest)

HiC scaffolding pipeline

Prerequisites

Before starting

Version History

Version 2 (latest) Created 21st Jun 2024 at 10:42 by Tom Brown

Version 1 (earliest) Created 16th Mar 2024 at 09:01 by Tom Brown

Creator

Submitter

HiC scaffolding pipeline Version 2 (latest) Version 2 (latest) Version 1 (earliest)

HiC scaffolding pipeline

Prerequisites

Before starting

Version History

Version 2 (latest) Created 21st Jun 2024 at 10:42 by Tom Brown

Version 1 (earliest) Created 16th Mar 2024 at 09:01 by Tom Brown

Creator

Submitter

Related items

HiC scaffolding pipeline
Version 2 (latest)

Version 2 (latest)

Version 1 (earliest)