Workflows
What is a Workflow?Filters
This workflow takes as input a collection of fastqs (single reads). Remove adapters with cutadapt, map with bowtie2. Keep MAPQ30. MACS2 for bam with fixed extension or model.
A workflow for the analysis of pox virus genomes sequenced as half-genomes (for ITR resolution) in a tiled-amplicon approach
Virtual screening of the SARS-CoV-2 main protease with rDock and pose scoring
This workflow take as input a collection of paired fastq. Remove adapters with cutadapt, map pairs with bowtie2 allowing dovetail. Keep MAPQ30 and concordant pairs. BAM to BED. MACS2 with "ATAC" parameters.
This workflow take as input a collection of paired fastq. It uses HiCUP to go from fastq to validPair file. The pairs are filtered for MAPQ and sorted by cooler to generate a tabix dataset. Cooler is used to generate a balanced cool file to the desired resolution.
This workflow take as input a collection of paired fastq. It will remove bad quality and adapters with cutadapt. Map with Bowtie2 end-to-end. Will remove reads on MT and unconcordant pairs and pairs with mapping quality below 30 and PCR duplicates. Will compute the pile-up on 5' +- 100bp. Will call peaks and count the number of reads falling in the 1kb region centered on the summit. Will plot the number of reads for each fragment length.
This workflow takes as input a list of single-read fastqs. Adapters and bad quality bases are removed with cutadapt. Reads are mapped with STAR with ENCODE parameters and genes are counted simultaneously. The counts are reprocess to be similar to HTSeq-count output. FPKM are computed with cufflinks. Coverage (per million mapped reads) are computed with bedtools on uniquely mapped reads.
This workflow takes as input a list of paired-end fastqs. Adapters and bad quality bases are removed with cutadapt. Reads are mapped with STAR with ENCODE parameters and genes are counted simultaneously. The counts are reprocess to be similar to HTSeq-count output. FPKM are computed with cufflinks. Coverage (per million mapped reads) are computed with bedtools on uniquely mapped reads (with R2 orientation inverted).
COVID-19: consensus construction
This workflow aims at generating reliable consensus sequences from variant calls according to transparent criteria that capture at least some of the complexity of variant calling.
It takes a collection of VCFs (with DP and DP4 INFO fields) and a collection of the corresponding aligned reads (for the purpose of calculating genome-wide coverage) such as produced by any of the variant calling workflows in ...
ChIP-seq paired-end Workflow
Inputs dataset
- The workflow needs a single input which is a list of dataset pairs of fastqsanger.
Inputs values
- adapters sequences: this depends on the library preparation. If you don't know, use FastQC to determine if it is Truseq or Nextera.
- reference_genome: this field will be adapted to the genomes available for bowtie2.
- effective_genome_size: this is used by MACS2 and may be entered manually (indications are provided for heavily used genomes).
...