ONTViSc (ONT-based Viral Screening for Biosecurity)

Introduction

eresearchqut/ontvisc is a Nextflow-based bioinformatics pipeline designed to help diagnostics of viruses and viroid pathogens for biosecurity. It takes fastq files generated from either amplicon or whole-genome sequencing using Oxford Nanopore Technologies as input.

The pipeline can either: 1) perform a direct search on the sequenced reads, 2) generate clusters, 3) assemble the reads to generate longer contigs or 4) directly map reads to a known reference.

The reads can optionally be filtered from a plant host before performing downstream analysis.

Pipeline overview

Data quality check (QC) and preprocessing
- Merge fastq files (Fascat, optional)
- Raw fastq file QC (Nanoplot)
- Trim adaptors (PoreChop ABI - optional)
- Filter reads based on length and/or quality (Chopper - optional)
- Reformat fastq files so read names are trimmed after the first whitespace (bbmap)
- Processed fastq file QC (if PoreChop and/or Chopper is run) (Nanoplot)
Host read filtering
- Align reads to host reference provided (Minimap2)
- Extract reads that do not align for downstream analysis (seqtk)
QC report
- Derive read counts recovered pre and post data processing and post host filtering
Read classification analysis mode
Clustering mode
- Read clustering (Rattle)
- Convert fastq to fasta format (seqtk)
- Cluster scaffolding (Cap3)
- Megablast homology search against ncbi or custom database (blast)
- Derive top candidate viral hits
- Align reads back to top reference and derive coverage statistics (mosdepth and coverM)
De novo assembly mode
- De novo assembly (Canu or Flye)
- Megablast homology search against ncbi or custom database or reference (blast)
- Derive top candidate viral hits
- Align reads back to top reference and derive coverage statistics (mosdepth and coverM)
Read classification mode
- Option 1 Nucleotide-based taxonomic classification of reads (Kraken2, Braken)
- Option 2 Protein-based taxonomic classification of reads (Kaiju, Krona)
- Option 3 Convert fastq to fasta format (seqtk) and perform direct homology search using megablast (blast)
Map to reference mode
- Align reads to reference fasta file (Minimap2) and derive bam file and alignment statistics (Samtools)

Code and detailed instructions can be found here. A comprehensive, step-by-step guide on setting up and executing the ONTViSc pipeline across three high-performance computing systems hosted by Australian research and computing facilities - Lyra (Queensland University of Technology), Gadi (National Computational Infrastructure), and Setonix (Pawsey) - utilising the Australian Nextflow Seqera Service, can be found here.

Authors

Marie-Emilie Gauthier
Craig Windell
Magdalena Antczak
Roberto Barrero

Version History

main @ 2274c83 (latest) Created 18th Dec 2024 at 04:26 by Magdalena Antczak

update pipeline figure

Frozen main 2274c83

v1.3 Created 18th Dec 2024 at 04:23 by Magdalena Antczak

update test command

Frozen v1.3 049bd72

main @ d333445 (earliest) Created 4th Dec 2023 at 01:42 by Magdalena Antczak

update conditions in preprocessing steps

Frozen main d333445

ONTViSc (ONT-based Viral Screening for Biosecurity)
main @ 2274c83 (latest)

main @ 2274c83 (latest)

v1.3

main @ d333445 (earliest)

ONTViSc (ONT-based Viral Screening for Biosecurity)

Introduction

Pipeline overview

Authors

Version History

main @ 2274c83 (latest) Created 18th Dec 2024 at 04:26 by Magdalena Antczak

v1.3 Created 18th Dec 2024 at 04:23 by Magdalena Antczak

main @ d333445 (earliest) Created 4th Dec 2023 at 01:42 by Magdalena Antczak

Creators

Submitter

ONTViSc (ONT-based Viral Screening for Biosecurity) main @ 2274c83 (latest) main @ 2274c83 (latest) v1.3 main @ d333445 (earliest)

ONTViSc (ONT-based Viral Screening for Biosecurity)

Introduction

Pipeline overview

Authors

Version History

main @ 2274c83 (latest) Created 18th Dec 2024 at 04:26 by Magdalena Antczak

v1.3 Created 18th Dec 2024 at 04:23 by Magdalena Antczak

main @ d333445 (earliest) Created 4th Dec 2023 at 01:42 by Magdalena Antczak

Creators

Submitter

Related items

ONTViSc (ONT-based Viral Screening for Biosecurity)
main @ 2274c83 (latest)

main @ 2274c83 (latest)

v1.3

main @ d333445 (earliest)