Workflow Type:  Python
        
  
        
        
        
      
  
    
      
        
      
Frozen
    
    
  
      
  
    
      
        
      
Frozen
    
    
  
      
      
            Work-in-progress
        
        
GALOP - Genome Assembly using Long reads Pipeline
This repository contains an exact copy of the standard Genoscope long reads assembly pipeline.
At the moment, this is not intended for users to download as it uses grid submission commands that will only work at Genoscope. As time goes on, we intend to make this pipeline available to a broader audience. However, genome assembly and polishing commands are accessible in the lib/assembly.py and lib/polishing.py files.
galop.py -h
Mandatory arguments:
  --step {assembly,polishing}
                        Defines if the program will launch assembly or polishing scripts (default: None)
Assembly step arguments:
  --proj PROJECT_CODE, -p PROJECT_CODE
                        Project and material codes, can be given multiple times (eg. -p BCM,A,B -p BWW,AB)
                        (default: None)
  -i INPUT_FILE         Nanopore reads fastq file (default: )
  --size GENOME_SIZE, -s GENOME_SIZE
                        Estimated size of the genome in Mb (default: None)
  --cov READSET_COVERAGE, -c READSET_COVERAGE
                        Coverage to use for longest and filtlong subsets (default: 30)
  --assemblers ASSEMBLER_LIST
                        Comma-separated list of assemblers to use (e.g. '--assemblers
                        Smartdenovo,Raven,Wtdbg2'will not launch flye nor Necat. Choices: Flye, Hifiasm, Necat,
                        Nextdenovo, Raven, Shasta,Smartdenovo, Wtdbg2 (default:
                        Smartdenovo,Wtdbg2,Flye,Necat,Nextdenovo)
  --readsets READSET_LIST
                        Comma-separated list of readsets to use (e.g. '--readsets Filtlong,Longest' will not
                        launch assemblies with all reads (default: Full,Filtlong,Longest)
  --no-readset          Disables readset creation (default: False)
  --all-readsets        Disables the use of lsRunProj to check for readset validity and instead use all available
                        readsets (default: False)
  --force               Skips directory creation (default: False)
  --nano-raw            Use --nano-raw instead of --nano-hq in Flye (default: False)
  --pacbio              Look for PacBio runs when building readsets. (default: False)
Polishing step arguments:
  --model MEDAKA_MODEL, -m MEDAKA_MODEL
                        Model to use for medaka polishing (default: r941_prom_sup_g507)
  --pe1 PE1_PATH        Path to the Illumina R1 file (.gz or .fastq) (default: None)
  --pe2 PE2_PATH        Path to the Illumina R2 file (.gz or .fastq) (default: None)
  --assembly ASSEMBLY, -a ASSEMBLY
                        FULL PATH to the assembly to polish (default: )
  --assembly_dir ASSEMBLY_DIR
                        FULL PATH to the directory ouput of the 'nanopore_assembly_pipeline --step assembly'
                        (default: )
  --racon               Enables the racon step (default: False)
  --no_medaka           Skip the medaka step (default: False)
Optional arguments:
  --dir OUTPUT_DIRECTORY, -d OUTPUT_DIRECTORY
                        Output directory (default: None)
  --help, -h            Show this help message and exit
Submission arguments:
  --submode {msub,local}
                        Either submit using ccc_msub or run in local mode (default: msub)
  --nolaunch            Creates submission scripts but does not launch them (default: False)
  --account ACCOUNT     Account to use for submission (default: bistace)
  --qos {long,week,nolimit,xlarge,xxlarge}
                        QoS to use for submission (default: )
  --assembly_queue {normal,xlarge,small,broadwell,xxlarge}
                        Cluster queue to use for the assembly step (default: normal)
  --assembly_core ASSEMBLY_CORE_NUMBER
                        Number of cores to use for the assembly step (default: 36)
  --polishing_queue {normal,xlarge,small,broadwell,xxlarge}
                        Cluster queue to use for the polishing step (default: normal)
  --polishing_core POLISHING_CORE_NUMBER
                        Number of cores to use for the polishing step (default: 36)
  --wait                Wait for all jobs to finish before exiting (default: False)
Version History
main @ a1c22db (latest) Created 14th Nov 2024 at 06:55 by Benjamin Istace
Update README.md
Frozen
 main
maina1c22db
    main @ aa63fa8 (earliest) Created 12th Nov 2024 at 07:37 by Benjamin Istace
Add files via upload
Frozen
 main
mainaa63fa8
     Creators and Submitter
 Creators and SubmitterCreators
Submitter
    
    Citation
  
  
  Istace, B., Aury, J.-M., & Belser, C. (2024). GALOP - Genome Assembly using Long reads Pipeline. WorkflowHub. https://doi.org/10.48546/WORKFLOWHUB.WORKFLOW.1200.2
    Activity
Views: 2375 Downloads: 712
Created: 12th Nov 2024 at 07:37
Last updated: 14th Nov 2024 at 06:55
Annotated Properties
  Topic annotations
  
  
  
  
 Attributions
 AttributionsNone
 Collections
 Collections
 View on GitHub
View on GitHub Download RO-Crate
Download RO-Crate
 Genome Assembly Wor...
        Genome Assembly Wor...

 https://orcid.org/0000-0003-1718-3010
 https://orcid.org/0000-0003-1718-3010



