Workflow Type:  Galaxy
        
  
            
              
                
                     
                
              
            
        
          
            
              
    
      
        
        
    
    
      
        
        
    
    
      
        
        
    
            
          
        
        
      
  
    
    
    
  
      
      
            Stable
        
        
Secondary metabolite biosynthetic gene cluster (SMBGC) Annotation using Neural Networks Trained on Interpro Signatures
Associated Tutorial
This workflows is part of the tutorial Marine Omics identifying biosynthetic gene clusters, available in the GTN
Features
- Includes Galaxy Workflow Tests
- Includes a Galaxy Workflow Report
- Uses Galaxy Workflow Comments
Thanks to...
Workflow Author(s): Marie Jossé
Tutorial Author(s): Marie Josse
Tutorial Contributor(s): Björn Grüning, Saskia Hiltemann
Grants(s): Fair-Ease, EuroScienceGateway
Inputs
| ID | Name | Description | Type | 
|---|---|---|---|
| Fasta nucelotide file | Fasta nucelotide file | BGC0001472.fna | 
 | 
Steps
| ID | Name | Description | 
|---|---|---|
| 1 | Prodigal Gene Predictor | Create the protein fasta file toolshed.g2.bx.psu.edu/repos/iuc/prodigal/prodigal/2.6.3+galaxy0 | 
| 2 | Sanntis: Build Genbank | Use of Sanntis toolshed.g2.bx.psu.edu/repos/ecology/sanntis_marine/sanntis_marine/0.9.3.5+galaxy1 | 
| 3 | Regex Find And Replace | Remove useless * in the protein fasta file toolshed.g2.bx.psu.edu/repos/galaxyp/regex_find_replace/regex1/1.0.3 | 
| 4 | InterProScan | Create TSV file for Sanntis toolshed.g2.bx.psu.edu/repos/bgruening/interproscan/interproscan/5.59-91.0+galaxy3 | 
| 5 | Sanntis: identify biosynthetic gene clusters | toolshed.g2.bx.psu.edu/repos/ecology/sanntis_marine/sanntis_marine/0.9.3.5+galaxy1 | 
Outputs
| ID | Name | Description | Type | 
|---|---|---|---|
| Protein fasta file | Protein fasta file | n/a | 
 | 
| Genbank file | Genbank file | n/a | 
 | 
| Clean protein fasta file | Clean protein fasta file | n/a | 
 | 
| Tabular file (.tsv) | Tabular file (.tsv) | n/a | 
 | 
| SMBGC Annotation | SMBGC Annotation | n/a | 
 | 
Version History
 Creators and Submitter
 Creators and SubmitterCreator
Submitter
Discussion Channel
Tools
    
    Citation
  
  
  Jossé, M. (2025). Marine Omics identifying biosynthetic gene clusters. WorkflowHub. https://doi.org/10.48546/WORKFLOWHUB.WORKFLOW.1663.1
    Activity
Views: 1323 Downloads: 195 Runs: 1
Created: 2nd Jun 2025 at 11:03
Last updated: 28th Jul 2025 at 11:14
 Attributions
 AttributionsNone
 Collections
 Collections
 Visit source
Visit source Download RO-Crate
Download RO-Crate Run on Galaxy
Run on Galaxy
 1.0
1.0

 Workflows in EuroSc...
        Workflows in EuroSc...



 https://orcid.org/0009-0008-0622-604X
 https://orcid.org/0009-0008-0622-604X



