preparing genomic data for phylogeny recostruction (GTN)
Version 1

Workflow Type: Galaxy

This workflow begins from a set of genome assemblies of different samples, strains, species. The genome is first annotated with Funnanotate. Predicted proteins are furtner annotated with Busco. Next, 'ProteinOrtho' finds orthologs across the samples and makes orthogroups. Orthogroups where all samples are represented are extracted. Orthologs in each orthogroup are aligned with ClustalW. Test dataset: https://zenodo.org/record/6610704#.Ypn3FzlBw5k

Inputs

ID Name Description Type
Input genomes as collection Input genomes as collection n/a
  • File[]

Steps

ID Name Description
1 Replace Text toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_line/1.1.2
2 RepeatMasker toolshed.g2.bx.psu.edu/repos/bgruening/repeat_masker/repeatmasker_wrapper/4.1.2-p1+galaxy0
3 Funannotate predict annotation toolshed.g2.bx.psu.edu/repos/iuc/funannotate_predict/funannotate_predict/1.8.9+galaxy2
4 Extract ORF toolshed.g2.bx.psu.edu/repos/bgruening/glimmer_gbk_to_orf/glimmer_gbk_to_orf/3.02
5 Regex Find And Replace toolshed.g2.bx.psu.edu/repos/galaxyp/regex_find_replace/regex1/1.0.1
6 Collapse Collection toolshed.g2.bx.psu.edu/repos/nml/collapse_collections/collapse_dataset/4.2
7 Proteinortho toolshed.g2.bx.psu.edu/repos/iuc/proteinortho/proteinortho/6.0.14+galaxy2.9.1
8 Busco toolshed.g2.bx.psu.edu/repos/iuc/busco/busco/4.1.4
9 Filter Filter1
10 Proteinortho grab proteins toolshed.g2.bx.psu.edu/repos/iuc/proteinortho_grab_proteins/proteinortho_grab_proteins/6.0.14+galaxy2.9.1
11 Regex Find And Replace toolshed.g2.bx.psu.edu/repos/galaxyp/regex_find_replace/regex1/1.0.1
12 ClustalW toolshed.g2.bx.psu.edu/repos/devteam/clustalw/clustalw/2.1

Outputs

ID Name Description Type
headers_shortened headers_shortened n/a
  • File
repeat_masked repeat_masked n/a
  • File
funannotate_predicted_proteins funannotate_predicted_proteins n/a
  • File
extracted_ORFs extracted_ORFs n/a
  • File
_anonymous_output_1 _anonymous_output_1 n/a
  • File
sample_names_to_headers sample_names_to_headers n/a
  • File
proteomes_to_one_file proteomes_to_one_file n/a
  • File
_anonymous_output_2 _anonymous_output_2 n/a
  • File
Proteinortho on input dataset(s): orthology-groups Proteinortho on input dataset(s): orthology-groups n/a
  • File
_anonymous_output_3 _anonymous_output_3 n/a
  • File
_anonymous_output_4 _anonymous_output_4 n/a
  • File
_anonymous_output_5 _anonymous_output_5 n/a
  • File
_anonymous_output_6 _anonymous_output_6 n/a
  • File
_anonymous_output_7 _anonymous_output_7 n/a
  • File
Proteinortho_extract_by_orthogroup Proteinortho_extract_by_orthogroup n/a
  • File
fasta_header_cleaned fasta_header_cleaned n/a
  • File
ClustalW on input dataset(s): clustal ClustalW on input dataset(s): clustal n/a
  • File

Version History

Version 1 (earliest) Created 6th Jun 2022 at 15:05 by Miguel Roncoroni

Initial commit


Frozen Version-1 a3e26fb
help Creators and Submitter
Creators
Not specified
Additional credit

Miguel Roncoroni

Submitter
Activity

Views: 3469   Downloads: 230   Runs: 0

Created: 6th Jun 2022 at 15:05

help Attributions

None

Total size: 33.9 KB
Powered by
(v.1.16.0-main)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH