Workflow Type: Galaxy
Open
Stable
This workflow applies text mining to a museum collection in tabular format to extract from which year most objects derive and what they are. The first steps are filtering and data cleaning to put the data in correct format. Datamash allows showing how many documents from what year the museum catalogue contains. The output is a chronological table which is visualised as a bar chart. From that, the year where most items derived from is extracted. The next step filters items only from that year. The object description from all of those items is extracted to visualise what content the museum own from that year.
Inputs
ID | Name | Description | Type |
---|---|---|---|
Input | Input | Upload a tsv file with multiple rows and columns to compute on. |
|
stop_words_english | stop_words_english | Upload a list of English stop words in a .txt-format. |
|
Steps
ID | Name | Description |
---|---|---|
2 | Cut | Cut1 |
3 | Filter Tabular | toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/3.3.1 |
4 | Column Regex Find And Replace | toolshed.g2.bx.psu.edu/repos/galaxyp/regex_find_replace/regexColumn1/1.0.3 |
5 | Sort | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sort_header_tool/9.5+galaxy2 |
6 | Remove beginning | Remove beginning1 |
7 | Datamash | toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.9+galaxy0 |
8 | Sort | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sort_header_tool/9.5+galaxy2 |
9 | Bar chart | barchart_gnuplot |
10 | Sort | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sort_header_tool/9.5+galaxy2 |
11 | Select first | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_head_tool/9.5+galaxy2 |
12 | Cut | Cut1 |
13 | Parse parameter value | param_value_from_file |
14 | Compose text parameter value | toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1 |
15 | Search in textfiles | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_grep_tool/9.5+galaxy2 |
16 | Cut | Cut1 |
17 | Generate a word cloud | toolshed.g2.bx.psu.edu/repos/bgruening/wordcloud/wordcloud/1.9.4+galaxy2 |
Outputs
ID | Name | Description | Type |
---|---|---|---|
out_file1 | out_file1 | n/a |
|
output | output | n/a |
|
Version History
Version 1 (earliest) Created 25th Aug 2025 at 13:53 by Daniela Schneider
Initial commit
Open
master
25a0297

Creator
Submitter
Activity
Views: 18 Downloads: 2 Runs: 0
Created: 25th Aug 2025 at 13:53
Annotated Properties
Operation annotations

None