Exploring the role of normalization and feature selection in microbiome disease classification pipelines
main @ b9472f1

Workflow Type: Python
Stable

Code and supporting data for the article: "Exploring the role of normalization and feature selection in microbiome disease classification pipelines."

The repository contains the following folders:

  • 1. data: contains OTU/ASV tables and class annotations for the 15 curated datasets considered.
  • 2. src: code writen to perform the analyses from the article and the statistical tests
  • 3. results: tables containing global nested cross validation results
  • 4. figures

License: This project is licensed under GNU GPL 3.0 - check LICENSE file for more details.

Dataset Samples (Cases, Controls) Features IR Project ID
ART 114 (86, 28) 10733 3.07 PRJNA203810
CDI 336 (93, 243) 3456 2.61 10.1128/mbio.01021-14 (DOI)
CRC1 490 (229, 261) 6920 1.14 PRJNA290926
CRC2 102 (46, 56) 837 1.22 SRP005150
HIV 350 (293, 57) 14425 5.14 PRJNA307231
CD1 140 (78, 62) 3547 1.26 PRJNA237362
CD2 160 (68, 92) 3547 1.35 PRJNA237362
IBD1 91 (67, 24) 2742 2.79 PRJNA82109
IBD2 114 (68, 46) 1496 1.48 10.1053/j.gastro.2010.08.049 (DOI)
CIR 77 (51, 26) 3104 1.96 PRJNA174838
MHE 77 (26, 51) 3104 1.96 PRJNA174838
OB 281 (220, 61) 6386 3.61 PRJNA32089
PAR1 148 (74, 74) 10232 1.00 PRJEB4927
PAR2 333 (201, 132) 6844 1.52 PRJNA601994
PAR3 507 (323, 184) 12198 1.76 PRJNA601994

Notes:

  • ART: Arthritis; CDI: Clostridium difficile Infection; CRC1 and CRC2: Colorectal Cancer; HIV: Human Immunodeficiency Virus; CD1 and CD2: Crohn's Disease; IBD1 and IBD2: Inflammatory Bowel Disease; CIR: Cirrhosis; MHE: Minimal Hepatic Encephalopathy; OB: Obesity; PAR1, PAR2, and PAR3: Parkinson's Disease.
  • CD1 and CD2 were taken from MLRepo, PAR2 and PAR3 were retrieved from their respective article sources, and the remaining datasets were obtained from MicrobiomeHD.
  • Project IDs from NCBI for raw data. IBD2 data is only available via MicrobiomeHD repository, CDI raw data is available at mothur (https://mothur.org/CDI_MicrobiomeModeling/)

Version History

main @ b9472f1 (earliest) Created 13th Jul 2025 at 17:40 by Ignacio Garach Vélez

Add files via upload


Frozen main b9472f1
help Creators and Submitter
Creator
  • Ignacio Garach
Submitter
Citation
Garach, I. (2025). Exploring the role of normalization and feature selection in microbiome disease classification pipelines. WorkflowHub. https://doi.org/10.48546/WORKFLOWHUB.WORKFLOW.1807.1
Activity

Views: 52   Downloads: 5

Created: 13th Jul 2025 at 17:40

Annotated Properties
Topic annotations
help Attributions

None

Total size: 459 MB
Powered by
(v.1.17.0-main)
Copyright © 2008 - 2025 The University of Manchester and HITS gGmbH