Workflow Type: Python
Frozen
📄 Generalizable machine learning models for rapid antimicrobial resistance prediction in unseen healthcare settings
This repository contains the code used for the experiments in the paper:
Generalizable machine learning models for rapid antimicrobial resistance prediction in unseen healthcare settings
by Diane Duroux, Paul P. Meyer, Giovanni Visonà, and Niko Beerenwinkel.
⚙️ Install the dependencies
Clone the repository, unzip OriginalData.zip, and install the necessary dependencies listed in the requirements.txt file
pip install -r requirements.txt
💻 AMR Classifier Training with ResMLP and inference
The following command trains a ResMLP model for AMR classification using the preprocessed DRIAMS data.
📦 Output
In output//_results/
, the script generates:
test_set_seed0.csv
➤ Contains predictions:species
,sample_id
,drug
,response
, andPrediction
.
🛠 Required Arguments
Argument | Description |
---|---|
--driams_long_table |
Path to the metadata file for the current dataset. |
--spectra_matrix |
Path to the input mass spectra (either raw or MAE-encoded). |
--sample_embedding_dim |
Dimension of the spectra input (6000 for raw, or same as for MAE). |
--drugs_df |
Path to the antimicrobial compound encoding file. |
--fingerprint_class |
Type of encoding: 'morgan_1024' , 'molformer_github' , or 'selfies_flattened_one_hot' . |
--fingerprint_size |
Size of the encoding: 1024 (Morgan), 768 (Molformer), or 24160 (SELFIES). |
--split_type |
Set to specific if splits are pre-defined, else random. |
--split_ids |
Path to the data_splits.csv file. |
--experiment_group |
Name of the output folder. |
--experiment_name |
Name of the output subfolder. |
--seed |
Random seed for reproducibility. |
--n_epochs |
Number of epochs for classifier training. |
--learning_rate |
Learning rate for the optimizer. |
--patience |
Number of epochs to wait before early stopping. |
--batch_size |
Batch size for classifier training. |
🚀 Example: ResMLP Training on DRIAMS B2018 with Raw Spectra + Morgan Fingerprints
ulimit -Sn 10000 # Optional: increase file descriptor limit if needed
python3 code/ResAMR_classifier.py \
--driams_long_table ProcessedData/B2018/combined_long_table.csv \
--spectra_matrix ProcessedData/B2018/rawSpectra_data.npy \
--sample_embedding_dim 6000 \
--drugs_df OriginalData/drug_fingerprints_Mol_selfies.csv \
--fingerprint_class morgan_1024 \
--fingerprint_size 1024 \
--split_type specific \
--split_ids ProcessedData/B2018/data_splits.csv \
--experiment_group rawMS_MorganFing \
--experiment_name ResMLP \
--seed 0 \
--n_epochs 2 \
--learning_rate 0.0003 \
--patience 10 \
--batch_size 128
💰 Funding
This research was primarily supported by the ETH AI Center.
Version History
main @ 5fb358d (earliest) Created 21st Jul 2025 at 14:22 by Diane Duroux
Create LICENSE
Frozen
main
5fb358d

Creators
Not specifiedSubmitter
Activity
Views: 20 Downloads: 4
Created: 21st Jul 2025 at 14:22

This item has not yet been tagged.

None