📄 Generalizable machine learning models for rapid antimicrobial resistance prediction in unseen healthcare settings

This repository contains the code used for the experiments in the paper:

Generalizable machine learning models for rapid antimicrobial resistance prediction in unseen healthcare settings
by Diane Duroux, Paul P. Meyer, Giovanni Visonà, and Niko Beerenwinkel.

⚙️ Install the dependencies

You can set up the project with either pip or uv.

Option A - pip:

Install the necessary dependencies listed in the requirements.txt file

pip install -r requirements.txt

Option B - uv:

We provide pyproject.toml and uv.lock for macOS, Windows, and Linux.

Note: On a Linux or non-apple silicon please use the pyproject.toml file for Mac and rewrite the uv.lock after installation.

# 0) Install uv (one-time)
# mac/linux:
curl -LsSf https://astral.sh/uv/install.sh | sh
# windows (PowerShell):
iwr https://astral.sh/uv/install.ps1 -UseBasicParsing | iex
 
# 1) Ensure the pinned Python is available (adjust if your pyproject pins a version)
uv python install 3.11
 
# 2) Create the exact environment from the lockfile
uv sync --frozen
 
# 3) Run your code within the env
uv run python -V
uv run python your_script.py

💻 AMR Classifier Training with ResMLP and inference

The following command trains a ResMLP model for AMR classification using the preprocessed DRIAMS data.

📦 Output

In output//_results/, the script generates:

test_set_seed0.csv
➤ Contains predictions: species, sample_id, drug, response, and Prediction.

🛠 Required Arguments

Argument	Description
`--driams_long_table`	Path to the metadata file for the current dataset.
`--spectra_matrix`	Path to the input mass spectra (either raw or MAE-encoded).
`--sample_embedding_dim`	Dimension of the spectra input (6000 for raw, or same as for MAE).
`--drugs_df`	Path to the antimicrobial compound encoding file.
`--fingerprint_class`	Type of encoding: `'morgan_1024'`, `'molformer_github'`, or `'selfies_flattened_one_hot'`.
`--fingerprint_size`	Size of the encoding: 1024 (Morgan), 768 (Molformer), or 24160 (SELFIES).
`--split_type`	Set to `specific` if splits are pre-defined, else random.
`--split_ids`	Path to the `data_splits.csv` file.
`--experiment_group`	Name of the output folder.
`--experiment_name`	Name of the output subfolder.
`--seed`	Random seed for reproducibility.
`--n_epochs`	Number of epochs for classifier training.
`--learning_rate`	Learning rate for the optimizer.
`--patience`	Number of epochs to wait before early stopping.
`--batch_size`	Batch size for classifier training.

🚀 Example: ResMLP Training on DRIAMS B2018 with Raw Spectra + Morgan Fingerprints

ulimit -Sn 10000  # Optional: increase file descriptor limit if needed

python3 code/ResAMR_classifier.py \
    --driams_long_table ProcessedData/B2018/combined_long_table.csv \
    --spectra_matrix ProcessedData/B2018/rawSpectra_data.npy \
    --sample_embedding_dim 6000 \
    --drugs_df OriginalData/drug_fingerprints_Mol_selfies.csv \
    --fingerprint_class morgan_1024 \
    --fingerprint_size 1024 \
    --split_type specific \
    --split_ids ProcessedData/B2018/data_splits.csv \
    --experiment_group rawMS_MorganFing \
    --experiment_name ResMLP \
    --seed 0 \
    --n_epochs 2 \
    --learning_rate 0.0003 \
    --patience 10 \
    --batch_size 128

💰 Funding

This research was primarily supported by the ETH AI Center.

Generalizable machine learning models for rapid antimicrobial resistance prediction in unseen healthcare settings
main @ 3ce9c42

📄 Generalizable machine learning models for rapid antimicrobial resistance prediction in unseen healthcare settings

⚙️ Install the dependencies

Option A - pip:

Option B - uv:

💻 AMR Classifier Training with ResMLP and inference

📦 Output

🛠 Required Arguments

🚀 Example: ResMLP Training on DRIAMS B2018 with Raw Spectra + Morgan Fingerprints

💰 Funding

Version History

main @ 3ce9c42 (earliest) Created 17th Oct 2025 at 12:16 by Diane Duroux

Creator

Submitter

Generalizable machine learning models for rapid antimicrobial resistance prediction in unseen healthcare settings main @ 3ce9c42

📄 Generalizable machine learning models for rapid antimicrobial resistance prediction in unseen healthcare settings

⚙️ Install the dependencies

Option A - pip:

Option B - uv:

💻 AMR Classifier Training with ResMLP and inference

📦 Output

🛠 Required Arguments

🚀 Example: ResMLP Training on DRIAMS B2018 with Raw Spectra + Morgan Fingerprints

💰 Funding

Version History

main @ 3ce9c42 (earliest) Created 17th Oct 2025 at 12:16 by Diane Duroux

Creator

Submitter

Related items

Generalizable machine learning models for rapid antimicrobial resistance prediction in unseen healthcare settings
main @ 3ce9c42