CryoDataBot: a pipeline to curate cryoEM datasets for AI-driven structural biology
main @ 39cc8bf

Workflow Type: Python

CryoDataBot

CryoDataBot is an automated pipeline designed to streamline dataset generation for cryogenic electron microscopy (cryoEM)-based atomic model building. It supports large-scale AI training and benchmarking by providing standardized tools for data retrieval, preprocessing, labeling, and quality control. CryoDataBot enables flexible configurations for diverse biomolecular structures, improves modeling reproducibility, and facilitates retraining of AI models such as U-Net and CryoREAD. It is open-source and optimized for structural biology applications in RNA and protein modeling.

Table of Contents

Installation

To install CryoDataBot:

one-liner:

git clone --branch temp https://github.com/t00shadow/CryoDataBot.git && cd CryoDataBot && pip install -r requirements.txt

step-by-step:

git clone --branch temp https://github.com/t00shadow/CryoDataBot.git
cd CryoDataBot
pip install -r requirements.txt

Usage

You can run the main GUI for CryoDataBot with the following command:

python -m cryodatabot

Version History

main @ 39cc8bf (earliest) Created 10th Jul 2025 at 01:45 by Qibo Xu

Add feature: discard all 0 sub-volumes

changed the function split_to_npy, add an argument test_zero_ratio as a threshold for discard sub-volumes with high zero ratio.

also integrated the sub-volume generation for map and model


Frozen main 39cc8bf
help Creators and Submitter
Creators
  • Qibo Xu
  • Hong Zhou
  • Leon Wu
  • Micahel Rebelo
  • Shi Feng
  • Star Yu
  • Farhanaz Farheen
  • Daisuke Kihara
Submitter
Citation
Xu, Q. (2025). CryoDataBot: a pipeline to curate cryoEM datasets for AI-driven structural biology. WorkflowHub. https://doi.org/10.48546/WORKFLOWHUB.WORKFLOW.1796.1
License
Activity

Views: 35   Downloads: 6

Created: 10th Jul 2025 at 01:45

help Tags

This item has not yet been tagged.

help Attributions

None

Total size: 2.18 MB
Powered by
(v.1.17.0-main)
Copyright © 2008 - 2025 The University of Manchester and HITS gGmbH