Workflow Type: Docker
Frozen
Stable
Article-GADES
This repository represents generating and benchmarking the results of the GADES package for Distance Matrix Calculation
Installation
git lfs install
git clone https://github.com/lab-medvedeva/Article-GADES.git
cd Article-GADES
Put the Real datasets in the MEX format to the folder Datasets/Real
.
Running benchmark using Docker Deployment
docker run --gpus all \
-v $PWD/Datasets:/workspace/Article-GADES/Datasets \
-v $PWD/results:/workspace/Article-GADES/results \
akhtyamovpavel/article-gades
Step 01. Generation of the datasets
Step 01.1. Generated Dense Datasets
cd ./scripts/MatricesGeneration
./generate_dense.sh ../../Datasets/
Step 01.2. Generated Sparse Datasets
cd ./scripts/MatricesGeneration
./generate_sparse.sh ../../Datasets/
Step 02. Benchmarking
Step 02.1. Generated Dense Datasets
cd ./scripts/Benchmarking
./run_benchmark_generated_dense.sh ../../
./run_benchmark_python_dense.sh ../../
Step 02.2. Generated Sparse Datasets
cd ./scripts/Benchmarking/
./run_benchmark_generated_sparse.sh ../../
Step 02.3 Real Datasets
cd ./Scripts/Benchmarking/
./run_benchmark_real_python.sh ../../results/RealDatasets//
./run_benchmark_real_R.sh ../../results/RealDatasets//
Example:
./run_benchmark.sh ../../Datasets/Real/HLCA_marrow.mtx ../results/RealDatasets/HLCA_marrow/
Step 02.4. Ablation Study for the Batch Size Usage
Step 02.5. Ablation Study for the Memory Usage
cd ./Scripts/Benchmarking
./run_benchmark_real_python_memory_usage.sh ../../results/RealDatasetsBatchSizeFixedMemory//500/
Example:
./run_benchmark_real_python_memory_usage.sh ../../Datasets/CellLines.mtx ../../results/RealDatasetsBatchSizeFixedMemory/CellLines/500/
Step 03. Drawing charts
We split reproducibility notebooks into two parts:
- Aggregation over datasets
- Plotting charts
Aggregation
- For Generated Dense datasets you could use the GeneratedDatasetsCollector notebook.
- For Generated Sparse datasets you could use the GeneratedSparseCollector notebook.
Analyzing datasets
- Generated datasets analyzed in the GeneratedDatasetAnalysis notebook.
- Real datasets analyzed in the RealDatasetAnalysis notebook.
- Analysis of ablation study could be found in the reproducibility notebook.
Version History
main @ 39b937a (earliest) Created 5th Sep 2024 at 11:35 by Pavel Akhtyamov
Added links to ablation study
Frozen
main
39b937a
Creators and Submitter
Creator
Submitter
Tool
Citation
Akhtyamov, P. (2024). GADES reproducibility workflow. WorkflowHub. https://doi.org/10.48546/WORKFLOWHUB.WORKFLOW.1125.1
Activity
Views: 308 Downloads: 59
Created: 5th Sep 2024 at 11:35
Last updated: 5th Sep 2024 at 11:36
Annotated Properties
Topic annotations
Operation annotations
Tags
Attributions
None