open source · cloud-agnostic · bioinformatics-first

The right machine
for every
workload

cloudfit scores and ranks cloud instances across AWS, GCP, and Azure against your workload profile, and stays current as providers deprecate and release new machine types.

Try the demo ↗ API docs ↗ GitHub ↗ PyPI ↗

terminal

$ pip install cloudfit-core

Successfully installed cloudfit-core-0.4.0

$ python

>>> from cloudfit import rank, WorkloadProfile, MachineType

>>> profile = WorkloadProfile(vcpu=60, ram_gb=224, archetype="io", optimize_for="balanced")

>>> rank(profile, candidates)[0].instance.id

'c2-standard-60'█

01

Try it now

Two ways to try cloudfit right now. The one-click Gradio UI takes a workload profile through a form and returns ranked instance recommendations. The FastAPI service serves the same scoring engine over HTTP for programmatic use. Both run on a bundled snapshot of 875 GCP machine types across 5 regions (us-central1, us-east1, us-west1, europe-west4, asia-southeast1) with realistic asymmetric availability. No credentials needed.

Open the UI ↗ Swagger UI ↗ /health ↗

The Space sleeps when idle, so the first request may take 30 to 60 seconds to wake the container. Subsequent requests are instant.

POST /recommend · rank machine types

$curl -sX POST https://chaitanyakasaraneni-cloudfit-api.hf.space/recommend \

-H 'content-type: application/json' \

-d '{

"workload": {"vcpu": 32, "ram_gb": 128, "optimize_for": "balanced"},

"region": "us-central1",

"top_k": 3

}'

→ ranked list of 3 instances, filtered to instances available in us-central1

GET /instances · browse the catalog

$# filter by region, vCPU, GPU, status

$curl 'https://chaitanyakasaraneni-cloudfit-api.hf.space/instances?region=europe-west4&min_vcpu=64&limit=5'

$# asia-southeast1 has fewer families in the bundled snapshot

$curl 'https://chaitanyakasaraneni-cloudfit-api.hf.space/instances?region=asia-southeast1&limit=5'

→ matching instances with full specs and pricing

GET /providers · snapshot summary

$curl 'https://chaitanyakasaraneni-cloudfit-api.hf.space/providers'

→ per-provider instance count, regions present, and status breakdown (active / deprecated / tombstoned)

$# useful for "what's in the snapshot right now?"

POST /diff · compare two workloads

$curl -sX POST https://chaitanyakasaraneni-cloudfit-api.hf.space/diff \

-H 'content-type: application/json' \

-d '{

"a": {"workload": {"vcpu": 16, "ram_gb": 64}},

"b": {"workload": {"vcpu": 64, "ram_gb": 256}}

}'

→ top pick for each + price/hr, monthly cost, vCPU, RAM deltas

02

How it works

1

Describe your workload

Declare vCPU, RAM, archetype, disk requirements, GPU needs, and spot tolerance in a WorkloadProfile or a YAML file. cloudfit understands five resource archetypes: I/O, CPU, memory, GPU, and burst-parallel.

2

Score every candidate

Hard floor filters eliminate instances that can't meet your minimum requirements. The scoring engine then weights cost, performance, and availability according to your optimize_for mode. No cloud credentials needed.

3

Get a ranked recommendation

Receive a ranked list with composite scores and sub-scores for each candidate. Feed the result into Terraform, Nextflow, or any IaC pipeline. As provider plugins come online, the registry stays current as instance families are deprecated and released.

03

Bioinformatics pipelines

I/O

I/O bound

disk-saturating

CPU

CPU bound

thread-parallel

MEM

Memory bound

large index

GPU

GPU / ML

inference

BURST

Burst parallel

scatter-gather

04

Scoring model

score = w_cost × cost_score + w_perf × perf_score + w_avail × avail_score

cost_score is relative to the candidate set (cheapest qualifying instance = 1.0, dearest = 0.0), so a real price gap moves the score. archetype is a classification and disk-sizing label in this release; it does not change ranking, which is driven by optimize_for.

Mode	w_cost	w_perf	w_avail	Best for
cost	0.70	0.20	0.10	Batch jobs, dev environments
balanced	0.33	0.34	0.33	Default for production workloads
performance	0.10	0.80	0.10	Latency-sensitive, GPU inference
availability	0.10	0.20	0.70	Long-running, deprecation-sensitive

      python
      example.py
    
from cloudfit import rank, WorkloadProfile, MachineType

profile = WorkloadProfile(

    vcpu=60,

    ram_gb=224,

    workload="io-intensive",

    archetype="io",

    optimize_for="balanced",

)

# supply candidates yourself, or fetch them live with cloudfit-provider-gcp

candidates = [

    MachineType(id="c2-standard-60",       provider="gcp", vcpu=60, ram_gb=240, price_hr=3.13),

    MachineType(id="c3d-standard-60-lssd", provider="gcp", vcpu=60, ram_gb=240, price_hr=3.39),

    MachineType(id="c7i.24xlarge",         provider="aws", vcpu=96, ram_gb=192, price_hr=4.28),

]

for r in rank(profile, candidates):

    print(f"{r.instance.id:<30} score={r.score:.2f}  ${r.instance.price_hr:.2f}/hr")

#1c2-standard-60         score=1.00  $3.13/hr
#2c3d-standard-60-lssd   score=0.67  $3.39/hr
#3c7i.24xlarge           score=0.00  $4.28/hr  disqualified: 192 GB RAM < 224 required

05

Ecosystem

● live

cloudfit-core

Scoring engine, workload profiles, hard floor filters, dynamic disk sizing. Pure Python, no cloud credentials needed.

pip install cloudfit-core

● live

cloudfit-provider-gcp

Fetches GCP Compute Engine machine types, pricing, and availability and normalizes them for cloudfit-core. Can run on a daily cron to feed the registry.

pip install cloudfit-provider-gcp

○ coming soon

cloudfit-provider-aws

Fetches AWS EC2 instance specs and on-demand pricing. Handles deprecation tombstoning so IaC configs warn instead of break.

pip install cloudfit-provider-aws

● live

cloudfit-api

REST API (service, not a library). /recommend · /instances · /providers · /diff. Multi-region snapshot (5 regions). FastAPI with OpenAPI docs. Self-host with Docker, or use the live demo on Hugging Face Spaces.

→ open /docs to try it

● live

cloudfit-ui

One-click Gradio demo over the scoring engine. Workload profile in, ranked machine types out. Five example workloads (BWA-MEM2, Cell Ranger, AlphaFold, Nextflow burst, Spark ETL) built in.

→ open the demo

samplesheet-parser

Format-agnostic Illumina SampleSheet parser. Auto-detection, V1↔V2 conversion, Hamming distance validation.

pip install samplesheet-parser

clinops

Clinical ML data quality library. MIMIC-III dataset support and reproducible quality checks for clinical research.

pip install clinops

Author

Chaitanya Krishna
Kasaraneni

Software engineer focused on cloud-based data infrastructure for batch and bioinformatics workloads. Published researcher across AI/ML, medical imaging, and computational drug discovery.

ORCID 0000-0001-5792-1095

in

linkedin.com/in/chaitanyakasaraneni

gh

github.com/cloudfit-io

Publications

IEEE CIACON 2025