In 2019 I had a chat with the DeepSeek team, in the hope of selling them an AI cloud solution. I was trying to convince them a few things:
- you don't need complicated cloud virtualization, you just need containers and an efficient scheduler.
- you will need really fast,
DeepSeek (Chinese AI co) making it look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2048 GPUs for 2 months, $6M).
For reference, this level of capability is supposed to require clusters of closer to 16K GPUs, the ones being






