Pinned
Modal
1,523 posts
AI infrastructure that developers love 💚
Run inference, sandboxes, batch processing, training, and many other things on Modal
- Modal repostedYou no longer have to pick between the performance of a black box API and the flexibility and control of @modal. Auto Endpoints give you both. We're unlocking frontier performance for everyone without having to talk to sales or an FDE. More cooking here, stay tuned.
- Modal repostedManaged private LLM endpoints, now available for everyone in @modal. Deploy in a few clicks with the UI or a few keystrokes with our CLI. The coolest thing is that these are not black boxes – customers have full access to the code underneath.
- We're hosting an art show with @GrayAreaorg in San Francisco! 💚 Submissions are open till July 15: modal.art📢 We're partnering with @modal to offer a new development and exhibition opportunity for artists with sustained engagements in artificial intelligence and the arts. This global open call seeks proposals for creative projects that demonstrate the intentional use of AI to further
00:00 - Sandbox startup latency and scaling can make or break your RL training run. Great post breaking this down, shown using Modal Sandboxes.RL Systems Mind the Gap: Matching Trainer and Generator Throughput RL Training Infrastructure, GRPO, PipelineRL, Async RL, Policy Staleness, RL Sandbox Infra, CPU Requirements, TCO Analysis, Thinking Machines Tinker newsletter.semianalysis.com/p/rl-systems-m…
- Modal repostedOur sandbox team has been on a crusade against every millisecond of latency and it's paying off. More cool results coming very soon!
- We worked with @lmsysorg and z-lab.ai to - integrate DFlash spec into @sgl_project - make it faster with overlap - train a DFlash drafter for @Alibaba_Qwen 397B-A17B The result: up to 4.3x greater throughput over baseline and 1.5x over native MTP.You can find the drafter on @huggingface, where we've each released an identical copy of the weights. Kinda like getting matching tats with your bestie Our copy is here: huggingface.co/modal-labs/Qwe… The repos include scripts that reproduce our benchmark showing superiority over MTP:You can read about DFlash, the SGLang Spec V2 overlap scheduler, and how it all came together on the @lmsysorg blog:






















