Fireworks AI has raised $52M in Series B funding led by @sequoia !
This round propels our mission to enhance our inference platform and lead the shift to compound AI systems. Huge thanks to our investors @nvidia , @AMD , @MongoDB , @benchmark , Sheryl Sandberg , Frank
Cofounder and CEO of @FireworksAI_HQ
- 🔥 Introducing Fireworks f1 🔥 f1 is the first reasoning system over open models to beat GPT-4o and Claude 3.5 Sonnet across hard coding, chat and math benchmarks. 💥 two variants now available in preview: f1 and f1-mini 💥 access the preview on Fireworks AI Playground for
- 🔥 Structure is all you need. 🔥 We’re excited to announce: - FireFunction V1 - our new, open-weights function calling model: - GPT-4-level structured output and decision-routing at 4x lower latency - open-weights, commercially usable - Blog post:
- I'm very excited to share that @FireworksAI_HQ has raised $250M in Series C funding co-led by @lightspeedvp and @IndexVentures , and participation from @sequoia Capital and @EvanticCapital, bringing up valuation to $4 billion. In total, we raised $327M from prior rounds led by
- 🔥 Excited to announce FireAttention: a breakthrough in the speed vs quality tradeoffs of LLMs We wrote a custom CUDA kernel optimized for MQA, and FP8/FP16 support on H100. This also lets us run Mixtral at FP8 with negligible impact to quality benchmarks (as opposed to GPTQ,What's the most performant way to serve Mixtral and other open-source MoE models? Fireworks investigated this topic and came up with our proprietary serving stack with 4x the speed compared to vLLM and negligible quality impact! Read about our findings here
- Fireworks passed ASIC speed! First time, GPU based inference crossed an ASIC provider. Benchmark credit to AA Model: GPT-OSS-120B Speed: 540 TPS Legend: Purple - Fireworks on B200; Orange - Groq
- 🔥🔥Announcing FireLLaVA -- the first multi-modal LLaVA model, trained by @FireworksAI_HQ , with a commercially permissive license. It’s also our first open source model! While the industry heavily uses text-based foundation models to generate responses, in real-world
- 🔥 Firefunction-v2, new open-weights function-calling model🔥 I'm super excited to announce Firefunction-v2, our latest open-weights! - Competitive with GPT-4o at function-calling - 1/10 of GPT-4o cost and 2x the speed - Retains both conversation and function-calling
- 🚀 Fireworks Reinforcement Fine-Tuning (RFT) launched! After many months of iteration with real world use cases, we are excited to launch Fireworks RFT public preview. It’s a managed RL service that turns open frontier models (e.g. DeepSeek V3, Kimi K2) into custom agents for
- 🔥 FireAttention v3 -- enabling viable alternatives in the GPU inference serving market 🔥 Engineers at @FireworksAI_HQ have successfully ported FireAttention to AMD MI300s, resulting in 80% more throughput and 60% faster latency than NIM on Nvidia H100s. With these
- 🔥 FireAttention v2: speed and long context, pick two! 🔥 At @FireworksAI_HQ , we challenge ourselves constantly -- can we develop a dramatically more efficient serving for LLMs with RAG in a long context window with negligible quality tradeoff? We built our next generation
- 🔥 We’re excited to announce @FireworksAI_HQ fine-tuning service! Fine-tune and run models like Mixtral at 300 tokens/sec with no extra inference cost! 🔁 Tune and iterate rapidly - Go from dataset to querying a fine-tuned model in minutes covering 100+ models. 🤝 Seamless
- Congrats @mntruell @amanrsanger @sualehasif996! There are so many sleepless nights stretching and scaling Cursor’s explosive workload together. Onwards!We've raised $2.3B in Series D funding from Accel, Andreessen Horowitz, Coatue, Thrive, Nvidia, and Google. We're also happy to share that Cursor has grown to over $1B in annualized revenue and now produces more code than any other agent in the world. This funding will allow














