Wow.
@Zai_org GLM 5.2 is a marvel! It is *at least* as good as Opus 4.8 and GPT 5.5. It's super fast, inexpensive, and not too verbose.
It responds with nuance and judgement, & handles long context VERY well.
I've never experienced an open weights model like this before.
Recently, @BoardroomClub1 and @Yossigarmazi hosted @the_bunny_chen to talk open source, RFT, and why the infrastructure layer matters as much as the model itself.
Listen now:
We teamed up with @FireworksAI_HQ to answer the following question…
How can we cost-effectively mine important signals from every single trace, while maintaining frontier performance?
Read our LangChain Labs study ⤵️
We're just getting photos back from New York @Techweek_ and we can't get enough of them.
Thanks to everyone who came up to the roof to party with us. We'll see you next year!
Kimi 2.7 is now fully trainable on Fireworks.
Feed your data into Kimi and build a moat that beats frontier models at lower costs. SFT. DPO. RL.
Managed clicks or raw API with huge context, giant LoRA ranks & all the pro knobs.
Get started: app.fireworks.ai/dashboard/fine…
GLM 5.2 is live on Fireworks, day zero. 1M-token context, coding‑first frontier model, independently validated on SWE‑bench, Terminal‑Bench, GPQA and AIME.
Served on our own infrastructure the moment @Zai_org opened the weights.
What’s under the hood ↓
You may have seen other platforms announce day-zero support within minutes of weights dropping, but there is a key difference between inference providers and routers.
Routers forward your GLM 5.2 calls to someone else’s endpoint (typically the model lab’s own API).
Fireworks
Benchmarks are just the starting point.
The only eval that matters is the one on your codebase, your prompts, your latency SLOs.
Ready to build?
Drop accounts/fireworks/models/glm-5p2 into your existing OpenAI/Anthropic‑compatible client and try it on real work.
Learn
This is stepping stone for enabling customers to generate training data from traces and the lean into continuous post training and own their AI with their own data moat.
We tip our hat to the @LangChain team for the incredible work.
public benchmarks are saturated. every frontier model has trained against them, and the leaderboard tells you near nothing.
we built ours from inside ramp — code no model has seen, graded against the bar our engineers ship to.
every company running on AI needs its own.
Moonshot released K2.7 Code, the latest in their K2 line of coding models, and it's live on Fireworks Day 0, on serverless and the API.
It produces roughly 30% fewer reasoning tokens than K2.6 while scoring higher on Moonshot’s coding benchmarks.
For agentic coding work, that
In long agent loops, reasoning tokens get reused as context on every following turn.
Shorter reasoning means smaller contexts downstream, faster generations, and fewer retries.
K2.7 Code reduces that overhead without giving up quality, which lowers the real cost per completed