Chaos Labs (@chaoslabs) / X

Chaos Labs

2,844 posts

Chaos Labs

@chaoslabs

Building intelligence that compounds.

NYC & TLV

Joined October 2021

Pinned
Chaos Labs
@chaoslabs
Mar 13, 2025
1/ Introducing Chaos AI—The World’s First AI-Powered Crypto Researcher. Built on years of proprietary data from securing trillions in trading volume, Chaos AI transforms fragmented market data into institutional-grade financial intelligence. Get Early Access:
00:00
516K
Chaos Labs
@chaoslabs
2h
Context eats up over 80% of agentic tokens Multi-step AI workflows consume ~1,000x more tokens than chat, and most of this consumption occurs pre-generation through retrieval, state management, and context assembly.
Chaos Labs
@chaoslabs
Jun 19
Did Claude Code create the first tokenmaxxing workforce? In less than a year, Claude Code references grew to 135K+ GitHub commits per day. Meanwhile, the @FT reports that companies like Uber and Cisco are grappling with the costs of scaling AI usage across their workforce.
3K
Chaos Labs
@chaoslabs
2h
2/ Pre-generation overhead scales with workflow complexity. > Each agent step requires retrieval, tool-output ingestion, context reconstruction, and constraint evaluation before execution can continue. > If ~80% of token budgets are concentrated in these operations, the
221
Chaos Labs
@chaoslabs
Jun 19
Did Claude Code create the first tokenmaxxing workforce? In less than a year, Claude Code references grew to 135K+ GitHub commits per day. Meanwhile, the @FT reports that companies like Uber and Cisco are grappling with the costs of scaling AI usage across their workforce.
GIF
5.6K
Chaos Labs
@chaoslabs
Jun 19
2/ AI introduces marginal costs into knowledge work. > Every additional analysis, investigation, recommendation, or code review consumes compute. > At scale, the economics increasingly favors organizations that retain and reuse the intelligence generated by their AI workflows.
339
Chaos Labs
@chaoslabs
Jun 18
1/ Every interaction between humans, agents, and models leaves behind a record of how an organization operates. Decisions, evaluations, outcomes, and workflows encode expertise, judgment, and results; over time, this record becomes organizational intelligence.
GIF
1.6K
Chaos Labs
@chaoslabs
Jun 18
2/ Two companies can deploy the same frontier model and compound knowledge at different rates. One converts interactions into organizational learning, the other does not: > Expertise becomes reusable infra > Decisions become training signals that improve future workflows >
609
Chaos Labs reposted
0xGeeGee
@0xGeeGee
Jun 17
As we keep accelerating on AI, I expect us heading through a series of "DeepSeek" moments, where rather than improving exponentially the intelligence of the model forever, we will demolish the denominator (costs) and gets the improvement in terms of both efficiency and
Chaos Labs
@chaoslabs
Jun 17
1/ Foundation models are a commodity With access to frontier models expanding across enterprises, accumulated intelligence is emerging as a primary source of competitive advantage.
2.5K
Chaos Labs
@chaoslabs
Jun 17
1/ Foundation models are a commodity With access to frontier models expanding across enterprises, accumulated intelligence is emerging as a primary source of competitive advantage.
6.8K
Chaos Labs
@chaoslabs
Jun 17
2/ Every AI interaction leaves behind a record of how work is performed within an organization. > Accepted outputs, rejected outputs, workflow traces, evaluations, corrections, decisions, and outcomes capture which approaches succeed, which fail, and how expertise is applied. >
453
Chaos Labs reposted
Omer Goldberg
@omeragoldberg
Jun 16
Article
From Tokenmaxxing to Token Yield
The rise and fall of Tokenmaxxing For most of the past year, I asked everyone at Chaos Labs to use AI in their day-to-day work: product dev, design, writing, debugging, GTM work, risk analysis,...
8.8K
Chaos Labs reposted
Omer Goldberg
@omeragoldberg
Jun 16
Tokenmaxxing is easy when you're getting started. Modeled @chaoslabs AI usage at scale and saw a projected $150M/yr in spend. While I knew what it would cost, I had no real visibility into its impact, efficiency, and value. Sharing our journey on measuring AI spend ROI.
Omer Goldberg
@omeragoldberg
Jun 16
Article
From Tokenmaxxing to Token Yield
The rise and fall of Tokenmaxxing For most of the past year, I asked everyone at Chaos Labs to use AI in their day-to-day work: product dev, design, writing, debugging, GTM work, risk analysis,...
2.9K
Chaos Labs
@chaoslabs
Jun 15
1/ Identical AI outputs conceal entirely different workflows Equivalent outputs often rely on a wide range of data sources, assumptions, permissions, and tool actions, each carrying a distinct risk profile.
Chaos Labs
@chaoslabs
Jun 12
Article
The Distance Between Output & Action
A year ago, most discussions around enterprise AI focused on model capabilities. Are models reliable enough to write production-level code? Can they produce research sophisticated enough to influence...
4.9K
Chaos Labs
@chaoslabs
Jun 15
2/ Organizations will increasingly need to determine which AI outputs warrant action. As AI-generated work scales, how an output was derived will carry as much weight as the output itself. Without this visibility, each output will require independent validation, and every
633
Chaos Labs
@chaoslabs
Jun 12
Article
The Distance Between Output & Action
A year ago, most discussions around enterprise AI focused on model capabilities. Are models reliable enough to write production-level code? Can they produce research sophisticated enough to influence...
12K
Chaos Labs
@chaoslabs
Jun 11
1/ Persistent memory reduces the need to tokenmax Researchers found that AI agents with persistent memory completed similar workloads with ~22% lower cost, 18–28% fewer agent turns, and 11–32% faster execution.
3.9K
Chaos Labs
@chaoslabs
Jun 11
2/ What this means. A meaningful share of agent computation is spent rebuilding context. Researchers found that persistent memory reduces repeated architecture discovery, file re-reading, context reconstruction, planning overhead, and exploratory dead ends. Every instance of
931
Chaos Labs
@chaoslabs
Jun 10
Frontier models burn millions of extra tokens on identical software engineering workloads Sonnet 4.5 and Kimi-K2 consumed 1.5M+ more tokens vs GPT-5 on average while resolving the same SWE-bench Verified issues.
6.2K
Chaos Labs
@chaoslabs
Jun 10
2/ What does this mean? > The efficiency gap persisted even on SWE-bench issues every model solved successfully, suggesting token consumption reflects model-specific behavior > Higher token consumption did not reliably translate into better task performance > The same workload
930
Chaos Labs
@chaoslabs
Jun 9
1/ Token yield > tokenmaxxing The highest-leverage AI workflows maximize information extracted per token; beyond a threshold, additional tokens generate diminishing returns.
8.2K
Chaos Labs
@chaoslabs
Jun 9
2/ For enterprises, the implication is straightforward: scaling AI usage does not guarantee proportional gains in productivity, decision quality, or business outcomes. Bai et al. (2026) analyzed execution traces from eight frontier models on SWE-bench Verified and found a
846