Cassandra Inc. | Devpost

Inspiration

Real markets are efficient because millions of participants with different priors continuously reveal private information through trading. The Aumann Agreement Theorem proves that rational agents with common knowledge will converge to identical beliefs. In practice, markets achieve this convergence without explicit information disclosure: prices aggregate private signals through the mechanism of trade itself.

The recent intelligence explosion has sparked a race to build a single "superforecaster," but we believe this misses a fundamental insight: ensembles of differentiated agents outperform individual oracles. Like human prediction markets, accuracy emerges from diversity, not from a single best model, but from the competitive synthesis of specialized perspectives.

Cassandra stages a two-act experience that unifies research and trading into a single transparent system. Differentiated AI traders (each reflecting distinct spheres of influence on X) compete in a live market, putting "skin in the game" by trading with real currency. The market price converges toward the mean of public and private information, exactly as Aumann predicted, while exposing the reasoning behind each trade.

Name Inspiration: Cassandra, the Greek prophetess cursed with foresight but unable to communicate it. If only she had access to prediction markets...

What it does

Act I: The Superforecaster Room

Cassandra orchestrates a 24-agent, 4-phase research pipeline inspired by the best human forecasters at the Good Judgment Project:

Phase 1: Factor Discovery (10 agents, parallel)
- Independent discovery agents explore up to 5 factors each across economic, social, political, technical, and environmental domains
- Output: Up to 50 candidate factors for validation
Phase 2: Validation (2 agents, sequential)
- Deduplication and validation of discovered factors
- Importance scoring (1-10) and consensus selection of top 5 factors
Phase 3: Research (10 agents, parallel)
- 5 historical pattern analysts (one per top factor)
- 5 current data researchers (one per top factor)
- All run in parallel for maximum speed
Phase 4: Synthesis (1 agent)
- Combines all research into a calibrated prediction with confidence score
- Generates traceable reasoning and key factors

This pipeline produces an anchor prediction that initializes the market with sufficient liquidity—similar to how sophisticated hedge funds like SIG seed real prediction markets.

Act II: The Trading Floor

The anchor prediction seeds a continuous market simulation with 18 differentiated agents:

5 Fundamental Analysts: Slow, methodical web researchers who update predictions based on deep research
9 X-User Personas: Reactionary agents modeled after distinct spheres of influence on X (tech VCs, crypto traders, AI researchers, etc.), each with access to filtered social signals
4 User-Linked Traders: Agents that reflect your own opinions and can be directly controlled

These agents:

Post bids and asks based on their predictions and confidence
React to each other's trades in real time
Clear orders via an atomic market-matching algorithm
Converge toward a consensus market price that aggregates all available information

All of this unfolds in the Exchange, a pixel-art office floor where you can read agent reasoning and observe trades occur. The market price serves as a live probability estimate that updates continuously as new information arrives, exactly as theory predicts.

How we built it

Human prediction markets rely on sophisticated hedge funds like SIG to initialize markets with sufficient liquidity for retail traders to engage with. To anchor our market, we started with a 24-agent, 4-phase research pipeline (discovery → validation → research → synthesis) based in the state-of-the-art of human forecasting that logs factors, reasoning, and predictions into Supabase tables. Those outputs seed a continuous market simulation with 18 agents (5 fundamental, 9 X-users, 4 user-linked) that receive information, make markets, and clear orders via a Supabase edge function. The backend uses FastAPI, a custom X search MCP, and a more-than-generous number of Grok agents; the frontend is Next.js 14 + Tailwind with Supabase Realtime hooks for trades/order book and agent state, rendered in a pixel-office UI inspired by Stardew Valley.

Challenges we ran into

Models hallucinate as context extends. We wired 24 agents together using a research framework based on the best human forecasters at GJP to make an accurate anchor point to accurately initialize our markets with sufficient liquidity.
The timing of human trading is based on confidence, latency, and new information, but LLM rollout speed introduces uncontrollable variance. We used a round-robin batching approach to keep trading equitable between models.
To make trading meaningful, we needed a realistic distinctions between our agents. We grouped people and influencers on X into distinct spheres of influence to replicate realistic social networks on X and provide differentiated information to our traders.

Accomplishments that we're proud of

Prediction markets require live updating. We built a prediction market platform from scratch, complete with short-selling, public order books, and a market-clearing algorithm that avoids race conditions while preventing deadlock.
We (preliminarily) benchmarked our Superforecaster on a subset of ForecastBench, where it outperformed base Grok-4.1-Fast-Thinking.
Forecasts are only as useful as their explanations. By segmenting its research into the same phases used by humans, our Superforecaster makes verifiable predictions with traceable logic.

What we learned

Ensembles almost always outperform individual models, but they require specific scaffolding in the way they interact with each other. While designing Superforecaster, we dug into the literature on the SOTA of LLM research pipelines and human forecaster approaches.
Grok's default X Search tool generates a lot of noise. We built and optimized a custom X Search MCP and a semantic filter pipeline to make more controllable searches that isolate trading signals without polluting model context (unless the agent's sphere of influence calls for noise...)

What's next for Cassandra Inc.

More sanity check features in place. We're looking into Chain-Pattern Interrupts to avoid pattern inertia, and better noise filters (e.g. weighting informative X posts over shitposts but providing both) to prevent against agent manipulation and provide a stronger anchoring towards the prior.