Antim Labs (@AntimLabs) / X

Antim Labs

33 posts

Antim Labs

@AntimLabs

World creation layer for robotics sims

San Francisco, CA

Joined July 2025

Antim Labs
@AntimLabs
6h
We’ve been exploring how interactive 3D worlds can serve as scalable training environments for embodied agents. As a first step, we trained a VLM agent to play Elden Ring in real time. Read about it here:
Embodied Agents in Simulated Worlds
From antimlabs.com
412
Antim Labs reposted
Gokul Srinivasan
@gokul8967
May 18
Simulation is a core part of how we build and evaluate robots. I spoke about simulation, robotics, and the role of interactive worlds at AI Engineer Singapore this past weekend! Full talk : youtube.com/watch?v=_xQnSN… Huge shoutout to @SherryYanJiang @unprofeshme @agrimsingh @swyx
7.1K
Antim Labs reposted
Shrey Kothari
@Shreyko
May 16
Excited to partner with @hud_evals on their RL environments hackathon! Come build with us
hud
@hud_evals
May 16
Announcing HUD's RL environments for RSI hackathon! 🎉 Join us June 20–21 in SF if you're interested in RL and want to push the frontier forward! (w/$100,000+ in prizes and compute credits 👀)
7.4K
Antim Labs reposted
Shrey Kothari
@Shreyko
May 11
Article
Simulating the Physical World
Language models had access to cheap, already-existing data on the internet. As larger models were trained on more text, general capabilities started to emerge. Robotics has a much harder data problem....
27K
Antim Labs reposted
Shrey Kothari
@Shreyko
Apr 30
We're launching early access to Gizmo, our automated sim creation tool. From text and/or image inputs, our agent generates SimReady assets and scenes from dimensioned primitives, with correct affordances and articulation.
00:00
29K
Antim Labs
@AntimLabs
Apr 22
Replying to @AntimLabs
We'll keep evaluating frontier models as they are released, and expanding the leaderboards to smaller VLMs with better latency. VGBench is live at antimlabs.com/vgbench.
antimlabs.com
Antim Labs — Simulation infrastructure for Physical AI
Antim Labs builds Gizmo, an automated scene generation tool for robotics simulations. Train policies on infinite, diverse worlds.
1K
Antim Labs
@AntimLabs
Apr 22
Replying to @AntimLabs
The gap is clear: models show flashes of vision-driven competence then break on navigation loops, ambiguous controls, pathfinding, or basic drag-and-drop.
1K
Antim Labs
@AntimLabs
Apr 22
Replying to @AntimLabs
Even strong models struggle across many games, especially when tasks require long-horizon planning, spatial reasoning, and goal persistence. Scores use a checkpointing system, and the best performers barely cross 5% in the Full setting.
619
Antim Labs
@AntimLabs
Apr 22
Replying to @AntimLabs
VGBench has two settings: Full = real-time play; Lite = the game pauses while the model thinks. Lite was added because inference latency is still a real bottleneck for agents in live environments.
627
Antim Labs
@AntimLabs
Apr 22
Replying to @AntimLabs
The setup: agents play from raw visual input plus a high-level system prompt describing the objective and controls. No game-specific scaffolding or auxiliary information is provided. This is a video of Gemini 3.1 Pro Preview playing The Legend of Zelda: Link's Awakening (sped
00:00
1K
Antim Labs
@AntimLabs
Apr 22
Replying to @AntimLabs
VGBench evaluates VLM-based agents on a suite of curated video games, scoring them on game progression through visual understanding alone. We've updated the leaderboard with the latest frontier models: @OpenAI GPT-5.4, @AnthropicAI Claude Opus 4.6, and @GoogleDeepMind Gemini 3.1
1.1K
Antim Labs
@AntimLabs
Apr 22
Replying to @AntimLabs
Video games are created to be intuitive for humans to learn and master by leveraging innate inductive biases, making them an ideal testbed for evaluating those same capabilities in VLMs.
7.1K
Antim Labs
@AntimLabs
Apr 22
We are excited to launch VideoGameBench on Antim Labs, created by @a1zhang, Thomas L. Griffiths (@cocosci_lab), @karthik_r_n, and @OfirPress at Princeton.
40K