A Very Big Video Reasoning Suite

We bet on a future that video reasoning is the next fundamental intelligence paradigm, after language reasoning, where spatiotemporal embodied world experiences could be more naturally captured.

ball_bounces_given_time
GitHub
Knowledge in-domain testset
A ball is placed at the initial position with a direction arrow indicating its movement direction. Simulate the ball bouncing 2 times off the boundary walls following elastic collision physics (angle of incidence equals angle of reflection). The ball stops after the 2th bounce, with its final position at the wall where the last collision occurs.
First Frame
Last Frame
symmetry_completion
GitHub
Abstraction out-of-domain testset
Complete this pattern by filling in the missing cells on the right side. The left half shows a pattern that should be mirrored to create a symmetric pattern. Mirror the left half across the vertical center line to complete the symmetric pattern. Keep the camera view fixed in the top-down perspective and maintain all existing cells unchanged. Stop the video when the pattern is complete.
First Frame
Last Frame
grid_number_sequence
GitHub
Spatiality in-domain testset
The scene shows a 10x10 grid with a green start point, a red end point, and yellow cells marked with numbers 1, 2, and 3. An orange circular agent is positioned at the green start point. The agent can move to adjacent cells (up, down, left, right). Starting from the green start point, the agent must visit the numbered yellow cells in numerical order (1, then 2, then 3), taking the shortest path between each consecutive pair of numbered cells. The agent is allowed to pass through the red end point when visiting the numbered cells if needed. After visiting all numbered cells in sequence, the agent must reach the red end point, also following the shortest path.
First Frame
Last Frame
rotation_puzzle
GitHub
Transformation in-domain testset
Solve this rotation puzzle by rotating the four squares to connect the pipe paths. Each square can be rotated 90 degrees clockwise or counterclockwise. Rotate the squares so that all pipe paths connect to form a continuous path. Keep the camera view fixed in the top-down perspective and maintain all square positions unchanged. Stop the video when all pipes are connected and the puzzle is solved.
First Frame
Last Frame
pigment_color_mixing_subtractive
GitHub
Perception out-of-domain testset
The scene has two pigment colors positioned on the left and right sides, and a mixing zone marked by a black rectangular border in the center. In subtractive color mixing (pigment/paint mixing), when two pigments combine, convert RGB to CMY, add CMY components, then convert back: convert RGB to CMY (CMY = 255 - RGB), mix in CMY space (result_CMY = min(CMY1 + CMY2, 255)), convert back to RGB (RGB = 255 - CMY_result). First identify the RGB values of the left pigment (an RGB(69, 238, 140) colored pigment) and the right pigment (an RGB(47, 80, 187) colored pigment), then calculate the mixed color using the CMY conversion process. Fill the black-bordered mixing zone in the center with the resulting mixed color and show the full calculation process step by step.
First Frame
Last Frame

Inference Results

View All Results
Domino Chain Gap Analysis - Samples
00
01
02
03
04
Task Domains 1/5
Domino Chain Gap Analysis
Knowledge in-domain testset
Shape Outline Fill
Abstraction in-domain testset
LEGO Construction
Spatiality in-domain testset
Shape Sorter
Transformation out-of-domain testset
Connecting Color
Perception out-of-domain testset
Prompt
Loading...
Ground Truth
First
First Frame
Final
Final Frame
Model Outputs
1/9
VBVR-Wan2.2
VBVR-Wan2.2
CogVideoX 1.5
Kling 2.6
LTX-2
Runway Gen-4
Sora 2
Veo 3
Wan 2.2 I2V
Hunyuan I2V

Leaderboard

+ Submit
Split
Type
Category
2026-04-14 13 models