A Very Big Video Reasoning Suite
We bet on a future that video reasoning is the next fundamental intelligence paradigm, after language reasoning, where spatiotemporal embodied world experiences could be more naturally captured.
identify_chinese_character
GitHub
Prompt
Find and circle the Chinese character among the displayed characters. Only one character is Chinese. Draw a red circle around it.
First Frame
Last Frame
Video
shape_outline_fill
GitHub
Prompt
Complete the A:B :: C:? shape-style analogy. Show how the right shape in the second row changes its fill or outline so that it follows the same style transformation used between the first two shapes.
First Frame
Last Frame
Video
grid_go_through_block
GitHub
Prompt
The scene shows a 10x10 grid with a green start square (containing an orange circular agent), a red end square, and multiple purple and yellow rectangular blocks. Starting from the green start square, the agent can move to adjacent cells (up, down, left, right). The goal is to move the agent to the red end square along the shortest path that passes through all purple and yellow blocks (the agent must visit every purple and yellow block before reaching the red end square).
First Frame
Last Frame
Video
track_object_movement
GitHub
Prompt
The object marked with a green border is the only object that moves. It moves horizontally to align directly below the object with a red star at its center. Track the movement with the green border as the object moves.
First Frame
Last Frame
Video
color_triple_intersection_red
GitHub
Prompt
A Venn diagram of circles is shown. Identify the region that lies in all three of the first three circles (triple intersection) and color that region red. Do not change anything else.
First Frame
Last Frame
Video
Glass Refraction - Samples
00
01
02
03
04
Prompt
Loading...
Ground Truth
First
Final
Model Outputs
1/
VBVR-Wan2.2
VBVR-Wan2.2
CogVideoX 1.5
Kling 2.6
LTX-2
Runway Gen-4
Sora 2
Veo 3
Wan 2.2 I2V
Hunyuan I2V
Seedance 2.0
Leaderboard
Modality
Split
Type
Category