Generates synthetic datasets for training and evaluating vision models on constrained path planning tasks. Each sample contains a grid where an agent must visit all colored blocks (blue, yellow, pink, or purple) before reaching the red endpoint, requiring optimal route planning under collection constraints.
Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.
| Property | Value |
|---|---|
| Task ID | G-16 |
| Task | Grid Go Through Block |
| Category | Spatiality |
| Resolution | 1024×1024 px |
| FPS | 16 fps |
| Duration | ~3-5 seconds |
| Output | PNG images + MP4 video |
# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/G-16_grid_go_through_block_data-generator.git
cd G-16_grid_go_through_block_data-generator
# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .# Generate 50 samples
python examples/generate.py --num-samples 50
# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset
# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42
# Without videos (faster)
python examples/generate.py --num-samples 50 --no-videos| Argument | Description |
|---|---|
--num-samples |
Number of tasks to generate (required) |
--output |
Output directory (default: data/questions) |
--seed |
Random seed for reproducibility |
--no-videos |
Skip video generation (images only) |
The scene shows a 10x10 grid with a green start square (containing an orange circular agent), a red end square, and multiple blue and pink rectangular blocks. Starting from the green start square, the agent can move to adjacent cells (up, down, left, right). The goal is to move the agent to the red end square along the shortest path that passes through all blue and pink blocks (the agent must visit every blue and pink block before reaching the red end square).
Note: {color} is dynamically replaced with the actual block colors (e.g., "blue", "yellow", "pink", "purple", or combinations like "blue, yellow and pink").
![]() |
![]() |
![]() |
| Initial Frame Agent at start, 3 colored blocks scattered on grid |
Animation Agent visits all blocks in optimal order |
Final Frame Agent at red endpoint, all blocks visited |
Navigate from green start to red end point while visiting all colored blocks, finding the shortest path that satisfies the collection constraint.
- Grid: 10×10 grid of cells
- Start point: Green filled cell with orange agent
- End point: Red filled cell (final destination)
- Colored blocks: 3 blocks (blue, yellow, pink, or purple) that must be collected before reaching end
- Agent: Orange circular character
- Movement: Can move up, down, left, right to adjacent cells
- Background: White grid with black borders
- Goal: Visit all colored blocks via shortest path, then reach red endpoint
- Collection constraint: must visit all colored blocks before endpoint
- Path optimization with constraint satisfaction (traveling salesman problem)
- Dynamic programming to find optimal visitation order
- Manhattan distance-based pathfinding
- Clear visual markers (green=start, red=end, colored blocks=collectibles)
- Grid-based movement (no diagonal)
- Multiple block colors: blue, yellow, pink, purple (randomly assigned per block)
data/questions/grid_go_through_block_task/grid_go_through_block_00000000/
├── first_frame.png # Initial grid with colored blocks
├── final_frame.png # Agent at end, all blocks collected
├── prompt.txt # Collection task instruction (with dynamic color names)
├── ground_truth.mp4 # Animation of optimal collection path
└── question_metadata.json # Task metadata
File specifications:
- Images: 1024×1024 PNG format
- Video: MP4 format, 16 fps
- Duration: ~3-5 seconds (varies by path length)
spatiality path-planning constraint-satisfaction collection-task grid-navigation optimization


