Generates synthetic object manipulation tasks where specific objects must be removed from a scene based on various criteria such as color, shape, position, or uniqueness. All other objects remain unchanged in their original positions.
Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.
| Property | Value |
|---|---|
| Task ID | O-43 |
| Task | Object Subtraction |
| Category | Abstraction |
| Resolution | 1024×1024 px |
| FPS | 16 fps |
| Duration | varies |
| Output | PNG images + MP4 video |
# Clone the repository
git clone https://github.com/VBVR-DataFactory/O-43_object_subtraction_data-generator.git
cd O-43_object_subtraction_data-generator
# Install dependencies
pip install -r requirements.txt# Generate 100 samples
python examples/generate.py --num-samples 100
# Generate with specific seed
python examples/generate.py --num-samples 100 --seed 42
# Generate without videos
python examples/generate.py --num-samples 100 --no-videos
# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_output| Argument | Type | Description | Default |
|---|---|---|---|
--num-samples |
int | Number of samples to generate | 100 |
--seed |
int | Random seed for reproducibility | Random |
--output |
str | Output directory | data |
--no-videos |
flag | Skip video generation | False |
Remove the leftmost objects from the scene. Keep all other objects unchanged.
![]() |
![]() |
![]() |
| Initial Frame Multiple colored objects in scene |
Animation Specified objects disappearing |
Final Frame Remaining objects unchanged |
Remove specific objects from a scene based on selection criteria (color, shape, position, or uniqueness) while keeping all other objects stationary and unchanged.
- Object Count: 5-8 geometric objects per scene
- Object Types: Cubes, spheres, pyramids, cones
- Colors: Various colors (red, green, blue, yellow, orange, purple)
- Four Task Types:
- Type 1: Remove by attribute (all objects of a specific color or shape)
- Type 2: Remove specific objects (one or more named objects like "red cube")
- Type 3: Remove by position (leftmost, rightmost, N leftmost, etc.)
- Type 4: Remove the different one (object that looks different from others)
- Attribute-based selection: Tests understanding of color and shape properties
- Specific object identification: Requires matching exact object descriptions
- Spatial reasoning: Identifies objects by relative position
- Outlier detection: Finds objects that differ from the group
- Selective removal: Only specified objects disappear
- Position preservation: Remaining objects stay in exact original locations
- Complete elimination: Removed objects vanish entirely from the scene
- Multiple criteria types: Tests different forms of object selection logic
data/questions/object_subtraction_task/object_subtraction_00000000/
├── first_frame.png # Initial state (all objects present)
├── final_frame.png # Final state (specified objects removed)
├── prompt.txt # Task instructions with selection criteria
├── ground_truth.mp4 # Solution video (16 fps)
└── question_metadata.json # Task metadata
File specifications: Images are 1024×1024 PNG. Videos are MP4 at 16 fps, duration varies by number of objects removed.
object-manipulation selective-removal attribute-matching spatial-selection outlier-detection object-identification scene-editing


