Generates synthetic datasets for training and evaluating vision models on chart analysis and extreme value identification tasks. Each sample contains a chart without explicit data labels where the extreme point (maximum or minimum) must be identified visually and highlighted.
Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.
| Property | Value |
|---|---|
| Task ID | G-30 |
| Task | Chart Extreme Without Data |
| Category | Knowledge |
| Resolution | 1024×1024 px |
| FPS | 16 fps |
| Duration | ~3 seconds |
| Output | PNG images + MP4 video |
# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/G-30_chart_extreme_without_data_data-generator.git
cd G-30_chart_extreme_without_data_data-generator
# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .# Generate 50 samples
python examples/generate.py --num-samples 50
# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset
# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42
# Without videos (faster)
python examples/generate.py --num-samples 50 --no-videos| Argument | Description |
|---|---|
--num-samples |
Number of tasks to generate (required) |
--output |
Output directory (default: data/questions) |
--seed |
Random seed for reproducibility |
--no-videos |
Skip video generation (images only) |
The scene shows a area chart. Find the minimum value point and draw a red rectangular border around the corresponding point to highlight it.
![]() |
![]() |
![]() |
| Initial Frame Bar chart without data labels |
Animation Red border appears around longest bar |
Final Frame Maximum value bar highlighted |
Analyze charts without explicit data labels to identify extreme values (maximum or minimum) visually by comparing bar lengths or point positions, then highlight the target.
- Chart types: Bar charts (horizontal/vertical), line charts, pie charts, scatter plots, area charts
- Data visibility: No explicit numerical labels on data points
- Visual comparison: Must compare bar lengths or point heights visually
- Axes: Basic axes without detailed value labels
- Color palette: 30 distinct colors for chart elements
- Extreme type: Maximum (longest/highest) or minimum (shortest/lowest)
- Highlight method: Red rectangular border around target bar/point
- Background: White with minimal chart elements
- Goal: Identify correct extreme element purely by visual comparison
- Visual-only data comparison (no numerical labels)
- Extreme value identification through spatial reasoning
- Multiple chart type support (horizontal bar, vertical bar, line, pie, scatter, area)
- 30 distinct colors for enhanced visual diversity
- Tests visual comparison and spatial reasoning skills
- More challenging than with-data version
- Red border annotation for highlighting
- Complete metadata with per-point color and value information
data/questions/chart_extreme_without_data_task/chart_extreme_without_data_00000000/
├── first_frame.png # Chart without data labels or highlighting
├── final_frame.png # Chart with extreme element highlighted
├── prompt.txt # Instruction to find and highlight extreme value
├── ground_truth.mp4 # Animation of highlight appearing
└── question_metadata.json # Task parameters and metadata
File specifications:
- Images: 1024×1024 PNG format
- Video: MP4 format, 16 fps
- Duration: ~3 seconds
Metadata structure:
Each sample includes question_metadata.json with the following parameters:
chart_type: Type of chart (e.g., "horizontal_bar", "bar", "line", "pie", "scatter", "area")values: List of data valuesdata_points: List of objects, each containing:index: Data point index (0-based)value: Numerical valuecolor: RGB color tuple[r, g, b]assigned to this data point
extreme_type: "max" or "min"target_index: Index of the extreme valuetarget_value: The extreme value itself
logic-symbols chart-analysis visual-comparison extreme-values spatial-reasoning visual-highlighting


