VeriGraph: Scene Graphs for Execution Verifiable Robot Planning

This repository contains the code for the paper "VeriGraph: Scene Graphs for Execution Verifiable Robot Planning" (ICRA 2026).

This code is released under the MIT License. See LICENSE for details.

Project Page | Arxiv

Overview

VeriGraph is a framework that integrates Vision-Language Models (VLMs) for robotic planning while verifying action feasibility. It uses scene graphs as an intermediate representation to capture key objects and spatial relationships, enabling reliable plan verification and refinement.

Key features:

Scene graph generation from images using VLMs (GPT-4V, Gemini, LLaVA)
Iterative task planning with constraint validation
Support for language-based and image-based goal specification

Models. The paper reports GPT-4V for scene graph generation and GPT-4 for planning. This codebase defaults to gpt-4o for both (vlm_model, llm_model).

Iterative planner hyperparameters (Algorithm 1). The implementation uses consecutive constraint failures (reset to zero after a successful step), matching the paper. Defaults are error_threshold=5 and num_steps_per_call=3.

Scene graph generation (Table I). Run eval_sgg.py with sgg.scene=blocks|kitchen|tableware and sgg.api=gpt|google|ollama. Same prompt and global object list as in the paper.

Installation

Install the verigraph package in editable mode from the repository root (recommended):

git clone https://github.com/daniekpo/verigraph.git
cd verigraph
conda create -n verigraph python=3.10 -y
conda activate verigraph
pip install -e .

Alternatively, install dependencies only:

pip install -r requirements.txt

System Dependencies

Graphviz and pygraphviz are required only when graph image plotting is enabled (visualization.plot_graph_images=true, the default):

# Ubuntu/Debian
sudo apt-get install graphviz graphviz-dev

# macOS
brew install graphviz

pip install pygraphviz

API Keys

Create a .env file in the project root with your API keys:

OPENAI_API_KEY=your_openai_api_key
GOOGLE_API_KEY=your_google_api_key

Use OPENAI_API_KEY when running with OpenAI-backed models. Use GOOGLE_API_KEY when running with Gemini (sgg.api=google for scene-graph evaluation, or models.api=google for planning). Set sgg.model or models.llm_model to a Gemini model name when using Gemini; if the default OpenAI model name is left in place, the Gemini client falls back to its default model. Use local Ollama models without a cloud API key by setting sgg.api=ollama or models.api=ollama as appropriate. The Ollama client connects to localhost:11434.

Data Layout

To run the code, place your data under data/ following the structure below, or point the config paths at an equivalent directory tree.

data
|_ global_objects.txt
|_ scene_graphs
|  |_ gt_sg.json
|_ tasks
|  |_ all_tasks.json
|_ scenes
|  |_ blocks
|  |  |_ <scene_name>.jpeg
|  |_ kitchen
|  |  |_ <scene_name>.jpeg
|  |_ tableware
|  |  |_ <scene_name>.jpeg
|  |_ demo
|     |_ <scene_name>.jpeg

Notes:

A scene key such as blocks/blocks_2 maps to data/scenes/blocks/blocks_2.jpeg when present, otherwise to data/scenes/blocks/blocks_2_side.jpeg.
The same scene key must be used in scene_graphs/gt_sg.json.
Task entries reference those same keys through initial and, when applicable, goal.
data/scene_graphs/cached.json is optional. If present, iterative planning reuses cached predicted initial scene graphs; otherwise it generates them on demand.

Required file contents:

// data/scene_graphs/gt_sg.json
{
  "<scene_type>/<scene_name>": {
    "nodes": ["object_a", "object_b", "table"],
    "relations": [
      "object_a, on, table",
      "object_b, in, object_a"
    ]
  }
}

// data/tasks/all_tasks.json
[
  {
    "id": 0,
    "initial": "<scene_type>/<scene_name>",
    "task_type": "rearrange",
    "instruction": "Natural-language task instruction",
    "difficulty": "easy",
    "goal": "<scene_type>/<other_scene_name>"
  }
]

For stacking tasks, omit goal. Valid task_type values in this release are rearrange, language, stacking, and kitchen_match.

# data/global_objects.txt
object_a, object_b, table, bowl, cup

Usage

Scene Graph Generation Evaluation

Evaluate VLM-based scene graph generation against ground truth. Populate data/ first, then run from the repository root. Settings live in verigraph/config/config.yaml under paths and sgg (override on the CLI).

python -m verigraph.scripts.eval_sgg sgg.scene=tableware sgg.api=gpt
python -m verigraph.scripts.eval_sgg sgg.scene=kitchen sgg.api=google
python -m verigraph.scripts.eval_sgg sgg.scene=blocks sgg.api=gpt sgg.model=gpt-4o

Override sgg.model= for a different VLM. API options: gpt, google, ollama. Set visualization.plot_graph_images=false to skip saving scene-graph image renderings on systems without Graphviz/pygraphviz.

Planning Evaluation

Run from the repository root so paths like data/scenes/ resolve correctly.

One-shot planning ("Ours (Direct)" in the paper) uses eval_planning.py:

python -m verigraph.scripts.eval_planning task.task_name=rearrange
python -m verigraph.scripts.eval_planning task.task_name=language
python -m verigraph.scripts.eval_planning task.task_name=stacking
python -m verigraph.scripts.eval_planning task.task_name=kitchen_match

Examples of overrides: models.api=gpt, models.vlm_model=gpt-4o, models.llm_model=gpt-4o, task.n_tasks=10, task.shuffle=true, task.tasks_file=data/tasks/all_tasks.json.

Iterative planning (VeriGraph's main method):

python -m verigraph.scripts.eval_iterative_planning task.task_name=rearrange
python -m verigraph.scripts.eval_iterative_planning task.task_name=language
python -m verigraph.scripts.eval_iterative_planning task.task_name=stacking
python -m verigraph.scripts.eval_iterative_planning task.task_name=kitchen_match

To run iterative planning with online robot execution after each newly verified action batch:

python -m verigraph.scripts.eval_iterative_planning \
  task.task_name=language \
  execution.online=true \
  execution.robot_server_address=tcp://ROBOT_IP:5555

When a task finishes with valid_solution=true, the planning scripts also write a plain-text verified plan file under the run directory:

results/.../plans/task_<task_id>_successful_plan.txt

Baseline Evaluations

SayCan baseline (uses models.llm_model only; no images):

python -m verigraph.scripts.eval_saycan task.task_name=language

ViLA baseline (uses models.vlm_model for vision-language planning):

python -m verigraph.scripts.eval_vila task.task_name=rearrange
python -m verigraph.scripts.eval_vila task.task_name=language

Real-World Robot Execution

VeriGraph supports real-world execution through a ZeroMQ-based client-server architecture. The released code includes the client that sends high-level commands to a robot execution server.

The paper’s deployed system additionally used perception and grasp components (e.g. LangSAM-style segmentation, AnyGrasp) and calibrated cameras; those are not part of this repository due to copyright restrictions. If you have compatible robot code, you only need a process that speaks the JSON/ZMQ protocol below.

Architecture

The planning pipeline generates high-level actions (e.g., move(object, source, destination)). These are sent to a robot server via ZeroMQ REQ/REP pattern:

Planning Pipeline  -->  ZMQ Client  -->  Robot Server
                                              |
                                    Perception + Grasp Planning
                                              |
                                        Robot Execution

Running Robot Execution

Start your robot execution server (implements the ZMQ REP interface)
Choose one of the execution modes below.

Offline execution from a saved results JSON

python -m verigraph.scripts.execute_plan \
  execution.plan_path=path/to/results.json \
  execution.task_id=1 \
  execution.robot_server_address=tcp://ROBOT_IP:5555

Offline execution from a plain text verified plan

python -m verigraph.scripts.execute_plan \
  execution.plan_path=path/to/task_1_successful_plan.txt \
  execution.robot_server_address=tcp://ROBOT_IP:5555

When execution.plan_path points to a results.json file with multiple task entries, set execution.task_id to choose which verified plan to execute.

Robot Server Interface

The robot server must implement a ZMQ REP socket that handles JSON messages:

pick_and_place:

{"command": "pick_and_place", "target": "apple", "destination": "bowl", "relationship": "in"}

go_home:

{"command": "go_home"}

get_image:

{"command": "get_image"}

All responses should be JSON with at least {"success": true/false}. On failure, include {"success": false, "error": "description"}.

The released execution utilities support verified move(...) actions. If your planner produces open(...) or close(...), you will need to extend the robot client/server protocol before using online or offline execution.

Project Structure

├── verigraph/               # Installable Python package
│   ├── config/              # Hydra `config.yaml` (grouped paths, task, sgg, execution, …)
│   ├── core/                # Planning, parsing, scene-graph verification
│   ├── prompts/             # LLM prompt templates (paths in config.yaml)
│   ├── utils/               # Integrations (LLM clients, robot client, …)
│   └── scripts/             # Runnable entry points (Hydra evals, execution)
├── data/                    # Expected location for user-supplied datasets
│   ├── scene_graphs/        # Ground truth scene graphs
│   ├── tasks/               # Task definitions
│   └── scenes/              # Scene images
├── pyproject.toml           # Package metadata (`pip install -e .`)
└── requirements.txt         # Optional flat dependency list

Citation

@inproceedings{ekpo2026verigraph,
        title={Verigraph: Scene Graphs for Execution Verifiable Robot Planning},
        author={Ekpo, Daniel and Levy, Mara and Suri, Saksham and Huynh, Chuong and Swaminathan, Archana and Shrivastava, Abhinav},
        booktitle={Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
        year={2026}
}

Acknowledgements

This work was partially supported by NSF CAREER Award #2238769 to AS.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
verigraph		verigraph
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VeriGraph: Scene Graphs for Execution Verifiable Robot Planning

Overview

Installation

System Dependencies

API Keys

Data Layout

Usage

Scene Graph Generation Evaluation

Planning Evaluation

Baseline Evaluations

Real-World Robot Execution

Architecture

Running Robot Execution

Robot Server Interface

Project Structure

Citation

Acknowledgements

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VeriGraph: Scene Graphs for Execution Verifiable Robot Planning

Overview

Installation

System Dependencies

API Keys

Data Layout

Usage

Scene Graph Generation Evaluation

Planning Evaluation

Baseline Evaluations

Real-World Robot Execution

Architecture

Running Robot Execution

Robot Server Interface

Project Structure

Citation

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages