🎨 VCode: SVG as Symbolic Visual Representation

TL;DR: SVG code as a Visual Representation

See our demo video for fun!

VCode_demo_video.mp4

📣 News

[2025.12.20] 🌟 Added GPT-5.2 to our benchmark, showing solid performance, below Gemini-3-Pro but outperforming Claude-4.5-Sonnet.
[2025.11.21] 🔥 Added Gemini-3-Pro to our benchmark, showing excellent performance.
[2025.11.08] 🎥 Released our demo video featuring lots of fun memes and reaction images converted into SVGs.
[2025.11.08] 🚀 We now offer a free trial API on our 🤗 HuggingFace Space.
[2025.11.05] 🔥 We are honored to be featured as 🤗 HuggingFace Daily Paper #1.

🛠️ Installation

Environment

git clone -b main --single-branch https://github.com/CSU-JPG/VCode.git
cd VCode
conda create -n vcode python=3.10.2 -y
conda activate vcode
conda install pytorch=2.5.1 torchvision=0.20.1 torchaudio=2.5.1 pytorch-cuda=12.4 -c pytorch -c nvidia
pip install -r requirements.txt

🚀 Quick Start

🧩 VCode-suite

VCode-suite is a comprehensive toolkit that automates the full image-to-SVG-to-render workflow. It includes both integrated pipelines and independent modules for generation, rendering, and revision. Users can either run the end-to-end pipelines for batch processing, or execute individual scripts for customized control.

📁 vcode-suite/
├── filter.py
├── img2svg.py
├── img2svgthinking.py
├── img2svg-w-visual-tool.py
├── img2text2svg.py
├── pipeline.sh
├── revision_pipeline.sh
├── revision.py
└── svg_render_img.py

💡 Tip: The pipelines (pipeline.sh, revision_pipeline.sh) perform fully automated batch processing, while the Python scripts (img2svg.py, img2text2svg.py, revision.py, etc.) can be run independently to support flexible and modular experimentation within the VCode framework.

⚙️ Usage

1️⃣ Generate and render SVGs

pipeline.sh orchestrates the full image-to-SVG-to-render workflow. It can connect to different generation modules — img2svg, img2text2svg, or img2svgthinking — to convert images into SVGs, then filter and render them into pixel images.

chmod +x pipeline.sh
./pipeline.sh

2️⃣ Optimize generated SVGs

revision_pipeline.sh automates the revision and optimization process. It takes the previously generated SVGs (generated_svgs/) and rendered images (generated_imgs/), calls the API-based revision module, and outputs the optimized SVGs and renders to optimized_svgs/ and optimized_imgs/.

chmod +x revision_pipeline.sh
./revision_pipeline.sh

3️⃣ Run scripts independently

Both generation and revision scripts can be executed independently for flexible and customized workflows.

Each core generation script — img2svg.py, img2text2svg.py, img2svgthinking.py, and img2svg-w-visual-tool.py — can directly convert input images into SVG code. Similarly, revision.py can be run independently to optimize previously generated SVGs through visual feedback.

Run img2svg.py

python vcode-suite/img2svg.py \
/path/to/input_images \
./generated_svgs \
--model gpt-5 \
--base-url https://openrouter.ai/api/v1 \
--api-key <OPENROUTER_API_KEY> \
--max-tokens 16384

Argument	Type	Default	Description
`images_folder`	str	-	Path to the input folder containing image files.
`svg_output_folder`	str	-	Directory to save the generated SVG files.
`--model`	str	`gpt-5`	API model name used for conversion.
`--base-url`	str	`https://openrouter.ai/api/v1`	Base URL of the API endpoint.
`--api-key`	str	-	API key for authentication.
`--sleep`	int	`5`	Seconds to wait between consecutive API calls.
`--max-tokens`	int	`16384`	Maximum number of tokens allowed in the model’s response.

Run revision.py

python vcode-suite/revision.py \
--svg-folder ./generated_svgs \
--original-folder ./input_images \
--rendered-folder ./generated_imgs \
--output-folder ./optimized_svgs \
--analysis-folder ./visual_analysis \
--base-url https://openrouter.ai/api/v1 \
--api-key <OPENROUTER_API_KEY> \
--model gpt-5 \
--max-tokens 16384

Argument	Type	Default	Description
`--svg-folder`	str	—	Root directory containing the SVG files to optimize.
`--svg-folder`	str	-	Root directory containing the SVG files to optimize.
`--original-folder`	str	-	Directory of the original reference images.
`--rendered-folder`	str	-	Directory of rendered images corresponding to the SVGs.
`--output-folder`	str	-	Directory to save the optimized SVG files.
`--analysis-folder`	str	-	Directory to save visual comparison and analysis txts.
`--base-url`	str	`https://openrouter.ai/api/v1`	Base URL of the API endpoint.
`--api-key`	str	-	API key.
`--model`	str	`gpt-5`	Model used for revision.
`--max-tokens`	int	`16384`	Maximum tokens allowed in the model response.

💡 Tip: The revision.py script refines existing SVGs based on visual comparison feedback, while generation scripts (img2svg.py, img2text2svg.py, img2svgthinking.py, img2svg-w-visual-tool.py) create SVGs from input images_folder. You can flexibly mix and match these tools depending on your pipeline needs.

🔮 Evaluation

⚙️ Usage

1️⃣ Generate IMGs for all three datasets

Use the VCode-suite pipeline (or standalone scripts) to render images for each dataset. Original images are already in data/:

MM-Vet: data/mm-vet/images
CV-Bench: data/cv-bench
MMMU: data/mmmu/mmmu_dev_processed_single_img_subset

Running your pipeline will produce, per dataset, a folder like:

generated_svgs/
generated_imgs/  ← used by the evaluators

2️⃣ Run each dataset’s evaluator

Each evaluator is a shell script under evaluation/…. They all follow the same usage:

chmod +x evaluation/mm-vet/mmvet_eval.sh
./evaluation/mm-vet/mmvet_eval.sh

chmod +x evaluation/cv-bench/cvbench_eval.sh
./evaluation/cv-bench/cvbench_eval.sh

chmod +x evaluation/mmmu/mmmu_eval.sh
./evaluation/mmmu/mmmu_eval.sh

These scripts will read your generated_imgs/ and compute scores.

💡 Reference: For directory organization and example script configuration, see example_results/ (it shows a working layout you can mirror).

3️⃣ Calculate each dataset’s metrics

Full Command with Options

python metrics.py \
--folder1 /path/to/reference_images \
--folder2 /path/to/model_outputs/gpt-4o \
--ckpt google/siglip2-so400m-patch14-384

Command Line Arguments

Argument	Required	Default	Description
`--folder1`	✅ Yes	-	Path to reference images folder
`--folder2`	✅ Yes	-	Path to model output folder (containing `generated_imgs/` and `generated_svgs/`)
`--ckpt`	❌ No	`google/siglip2-so400m-patch14-384`	SigLIP model checkpoint

Expected Directory Layout:

Reference Images Folder (--folder1)

Location: data/mm-vet/images (example path - can be customized)

folder1/
├── category1/
│   ├── image001.png
│   ├── image002.jpg
│   └── ...
├── category2/
│   ├── image003.png
│   └── ...
└── ...

Model Output Folder (--folder2)

Location: example_results/mm-vet/Gemini-2.5-Pro (example path - can be customized)

folder2/
├── generated_imgs/           # Generated/rendered images
│   ├── category1/
│   │   ├── image001.png
│   │   ├── image002.jpg
│   │   └── ...
│   ├── category2/
│   │   ├── image003.png
│   │   └── ...
│   └── ...
│
└── generated_svgs/           # SVG source files
   ├── category1/
   │   ├── image001.svg
   │   ├── image002.svg
   │   └── ...
   ├── category2/
   │   ├── image003.svg
   │   └── ...
   └── ...

📌 Citation

If you find our work useful, please cite:

@article{vcode,
  title={VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation},
  author={Lin, Kevin Qinghong and Zheng, Yuhao and Ran, Hangyu and Zhu, Dantong and Mao, Dongxing and Li, Linjie and Torr, Philip and Wang, Alex Jinpeng},
  journal={arXiv preprint arXiv:2511.02778},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎨 VCode: SVG as Symbolic Visual Representation

📣 News

📋 Table of Contents

🛠️ Installation

🚀 Quick Start

🧩 VCode-suite

⚙️ Usage

1️⃣ Generate and render SVGs

2️⃣ Optimize generated SVGs

3️⃣ Run scripts independently

🔮 Evaluation

⚙️ Usage

1️⃣ Generate IMGs for all three datasets

2️⃣ Run each dataset’s evaluator

3️⃣ Calculate each dataset’s metrics

📌 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
assets		assets
data		data
evaluation		evaluation
example_results		example_results
vcode-suite		vcode-suite
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🎨 VCode: SVG as Symbolic Visual Representation

📣 News

📋 Table of Contents

🛠️ Installation

🚀 Quick Start

🧩 VCode-suite

⚙️ Usage

1️⃣ Generate and render SVGs

2️⃣ Optimize generated SVGs

3️⃣ Run scripts independently

🔮 Evaluation

⚙️ Usage

1️⃣ Generate IMGs for all three datasets

2️⃣ Run each dataset’s evaluator

3️⃣ Calculate each dataset’s metrics

📌 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages