Cadsys: Parametric 3D CAD Generation from Multi-View Technical Drawings

Inspiration

Traditional CAD modeling is time-consuming and requires expert knowledge.
We wanted to explore AI-assisted, rapid parametric CAD reconstruction from standard technical drawing inputs (PNG, DXF), especially for tri-view orthographic projections common in engineering.

Our inspiration came from:

TriView2CAD dataset for structured DXF + PNG + JSON + STEP examples.
CReFT-CAD’s curriculum fine-tuning and post-SFT recipe.
The need of rapid prototyping with CAD for makers, engineers, (industrial) designers, and hardware projects.

What it does

Takes three orthographic views (Front, Top, Side) as PNG or DXF input.
Outputs ordered parametric values (JSON) describing the part.
Can automatically generate Build123d / CADQuery scripts for 3D reconstruction.
Fine-tuned on 100 curated tri-view samples prepared in tri_view_ready.

Input → Output Example:

Input: PNG tri-view + DXF
Output: JSON params → Build123d script → STEP/B-Rep 3D model

How we built it

Dataset Preparation (tri_view_ready/)
- Extracted and matched PNG, DXF, and JSON param files from TriView2CAD subset.
- Created manifest.csv linking inputs to labels.
- Defined KEY_ORDER.txt to keep param output sequence consistent.
- Wrote PROMPT.txt for model instruction consistency.
Model Training
- Used Modal Labs GPU compute (<$30 usage target).
- Mounted dataset volume into /data for training scripts.
- Fine-tuned a vision-language model (LoRA) to map tri-view images to ordered param strings.
- Loss function: numeric token regression over ordered params.
Inference Pipeline
- Accepts PNG/DXF upload.
- Runs VLM inference to predict ordered params.
- Converts params into Build123d script → generates CAD (STEP/B-Rep).

Challenges we ran into

Aligning PNG and DXF IDs with JSON labels — original dataset paths were inconsistent.
Creating a manifest that works both locally and inside Modal’s /data mount.
Keeping training for hackathon constraints, quantization needed.
Ensuring LoRA fine-tuning converges with only 100 samples due limited VRAM.

Accomplishments that we're proud of

End-to-end working PNG/DXF → CAD param → 3D script pipeline.
Successfully fine-tuned a model within hackathon time limits.
Reusable dataset format (tri_view_ready/) that others can adopt.

What we learned

Small, well-prepared datasets can outperform larger messy datasets for targeted tasks.
KEY_ORDER.txt and consistent prompts are critical for output stability.
Modal Volumes are a quick way to share datasets between jobs without rebuilding containers.

What's next for us

Scale dataset beyond 100 samples for higher accuracy.
Enhance RLHF with CAD validation (script renders vs. DXF overlays).
Support additional CAD script outputs (OpenSCAD, FreeCAD).
Explore multi-modal prompting: PNG + DXF + natural language specs.

Built With

build123d
dxf
fastapi
json
modal
png
python
qwen
step
vision

Updates

Dean Hu started this project — Aug 14, 2025 10:51 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.