Cadsys: Parametric 3D CAD Generation from Multi-View Technical Drawings
Inspiration
Traditional CAD modeling is time-consuming and requires expert knowledge.
We wanted to explore AI-assisted, rapid parametric CAD reconstruction from standard technical drawing inputs (PNG, DXF), especially for tri-view orthographic projections common in engineering.
Our inspiration came from:
- TriView2CAD dataset for structured DXF + PNG + JSON + STEP examples.
- CReFT-CAD’s curriculum fine-tuning and post-SFT recipe.
- The need of rapid prototyping with CAD for makers, engineers, (industrial) designers, and hardware projects.
What it does
- Takes three orthographic views (Front, Top, Side) as PNG or DXF input.
- Outputs ordered parametric values (JSON) describing the part.
- Can automatically generate Build123d / CADQuery scripts for 3D reconstruction.
- Fine-tuned on 100 curated tri-view samples prepared in
tri_view_ready.
Input → Output Example:
- Input: PNG tri-view + DXF
- Output: JSON params → Build123d script → STEP/B-Rep 3D model
How we built it
Dataset Preparation (
tri_view_ready/)- Extracted and matched PNG, DXF, and JSON param files from TriView2CAD subset.
- Created
manifest.csvlinking inputs to labels. - Defined
KEY_ORDER.txtto keep param output sequence consistent. - Wrote
PROMPT.txtfor model instruction consistency.
Model Training
- Used Modal Labs GPU compute (<$30 usage target).
- Mounted dataset volume into
/datafor training scripts. - Fine-tuned a vision-language model (LoRA) to map tri-view images to ordered param strings.
- Loss function: numeric token regression over ordered params.
Inference Pipeline
- Accepts PNG/DXF upload.
- Runs VLM inference to predict ordered params.
- Converts params into Build123d script → generates CAD (STEP/B-Rep).
Challenges we ran into
- Aligning PNG and DXF IDs with JSON labels — original dataset paths were inconsistent.
- Creating a manifest that works both locally and inside Modal’s
/datamount. - Keeping training for hackathon constraints, quantization needed.
- Ensuring LoRA fine-tuning converges with only 100 samples due limited VRAM.
Accomplishments that we're proud of
- End-to-end working PNG/DXF → CAD param → 3D script pipeline.
- Successfully fine-tuned a model within hackathon time limits.
- Reusable dataset format (
tri_view_ready/) that others can adopt.
What we learned
- Small, well-prepared datasets can outperform larger messy datasets for targeted tasks.
KEY_ORDER.txtand consistent prompts are critical for output stability.- Modal Volumes are a quick way to share datasets between jobs without rebuilding containers.
What's next for us
- Scale dataset beyond 100 samples for higher accuracy.
- Enhance RLHF with CAD validation (script renders vs. DXF overlays).
- Support additional CAD script outputs (OpenSCAD, FreeCAD).
- Explore multi-modal prompting: PNG + DXF + natural language specs.

Log in or sign up for Devpost to join the conversation.