Cadsys: Parametric 3D CAD Generation from Multi-View Technical Drawings

Inspiration

Traditional CAD modeling is time-consuming and requires expert knowledge.
We wanted to explore AI-assisted, rapid parametric CAD reconstruction from standard technical drawing inputs (PNG, DXF), especially for tri-view orthographic projections common in engineering.

Our inspiration came from:

  • TriView2CAD dataset for structured DXF + PNG + JSON + STEP examples.
  • CReFT-CAD’s curriculum fine-tuning and post-SFT recipe.
  • The need of rapid prototyping with CAD for makers, engineers, (industrial) designers, and hardware projects.

What it does

  • Takes three orthographic views (Front, Top, Side) as PNG or DXF input.
  • Outputs ordered parametric values (JSON) describing the part.
  • Can automatically generate Build123d / CADQuery scripts for 3D reconstruction.
  • Fine-tuned on 100 curated tri-view samples prepared in tri_view_ready.

Input → Output Example:

  1. Input: PNG tri-view + DXF
  2. Output: JSON params → Build123d script → STEP/B-Rep 3D model

How we built it

  1. Dataset Preparation (tri_view_ready/)

    • Extracted and matched PNG, DXF, and JSON param files from TriView2CAD subset.
    • Created manifest.csv linking inputs to labels.
    • Defined KEY_ORDER.txt to keep param output sequence consistent.
    • Wrote PROMPT.txt for model instruction consistency.
  2. Model Training

    • Used Modal Labs GPU compute (<$30 usage target).
    • Mounted dataset volume into /data for training scripts.
    • Fine-tuned a vision-language model (LoRA) to map tri-view images to ordered param strings.
    • Loss function: numeric token regression over ordered params.
  3. Inference Pipeline

    • Accepts PNG/DXF upload.
    • Runs VLM inference to predict ordered params.
    • Converts params into Build123d script → generates CAD (STEP/B-Rep).

Challenges we ran into

  • Aligning PNG and DXF IDs with JSON labels — original dataset paths were inconsistent.
  • Creating a manifest that works both locally and inside Modal’s /data mount.
  • Keeping training for hackathon constraints, quantization needed.
  • Ensuring LoRA fine-tuning converges with only 100 samples due limited VRAM.

Accomplishments that we're proud of

  • End-to-end working PNG/DXF → CAD param → 3D script pipeline.
  • Successfully fine-tuned a model within hackathon time limits.
  • Reusable dataset format (tri_view_ready/) that others can adopt.

What we learned

  • Small, well-prepared datasets can outperform larger messy datasets for targeted tasks.
  • KEY_ORDER.txt and consistent prompts are critical for output stability.
  • Modal Volumes are a quick way to share datasets between jobs without rebuilding containers.

What's next for us

  • Scale dataset beyond 100 samples for higher accuracy.
  • Enhance RLHF with CAD validation (script renders vs. DXF overlays).
  • Support additional CAD script outputs (OpenSCAD, FreeCAD).
  • Explore multi-modal prompting: PNG + DXF + natural language specs.

Built With

  • build123d
  • dxf
  • fastapi
  • json
  • modal
  • png
  • python
  • qwen
  • step
  • vision
Share this project:

Updates