Inspiration

Field inspections are critical to keeping heavy equipment operational and safe for human use. Yet, workflow around this crucial operation relies on antiquated methods that make no use of modern technology. Technicians move around massive machines, scanning for subtle signs of wear or failure, while simultaneously trying to document findings in rigid forms or mobile checklists. Engineers have to worry about far too many things at once: eyes on the equipment, then down to a screen, then back again. In that back-and-forth, small details can be missed where it really matters.

Just as crucial, engineers aren't always available. Highly trained inspectors represent years of experience and technical judgment, and their time is both limited and expensive. Every hour spent manually documenting findings or moving between sites is time that could be dedicated to deeper diagnostics, preventative maintenance, or system optimization. In many parts of the world, access to fully trained inspectors may not exist at all; Heavy machinery powers essential industries in developing regions, yet advanced engineering support can be scarce or centralized far away.

We believe modern technology should extend expertise: capturing knowledge, reinforcing best practices, and ensuring that every inspection benefits from consistent, high-quality insight regardless of location.

What it does

  • CATalyst Inspect transforms a short walk-around video into a structured, checklist-aligned inspection report in minutes, eliminating the need for manual form entry.

  • It uses a multimodal pipeline that combines vision models with LLM reasoning to analyze footage, detect potential defects, and map findings directly to standardized inspection items.

  • The system includes a specialized, fine-tuned corrosion detection model that identifies early signs of structural degradation, one of the most costly and safety-critical failure modes in heavy equipment.

  • When the specialized model flags a concern, its findings are prioritized in the final report to ensure that domain-specific risks are not overlooked.

  • Each inspection generates clear PASS, MONITOR, or FAIL assessments accompanied by concise, actionable notes suitable for real-world maintenance workflows.

  • The platform maintains inspection history per user and per machine, enabling traceability, trend analysis, and long-term knowledge retention across sessions.

How we built it

CATalyst Inspect is a full-stack multimodal inspection system built with an Expo and React Native mobile frontend and a FastAPI backend designed for real-world field use. The mobile app captures short walk-around videos and streams them to a backend pipeline that performs frame sampling, visual inference, and structured reasoning in parallel. A GPU-backed VLM layer, combining YOLO for defect localization and Qwen for multimodal understanding, maps detected issues directly to standardized checklist items, while a specialized fine-tuned corrosion model prioritizes high-risk structural degradation. The structured findings are then passed to the OpenAI API, which generates an engineer-style report with PASS, MONITOR, or FAIL classifications and concise, actionable commentary. Each inspection is stored with per-user and per-machine context using Supermemory, enabling historical traceability and knowledge retention across sessions, while SQLite provides lightweight local persistence for rapid retrieval and demo reliability.

Challenges we ran into

One of our primary challenges was ensuring that the system performed reliably in real-world field conditions rather than only in controlled testing environments. Mobile video capture and upload introduced complexities related to large file sizes, variable network quality, and intermittent connectivity, particularly during live demonstrations. Simultaneously, we needed to translate probabilistic and often unstructured model outputs into a strict, standardized inspection checklist format. This required careful schema design, validation layers, and deterministic fallback logic to guarantee consistent and defensible PASS, MONITOR, and FAIL classifications. As we incorporated persistent memory, we also faced trade-offs between retaining valuable historical context and managing payload size and storage constraints. Finally, orchestrating a multi-stage pipeline, from VLM inference to LLM reasoning to memory persistence, demanded robust error handling to ensure that failures in any component would not interrupt the inspection process. As a result, we designed the system to degrade gracefully, maintaining continuity and reliability even when individual intelligence layers encounter issues.

Accomplishments that we're proud of

We are most proud that CATalyst Inspect functions as a true end-to-end inspection engine rather than a collection of disconnected AI features. From the moment a short walk-around video is captured, the system carries information through defect detection, structured checklist mapping, LLM reasoning, and persistent history in a single continuous workflow. Instead of producing vague summaries, it generates deterministic, schema-aligned PASS, MONITOR, and FAIL outputs that are directly usable in real maintenance processes, complete with supporting evidence and traceable logic. We also designed the mobile experience with real field conditions in mind, delivering a clear, dark, field-ready interface that prioritizes clarity and speed without adding friction. Just as importantly, we built robust fallback paths so the system continues to produce reliable outputs even if individual AI services are unavailable. The result is not simply an AI wrapper, but a cohesive, production-ready reasoning pipeline that demonstrates how intelligent inspection systems can operate reliably in real-world environments.

What we learned

We learned that building real-world AI systems is far more about orchestration and reliability than raw model performance. Integrating video capture, multimodal vision models, LLM reasoning, and persistent memory into a single inspection pipeline forced us to think deeply about schema design, failure handling, and graceful degradation. We learned how to translate probabilistic model outputs into strict, defensible checklist formats and how to design systems that continue working even when individual components fail. Ultimately, we learned that production-ready AI relies on consistency, usability, and trust in the field.

What's next for CATalyst Inspect

CATalyst Inspect is built on a modular inspection engine, which means expanding its capabilities does not require rethinking the core pipeline. Our immediate focus is broadening defect coverage beyond corrosion to include cracks, leaks, and progressive wear—extending the same detection, checklist mapping, and reasoning framework to additional high-impact failure modes. Each new defect class plugs into the existing architecture, inheriting the same structured outputs and validation guarantees.

We also plan to introduce human-in-the-loop review and annotation tools, allowing engineers to refine findings, validate edge cases, and continuously strengthen the specialized models. This feedback loop turns every inspection into training data, steadily improving performance while preserving accountability.

At the systems level, we aim to deepen long-term trend analysis across inspection histories, enabling proactive maintenance alerts based on degradation patterns over time rather than isolated findings. Finally, we will productionize the inference service for scalable deployment and introduce fleet-level dashboards that provide centralized visibility into asset health, inspection consistency, and emerging risk signals across entire operations.

Built With

Share this project:

Updates