persistence homology
node-based filtration techniques
obtaining digestible insights

🧠 Inspiration

Traditional vendor and client onboarding systems are still trapped in the manual age: fragmented, redundant, and vulnerable. Each new client brings not just paperwork but potential risk, hidden ownership webs, and compliance uncertainty. We wanted to create a system that thinks, not just processes. One that uses Nemotron intelligence, topological data analysis, and graph theory to understand relationships, automate due diligence, and accelerate onboarding while enhancing potential fraud detection.

⚙️ What it Does

Our solution is an intelligent, Nemotron-powered multi-tier onboarding agent that autonomously extracts, analyzes, and validates vendor or client data in real time.

Tier 1: AI Parsing & Data Capture Using Nemotron’s OCR and document parsing models, our system reads and structures data from uploaded onboarding documents. Integrated OAuth authentication ensures secure access and user identity management on client and server side.
Tier 2: Topological AI Risk Assessment We model the extracted information as a weighted transaction graph and apply advanced Topological Data Analysis (TDA). Through Persistent Homology, node and edge filtration, Forman–Ricci curvature, and Betti vector computation, the system uncovers latent patterns, ownership cycles, and hidden linkages associated with fraud or shell structures.
Tier 3: Human-AI Compliance Validation Layer The system provides interpretable outputs "validated" or "suspected fraud" with supporting graph visualizations and reasoning traces, enabling compliance teams to make data-driven final decisions.

🏗️ How We Built It

Frontend: HTML5, CSS, Flask, JavaScript, OAuth (for secure authentication), and early integration scaffolds for Google Firestore for future scalability.
Backend: Deployed multiple Brev Console instances and NVIDIA Virtual Machines for parallel Nemotron workloads.
Topological ML: Python stack with networkx, sklearn, numpy, pandas, subprocesses, and pickle for data persistence and orchestration.
Mathematical Layer: Embedded Persistent Homology pipelines, Betti number computation, multi-persistence filtrations, and eigenvalue-based anomaly detection to construct topological signatures of entity behavior.
GenAI Deployment: understanding the core functionality and fundamentals of Firestore and AuthO smoothening the UI, and making the frontend cohesive, and internal deployment to solve merge conflicts.

Together, this created a reasoning pipeline where Nemotron performs orchestration, while our topological ML models provide fraud intuition grounded in mathematics.

🚧 Challenges We Ran Into

Nemotron Instance Deployment: Initial deployments on Windows consistently failed to run GPU-accelerated Nemotron environments. We overcame this by shifting to Brev Console on Mac-based NVIDIA VMs, achieving stable multi-instance communication.
Topological Vector Instability: Early Betti vector results were inconsistent across samples, normalization and multi-filtration alignment were introduced to stabilize metrics.
Dataset Balancing: Simulating realistic onboarding data across banking, asset management, and taxation required extensive fine-tuning for feature diversity.

🏅 Accomplishments We're Proud Of

Achieved autonomous multi-step onboarding workflow orchestration using Nemotron aligning perfectly with our "intelligent agent" vision.
Implemented real-time topological fraud detection leveraging Persistent Homology and Forman–Ricci curvature analysis.
Established a secure OAuth-based onboarding flow integrating human oversight and AI reasoning.

📚What we Learned

How to integrate reasoning-capable LLMs like Nemotron with graph-based machine learning in financial compliance.
Understanding fraud not as isolated data points but as evolving topological shapes in transaction graphs.
The importance of orchestration-level design, enabling AI systems to perform workflows autonomously rather than reactively.

🚀 What's Next

Integrate Firestore for persistent onboarding histories and cross-client pattern learning.
Expand topological feature extraction using multi-dimensional persistence diagrams for richer fraud signatures.