Pricele$$ | Devpost

Inspiration

In today's digital ecosystem, users constantly exchange personal data for access to online services — often without realizing the scale or value of what they are giving away. Consent has become routine, but understanding has not.

We were driven by a central question:
If personal data fuels billion-dollar business models, why are users unaware of its value and impact?

This motivated us to build a system that brings transparency to the data economy — helping users see, quantify, and understand how their data is being collected and monetized.

What We Built

A Chrome extension with an embedded AI and machine learning engine that audits websites in real time and converts complex privacy signals into simple, actionable insights. Everything runs locally — no data leaves the device.

Core capabilities include:

Real-time detection of trackers, cookies, and third-party requests
ML-based estimation of the economic value of user data, with confidence intervals
Reconstruction of the audience profile ad systems have built on the user's browsing history
AI-powered summarization of privacy policies into concise, human-readable insights
Contradiction detection — flags when a site's policy claims contradict its observed tracker behavior
Policy change detection — alerts users when a privacy policy is quietly updated
A dashboard that aggregates and visualizes data exposure across the month

We model the estimated value of user data per tracker event as:

$$ V = \sum_{i=1}^{n} (w_i \cdot d_i) $$

where:

$d_i$ represents detected signals (page category, tracker density, device type, geo signal, time of day)
$w_i$ represents weights learned from IAB CPM benchmarks and WhoTracksMe tracker data

How We Built It

The system is entirely self-contained within the Chrome extension — there is no external server or backend. All inference, storage, and processing happens on-device.

Browser Extension Layer (Chrome MV3)

Content scripts detect third-party tracker requests using PerformanceObserver and resource timing APIs
The background service worker orchestrates all inference, storage, and API calls
declarativeNetRequest powers the block mode with dynamic rule injection

Machine Learning Layer (ONNX Runtime Web — runs in the service worker)

Valuation model:

PyTorch regression network: 17 features → 64 → 32 → 2 outputs
Trained on WhoTracksMe tracker density data and IAB CPM benchmarks
Outputs a low/mid/high confidence interval per tracker event

Mirror model:

PyTorch multi-label classifier:
URL → BoW → LSA (64-dim) → 128 → 64 → 8 sigmoid outputs
Trained on 1.27M URL samples from the Curlie dataset
Reconstructs which of 8 IAB audience segments an ad system would assign to the user

Both models are exported to ONNX and run entirely via WASM inside the service worker — no GPU, no server.

AI Layer (Claude API)

Privacy policy text is fetched, hashed with SHA-256, and analyzed by Claude on first visit or when the policy changes
Structured JSON output includes:
- Plain-English summary
- Boolean policy claims
- Consent dark pattern score
- Change summary
All results are cached in chrome.storage.local — Claude is only called when necessary

Processing pipeline per page visit:

Content script detects tracker requests and scores the cookie consent banner for dark patterns
Service worker runs the ONNX valuation model to assign a dollar value with confidence interval
Service worker fetches and hashes the site's privacy policy; calls Claude if the hash has changed
Contradiction engine cross-references policy claims against observed tracker categories
All events are stored locally; dashboard aggregates them into the monthly statement
Mirror model runs against the full browsing history to reconstruct the user's ad profile

What We Learned

Running ONNX models inside a Chrome MV3 service worker requires specific WASM configuration (wasm-unsafe-eval CSP, single-threaded execution, explicit WASM path resolution)
Privacy-related data is highly unstructured; preprocessing and feature engineering matter more than model complexity
Combining language models with local ML creates a strong hybrid system — Claude handles open-ended text understanding, ONNX handles low-latency structured inference
Users respond more strongly to quantitative, dollar-denominated insights than abstract privacy warnings

Challenges We Faced

Chrome MV3's Content Security Policy blocks WebAssembly by default — required adding wasm-unsafe-eval and configuring ONNX Runtime Web for single-threaded WASM execution
Lack of labeled ground-truth datasets for data valuation required building a training set from WhoTracksMe tracker density statistics combined with IAB CPM benchmarks
Class imbalance in the mirror training dataset (some audience segments 30× more common than others) required weighted loss functions and per-class confidence thresholds
Privacy policy formats vary wildly across sites, making reliable text extraction and consistent LLM output structure a significant engineering challenge

Built With

chrome-extension-mv3
chrome.storage
claude-api
declarativenetrequest
iab
javascript
onnx-runtime-web
python
pytorch
react
tailwind-css
vite
web-crypto-api
webassembly
whotracksme

Updates

harini t started this project — Apr 05, 2026 12:51 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.