PrivacyLens is a small web + backend app that analyzes mobile app privacy policies and Play Store metadata, enhances them using an LLM (Azure OpenAI), and surfaces a concise 4-bullet privacy summary in the UI.
- Server-side scraping of Google Play metadata and privacy policy.
- LLM-based extraction and summarization (Azure OpenAI) with server-side normalization.
- Caching of LLM results to control cost (24h TTL).
- Robust fallback when policy scraping is blocked: developer website and Play Store page text are analyzed by the LLM.
The diagram above shows the request flow: Frontend → Backend → (Play Store scraper / privacy policy fetcher). If a direct policy is available it's parsed and optionally enhanced with the LLM; if blocked, the backend uses the developer site or Play Store text as a fallback input to the LLM. Results are normalized and cached before returning to the frontend.
A concise list of the main technologies, libraries and developer tools used across the project.
-
Backend
- Language / runtime: Node.js (CommonJS)
- Web framework: Express
- Key libraries: axios, cheerio (HTML parsing), google-play-scraper (Play Store metadata), node-cache (simple in-memory caching), express-rate-limit, cors
- LLM integration: Azure OpenAI (see
backend/azureOpenAI.jsfor usage) - Dev: nodemon for development hot-reload
-
Frontend (mobile-first / Expo)
- Framework: React + React Native using Expo
- Router:
expo-router - Key libraries:
@expo/vector-icons,react-native-gesture-handler,react-native-reanimated,react-native-safe-area-context,react-native-screens,react-native-svg,react-native-web(web compatibility) - Entry: Expo app (see
frontend/appandapp.json)
-
Tools & Services
- Local development: Node.js, npm/yarn, Expo CLI (for running the frontend mobile/web app)
- Source control: Git
- LLM provider: Azure OpenAI (endpoint, deployment and API key stored in backend env)
-
Languages / Formats
- JavaScript (ES) across frontend and backend
- JSON for configuration (package.json, app.json)
This section is kept intentionally short — see backend/package.json and frontend/package.json for exact dependency versions used in the project.
- Clone and open the repo (you already have it):
cd D:\PrivacyLens- Backend
- Install dependencies and set environment variables. Do not commit
.env— it is already ignored.
cd backend
npm install
# Set required environment variables (PowerShell example). Replace with your values:
$env:AZURE_OPENAI_KEY = "<your-key>"
$env:AZURE_OPENAI_ENDPOINT = "https://your-resource.openai.azure.com"
$env:AZURE_OPENAI_DEPLOYMENT = "your-deployment-name"
# (optional) $env:AZURE_OPENAI_API_VERSION = "2023-05-15" or as configured
# Start backend
npm run dev # or: node server.js
# Backend runs on http://localhost:3000 by default
# But mine is on http://localhost:4002 -- adjust FRONTEND env var accordingly due to port conflicts- Frontend
cd ..\frontend
npm install
npm run start
# Open the Expo/React web UI (usually http://localhost:8081 or indicated in terminal)- AZURE_OPENAI_KEY — API key (server-only secret)
- AZURE_OPENAI_ENDPOINT — Azure OpenAI endpoint URL
- AZURE_OPENAI_DEPLOYMENT — deployment name to use for chat/completions
- (Optional) AZURE_OPENAI_API_VERSION — API version; defaults are present in code
Keep these values in backend/.env locally (already ignored by git).
- GET /health — service health
- GET /api/app/:query — returns app metadata + dataSafety. Example response highlights:
metadata.privacyScore(Number)dataSafety.dataCollected(array)dataSafety.dataShared(array)dataSafety.securityPractices.__llmSummary— array of exactly 4 summary strings (LLM-derived)
Example: /api/app/facebook → JSON with metadata and dataSafety.securityPractices.__llmSummary
- LLM enrichment is performed server-side and normalized so the frontend receives a consistent shape (summary = 4 bullet strings).
- LLM outputs are cached for 24 hours to reduce costs and rate pressure.
- When direct policy scraping is blocked (HTTP 403 or similar), the backend attempts:
- Fetch the developer website privacy page, if available
- If not available, send the Play Store page text to the LLM as fallback and mark the result as inferred
- The frontend shows a small attribution or fallback message when the summary was inferred from fallback content.
- If
git pushis rejected, rebase ontoorigin/main:
git fetch origin
git pull --rebase origin main
# resolve any conflicts, then:
git push origin main- Keep your
backend/.envlocal and never commit it. The repository already addsbackend/.envto.gitignore.
- Add an explicit opt-in flag for LLM enrichment (e.g.
?enhance=true) to control costs and consent. - Add unit/integration tests that mock LLM responses to verify normalization (summary array length, fields present).
- Add attribution UI enhancements when summaries are inferred (not direct policy text).
This project is provided as-is for educational and prototyping use.
