Inspiration

The chaotic yet vibrant traffic of Casablanca was our main muse. As Morocco moves towards Smart Cities, we wanted to answer a critical question: "Can we build a metropolitan-scale Digital Twin using zero-budget, recycled hardware?" We didn't have access to expensive cloud clusters (AWS/Azure). Instead, we looked at the dusty desktops in our lab—an old Dell Optiplex (2011) and an HP ProDesk. The challenge wasn't just to simulate traffic; it was to prove that sophisticated Big Data architectures (Kafka, Spark) could run efficiently on constrained, heterogeneous hardware to solve real urban problems like traffic violations and police dispatching.

What it does

CasaTwin is a real-time distributed simulation and monitoring system for traffic management in Casablanca. Simulates Traffic: It tracks 500+ taxis moving across the actual coordinates of Casablanca in real-time. Detects Infractions: The system automatically detects anomalies such as speeding (>60km/h), red light violations, and illegal parking. Intelligent Processing (The Brain): It doesn't just log the violation; it "investigates" it. It retrieves the driver's identity and license points from a 10-million-row database. It decides if the fine is payable immediately or requires court action (based on point balance). Smart Dispatch: It calculates the distance between the violation and active police units using geospatial formulas to assign the nearest officer. Live Visualization: A real-time dashboard updates every few milliseconds via WebSockets, giving authorities a "God's eye view" of the city.

How we built it

We built a Distributed Data Pipeline running on a "Garage Cluster" of 3 networked machines connected via a Tailscale Mesh VPN.

  1. The Hardware (The "Frankenstein" Cluster) Node 1 (Lenovo Laptop): The Simulator (Generator). Node 2 (Dell Optiplex 790 - i5 Sandy Bridge): The Message Broker (Kafka). Node 3 (HP Pro - i5 Coffee Lake): The Processing Unit (Spark).
  2. The Software Stack Ingestion: We used Apache Kafka in KRaft mode (removing Zookeeper to save RAM) to stream thousands of telemetry events per second. Processing: Apache Spark Structured Streaming (PySpark) acts as the brain. It performs stream-static joins between the Kafka stream (speed/location) and MySQL (driver records/vehicle data). Storage: * MySQL: Stores relational data (Vehicles, Drivers, Police). Optimized with indexing for O(\log n) lookups. MongoDB: Stores the processed JSON logs of infractions for the frontend. Backend & Frontend: A FastAPI server reads from MongoDB and pushes updates to a React dashboard via WebSockets. ## Challenges we ran into Resource Starvation: Running a JVM (Java Virtual Machine) for Spark and Kafka on machines with only 8GB RAM caused constant crashes. We had to fine-tune the heap size (-Xmx) and optimize the Spark partitions to keep the system stable. ## Accomplishments that we're proud of Latency under 200ms: Despite using old hardware and a VPN, the time from "Infraction Occurred" to "Dashboard Update" is near-instant. Self-Healing Architecture: Using Docker Compose with restart policies meant that if a node crashed, it would recover automatically without breaking the pipeline. Complex Stream-Static Joins: Successfully joining a high-velocity stream (Kafka) with a high-volume static table (MySQL) in Spark without creating a bottleneck. ## What we learned Network is everything: In distributed systems, 90% of bugs are networking. We learned deep details about Docker networking, bind addresses, and VPN routing. KRaft vs. Zookeeper: We learned that for smaller, resource-constrained clusters, Kafka's new KRaft mode is a lifesaver compared to the heavy legacy Zookeeper setup. The value of Logging: We learned to stop assuming code works and start implementing robust callbacks (Success/Error handlers) in our producers to catch silent failures. ## What's next for CasaTwin Predictive AI: Feeding the historical data into a Machine Learning model (LSTM) to predict traffic jams before they happen. Edge Computing: Moving the "Simulator" logic to actual IoT placed in cars to use real GPS data. Mobile App: A specialized interface for the Police officers to receive notifications (e.g., "Speeding vehicle 1022-A-1 approaching your sector").
Share this project:

Updates