CodeBlue | Devpost

Inspiration

The world of clinical trials is silently in a crisis. Coming from medical backgrounds with our families, we were shocked to discover that up to 40% of participants drop out before their trial is complete, derailing critical medical research. This isn't just a number; it's a cascade of failure. About 20% of trials fail entirely because they can't keep enough participants, and 80% face delays, costing anywhere from $600,000 to $8 million for each lost month. The financial burden is staggering—replacing a single dropout costs nearly $20,000, meaning a standard 500-patient trial can lose up to $2.9 million just trying to replace the people who leave. And what’s even worse is that people oftentimes do not understand the preparation and eligibility for a clinical trial, and end up providing erroneous data by inadvertently participating in activities that hinder the trial (before or during it) due to a lack of understanding and transparency, costing clinical trials $500k–$1M in monetary damages and adding 3–6 months per mistrial. This broken system delays life-saving treatments and shatters patient trust. We knew there had to be a better way.

What it does

CodeBlue is a comprehensive platform designed to fight participant dropout and promote transparency by transforming the clinical trial experience from confusing and isolating to personalized and engaging. For patients, we provide an intuitive dashboard with a personalized trial timeline, making complex protocols easy to understand via tailored checklists that adhere to their underlying conditions and habits. An AI assistant is available 24/7 to answer questions about their EMR and trial procedures, providing instant, reliable support. We even allow patients to update their EMR via natural language conversations and/or file uploads that we extract and process. For Clinical Research Coordinators (CRCs), we offer a powerful dashboard to monitor patient progress in real-time, maintaining healthy HCP-patient relationships, enabling them to intervene before a patient considers dropping out. We bridge the communication gap that costs the industry billions and delays medical breakthroughs.

How we built it

We built CodeBlue on a robust and secure tech stack, featuring a React Native frontend for a seamless mobile experience and a FastAPI backend for high-performance data processing. All data is managed in real-time using Google's Firebase. To ensure security, we verify all Clinical Trials against Clinical Trials API by the US Gov. as well.

To ensure HIPAA compliance, we engineered a dual-token authentication system. When a user signs up, we generate two distinct identifiers: a “mainID” for their general user profile and a separate, securely hashed “emrID”. This “emrID” is a one-way hash of the main ID, meaning it cannot be reverse-engineered. This architecture creates a firewall, completely isolating sensitive Protected Health Information (PHI) from Personally Identifiable Information (PII), ensuring the highest level of data privacy.

Our timeline generation AI system uses Google’s specialized medical open-source model called MedGemma-27B with our own fine-tuning to gather medical data from CRCs to generate timelines in terms of stages for the trials, and then intaking information about the patient to tailor our response into terms relevant and understandable by the patient. We also optimized our hosting of this model on Runpod by finding the optimum number of workers, and the optimum spread across GPUs.

The heart of our platform is a sophisticated multi-agent AI architecture built with LangChain’s newest technology: DeepAgents (released only 2-3 weeks ago). Our “deepagent“ consists of specialized subagents, each with multi-tool calling capabilities, capable of acting linearly, non-linearly, and even simultaneously. Our main "router" agent analyzes incoming user queries and delegates them to three specialized sub-agents, which build context and help each other via Redis subsession caching: 1) EMR Manager: Manages and interprets patient health records. 2) Trial Info Specialist: Explains complex trial protocols in simple terms. 3) Clinical Org Assistant: Helps trial staff manage administrative tasks and updates.

This intelligent architecture of Gemini subagents ensure that every query is collaboratively handled by expert AI, leading not only to fast, accurate, and contextually-aware responses, but also autonomous tool calling to run CRUD operations on databases, records, and profiles.

Challenges we ran into

Integrating such a complex system came with significant hurdles. After recognizing our problem, we faced high walls we needed to jump over in order to be HIPAA compliant and not in violation of any FDA rules, such as displaying diagnoses. Hence, we built our dual-token authentication system, which anonymizes all our EMR data, while still giving us access to the user’s profile information disjointly. Our journey of hurdles began with attempting to translate CedarOS React components into React Native, eventually leading us to build our own AI backend for similar frontend components. Then, all development stopped when our dual-token authentication system locked everyone out of the application until we fixed it. A variety of React Native library installation and cache issues often stalled our development. Having a mix of hackathon experts and first-time hackers also made our collaboration a lot slower, but we turned it into an opportunity to have extreme clarity between teammates, leading to amazing four-way pair programming sessions, git pushing, and pulling back-to-back seamlessly. A big challenge for us, was programming MedGemma to precisely output our timeline, in a reasonable amount of time. Which led to a lot of time used up optimizing GPUs on Runpod, as we had to self host the open source model. Finally, the core of our problems laid in developing our gemini deepagent framework, which required extensive tool-by-tool and agent-by-agent testing to work, and was even held up at the end and required swapping back and forth with OpenAI due to us running out of Gemini credits (the free tier’s per-minute rate limit was not enough for our deepagent). OpenAI’s API is quite slow and buggy with our current version, and so there is room for improvement with more Gemini credits.

Accomplishments that we're proud of

We are incredibly proud of building a fully functional, end-to-end platform that addresses a critical real-world problem. Our greatest technical achievements include our HIPAA compliant dual-token authentication with disjoint anonymized databases, GPU optimizations to reduce runtime for Google’s large open source medically fine-tuned model (MedGemma27B), effectively setting up that agent for timeline generation, our Gemini multi-tool deepagent architecture, and finally our redis subsession caching for cross-context protocols and rolling conversation histories. But above all, we’re most proud that we were able to build something actionable impact, that truly helps clinical trials around the world gain fruition and actually have the opportunity to make a difference in this world.

What we learned

This project, for us, was a seminar in bridging ambitious ideas with real-world execution. Technically, we learned that HIPAA compliance isn't just a feature; it's a core design principle that demands rigorous solutions like our dual-token architecture. We dove deep into the practicalities of MLOps, optimizing Runpod GPU workers for MedGemma to balance speed and cost. Working with a bleeding-edge 2 weeks old framework with minimal documentation (and hence without any help from AI) like DeepAgents brought us into full critical thinking mode. We learned the immense value of methodical, tool-by-tool testing and how to debug a system that few have ever touched. Beyond the code, we learned how to transform a diverse team of sophomores, juniors, and seniors, ranging from first-time hackers to hackathon veterans, into a cohesive unit. The initial friction from our varied experience levels forced us to adopt radical transparency, leading to incredible four-way pair programming sessions where knowledge flowed seamlessly. The newcomers learned advanced development workflows, while the experienced members learned to articulate complex ideas with absolute clarity. We discovered that a team with diverse skills isn't a bottleneck, but instead a catalyst for mentorship and superior communication.

What's next for CodeBlue

The future is bright for CodeBlue. Given the time constraints of the hackathon, we were unable to add in a lot of features we really wished to add on, such as a human-in-the-loop system with clinicians to approve patients taking up clinical trials. We also hoped to make our chat a lot engaging by adding in conversational elements, potentially an active-passive agentic workflow to actively extract information from conversations more effectively. We also use static visuals for timelines, but eventually we may want to try and make dynamic AI-generated visuals. We also tried experimenting with creating videos of people explaining what the clinical trial was using Veo 3 to promote accessiblility, but time constraints caused this to be halted for higher priority features. We also discussed adding in our own recommendation system to match patients and CRCs, but we ultimately did not include this in order to not lose focus on our problem statement, and again, due to time constraints. Towards the end, we had to give up quite a few deepagent features in the frontend and backend due to us running out of Gemini credits and having to keep switching back and forth with OpenAI. Also we didn't have time for notification reminders but that would be super cool.