🧠 Auralysis

AI-powered medical image diagnosis, reporting, explanation, and doctor-style voice output.

💡 Inspiration

Healthcare reports are often slow, unclear, and difficult for patients to understand. MRI and X-ray interpretation requires expert review, which isn’t always instantly available.

Auralysis started with a simple question:

What if an AI could diagnose medical images and also explain the results clearly in a doctor’s voice?

This led to building a system that diagnoses, reports, explains, and speaks medical insights automatically.

⚙️ What It Does

Auralysis performs a complete end-to-end medical AI pipeline:

  • Predicts disease class using a TensorFlow model
  • Generates a structured clinical report using Groq LLaMA
  • Simplifies the explanation for patients using Gemini
  • Converts the summary into a doctor-style audio output using ElevenLabs
  • Tracks latency, errors, requests, and confidence using Datadog

Pipeline: Diagnose → Report → Explain → Speak

🛠️ How I Built It

🧪 Model Training

  • Dataset: 5400 MRI & X-ray images
  • 6 medical classes
  • Model: EfficientNetV2B0 (TensorFlow/Keras)
  • Validation accuracy: 86.94%

☁️ Render ML Inference API

  • Exported TensorFlow SavedModel
  • Built FastAPI inference service
  • Returns prediction, confidence, and probabilities
  • Deployed on Render as a scalable inference microservice

🧩 Railway Backend (Pipeline Brain)

Handles the full pipeline:

  • Groq LLaMA → structured JSON medical report
  • Gemini → simplified patient-friendly summary
  • ElevenLabs → doctor-style MP3 voice report
  • Datadog → live monitoring of every request

📱 Flutter Mobile App

Users can:

  • Upload MRI/X-ray images
  • Trigger the pipeline
  • Receive diagnosis + detailed report
  • Listen to a doctor-style spoken explanation

🚧 Challenges

  • Managing flow between Groq → Gemini → ElevenLabs → Render
  • Keeping LLM outputs stable and strictly JSON
  • Reducing latency across cloud services
  • Integrating Datadog metrics
  • Securing API keys across platforms

🏆 Accomplishments

  • Complete real-time medical AI pipeline
  • Dual-LLM architecture for clarity and accuracy
  • Automated voice-based medical reporting
  • Full Datadog observability dashboard
  • A mobile app delivering results in seconds

📚 What I Learned

  • Production-grade ML deployment
  • Multi-service AI orchestration
  • Why observability matters for reliability
  • AI explainability improves patient understanding
  • Voice output greatly improves accessibility

🔮 What’s Next

🚀 Short Term

  • Add more disease classes
  • Improve severity scoring
  • Add multilingual voice support

🏥 Long Term

  • Doctor review portal
  • Steps toward medical compliance
  • CT/ultrasound image support
  • Stronger LLM reasoning and validation

❤️ Final Thoughts

Auralysis began as an attempt to make medical AI easier to understand.
Now it diagnoses, explains, and speaks medical insights clearly.

Clear diagnosis. Clear explanation. Clear voice.

Built With

  • api-handling
  • datadog
  • efficientnetv2b0
  • elevenlabs
  • fastapi
  • flutter
  • gemini
  • google
  • groq
  • latency-&-error-tracking-railway-?-llm-+-tts-pipeline-deployment-&-environment-variables-render-?-ml-model-inference-hosting-android-/-mobile-app-?-frontend-for-uploading-scans-&-receiving-results-dotenv-?-requests-?-pil-?-numpy-?-preprocessing
  • llama
  • llm
  • metrics
  • railway
  • render
  • tensorflow
  • uvicorn
Share this project:

Updates

posted an update

Auralysis has evolved from an initial medical image classification prototype into a fully deployed, end-to-end medical AI system. What started as a model that could classify MRI and X-ray images has now grown into a complete diagnostic pipeline.

The current version not only analyzes MRI and X-ray scans, but also generates structured medical reports using LLMs, simplifies the results into patient-friendly explanations, and delivers doctor-style voice output. The entire workflow is cloud-deployed and continuously monitored using Datadog to track requests, latency, confidence, and error signals in real time.

A mobile app now brings everything together, allowing users to upload images and receive both written and voice-based explanations within seconds. This evolution has transformed Auralysis from a standalone ML model into a production-ready, observable, and user-focused medical AI application.

Next steps include expanding disease coverage, adding multilingual voice support, and further improving clinical accuracy and reliability. More updates coming soon.

Log in or sign up for Devpost to join the conversation.