Inspiration
The recurring challenge of disease outbreaks across Africa, from malaria and cholera to the more recent COVID-19 pandemic, highlighted the critical need for more proactive and responsive public health systems. We witnessed how delays in detection and information sharing can lead to devastating consequences for communities.
Existing surveillance methods often rely on manual reporting, leading to lags and incomplete data. Social media platforms, while rich in real-time information, are unstructured and difficult to analyze for public health trends. Weather patterns also play a significant role in vector-borne diseases. We were inspired to create a solution that could weave together these disparate data sources using the power of AI and modern data platforms to provide early warnings and actionable insights, ultimately saving lives and protecting communities across the continent.
What it does
Wavi is an intelligent disease surveillance platform designed for public health officials in Africa. It integrates diverse data streams and employs AI to identify early signs of potential disease outbreaks, enabling faster and more targeted interventions. Specifically, Wavi:
- Aggregates Heterogeneous Data: Wavi seamlessly collects and integrates data from various sources, including:
- Hospital Records: Anonymized patient data on reported illnesses.
- Social Media: Publicly available posts and trends related to health concerns, filtered and analyzed for relevant keywords and sentiment.
- Weather Data: Real-time and historical weather patterns that can influence disease vectors (e.g., rainfall for malaria-carrying mosquitoes).
- News Reports: Information on potential outbreaks or unusual health events reported in the media.
- Mobile Phone Surveys (Future Integration): Opt-in surveys to gather real-time health information from communities.
- AI-Powered Anomaly Detection: Using machine learning algorithms on Google Cloud AI Platform, Wavi analyzes the integrated data to identify unusual spikes or patterns in disease indicators that might signal an impending outbreak. This includes:
- Time Series Analysis: Detecting deviations from historical trends in reported cases.
- Natural Language Processing (NLP): Analyzing social media and news reports for early mentions of symptoms or disease clusters.
- Spatial Analysis: Identifying geographic hotspots where disease indicators are elevated.
- Predictive Modeling: Wavi employs AI models to forecast the potential spread and intensity of outbreaks based on historical data, weather patterns, and population density.
- Real-time Visualization and Alerting: The platform provides public health officials with an intuitive dashboard on Google Cloud, displaying real-time maps, charts, and key indicators. Automated alerts are triggered when anomalies or high-risk areas are detected, enabling rapid response.
- MongoDB Integration for Enhanced Search and Analysis:
- Flexible Data Storage: MongoDB's document model allows for easy ingestion and querying of diverse and evolving data structures from various sources.
- Powerful Search Capabilities: MongoDB's full-text search enables efficient analysis of textual data from social media and news reports.
- Vector Search for Semantic Understanding: We utilize MongoDB's vector search capabilities (integrated with embeddings generated by AI models on Google Cloud) to understand the semantic meaning of text data, allowing for more nuanced identification of disease-related conversations and emerging symptoms even with variations in language.
- Google Cloud Integrations: Leveraging MongoDB Atlas on Google Cloud provides scalability, reliability, and seamless integration with other Google Cloud services like AI Platform and BigQuery for advanced analytics.
How we built it
Wavi was built using a combination of cutting-edge technologies on Google Cloud and MongoDB:
- Data Ingestion: We utilized Google Cloud Functions and Cloud Dataflow to create scalable pipelines for ingesting and processing data from various sources. APIs were developed to pull data from hospital databases (with appropriate anonymization), social media platforms (using public APIs), and weather services.
- Data Storage: MongoDB Atlas on Google Cloud serves as the central data repository, leveraging its flexible schema to accommodate the diverse data formats. Collections were designed to store hospital records, social media posts, weather data, and aggregated analytical results.
- AI/ML Modeling: We employed Google Cloud AI Platform to train and deploy machine learning models for anomaly detection and predictive forecasting. This involved:
- Feature Engineering: Creating relevant features from the integrated data (e.g., daily case counts, sentiment scores, weather patterns).
- Model Selection: Experimenting with various time series models (like ARIMA, Prophet), NLP models (for sentiment analysis and topic modeling), and classification models (for outbreak prediction).
- Model Training and Evaluation: Training models on historical datasets and evaluating their performance using appropriate metrics.
- Vector Embeddings and Search: We used AI models (deployed on Google Cloud) to generate vector embeddings for the textual data from social media and news reports. These embeddings were stored in MongoDB, enabling semantic search using MongoDB's vector search capabilities. This allows us to find relevant information even if the exact keywords are not present.
- Backend API: A RESTful API, built using Google Cloud Run , acts as an intermediary between the frontend and the backend services (data ingestion, AI models, MongoDB).
- Frontend Dashboard: A user-friendly web application, built using React hosted on Google Cloud Storage , provides public health officials with interactive visualizations, real-time alerts, and reporting capabilities. It consumes data from the backend API.
- Google Cloud Integrations: We leveraged Google Cloud IAM for secure access control, Cloud Monitoring for platform health and performance monitoring, and BigQuery for potential large-scale data analysis and exploration.
Challenges we ran into
- Data Availability and Quality: Accessing consistent and high-quality data from diverse sources, especially hospital records, presented a significant challenge due to varying data formats, completeness, and privacy regulations. We focused on establishing secure and compliant data sharing mechanisms.
- Handling Noisy Social Media Data: Filtering out irrelevant information and noise from social media streams while still capturing valuable early signals required careful development of NLP models and keyword filtering strategies.
- Language Barriers in Social Media: Analyzing social media content in multiple African languages posed a challenge. We explored using multilingual NLP models and translation services.
- Building Trust and Adoption: Ensuring that public health officials trust and adopt the platform requires clear communication, user training, and demonstrating the accuracy and reliability of the AI-driven insights.
- Scalability and Cost Optimization: Designing a platform that can handle potentially massive data volumes and scale efficiently on Google Cloud while optimizing costs was a key consideration.
Accomplishments that we're proud of
- Successful Integration of Diverse Data Sources: We built a functional prototype that successfully integrates data from simulated hospital records, social media feeds (using relevant keywords), and weather APIs into MongoDB.
- Implementation of AI-Powered Anomaly Detection: We developed and deployed machine learning models on Google Cloud AI Platform that can identify simulated disease outbreaks with a promising level of accuracy.
- Demonstration of MongoDB's Vector Search for Semantic Analysis: We successfully implemented vector search in MongoDB to analyze social media text, demonstrating its ability to identify relevant health-related discussions based on meaning, not just keywords.
- User-Friendly Dashboard Prototype: We created an intuitive web dashboard that provides real-time visualizations of key disease indicators and alerts.
- Strong Foundation for Scalability and Reliability: By leveraging Google Cloud and MongoDB Atlas, we built a platform with a solid foundation for handling large datasets and scaling to serve multiple regions.
What we learned
- The Power of Data Integration for Public Health: Combining seemingly disparate data sources can provide a much more comprehensive and timely picture of public health risks.
- The Synergistic Potential of AI and Modern Databases: AI algorithms on Google Cloud, coupled with MongoDB's flexible data handling and advanced search capabilities (including vector search), create a powerful toolkit for intelligent data analysis.
- The Importance of Context-Specific AI Development: Building effective AI models for disease surveillance requires careful consideration of the specific context, including local languages, cultural nuances in health reporting, and the unique challenges of data collection in Africa.
- The Critical Role of Collaboration: Developing a successful public health platform requires close collaboration with public health officials, data scientists, and technology experts.
- The Ethical Considerations of Data Usage: Handling sensitive health data requires strict adherence to privacy regulations and ethical guidelines.
What's next for Wavi
Our vision for Wavi extends beyond this hackathon. We plan to:
- Expand Data Source Integration: Incorporate data from mobile phone surveys, news outlets in more local languages, and environmental data.
- Enhance AI Models: Develop more sophisticated predictive models that can forecast the severity and geographic spread of outbreaks with greater accuracy.
- Implement Real-time Alerting Systems: Develop robust and timely alert mechanisms for public health officials at national and local levels.
- Strengthen Multilingual Support: Integrate more African languages into the platform's analysis and user interface.
- Pilot Testing and User Feedback: Conduct pilot testing with public health organizations in select African countries to gather real-world feedback and refine the platform based on their needs.
- Explore Integration with Existing Public Health Systems: Investigate how Wavi can be seamlessly integrated with existing national health information systems.
- Develop Community Engagement Features: Explore ethical and privacy-preserving ways to engage communities in reporting health concerns through the platform.

Log in or sign up for Devpost to join the conversation.