Inspiration
I’ve always loved nature, but I noticed that many apps for citizen science where regular people help collect data for scientists felt a bit like digital homework. You upload a photo, and that’s it. I wanted to build something that gives back to the user. I wanted a kid or a hobbyist to take a photo of a species and immediately get a AI powered digital textbook created just for them to understand more about the species.
What it does
WildLearn is an AI-powered citizen science app that turns a species sighting into an immersive digital encyclopedia. When you upload a sighting, the app plots it on a map and uses Gemini AI to instantly build a "Learn" page featuring habitat videos with natural sounds, informations like fun facts, and infographics images. WildLearn transforms every discovery in nature into a high-tech, interactive lesson.
How we built it
I built WildLearn using Flutter within Android Studio to ensure a high-performance mobile experience. The backend is hosted on Google Cloud, which manages the logic for calling Gemini APIs to generate species-specific videos, images, and text in real time. To handle data management, I integrated Firebase to securely store and sync all species sighting information, ensuring that every user upload is instantly saved and plotted on the map.
Challenges we ran into
The main challenges was to design the UI for learning page so that video, text, and images are properly arranged on the phone screen without looking messy.
Accomplishments that we're proud of
I'm incredibly proud of how WildLearn takes a simple user photo and instantly generates a high-quality educational suite. Most apps just store data, Wildlearn creates a personalized documentary experience including video, natural soundscapes, and AI-generated infographics
What we learned
I gained significant technical expertise in cloud architecture by learning how to host and manage custom backends on Google Cloud to bridge my mobile client with advanced AI services. I also mastered Gemini’s interleaved and mixed output capabilities, which allowed me to orchestrate a single, cohesive response containing a blend of text, images, and video data.

Log in or sign up for Devpost to join the conversation.