-
-
Web page listing upcoming and past free food events.
-
GIF
DAIN using our functions as tools to inform users about upcoming free food events!
-
(Static Image Version) DAIN using our functions as tools to inform users about upcoming free food events!
-
Web scraper screenshotting instagram feed for Gemini to process
-
Exploration of Network tab on DevTools to identify Instagram internal APIs
FREE FOOD!!!!
To tackle food insecurity on campus, our project scrapes dozens of UCSD Instagram accounts to create a single feed of events with free food. This is more difficult than it looks—there is no usable Instagram API, and Instagram is actively designed against automated web scraping. We found an approach that allows our web scraper to behave more like a regular Instagram user, allowing it to stay within rate limits.
Inspiration
UC San Diego (UCSD) has a lot of free food events, yet some fail to attract enough attendees, leading to food waste. Meanwhile, almost half of UCSD students—53% among underrepresented minorities—have experienced food insecurity. Our dining halls are not all-you-can-eat, so when a student relying on financial aid runs out of dining dollars, they no longer have guaranteed meals. Finally, UCSD also has a negative reputation of being "socially dead," especially compared to its sister school SDSU.
We've created a way for students to quickly learn about free food events, tackling food insecurity, food waste, and the lack of social livelihood on campus in one single sweep.
What it does
Our project consists of an event scraper and multiple ways of disseminating these events. Our Instagram web scraper automatically logs into Instagram with an account set up to follow dozens of UCSD student organizations, allowing its home page to be a feed of all events on campus. It then uses Gemini 2.0 Flash, which has proven to be good enough for this job, to extract event information, such as what food is available and when and where the event is held. These events are then stored in a MongoDB database.
To make navigating these events easier for students, we made two tools:
- We created a website that lists upcoming free food events.
- We also implemented a DAIN service that informs users about upcoming and past free food events. DAIN's reasoning capabilities allow the user to interface with the events in a more personalized manner, such as filtering by events with boba or vegetarian options.
How we built it
Our project uses Playwright to spawn a headless instance of Firefox. It visits Instagram's website with cookies pre-loaded so that it starts off logged into our project's Instagram account. It simulates user behavior, scrolling down to view posts and clicking through stories. While doing this, it also inspects the website's network activity. Instagram makes GraphQL calls for getting post and story data, so we eavesdrop on these requests and make copies of the data, which is in human-readable JSON. For debugging purposes, our web scraper can also take screenshots of what it sees.
Then, we use Gemini 2.0 Flash to process post and story images and captions, and extract key details like free food, location, and time into a JSON object. We selected this model because it is fast and already good enough at extracting event information from images. Below is the JSON schema included as part of the prompt to Gemini. We then store this data inside a MongoDB database for future use.
{
"freeFood": string[], // List only free consumable items, using the original phrasing from the post (e.g. "Dirty Birds", "Tapex", "boba", "refreshments", "snacks", "food"). Empty if no free consumables.
"location": string,
"date": { "year": number; "month": number; "date": number }, // Month is between 1 and 12
"start": { "hour": number; "minute": number }, // 24-hour format
"end": { "hour": number; "minute": number } // 24-hour format, optional and omitted if no end time specified
}`;
DAIN Service
We created a service for DAIN with tools for getting all free-food events and listing events on a specific day. We return the event data as both LLM-readable JSON objects and a human-readable table, which allows the user to see the events that DAIN sees, as well as allowing DAIN to perform further data processing on the events.
Challenges we ran into
Initially, we explored APIs like Instaloader as a method of gathering data from Instagram. However, these APIs often quickly ran into rate limits, which would be infeasible considering the number of student orgs at UCSD.
Accomplishments that we're proud of
We identified relevant Instagram internal APIs that gave us data from Instagram stories and a user profile's timeline.
What we learned
Gemini is quite useful for generating structured data from non-standardized images like those on Instagram. By giving Gemini an example of what we want our structured prompt to look like, we can get much more consistent results. We settled on the following prompt:
Using the following flyers and caption, output only a JSON array of event objects without any explanation or formatting, whose contents each conform to the following schema.
{
"freeFood": string[], // List only free consumable items, using the original phrasing from the post (e.g. "Dirty Birds", "Tapex", "boba", "refreshments", "snacks", "food"). Empty if no free consumables.
"location": string,
"date": { "year": number; "month": number; "date": number }, // Month is between 1 and 12
"start": { "hour": number; "minute": number }, // 24-hour format
"end": { "hour": number; "minute": number } // 24-hour format, optional and omitted if no end time specified
}
DAIN is also very impressive and surprisingly easy to set up.
What's next for FREE FOOD!!!!
We would love to expand to serving other campuses! Additionally, we could consider mixing our scraper with attendance data to identify under-attended events, and then highlight those events more during user queries to our DAIN agent.
Built With
- dain
- eslint
- express.js
- gemini
- mongodb
- node.js
- playwright
- python
- react
- selenium
- typescript
- vite
Log in or sign up for Devpost to join the conversation.