plaite | Devpost

Inspiration

Our inspiration for this project came from the increasing awareness of healthy eating and the desire for instant access to nutritional information. We realized that many people struggle with understanding the nutritional value of their meals, especially when eating out or trying new foods they are unfamiliar with. By leveraging computer vision and AI, we aimed to make this process as simple as pointing your phone at the food you want more information about. Additionally, we wanted to offer a more interactive experience by adding a chat feature, allowing users to ask for personalized food suggestions, recipe ideas, or dietary advice. Our goal was to create a tool that empowers people with more nutritional knowledge, and to make personalized healthier food choices with ease.

What it does

The webcam will recognize food that is in frame (using a Computer Vision model) and label it. Those foods will be stored internally, so when you go to the information section, the website will give you nutrional information based on those foods pulled from the USDA data. The chat option will also act as a personal assistant to answer any queries you may have about the food, any suggestions you ask, such as modifications to the food you can make to work towards certain goals (such as substituting some ingredients to reduce or increase calories).

How we built it

We built our project by first drawing out a basic wireframe of the user interface and listing clear goals for what the application should be able to do. Then, we split up people working on the computer vision model, user interface, and API calls. While the CV model was still work in progress, the API calls (USDA and OpenAI) were made to work independently, with todo segments for future integration. Lastly, when the CV model was ready, it was integrated into the website, with passthrough to the USDA API and OpenAI chatbot.

Challenges we ran into

Since our computer vision model needs to be trained on a dataset of food images, we needed a machine powerful enough to train the model in time. At first, we tried to use our personal laptops, but we ended up using Intel Cloud with AI Accelerator to train our dataset of 20k+ food images.

On the UI front, we had issues placing items where we wanted to with Streamlit. Often times, we would have to install extensions to Streamlit, or use markdown to get objects where we wanted. We tried using Figma to wireframe how we wanted our front end to look like at first, but were unable to figure out how to connect the Figma wireframe to the actual code.

Accomplishments that we're proud of

We built a food detection CV model from the ground up in under 12 hours! The model is able to detect with decent confidence a large selection of foods, with particular focus on Japanese cuisine. Additionally, the information and LLM chat features work a lot better than we initially expected. In the end, we met every objective we had when we started the project.

What we learned

Our team learned a lot over hackathon weekend. Our team learned how to build, train, test, and deploy a CV model from the ground up. The model took a lot longer to train than we expected, even on a dedicated machine with AI acceleration built in. Putting bounding boxes around where the CV model detected an object was also a major challenge, since had to overlay images on top of the live video feed. Lastly, we had difficulty finding a sufficiently large and clear dataset of food images to train the model on.

Other skills we learned included how to call OpenAI's GPT-4o API, how to create a LLM chatbot, how to use Streamlit to build our front end, and how to manage data extracted from the USDA API (pre-processing, displaying, modifying, and storage).

What's next for plaite

We want to train the model on a larger dataset of images, with cuisines from various countries. Additionally, we want to increase the inference speed and the frame rate of the camera, which is currently limited by the computer we run it on. Lastly, we would like to make the front end a lot smoother, particularly on the Information page.