Inspiration
We were inspired by NutroPNG to make an app that helps people track their nutrition, but we also wanted to add estimations on portion sizes.
What it does
Snap a pic of your meal, and it'll break down the calories, fats, protein, and carbs in your meal. Plus, it keeps tabs on your goals and throws in some consequences if you don't hit your daily calorie targets.
How we built it
When we receive a food picture, we kick things off by breaking down the meal into its individual ingredients using FastSam for masking. For each food segment, we determine its area, leveraging DepthAnything to calculate the depth and ultimately derive the volume of the food. Next, we caption each food item to gather details on food density (g/cm^3) and nutritional data like calories per 100g, courtesy of CalorieNinja. We then crunch the numbers to compute the nutritional information and present it for you.
Changes were made
Thats what we initially taught would work, but gpt4vision does this much better because of the amazing engineering at OpenAI. So we ended up just using gpt4vision
Challenges we ran into
We encountered a major hurdle while working with the depthAnything model, particularly in handling background values. The model struggled with computing accurate background values, leading to skewed calculations for food density. To address this, we successfully tackled the issue by eliminating outlier values from the dataset, resulting in more precise results.
Another challenge arose during the captioning of segmented food ingredients. Without the context of the entire image, models faced difficulty identifying specific ingredients solely based on the mask. To overcome this, we devised a solution by incorporating GPT vision. This approach provided the model with a holistic view of the entire image alongside the mask, facilitating the accurate identification of ingredients.
After seeing how good GPT vision was, we ended up using GPT to run all our estimations for us, which greatly improved our accuracy.
Accomplishments that we're proud of
Navigating through various engineering challenges, our team systematically devised solutions, culminating in the successful delivery of a functional product within the Hackathon's constrained timeline. Notably, we integrated new machine learning tools to address specific technical issues encountered during the development process.
We pulled a pivot to gpt4vision which greatly improved our accuracy.
What we learned
We gained valuable insights into computing the areas of items within a picture, relying solely on pixel data, depth information, and the field of view (FOV) of the camera.
What's next for CalorieJpg
The next steps for CalorieJpg involve refining and expanding the capabilities of our system. Here's what's on the horizon:
Enhanced Depth Computation: Further improve our depth computation algorithms to ensure more accurate volume calculations for food items, addressing any remaining issues with background values.
Built With
- calorieninja
- depthanything
- fastsam
- python
- svelte
- yolo
Log in or sign up for Devpost to join the conversation.