NutriBench

Welcome to the official repository for NutriBench.

A Dataset for Evaluating Large Language Models on Nutrition Estimation from Meal Descriptions

🌐 Project Page | 📝 Paper (ICLR 2025) | 📊 Dataset | 🔗 Github

News

[2025/04/08] NutriBench v2 is released! Now supports 24 countries with improved diversity in meal descriptions.
[2025/03/16] We’ve launched LLM-Based Carb Estimation via Text Message!
- For US phone numbers, text your meal description to +1 (866) 698-9328.
- For WhatsApp, send a message to +1 (555) 730-0221.
[2025/02/11] 🎉 Our NutriBench paper has been accepted at ICLR 2025!
[2024/10/16] Released NutriBench v1, the First benchmark for evaluating nutrition estimation from meal descriptions.

Dataset

Please refer to our 🔗 Dataset

Inference

python inference.py

This script will take meal descriptions as input and return estimated carbohydrate values using a pretrained LLM.

Benchmark

We currently use lm-evaluation-harness to benchmark models on NutriBench.

🛠️ We are working on merging our NutriBench task into the main repo via a pull request.

To benchmark your model on NutriBench before the merge, follow these steps:

Clone the lm-evaluation-harness repository:

git clone https://github.com/EleutherAI/lm-evaluation-harness.git
cd lm-evaluation-harness
pip install -e .

Copy the NutriBench task folder into lm_eval/tasks:

cp -r [path_to_nutribench_repo]/nutribench ./lm_eval/tasks/

Run the benchmark command (example for vLLM):

lm_eval \
  --model vllm \
  --model_args pretrained=[model_path],tensor_parallel_size=1,dtype=auto,gpu_memory_utilization=0.8,data_parallel_size=1 \
  --batch_size auto \
  --tasks nutribench_v2_cot \
  --output_path results \
  --seed 42 \
  --log_samples \
  --apply_chat_template

You can change nutribench_v2_cot to other tasks (e.g., nutribench_v2_base, etc.) depending on your use case. Please refer to lm-evaluation-harness for detailed documentation.

Reference Result

Task	[email protected]	MAE
nutribench_v2_base	0.3301	36.18
nutribench_v2_cot	0.3527	37.17

These reference results were obtained using Meta-Llama-3.1-8B-Instruct.

Citation

If you find NutriBench helpful, please consider citing:

@article{hua2024nutribench,
  title={NutriBench: A Dataset for Evaluating Large Language Models on Nutrition Estimation from Meal Descriptions},
  author={Hua, Andong and Dhaliwal, Mehak Preet and Burke, Ryan and Pullela, Laya and Qin, Yao},
  journal={arXiv preprint arXiv:2407.12843},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
nutribench		nutribench
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NutriBench

Welcome to the official repository for NutriBench.

News

Dataset

Inference

Benchmark

Reference Result

Citation

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

DongXzz/NutriBench

Folders and files

Latest commit

History

Repository files navigation

NutriBench

Welcome to the official repository for NutriBench.

News

Dataset

Inference

Benchmark

Reference Result

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages