NutriBench: A Dataset for Evaluating Large Language Models in Carbohydrate Estimation from Meal Descriptions

Accurate nutrition estimation helps people make informed dietary choices and is essential in the prevention of serious health complications. We present NutriBench, the first publicly available natural language meal description nutrition benchmark. NutriBench consists of 11,857 meal descriptions generated from real-world global dietary intake data. The data is human-verified and annotated with macro-nutrient labels, including carbohydrates, proteins, fats, and calories. We conduct an extensive evaluation of NutriBench on the task of carbohydrate estimation, testing twelve leading Large Language Models (LLMs), including GPT-4o, Llama3.1, Qwen2, Gemma2, and OpenBioLLM models, using standard, Chain-of-Thought and Retrieval-Augmented Generation strategies. Additionally, we present a study involving professional nutritionists, finding that LLMs can provide more accurate and faster estimates. Finally, we perform a real-world risk assessment by simulating the effect of carbohydrate predictions on the blood glucose levels of individuals with diabetes. Our work highlights the opportunities and challenges of using LLMs for nutrition estimation, demonstrating their potential to aid professionals and laypersons and improve health outcomes. Our benchmark is publicly available at: https://mehak126.github.io/nutribench.html

翻译：准确的营养估算有助于人们做出明智的饮食选择，对于预防严重健康并发症至关重要。我们提出了NutriBench，这是首个公开可用的自然语言膳食描述营养基准。NutriBench包含11,857条基于真实世界全球膳食摄入数据生成的膳食描述。该数据经过人工验证，并标注了宏量营养素标签，包括碳水化合物、蛋白质、脂肪和热量。我们在碳水化合物估算任务上对NutriBench进行了广泛评估，测试了十二个领先的大语言模型（LLMs），包括GPT-4o、Llama3.1、Qwen2、Gemma2以及OpenBioLLM系列模型，并采用了标准、思维链和检索增强生成策略。此外，我们开展了一项涉及专业营养师的研究，发现LLMs能够提供更准确、更快速的估算结果。最后，我们通过模拟碳水化合物预测对糖尿病患者血糖水平的影响，进行了实际风险评估。我们的工作凸显了使用LLMs进行营养估算的机遇与挑战，证明了其在辅助专业人士与普通用户、改善健康结果方面的潜力。我们的基准已公开提供：https://mehak126.github.io/nutribench.html

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日