Reliance on images for dietary assessment is an important strategy to accurately and conveniently monitor an individual's health, making it a vital mechanism in the prevention and care of chronic diseases and obesity. However, image-based dietary assessment suffers from estimating the three dimensional size of food from 2D image inputs. Many strategies have been devised to overcome this critical limitation such as the use of auxiliary inputs like depth maps, multi-view inputs, or model-based approaches such as template matching. Deep learning also helps bridge the gap by either using monocular images or combinations of the image and the auxillary inputs to precisely predict the output portion from the image input. In this paper, we explore the different strategies employed for accurate portion estimation.
翻译:依赖图像进行膳食评估是准确便捷监测个体健康的重要策略,使其成为预防和护理慢性疾病与肥胖的关键机制。然而,基于图像的膳食评估存在从二维图像输入估计食物三维尺寸的难题。为克服这一关键限制,学界已提出多种策略,例如使用深度图等辅助输入、多视角输入,或基于模板匹配等模型方法。深度学习通过使用单目图像或结合图像与辅助输入来精确预测图像输入对应的食物分量,进一步弥合了该领域的技术鸿沟。本文系统探讨了实现精确分量估计所采用的不同策略。