Foodborne illnesses are gastrointestinal conditions caused by consuming contaminated food. Restaurants are critical venues to investigate outbreaks because they share sourcing, preparation, and distribution of foods. Public reporting of illness via formal channels is limited, whereas social media platforms host abundant user-generated content that can provide timely public health signals. This paper analyzes signals from Yelp reviews produced by a Hierarchical Sigmoid Attention Network (HSAN) classifier and compares them with official restaurant inspection outcomes issued by the New York City Department of Health and Mental Hygiene (NYC DOHMH) in 2023. We evaluate correlations at the Census tract level, compare distributions of HSAN scores by prevalence of C-graded restaurants, and map spatial patterns across NYC. We find minimal correlation between HSAN signals and inspection scores at the tract level and no significant differences by number of C-graded restaurants. We discuss implications and outline next steps toward address-level analyses.
翻译:食源性疾病是由摄入受污染食物引起的胃肠道疾病。餐厅因其在食材采购、加工制备及食品分发环节的共性,成为调查疾病暴发的关键场所。通过正式渠道进行的公共卫生事件报告有限,而社交媒体平台承载着大量用户生成内容,可为公共卫生监测提供即时信号。本文通过分层S型注意力网络(HSAN)分类器解析Yelp评论中的健康信号,并将其与纽约市卫生与心理卫生局(NYC DOHMH)2023年发布的官方餐厅检查结果进行对比。我们在人口普查区层面评估相关性,比较不同C级餐厅密度区域的HSAN评分分布,并绘制纽约市全域空间格局图。研究发现:HSAN信号与检查分数在区域层面相关性微弱,且不同C级餐厅数量区域间未呈现显著差异。最后探讨了研究意义,并提出了向地址级分析推进的后续研究方向。