In this paper, we develop an LLM-powered framework for the curation and evaluation of emerging opinion mining in online health communities. We formulate emerging opinion mining as a pairwise stance detection problem between (title, comment) pairs sourced from Reddit, where post titles contain emerging health-related claims on a topic that is not predefined. The claims are either explicitly or implicitly expressed by the user. We detail (i) a method of claim identification -- the task of identifying if a post title contains a claim and (ii) an opinion mining-driven evaluation framework for stance detection using LLMs. We facilitate our exploration by releasing a novel test dataset, Long COVID-Stance, or LC-stance, which can be used to evaluate LLMs on the tasks of claim identification and stance detection in online health communities. Long Covid is an emerging post-COVID disorder with uncertain and complex treatment guidelines, thus making it a suitable use case for our task. LC-Stance contains long COVID treatment related discourse sourced from a Reddit community. Our evaluation shows that GPT-4 significantly outperforms prior works on zero-shot stance detection. We then perform thorough LLM model diagnostics, identifying the role of claim type (i.e. implicit vs explicit claims) and comment length as sources of model error.
翻译:本文提出了一个基于大型语言模型(LLM)的框架,用于在线健康社区中新观点挖掘的策展与评估。我们将新观点挖掘形式化为一个成对立场检测问题,涉及来自Reddit的(标题,评论)对,其中帖子标题包含关于未预定义主题的新兴健康相关主张。这些主张由用户明确或隐含地表达。我们详细阐述了(i)一种主张识别方法(即识别帖子标题是否包含主张的任务),以及(ii)一种基于LLM的立场检测评估框架,该框架以观点挖掘为驱动。通过发布一个新的测试数据集——长新冠立场数据集(Long COVID-Stance,简称LC-Stance),我们推动了相关探索。该数据集可用于评估LLM在在线健康社区中主张识别和立场检测任务上的表现。长新冠是一种新兴的COVID后综合征,其治疗指南不确定且复杂,因此成为我们任务的适用案例。LC-Stance包含来自Reddit社区的与长新冠治疗相关的讨论。我们的评估表明,GPT-4在零样本立场检测任务上显著优于先前工作。随后,我们进行了全面的LLM模型诊断,识别出主张类型(即隐含与明确主张)和评论长度作为模型错误来源的作用。