Eating disorders (ED), a severe mental health condition with high rates of mortality and morbidity, affect millions of people globally, especially adolescents. The proliferation of online communities that promote and normalize ED has been linked to this public health crisis. However, identifying harmful communities is challenging due to the use of coded language and other obfuscations. To address this challenge, we propose a novel framework to surface implicit attitudes of online communities by adapting large language models (LLMs) to the language of the community. We describe an alignment method and evaluate results along multiple dimensions of semantics and affect. We then use the community-aligned LLM to respond to psychometric questionnaires designed to identify ED in individuals. We demonstrate that LLMs can effectively adopt community-specific perspectives and reveal significant variations in eating disorder risks in different online communities. These findings highlight the utility of LLMs to reveal implicit attitudes and collective mindsets of communities, offering new tools for mitigating harmful content on social media.
翻译:饮食失调(ED)是一种严重的心理健康疾病,具有较高的死亡率和发病率,影响着全球数百万人,尤其是青少年。宣扬并正常化ED的在线社区的激增,与这一公共卫生危机密切相关。然而,由于使用编码语言和其他混淆手段,识别有害社区具有挑战性。为应对这一挑战,我们提出了一种新颖的框架,通过使大型语言模型(LLMs)适应特定社区的语言,来揭示在线社区的隐含态度。我们描述了一种对齐方法,并从语义和情感多个维度评估了结果。随后,我们使用社区对齐的LLM来回应旨在识别个体ED的心理测量问卷。我们证明,LLMs能够有效采纳社区特定的视角,并揭示不同在线社区在饮食失调风险上的显著差异。这些发现凸显了LLMs在揭示社区隐含态度和集体心态方面的效用,为减轻社交媒体上的有害内容提供了新工具。