Chat-based large language models have the opportunity to empower individuals lacking high-quality healthcare access to receive personalized information across a variety of topics. However, users may ask underspecified questions that require additional context for a model to correctly answer. We study how large language model biases are exhibited through these contextual questions in the healthcare domain. To accomplish this, we curate a dataset of sexual and reproductive healthcare questions that are dependent on age, sex, and location attributes. We compare models' outputs with and without demographic context to determine group alignment among our contextual questions. Our experiments reveal biases in each of these attributes, where young adult female users are favored.
翻译:基于聊天的大语言模型有机会为缺乏高质量医疗保健服务的个体提供跨多种主题的个性化信息。然而,用户可能提出未充分说明的问题,需要额外的上下文才能让模型正确回答。我们研究大语言模型在医疗保健领域如何通过这些上下文相关问题展现出偏见。为此,我们整理了一个依赖于年龄、性别和地点属性的性与生殖健康相关问题数据集。我们比较模型在有和没有人口统计上下文情况下的输出,以确定上下文问题的群体对齐情况。我们的实验揭示了每个属性中的偏见,其中年轻成年女性用户更受青睐。