Large Language Models (LLMs), with their flexible generation abilities, can be powerful data sources in domains with few or no available corpora. However, problems like hallucinations and biases limit such applications. In this case study, we pick nutrition counselling, a domain lacking any public resource, and show that high-quality datasets can be gathered by combining LLMs, crowd-workers and nutrition experts. We first crowd-source and cluster a novel dataset of diet-related issues, then work with experts to prompt ChatGPT into producing related supportive text. Finally, we let the experts evaluate the safety of the generated text. We release HAI-coaching, the first expert-annotated nutrition counselling dataset containing ~2.4K dietary struggles from crowd workers, and ~97K related supportive texts generated by ChatGPT. Extensive analysis shows that ChatGPT while producing highly fluent and human-like text, also manifests harmful behaviours, especially in sensitive topics like mental health, making it unsuitable for unsupervised use.
翻译:大型语言模型凭借其灵活的生成能力,可在缺乏现成语料库的领域中成为强大的数据来源。然而,幻觉现象和偏见等问题限制了此类应用。在本案例研究中,我们选取营养咨询这一缺乏任何公共资源的领域,展示了通过结合大型语言模型、众包工作者和营养专家可获取高质量数据集。我们首先众包采集并聚类了一个新颖的饮食相关问题数据集,随后与专家合作引导ChatGPT生成相关的支持性文本,最终由专家评估生成文本的安全性。我们发布了HAI-coaching——首个经专家标注的营养咨询数据集,包含约2400条来自众包工作者的饮食困扰记录,以及约97000条由ChatGPT生成的关联支持性文本。大量分析表明,ChatGPT在生成高度流畅且类人文本的同时,在心理健康等敏感话题上仍表现出有害行为,使其不适合无监督使用。