Large language models have shown unprecedented abilities in generating linguistically coherent and syntactically correct natural language output. However, they often return incorrect and inconsistent answers to input questions. Due to the complexity and uninterpretability of the internally learned representations, it is challenging to modify language models such that they provide correct and consistent results. The data management community has developed various methods and tools for providing consistent answers over inconsistent datasets. In these methods, users specify the desired properties of data in a domain in the form of high-level declarative constraints. This approach has provided usable and scalable methods to delivering consistent information from inconsistent datasets. We aim to build upon this success and leverage these methods to modify language models such that they deliver consistent and accurate results. We investigate the challenges of using these ideas to obtain consistent and relevant answers from language models and report some preliminary empirical studies.
翻译:大型语言模型在生成语言连贯且句法正确的自然语言输出方面展现出前所未有的能力。然而,它们常对输入问题给出错误且不一致的答案。由于内部学习表征的复杂性与不可解释性,修改语言模型以使其提供正确且一致的结果颇具挑战性。数据管理领域已开发出多种方法及工具,用于在不一致的数据集上提供一致的答案。在这些方法中,用户以高层声明式约束的形式指定数据在某一领域的期望属性。这一方法提供了可用的、可扩展的手段,能从非一致数据集中交付一致的信息。我们旨在借鉴这一成功经验,利用这些方法修改语言模型,使其产生一致且准确的结果。我们探讨了运用这些思想从语言模型中获得一致且相关答案所面临的挑战,并报告了初步的实验研究。