Large Language Models (LLMs) bring transformative benefits alongside unique challenges, including intellectual property (IP) and ethical concerns. This position paper explores a novel angle to mitigate these risks, drawing parallels between LLMs and established web systems. We identify "citation" - the acknowledgement or reference to a source or evidence - as a crucial yet missing component in LLMs. Incorporating citation could enhance content transparency and verifiability, thereby confronting the IP and ethical issues in the deployment of LLMs. We further propose that a comprehensive citation mechanism for LLMs should account for both non-parametric and parametric content. Despite the complexity of implementing such a citation mechanism, along with the potential pitfalls, we advocate for its development. Building on this foundation, we outline several research problems in this area, aiming to guide future explorations towards building more responsible and accountable LLMs.
翻译:大型语言模型(LLMs)带来了变革性益处的同时,也引发了知识产权(IP)和伦理等独特挑战。本立场论文借鉴LLMs与已有网络系统的相似性,探索减轻这些风险的新视角。我们指出"引用"——对来源或证据的致谢或参照——是LLMs中至关重要却缺失的组件。引入引用机制可提升内容透明度与可验证性,从而直面LLMs部署中的知识产权与伦理问题。我们进一步提出,为LLMs构建全面的引用机制应兼顾非参数化与参数化内容。尽管实施此类引用机制存在复杂性及潜在陷阱,我们仍倡导其开发。基于此框架,我们概述了该领域的若干研究问题,旨在引导未来探索方向,以构建更负责任且可问责的LLMs。