Large Language Models (LLMs) bring transformative benefits alongside unique challenges, including intellectual property (IP) and ethical concerns. This position paper explores a novel angle to mitigate these risks, drawing parallels between LLMs and established web systems. We identify "citation" - the acknowledgement or reference to a source or evidence - as a crucial yet missing component in LLMs. Incorporating citation could enhance content transparency and verifiability, thereby confronting the IP and ethical issues in the deployment of LLMs. We further propose that a comprehensive citation mechanism for LLMs should account for both non-parametric and parametric content. Despite the complexity of implementing such a citation mechanism, along with the potential pitfalls, we advocate for its development. Building on this foundation, we outline several research problems in this area, aiming to guide future explorations towards building more responsible and accountable LLMs.
翻译:大型语言模型(LLMs)在带来变革性益处的同时,也引发了包括知识产权与伦理关切在内的独特挑战。本文立足立场视角,通过类比大型语言模型与成熟的网络系统,探索缓解这些风险的新颖途径。我们指出"引用"——即承认或援引来源证据的行为——是LLMs中缺失的关键要素。将引用机制整合其中,既能提升内容透明度与可验证性,又可应对LLMs部署中的知识产权与伦理问题。我们进一步提出,面向LLMs的完整引用机制应兼顾非参数化内容与参数化内容。尽管实现该机制存在复杂性及潜在陷阱,我们仍倡导其开发。基于此,我们勾勒出该领域若干待解研究问题,旨在为构建更负责任与可问责的LLMs指引未来探索方向。