With the advent of accessible interfaces for interacting with large language models, there has been an associated explosion in both their commercial and academic interest. Consequently, there has also been an sudden burst of novel attacks associated with large language models, jeopardizing user data on a massive scale. Situated at a comparable crossroads in its development, and equally prolific to LLMs in its rampant growth, blockchain has emerged in recent years as a disruptive technology with the potential to redefine how we approach data handling. In particular, and due to its strong guarantees about data immutability and irrefutability as well as inherent data provenance assurances, blockchain has attracted significant attention as a means to better defend against the array of attacks affecting LLMs and further improve the quality of their responses. In this survey, we holistically evaluate current research on how blockchains are being used to help protect against LLM vulnerabilities, as well as analyze how they may further be used in novel applications. To better serve these ends, we introduce a taxonomy of blockchain for large language models (BC4LLM) and also develop various definitions to precisely capture the nature of different bodies of research in these areas. Moreover, throughout the paper, we present frameworks to contextualize broader research efforts, and in order to motivate the field further, we identify future research goals as well as challenges present in the blockchain for large language model (BC4LLM) space.
翻译:随着大语言模型交互接口的普及,其在商业与学术领域引发了前所未有的关注热潮。相应地,针对大语言模型的新型攻击手段也呈现爆发式增长,导致大规模用户数据面临安全风险。与此同时,区块链技术作为近年来快速发展的颠覆性技术,正处于与大语言模型相似的发展拐点,其迅猛的增长态势同样引人注目。区块链技术有望重新定义数据处理范式,特别是其提供的数据不可篡改性、不可否认性及固有的数据溯源保障等强安全特性,使其成为抵御大语言模型各类攻击、进一步提升模型响应质量的有效手段而备受关注。本综述系统性地评估了当前利用区块链技术防护大语言模型脆弱性的研究进展,并深入分析了其在创新应用场景中的潜在价值。为更好地实现这些目标,我们提出了面向大语言模型的区块链技术分类体系(BC4LLM),并构建了系列定义以精确刻画该领域不同研究分支的本质特征。此外,本文通过构建理论框架对更广泛的研究成果进行系统化梳理,同时为推进该领域发展,我们进一步明确了面向大语言模型的区块链技术(BC4LLM)领域未来研究方向及现存挑战。