Generative Artificial Intelligence Consensus in a Trustless Network

We performed a billion locality sensitive hash comparisons between artificially generated data samples to answer the critical question - can we verify the "correctness" of generative AI output in a non-deterministic, trustless, decentralized network? We generate millions of data samples from a variety of open source diffusion and large language models and describe the procedures and trade-offs between generating more verses less deterministic output in a heterogenous, stochastic network. Further, we analyze the outputs to provide empirical evidence of different parameterizations of tolerance and error bounds for verification. Finally, given that we have the generated an enormous amount of simulated data, we also release a new training dataset called ImageNet-Gen for use in augmenting existing training pipelines. For our results, we show that with a majority vote between three independent verifiers, we can detect image generated perceptual collisions in generated AI with over 99.89% probability and less than 0.0267% chance of intra-class collision. For large language models (LLMs), we are able to gain 100% consensus using greedy methods or n-way beam searches to generate consensus demonstrated on different LLMs. In the context of generative AI training, we pinpoint and minimize the major sources of stochasticity and present gossip and synchronization training techniques for verifiability. Thus, this work provides a practical, solid foundation for AI verification and consensus for the minimization of trust in a decentralized network.

翻译：我们在人工生成的数据样本之间进行了十亿次局部敏感哈希比较，以回答关键问题——在一个非确定性、无信任、去中心化的网络中，能否验证生成式人工智能输出的“正确性”？我们利用多种开源扩散模型和大语言模型生成了数百万个数据样本，并描述了在异构、随机网络中产生更多或更少确定性输出的流程与权衡。进一步地，我们分析输出结果，提供了不同参数化容错与误差边界的经验证据，用于验证目的。最后，鉴于我们生成了海量模拟数据，我们还发布了一个名为ImageNet-Gen的新训练数据集，用于增强现有训练流程。研究结果表明，通过三个独立验证者之间的多数投票，我们能够以超过99.89%的概率检测生成式AI中图像生成的感知碰撞，且类内碰撞概率低于0.0267%。对于大语言模型，我们采用贪婪方法或N路束搜索，在不同LLM上实现了100%的共识。在生成式AI训练的背景下，我们定位并最小化了随机性的主要来源，并提出了用于可验证性的八卦与同步训练技术。因此，本研究为去中心化网络中最小化信任的AI验证与共识提供了实用且稳固的基础。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日