SocialGenPod: Privacy-Friendly Generative AI Social Web Applications with Decentralised Personal Data Stores

We present SocialGenPod, a decentralised and privacy-friendly way of deploying generative AI Web applications. Unlike centralised Web and data architectures that keep user data tied to application and service providers, we show how one can use Solid -- a decentralised Web specification -- to decouple user data from generative AI applications. We demonstrate SocialGenPod using a prototype that allows users to converse with different Large Language Models, optionally leveraging Retrieval Augmented Generation to generate answers grounded in private documents stored in any Solid Pod that the user is allowed to access, directly or indirectly. SocialGenPod makes use of Solid access control mechanisms to give users full control of determining who has access to data stored in their Pods. SocialGenPod keeps all user data (chat history, app configuration, personal documents, etc) securely in the user's personal Pod; separate from specific model or application providers. Besides better privacy controls, this approach also enables portability across different services and applications. Finally, we discuss challenges, posed by the large compute requirements of state-of-the-art models, that future research in this area should address. Our prototype is open-source and available at: https://github.com/Vidminas/socialgenpod/.

翻译：我们提出SocialGenPod，一种去中心化且隐私友好的生成式AI网络应用部署方案。与将用户数据绑定至应用和服务提供商的中心化网络与数据架构不同，我们展示了如何利用Solid（一种去中心化网络规范）将用户数据与生成式AI应用解耦。我们通过原型系统演示了SocialGenPod：用户可与不同大语言模型对话，并可选地利用检索增强生成技术，基于存储在用户授权访问（直接或间接）的任意Solid Pod中的私密文档生成答案。SocialGenPod采用Solid的访问控制机制，使用户能完全掌控其Pod中数据的访问权限。所有用户数据（聊天记录、应用配置、个人文档等）均安全存储在用户的个人Pod中，与特定模型或应用提供商相隔离。除提供更优的隐私控制外，该方案还支持跨不同服务与应用的数据可移植性。最后，我们探讨了当前先进模型对大规模算力的需求所带来的挑战，这些挑战正是该领域未来研究亟待解决的方向。我们的原型系统为开源项目，可通过https://github.com/Vidminas/socialgenpod/ 获取。

相关内容

生成式人工智能

关注 38

生成式人工智能是利用复杂的算法、模型和规则，从大规模数据集中学习，以创造新的原创内容的人工智能技术。这项技术能够创造文本、图片、声音、视频和代码等多种类型的内容，全面超越了传统软件的数据处理和分析能力。2022年末，OpenAI推出的ChatGPT标志着这一技术在文本生成领域取得了显著进展，2023年被称为生成式人工智能的突破之年。这项技术从单一的语言生成逐步向多模态、具身化快速发展。在图像生成方面，生成系统在解释提示和生成逼真输出方面取得了显著的进步。同时，视频和音频的生成技术也在迅速发展，这为虚拟现实和元宇宙的实现提供了新的途径。生成式人工智能技术在各行业、各领域都具有广泛的应用前景。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日