Model Agnostic Hybrid Sharding For Heterogeneous Distributed Inference

The rapid growth of large-scale AI models, particularly large language models has brought significant challenges in data privacy, computational resources, and accessibility. Traditional centralized architectures often struggle to meet required data security and scalability needs which hinders the democratization of AI systems. Nesa introduces a model-agnostic sharding framework designed for decentralized AI inference. Our framework uses blockchain-based sequential deep neural network sharding to distribute computational tasks across a diverse network of nodes based on a personalised heuristic and routing mechanism. This enables efficient distributed training and inference for recent large-scale models even on consumer-grade hardware. We use compression techniques like dynamic blockwise quantization and mixed matrix decomposition to reduce data transfer and memory needs. We also integrate robust security measures, including hardware-based trusted execution environments to ensure data integrity and confidentiality. Evaluating our system across various natural language processing and vision tasks shows that these compression strategies do not compromise model accuracy. Our results highlight the potential to democratize access to cutting-edge AI technologies by enabling secure and efficient inference on a decentralized network.

翻译：随着大规模AI模型（尤其是大语言模型）的快速发展，数据隐私、计算资源和可访问性方面面临重大挑战。传统的集中式架构往往难以满足所需的数据安全性和可扩展性需求，这阻碍了AI系统的民主化进程。Nesa提出了一种专为去中心化AI推理设计的模型无关分片框架。该框架采用基于区块链的顺序深度神经网络分片技术，通过个性化启发式路由机制将计算任务分配到异构节点网络中。这使得即使在消费级硬件上也能对当前的大规模模型进行高效的分布式训练与推理。我们采用动态分块量化与混合矩阵分解等压缩技术来降低数据传输与内存需求。同时集成了包括基于硬件的可信执行环境在内的鲁棒安全措施，以确保数据完整性与机密性。通过在多种自然语言处理和视觉任务上评估本系统，结果表明这些压缩策略不会损害模型精度。我们的研究凸显了通过在去中心化网络上实现安全高效的推理，为前沿AI技术提供民主化访问的潜力。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日