The remarkable success of foundation models has been driven by scaling laws, demonstrating that model performance improves predictably with increased training data and model size. However, this scaling trajectory faces two critical challenges: the depletion of high-quality public data, and the prohibitive computational power required for larger models, which have been monopolized by tech giants. These two bottlenecks pose significant obstacles to the further development of AI. In this position paper, we argue that leveraging massive distributed edge devices can break through these barriers. We reveal the vast untapped potential of data and computational resources on massive edge devices, and review recent technical advancements in distributed/federated learning that make this new paradigm viable. Our analysis suggests that by collaborating on edge devices, everyone can participate in training large language models with small edge devices. This paradigm shift towards distributed training on edge has the potential to democratize AI development and foster a more inclusive AI community.
翻译:基础模型的显著成功得益于缩放定律的驱动,该定律表明模型性能会随着训练数据量和模型规模的增加而可预测地提升。然而,这种规模化发展路径正面临两大严峻挑战:高质量公共数据的枯竭,以及大型模型所需的天量算力已被科技巨头垄断。这两大瓶颈对人工智能的进一步发展构成了重大阻碍。在这篇立场论文中,我们主张利用海量分布式边缘设备可以突破这些瓶颈。我们揭示了海量边缘设备在数据与计算资源方面的巨大未开发潜力,并回顾了近期分布式学习/联邦学习领域的技术进展,这些进展使得这一新范式成为可能。我们的分析表明,通过边缘设备之间的协作,每个人都能使用小型边缘设备参与大型语言模型的训练。这种向边缘分布式训练的范式转变,有望推动人工智能开发的民主化,并培育一个更具包容性的人工智能社群。