Open source projects have made incredible progress in producing widely usable machine learning models and systems, but open source alone will face challenges in fully democratizing access to AI. Unlike previous generations of open source software, open source and open weight AI models require substantial resources to activate and maintain -- e.g., data and compute for pre-training, post-training, and deployment -- which only a few actors can currently provide. This position paper argues that open source AI must be complemented by public AI: infrastructure and institutions that ensure models are accessible, sustainable, and governed in the public interest. To achieve the full promise of AI models as prosocial public goods, we need to build public infrastructure to power and deliver open source software and models.
翻译:开源项目在开发广泛可用的机器学习模型和系统方面取得了令人瞩目的进展,但仅靠开源本身在完全实现人工智能的民主化访问方面仍将面临挑战。与以往几代开源软件不同,开源与开放权重的AI模型需要大量资源来激活和维护——例如,用于预训练、后训练和部署的数据与算力——而目前只有少数参与者能够提供这些资源。这篇立场论文主张,开源AI必须得到公共AI的补充:即确保模型以符合公共利益的方式可获取、可持续且受治理的基础设施与机构。为了充分实现AI模型作为亲社会公共物品的承诺,我们需要建设公共基础设施,以赋能并交付开源软件与模型。