Open source projects have made incredible progress in producing widely usable machine learning models and systems, but open source alone will face challenges in fully democratizing access to AI. Unlike previous generations of open source software, open source and open weight AI models require substantial resources to activate and maintain -- e.g., data and compute for pre-training, post-training, and deployment -- which only a few actors can currently provide. This position paper argues that open source AI must be complemented by public AI: infrastructure and institutions that ensure models are accessible, sustainable, and governed in the public interest. To achieve the full promise of AI models as prosocial public goods, we need to build public infrastructure to power and deliver open source software and models.
翻译:开源项目在构建广泛可用的机器学习模型与系统方面取得了令人瞩目的进展,但仅凭开源本身,将面临实现人工智能全面民主化普及的挑战。与以往几代开源软件不同,开源及开放权重的AI模型需要大量资源来激活与维护——例如用于预训练、后训练及部署的数据与算力——而目前仅有少数行动者能够提供这些资源。本立场论文提出,开源人工智能必须由公共人工智能加以补充:即确保模型以公共利益为导向、具有可及性、可持续性并接受治理的基础设施与制度体系。为实现AI模型作为亲社会性公共物品的全部潜力,我们需要构建能够驱动并交付开源软件与模型的公共基础设施。