Open-sourcing foundation models (FMs) enables broad reuse but also exposes model trainers to economic and safety risks from unrestricted downstream fine-tuning. We address this problem by building non-fine-tunable foundation models: models that remain broadly usable in their released form while yielding limited adaptation gains under task-agnostic unauthorized fine-tuning. We propose Private Mask Pre-Training (PMP), a pre-training framework that concentrates representation learning into a sparse subnetwork identified early in training. The binary mask defining this subnetwork is kept private, and only the final dense weights are released. This forces unauthorized fine-tuning without access to the mask to update parameters misaligned with pretraining subspace, inducing an intrinsic mismatch between the fine-tuning objective and the pre-training geometry. We provide theoretical analysis showing that this mismatch destabilizes gradient-based adaptation and bounds fine-tuning gains. Empirical results on large language models demonstrating that PMP preserves base model performance while consistently degrading unauthorized fine-tuning across a wide range of downstream tasks, with the strength of non-fine-tunability controlled by the mask ratio.
翻译:开源基础模型(FMs)虽能促进广泛复用,但也使模型训练者面临因下游无限制微调带来的经济与安全风险。我们通过构建不可微调的基础模型来解决此问题:这类模型在发布形态下保持广泛可用性,同时在任务无关的未授权微调下仅能获得有限的适应增益。我们提出私有掩码预训练(PMP),一种将表示学习集中于训练早期识别的稀疏子网络的预训练框架。定义该子网络的二值掩码保持私有,仅发布最终稠密权重。这迫使未授权微调在无法访问掩码的情况下更新与预训练子空间失配的参数,从而在微调目标与预训练几何结构之间引入内在错位。我们通过理论分析表明,这种错位会破坏基于梯度的适应过程并限制微调增益。大规模语言模型的实证结果表明,PMP在保持基础模型性能的同时,能持续降低各类下游任务中未授权微调的效果,且不可微调强度可通过掩码比例进行调控。