Can language models improve their accuracy without external supervision? Methods such as debate, bootstrap, and internal coherence maximization achieve this surprising feat, even matching golden finetuning performance. Yet why they work remains theoretically unclear. We show that they are all special cases of coherence optimization: finding a context-to-behavior mapping that's most compressible and jointly predictable. We prove that coherence optimization is equivalent to description-length regularization, and that among all such regularization schemes, it is optimal for semi-supervised learning when the regularizer is derived from a pretrained model. Our theory, supported by preliminary experiments, explains why feedback-free self-improvement works and predicts when it should succeed or fail.
翻译:语言模型能否在没有外部监督的情况下提升其准确性?辩论、自举和内部一致性最大化等方法实现了这一令人惊讶的成就,甚至能与黄金微调性能相媲美。然而,其工作原理在理论上仍不明确。我们证明这些方法都是一致性优化的特例:即寻找一种最可压缩且联合可预测的上下文到行为的映射。我们证明一致性优化等价于描述长度正则化,并且在所有此类正则化方案中,当正则化器源自预训练模型时,它对于半监督学习是最优的。我们的理论得到了初步实验的支持,解释了为何无反馈的自我改进能够奏效,并预测了其成功或失败的条件。