Large language models (LLMs) can be adapted either through numerical updates that alter model parameters or symbolic manipulations that work on discrete prompts or logical constraints. While numerical fine-tuning excels at injecting new factual knowledge, symbolic updates offer flexible control of style and alignment without retraining. We introduce a neurosymbolic LoRA framework that dynamically combines these two complementary strategies. Specifically, we present a unified monitoring signal and a reward-based classifier to decide when to employ LoRA for deeper factual reconstruction and when to apply TextGrad for token-level edits. Our approach remains memory-efficient by offloading the symbolic transformations to an external LLM only when needed. Additionally, the refined prompts produced during symbolic editing serve as high-quality, reusable training data, an important benefit in data-scarce domains like mathematical reasoning. Extensive experiments across multiple LLM backbones show that neurosymbolic LoRA consistently outperforms purely numerical or purely symbolic baselines, demonstrating superior adaptability and improved performance. Our findings highlight the value of interleaving numerical and symbolic updates to unlock a new level of versatility in language model fine-tuning.
翻译:大型语言模型(LLMs)既可通过改变模型参数的数值更新来适配,也可通过作用于离散提示或逻辑约束的符号操作来调整。数值微调在注入新事实知识方面表现出色,而符号更新则能在无需重新训练的情况下灵活控制风格与对齐。我们提出了一种神经符号LoRA框架,动态结合了这两种互补策略。具体而言,我们设计了一个统一的监控信号和一个基于奖励的分类器,以决定何时使用LoRA进行更深层的事实重构,何时应用TextGrad进行词元级编辑。我们的方法通过仅在需要时将符号变换卸载至外部LLM,保持了内存效率。此外,符号编辑过程中生成的优化提示可作为高质量、可复用的训练数据,这在数学推理等数据稀缺领域尤为重要。在多种LLM骨干模型上进行的大量实验表明,神经符号LoRA始终优于纯数值或纯符号基线,展现出更强的适应性和更优的性能。我们的研究结果凸显了交替使用数值与符号更新在解锁语言模型微调新层次灵活性方面的重要价值。