Reasoning Large Language Models (RLLMs) excelling in complex tasks present unique challenges for digital watermarking, as existing methods often disrupt logical coherence or incur high computational costs. Token-based watermarking techniques can corrupt the reasoning flow by applying pseudo-random biases, while semantic-aware approaches improve quality but introduce significant latency or require auxiliary models. This paper introduces ReasonMark, a novel watermarking framework specifically designed for reasoning-intensive LLMs. Our approach decouples generation into an undisturbed Thinking Phase and a watermarked Answering Phase. We propose a Criticality Score to identify semantically pivotal tokens from the reasoning trace, which are distilled into a Principal Semantic Vector (PSV). The PSV then guides a semantically-adaptive mechanism that modulates watermark strength based on token-PSV alignment, ensuring robustness without compromising logical integrity. Extensive experiments show ReasonMark surpasses state-of-the-art methods by reducing text Perplexity by 0.35, increasing translation BLEU score by 0.164, and raising mathematical accuracy by 0.67 points. These advancements are achieved alongside a 0.34% higher watermark detection AUC and stronger robustness to attacks, all with a negligible increase in latency. This work enables the traceable and trustworthy deployment of reasoning LLMs in real-world applications.
翻译:在复杂任务中表现出色的推理大语言模型为数字水印技术带来了独特挑战,现有方法常会破坏逻辑连贯性或导致高昂计算成本。基于令牌的水印技术通过施加伪随机偏差可能破坏推理流,而语义感知方法虽能提升质量,却引入了显著延迟或需要辅助模型。本文介绍了ReasonMark,一种专为推理密集型大语言模型设计的新型水印框架。我们的方法将生成过程解耦为不受干扰的“思维阶段”和施加水印的“答案阶段”。我们提出了一种关键性评分,用于从推理轨迹中识别语义关键令牌,并将其提炼为“主语义向量”。该向量随后引导一种语义自适应机制,根据令牌与主语义向量的对齐程度调制水印强度,从而在确保鲁棒性的同时不损害逻辑完整性。大量实验表明,ReasonMark在多项指标上超越了现有先进方法:文本困惑度降低0.35,翻译BLEU分数提升0.164,数学准确率提高0.67分。这些性能提升是在水印检测AUC提高0.34%、对攻击具有更强鲁棒性的同时实现的,且仅带来可忽略的延迟增加。本工作为实现推理大语言模型在现实应用中的可追溯与可信赖部署提供了支持。