Chain-of-Thought (CoT) reasoning is widely used to improve LLM performance, and recent foundation recommender models adopt it by generating textual reasoning before predicting target items represented by Semantic IDs (SIDs). However, we observe that enabling thinking mode in models such as OpenOneRec can degrade recommendation quality by up to 25%. We investigate this failure and identify Linguistic Inertia: when a textual CoT segment is inserted before SID generation, the model relies more on natural-language context and less on historical SID evidence. Further analyses show that this effect is amplified by reduced access to historical information and longer CoT lengths. To mitigate it, we propose Linguistic-Inertia-Calibrated Decoding (LICD), a training-free framework that combines Reasoning-Chain Compression and Bias-Subtracted Contrastive Inference. Experiments on three large-scale benchmarks show that LICD consistently outperforms both no-thinking and original-thinking baselines. Our code is available at https://anonymous.4open.science/r/LICD-4573.
翻译:链式推理(CoT)被广泛用于提升大语言模型(LLM)的性能,近期的基础推荐模型通过在生成以语义ID(SID)表示的目标项目之前进行文本推理来采用这一方法。然而,我们观察到,在OpenOneRec等模型中启用思考模式会使推荐质量下降高达25%。我们研究这一失败现象并识别出语言惯性:当在SID生成前插入文本化的CoT片段时,模型更多地依赖自然语言上下文,而较少依赖历史SID证据。进一步分析表明,这种效应因历史信息访问减少和CoT长度增加而加剧。为缓解此问题,我们提出语言惯性校准解码(LICD),一种无需训练的框架,结合了推理链压缩与偏差相消对比推断。在三个大规模基准上的实验表明,LICD一致优于无思考模式和原始思考模式的基线方法。我们的代码已开源在https://anonymous.4open.science/r/LICD-4573。