Capturing complex user preferences from sparse behavioral sequences remains a fundamental challenge in sequential recommendation. Recent latent reasoning methods have shown promise by extending test-time computation through multi-step reasoning, yet they exclusively rely on depth-level scaling along a single trajectory, suffering from diminishing returns as reasoning depth increases. To address this limitation, we propose \textbf{Parallel Latent Reasoning (PLR)}, a novel framework that pioneers width-level computational scaling by exploring multiple diverse reasoning trajectories simultaneously. PLR constructs parallel reasoning streams through learnable trigger tokens in continuous latent space, preserves diversity across streams via global reasoning regularization, and adaptively synthesizes multi-stream outputs through mixture-of-reasoning-streams aggregation. Extensive experiments on three real-world datasets demonstrate that PLR substantially outperforms state-of-the-art baselines while maintaining real-time inference efficiency. Theoretical analysis further validates the effectiveness of parallel reasoning in improving generalization capability. Our work opens new avenues for enhancing reasoning capacity in sequential recommendation beyond existing depth scaling.
翻译:从稀疏行为序列中捕捉复杂的用户偏好仍然是序列推荐领域的一个基本挑战。最近的潜在推理方法通过多步推理扩展测试时计算已显示出潜力,但这些方法仅依赖单一轨迹上的深度级扩展,随着推理深度的增加会遭遇收益递减问题。为解决这一限制,我们提出了**并行潜在推理(PLR)**,这是一个通过同时探索多个多样化推理轨迹、率先实现宽度级计算扩展的新型框架。PLR通过在连续潜在空间中的可学习触发令牌构建并行推理流,通过全局推理正则化保持流间的多样性,并通过混合推理流聚合自适应地综合多流输出。在三个真实世界数据集上的大量实验表明,PLR显著优于现有最先进的基线方法,同时保持了实时推理效率。理论分析进一步验证了并行推理在提升泛化能力方面的有效性。我们的工作为在序列推荐中超越现有深度扩展、增强推理能力开辟了新途径。