Personalization is well studied in search and recommendation, but personalized question answering remains underexplored due to challenges in inferring preferences from long, noisy, implicit contexts and generating responses that are both accurate and aligned with user expectations. To address this, we propose Pathways of Thoughts (PoT), an inference-stage method that applies to any large language model (LLM) without task-specific fine-tuning. PoT models the thinking as an iterative decision process, where the model dynamically selects among cognitive operations such as reasoning, revision, personalization, and clarification. This enables exploration of multiple reasoning trajectories, producing diverse candidate responses that capture different perspectives. PoT then aggregates and reweights these candidates according to inferred user preferences, yielding a final personalized response that benefits from the complementary strengths of diverse reasoning paths. Experiments on the LaMP-QA benchmark show that PoT consistently outperforms competitive baselines, achieving up to a 10.8\% relative improvement. Human evaluation further validates these improvements, with annotators preferring PoT in 66\% of cases compared to the best-performing baseline and reporting ties in 15\% of cases.
翻译:个性化在搜索和推荐领域已得到充分研究,但在个性化问答中仍探索不足,这主要源于两大挑战:如何从冗长、嘈杂的隐式上下文中推断用户偏好,以及如何生成既准确又符合用户期望的响应。为此,我们提出思维路径(PoT),一种适用于任何大型语言模型(LLM)且无需任务特定微调的推理阶段方法。PoT将思维建模为一个迭代决策过程,模型在其中动态选择认知操作,如推理、修正、个性化和澄清。这使得模型能够探索多种推理轨迹,生成捕捉不同视角的多样化候选响应。随后,PoT根据推断出的用户偏好对这些候选响应进行聚合与重新加权,最终产生一个受益于不同推理路径互补优势的个性化响应。在LaMP-QA基准测试上的实验表明,PoT始终优于竞争基线,实现了高达10.8%的相对性能提升。人工评估进一步验证了这些改进,标注者在66%的情况下更倾向于PoT(与表现最佳的基线相比),并在15%的情况下认为两者表现相当。