LLM-Assisted Reranking to Operationalize Nuanced Objectives in Recommender Systems

Recommender systems have grown from content-organization tools into sophisticated systems that shape daily behavior. By controlling what we see, they shape what we perceive, raising concerns about filter bubbles, radicalization, polarization, and social inequality. Large language models (LLMs) enable more powerful personalization, intensifying these dynamics. Yet most recommenders are tuned for engagement or limited accuracy metrics, with little attention to broader social implications, e.g. how personalization reshapes exposure in socially consequential domains. We investigate whether LLM-assisted reranking, while improving personalization, inadvertently amplifies exposure to ideologically extreme or conspiratorial political content, a risk theorized but not empirically characterized in news recommendation. Using real news-consumption histories, we rerank YouTube's sidebar candidates through zero-shot, instruction-based prompting. We compare a baseline prompt with a constrained variant that preserves topical relevance and broadens ideological exposure while reducing conspiratorial or extreme content. Without constraints, reranking strengthened personalization but increased exposure to conspiratorial and extremist material for users whose histories contained such content. Lightweight prompt-level regularization reduced promotion of extreme content and increased ideological diversity, with modest relevance loss. Synthetic experiments suggest that LLMs rerank via statistical regularities in language rather than semantic understanding of ideology, clarifying why naive prompts amplify these patterns and why regularization can reshape them. Together, our results highlight the power of LLMs to operationalize contextual nuance in high-stakes recommendation, and the need to evaluate LLM-assisted personalization beyond accuracy and treat prompt design as a value-laden rather than neutral default.

翻译：推荐系统已从内容组织工具发展为塑造日常行为的复杂系统。通过控制我们所见的内容，它们塑造了我们的感知，引发了关于过滤气泡、激进观点、两极分化和社会不平等的担忧。大语言模型（LLM）实现了更强大的个性化，加剧了这些动态。然而，大多数推荐系统优化的是用户参与度或有限的准确性指标，很少关注更广泛的社会影响，例如个性化如何重塑社会重要领域中的内容曝光。我们研究了LLM辅助的重排序在提升个性化的同时，是否无意中加剧了用户对意识形态极端或阴谋论政治内容的曝光——这一风险在新闻推荐中虽被理论探讨但缺乏实证特征。利用真实的新闻消费历史，我们通过零样本指令提示对YouTube侧边栏候选内容进行重排序。我们比较了基础提示与受约束变体，后者在保持主题相关性、拓宽意识形态曝光的同时减少阴谋论或极端内容。在无约束条件下，重排序强化了个性化，但增加了用户历史中包含此类内容者对阴谋论和极端材料的曝光。轻量级提示级正则化在适度降低相关性的同时，减少了极端内容的推广并提升了意识形态多样性。合成实验表明，LLM通过语言中的统计规律而非意识形态的语义理解进行重排序，这解释了为何朴素提示会放大这些模式，而正则化可重塑它们。综合而言，我们的结果凸显了LLM在高风险推荐中实现语境细微操作的能力，以及评估LLM辅助个性化时需超越准确性指标、将提示设计视为价值负载而非中性默认选项的必要性。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

基于大语言模型（LLM）的智能体推理框架：从方法到场景的综述

专知会员服务

55+阅读 · 2025年8月26日

大语言模型在多模态推荐系统中的应用综述

专知会员服务

17+阅读 · 2025年5月17日

【WWW2025】G-Refer：基于图检索增强的大型语言模型用于可解释推荐

专知会员服务

13+阅读 · 2025年4月8日

【新书】设计大型语言模型应用：一种面向LLMs的整体方法

专知会员服务

56+阅读 · 2025年3月16日