Large Language Models (LLMs) increasingly act as gateways to web content, shaping how millions of users encounter online information. Unlike traditional search engines, whose retrieval and ranking mechanisms are well studied, the selection processes of web-connected LLMs add layers of opacity to how answers are generated. By determining which news outlets users see, these systems can influence public opinion, reinforce echo chambers, and pose risks to civic discourse and public trust. This work extends two decades of research in algorithmic auditing to examine how LLMs function as news engines. We present the first audit comparing three leading agents, GPT-4o-Mini, Claude-3.7-Sonnet, and Gemini-2.0-Flash, against Google News, asking: \textit{How do LLMs differ from traditional aggregators in the diversity, ideology, and reliability of the media they expose to users?} Across 24 global topics, we find that, compared to Google News, LLMs surface significantly fewer unique outlets and allocate attention more unevenly. In the same way, GPT-4o-Mini emphasizes more factual and right-leaning sources; Claude-3.7-Sonnet favors institutional and civil-society domains and slightly amplifies right-leaning exposure; and Gemini-2.0-Flash exhibits a modest left-leaning tilt without significant changes in factuality. These patterns remain robust under prompt variations and alternative reliability benchmarks. Together, our findings show that LLMs already enact \textit{agentic editorial policies}, curating information in ways that diverge from conventional aggregators. Understanding and governing their emerging editorial power will be critical for ensuring transparency, pluralism, and trust in digital information ecosystems.
翻译:大型语言模型(LLMs)日益成为网络内容的门户,塑造着数百万用户接触在线信息的方式。与传统搜索引擎的检索和排序机制已得到充分研究不同,联网LLMs的选择过程为答案生成增加了多层不透明性。通过决定用户看到哪些新闻媒体,这些系统能够影响公众舆论、强化信息茧房,并对公民讨论和公共信任构成风险。本研究将二十年来的算法审计研究延伸至LLMs作为新闻引擎的功能评估。我们首次对三种主流智能体——GPT-4o-Mini、Claude-3.7-Sonnet和Gemini-2.0-Flash——与Google News进行对比审计,核心问题是:\\textit{LLMs在向用户呈现媒体的多样性、意识形态倾向和可靠性方面,与传统聚合器有何差异?}通过对24个全球议题的分析,我们发现相较于Google News,LLMs呈现的独特媒体数量显著更少,且注意力分配更不均衡。具体而言,GPT-4o-Mini更强调事实性强的右倾信源;Claude-3.7-Sonnet偏好机构和公民社会领域,并轻微放大右倾曝光;Gemini-2.0-Flash则呈现温和的左倾倾向,但事实性未发生显著变化。这些模式在不同提示词变体和替代可靠性基准下均保持稳健。综合而言,我们的研究结果表明LLMs已实施\\textit{具自主性的编辑策略},其信息筛选方式与传统聚合器存在系统性差异。理解和规制其新兴的编辑权力,对于确保数字信息生态系统的透明度、多元性和可信度至关重要。