Conversational search engines such as YouChat and Microsoft Copilot use large language models (LLMs) to generate responses to queries. It is only a small step to also let the same technology insert ads within the generated responses - instead of separately placing ads next to a response. Inserted ads would be reminiscent of native advertising and product placement, both of which are very effective forms of subtle and manipulative advertising. Considering the high computational costs associated with LLMs, for which providers need to develop sustainable business models, users of conversational search engines may very well be confronted with generated native ads in the near future. In this paper, we thus take a first step to investigate whether LLMs can also be used as a countermeasure, i.e., to block generated native ads. We compile the Webis Generated Native Ads 2024 dataset of queries and generated responses with automatically inserted ads, and evaluate whether LLMs or fine-tuned sentence transformers can detect the ads. In our experiments, the investigated LLMs struggle with the task but sentence transformers achieve precision and recall values above 0.9.
翻译:对话式搜索引擎(如YouChat和Microsoft Copilot)利用大语言模型生成对查询的响应。下一步自然而言地,便是让同一技术在生成的回复中插入广告——而非在回复旁单独投放广告。插入的广告会让人联想到原生广告和产品植入,这两种形式都是巧妙且具有操控性的非常有效的广告手段。考虑到大语言模型高昂的计算成本(其提供者需要开发可持续的商业模式),对话式搜索引擎用户很可能在不久的将来会面临生成的植入式广告。因此,本文初步探究能否将大语言模型用作对抗措施,即阻断生成的植入式广告。我们构建了包含查询与自动插入广告的生成回复的Webis Generated Native Ads 2024数据集,并评估了大语言模型或微调后句子转换器能否检测出这些广告。实验表明,所研究的大语言模型在此任务中表现不佳,但句子转换器的精确率和召回率均高于0.9。