The ubiquity of multimedia content is reshaping online information spaces, particularly in social media environments. At the same time, search is being rapidly transformed by generative AI, with large language models (LLMs) routinely deployed as intermediaries between users and multimedia content to retrieve and summarize information. Despite their growing influence, the impact of LLM inaccuracies and potential vulnerabilities on multimedia information-seeking tasks remains largely unexplored. We investigate how generative AI affects accuracy, efficiency, and confidence in information retrieval from videos. We conduct an experiment with around 900 participants on 8,000+ video-based information-seeking tasks, comparing behavior across three conditions: (1) access to videos only, (2) access to videos with LLM-based AI assistance, and (3) access to videos with a deceiving AI assistant designed to provide false answers. We find that AI assistance increases accuracy by 3-7% when participants viewed the relevant video segment, and by 27-35% when they did not. Efficiency increases by 10% for short videos and 25% for longer ones. However, participants tend to over-rely on AI outputs, resulting in accuracy drops of up to 32% when interacting with the deceiving AI. Alarmingly, self-reported confidence in answers remains stable across all three conditions. Our findings expose fundamental safety risks in AI-mediated video information retrieval.
翻译:多媒体内容的普及正在重塑在线信息空间,尤其在社交媒体环境中。与此同时,生成式AI正迅速改变搜索领域——大型语言模型(LLM)被常规部署为用户与多媒体内容之间的中介,用于检索和总结信息。尽管LLM的影响力日益增强,但其不准确性及潜在漏洞对多媒体信息检索任务的影响仍鲜有探讨。我们通过实验研究生成式AI如何影响视频信息检索的准确性、效率及用户信心。我们设计了一项涉及约900名参与者的实验,在8000余项基于视频的信息检索任务中,对比三种条件下的行为表现:(1)仅提供视频访问权限,(2)提供视频及基于LLM的AI辅助工具,(3)提供视频及可生成错误答案的欺骗性AI助手。研究发现:当参与者观看相关视频片段时,AI辅助可将准确性提升3-7%;若未观看相关片段,准确性提升可达27-35%。短视频效率提升10%,长视频提升25%。然而,参与者倾向于过度依赖AI输出结果,导致与欺骗性AI互动时准确性下降高达32%。令人担忧的是,在所有三种条件下,参与者自我报告的回答信心水平保持稳定。我们的研究揭示了AI中介视频信息检索中存在的根本性安全风险。