AgenticShop：面向个性化网络购物的智能体产品策展基准评测 (AgenticShop: Benchmarking Agentic Product Curation for Personalized Web Shopping) - 专知论文

会员服务 ·

0

产品 · 系统 · 基准 · 智能体系统 · 网络环境 ·

AgenticShop: Benchmarking Agentic Product Curation for Personalized Web Shopping

翻译：AgenticShop：面向个性化网络购物的智能体产品策展基准评测

Sunghwan Kim,Ryang Heo,Yongsik Seo,Jinyoung Yeo,Dongha Lee

from arxiv, Accepted at WWW 2026

The proliferation of e-commerce has made web shopping platforms key gateways for customers navigating the vast digital marketplace. Yet this rapid expansion has led to a noisy and fragmented information environment, increasing cognitive burden as shoppers explore and purchase products online. With promising potential to alleviate this challenge, agentic systems have garnered growing attention for automating user-side tasks in web shopping. Despite significant advancements, existing benchmarks fail to comprehensively evaluate how well agentic systems can curate products in open-web settings. Specifically, they have limited coverage of shopping scenarios, focusing only on simplified single-platform lookups rather than exploratory search. Moreover, they overlook personalization in evaluation, leaving unclear whether agents can adapt to diverse user preferences in realistic shopping contexts. To address this gap, we present AgenticShop, the first benchmark for evaluating agentic systems on personalized product curation in open-web environment. Crucially, our approach features realistic shopping scenarios, diverse user profiles, and a verifiable, checklist-driven personalization evaluation framework. Through extensive experiments, we demonstrate that current agentic systems remain largely insufficient, emphasizing the need for user-side systems that effectively curate tailored products across the modern web.

翻译：电子商务的蓬勃发展使网络购物平台成为消费者在广阔数字市场中导航的关键门户。然而，这种快速扩张导致了嘈杂且碎片化的信息环境，增加了消费者在线探索和购买产品时的认知负担。智能体系统在自动化用户端网络购物任务方面展现出巨大潜力，为缓解这一挑战提供了可能，因而受到越来越多的关注。尽管已有显著进展，现有基准评测仍无法全面评估智能体系统在开放网络环境中策展产品的能力。具体而言，现有评测对购物场景的覆盖有限，仅关注简化的单平台查找而非探索式搜索。此外，它们在评估中忽视了个性化因素，导致无法明确智能体能否在真实购物场景中适应多样化的用户偏好。为填补这一空白，我们提出了AgenticShop——首个用于评估开放网络环境中个性化产品策展智能体系统的基准评测。我们的方法核心在于：真实的购物场景、多样化的用户画像，以及可验证的、基于检查表的个性化评估框架。通过大量实验，我们证明当前智能体系统仍存在明显不足，这凸显了开发能够在现代网络环境中有效策展定制化产品的用户端系统的迫切需求。

0

相关内容

用来满足人们需求和欲望的物体或无形的载体。好的产品大家都喜欢

智能体工程（Agent Engineering）

智能体工程（Agent Engineering）

专知会员服务

27+阅读 · 2025年12月31日

AgentOps综述：分类、挑战与未来方向

AgentOps综述：分类、挑战与未来方向

专知会员服务

38+阅读 · 2025年8月6日

智能体网络：用AI智能体编织下一代网络

智能体网络：用AI智能体编织下一代网络

专知会员服务

30+阅读 · 2025年8月5日

【大模型+搜索】AI搜索行业深度：大模型催生搜索行业变革机遇，产品百花齐放效果几何

【大模型+搜索】AI搜索行业深度：大模型催生搜索行业变革机遇，产品百花齐放效果几何

专知会员服务

37+阅读 · 2024年4月17日

【深度推荐系统：基础与进展】密歇根州立大学、香港理工大学、百度专家联合推出教程，Deep Recommender System: Fundamentals and Advances

【深度推荐系统：基础与进展】密歇根州立大学、香港理工大学、百度专家联合推出教程，Deep Recommender System: Fundamentals and Advances

专知会员服务

20+阅读 · 2022年2月25日

个性化广告推荐系统及其应用研究

个性化广告推荐系统及其应用研究

专知会员服务

96+阅读 · 2021年2月27日

个性化推荐系统技术进展

个性化推荐系统技术进展

专知会员服务

66+阅读 · 2020年8月15日

基于Transformer嵌入模型的个性化产品搜索，A Transformer-based Embedding Model for Personalized Product Search

基于Transformer嵌入模型的个性化产品搜索，A Transformer-based Embedding Model for Personalized Product Search

专知会员服务

31+阅读 · 2020年5月20日

【SIGMOD2020-阿里巴巴】AliCoCo阿里巴巴电子商务知识图谱的认知概念网半自动构建

【SIGMOD2020-阿里巴巴】AliCoCo阿里巴巴电子商务知识图谱的认知概念网半自动构建

专知会员服务

36+阅读 · 2020年3月31日

深度学习增强物联网应用调查，A Survey on Deep Learning Empowered IoT Applications

深度学习增强物联网应用调查，A Survey on Deep Learning Empowered IoT Applications

专知会员服务

44+阅读 · 2019年12月29日

清华大学张敏老师，个性化推荐的基础与趋势，145页ppt

清华大学张敏老师，个性化推荐的基础与趋势，145页ppt

专知

35+阅读 · 2019年11月23日

推荐系统丨完整的架构设计和算法（协同过滤、隐语义）

推荐系统丨完整的架构设计和算法（协同过滤、隐语义）

架构文摘

16+阅读 · 2019年9月9日

阿里巴巴最新成果：每一个商品的描述都是为你量身订做的

阿里巴巴最新成果：每一个商品的描述都是为你量身订做的

专知

14+阅读 · 2019年5月2日

【数字孪生】数字孪生是制造业实现“智能+”的技术接口

【数字孪生】数字孪生是制造业实现“智能+”的技术接口

产业智能官

35+阅读 · 2019年4月30日

推荐系统

炼数成金订阅号

28+阅读 · 2019年1月17日

NLP实战：用主题建模分析网购评论（附Python代码）

NLP实战：用主题建模分析网购评论（附Python代码）

论智

18+阅读 · 2018年10月17日

【知识图谱】基于知识图谱的安保机器人、知识图谱为电商而生如何感应用户需求、知识图谱在运维中的应用

【知识图谱】基于知识图谱的安保机器人、知识图谱为电商而生如何感应用户需求、知识图谱在运维中的应用

产业智能官

29+阅读 · 2018年10月13日

报名 | 码隆科技与谷歌研究院合办Kaggle大赛，挑战商品图像分类极限！

报名 | 码隆科技与谷歌研究院合办Kaggle大赛，挑战商品图像分类极限！

机器之心

10+阅读 · 2018年4月19日

【推荐系统】深度解析京东个性化推荐系统演进史

【推荐系统】深度解析京东个性化推荐系统演进史

产业智能官

23+阅读 · 2017年12月8日

【智能客服】智能客服2.0，数字时代的人性化交互

【智能客服】智能客服2.0，数字时代的人性化交互

产业智能官

13+阅读 · 2017年11月13日

社会化商务环境下基于中智集和云模型的推荐方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

以用户为中心的电子商务大数据偏好查询处理与优化

国家自然科学基金

0+阅读 · 2015年12月31日

基于在线消费者购买意向挖掘的个性化推荐研究

国家自然科学基金

0+阅读 · 2015年12月31日

社交网络中消费者行为演化及引导机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于社会网络的大型在线社区中虚拟商品购买行为研究

国家自然科学基金

0+阅读 · 2015年12月31日

服务团购中平台/商户决策优化与协调研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于领域知识和链路预测的个性化推荐研究

国家自然科学基金

4+阅读 · 2014年12月31日

网络购物平台商品质量管控作用机理及其演进研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于人眼关注度与情感分析的电子商务智能推荐计算

国家自然科学基金

0+阅读 · 2014年12月31日

负面在线评论和商家反馈对消费者个体态度和群体观点演化的影响研究

国家自然科学基金

0+阅读 · 2014年12月31日

AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation

Arxiv

0+阅读 · 2月19日

Persona2Web: Benchmarking Personalized Web Agents for Contextual Reasoning with User History

Arxiv

0+阅读 · 2月19日

AgentSkiller: Scaling Generalist Agent Intelligence through Semantically Integrated Cross-Domain Data Synthesis

Arxiv

1+阅读 · 2月10日

AgenticPay: A Multi-Agent LLM Negotiation System for Buyer-Seller Transactions

AgenticPay: A Multi-Agent LLM Negotiation System for Buyer-Seller Transactions

Arxiv

0+阅读 · 2月5日

Insight Agents: An LLM-Based Multi-Agent System for Data Insights

Arxiv

0+阅读 · 2月2日

TessPay: Verify-then-Pay Infrastructure for Trusted Agentic Commerce

Arxiv

0+阅读 · 1月30日

MalURLBench: A Benchmark Evaluating Agents' Vulnerabilities When Processing Web URLs

Arxiv

0+阅读 · 1月30日

AgentIF-OneDay: A Task-level Instruction-Following Benchmark for General AI Agents in Daily Scenarios

Arxiv

0+阅读 · 1月30日

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

Arxiv

0+阅读 · 1月26日

SafePro: Evaluating the Safety of Professional-Level AI Agents

Arxiv

0+阅读 · 1月13日

VIP会员

文章信息

相关主题

智能体系统

相关VIP内容

智能体工程（Agent Engineering）

智能体工程（Agent Engineering）

专知会员服务

27+阅读 · 2025年12月31日

AgentOps综述：分类、挑战与未来方向

AgentOps综述：分类、挑战与未来方向

专知会员服务

38+阅读 · 2025年8月6日

智能体网络：用AI智能体编织下一代网络

智能体网络：用AI智能体编织下一代网络

专知会员服务

30+阅读 · 2025年8月5日

【大模型+搜索】AI搜索行业深度：大模型催生搜索行业变革机遇，产品百花齐放效果几何

【大模型+搜索】AI搜索行业深度：大模型催生搜索行业变革机遇，产品百花齐放效果几何

专知会员服务

37+阅读 · 2024年4月17日

【深度推荐系统：基础与进展】密歇根州立大学、香港理工大学、百度专家联合推出教程，Deep Recommender System: Fundamentals and Advances

【深度推荐系统：基础与进展】密歇根州立大学、香港理工大学、百度专家联合推出教程，Deep Recommender System: Fundamentals and Advances

专知会员服务

20+阅读 · 2022年2月25日

个性化广告推荐系统及其应用研究

个性化广告推荐系统及其应用研究

专知会员服务

96+阅读 · 2021年2月27日

个性化推荐系统技术进展

个性化推荐系统技术进展

专知会员服务

66+阅读 · 2020年8月15日

基于Transformer嵌入模型的个性化产品搜索，A Transformer-based Embedding Model for Personalized Product Search

基于Transformer嵌入模型的个性化产品搜索，A Transformer-based Embedding Model for Personalized Product Search

专知会员服务

31+阅读 · 2020年5月20日

【SIGMOD2020-阿里巴巴】AliCoCo阿里巴巴电子商务知识图谱的认知概念网半自动构建

【SIGMOD2020-阿里巴巴】AliCoCo阿里巴巴电子商务知识图谱的认知概念网半自动构建

专知会员服务

36+阅读 · 2020年3月31日

深度学习增强物联网应用调查，A Survey on Deep Learning Empowered IoT Applications

深度学习增强物联网应用调查，A Survey on Deep Learning Empowered IoT Applications

专知会员服务

44+阅读 · 2019年12月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《可信人工智能赋能系统的支柱》

《从经典神经网络到不确定性下的拓扑神经网络：军事应用》2026最新40页报告

人工智能赋能边缘与自主系统：美陆军现代化进程聚焦威胁探测与战术边缘情报

《人工智能：对战略与力量的影响》slides

相关资讯

清华大学张敏老师，个性化推荐的基础与趋势，145页ppt

清华大学张敏老师，个性化推荐的基础与趋势，145页ppt

专知

35+阅读 · 2019年11月23日

推荐系统丨完整的架构设计和算法（协同过滤、隐语义）

推荐系统丨完整的架构设计和算法（协同过滤、隐语义）

架构文摘

16+阅读 · 2019年9月9日

阿里巴巴最新成果：每一个商品的描述都是为你量身订做的

阿里巴巴最新成果：每一个商品的描述都是为你量身订做的

专知

14+阅读 · 2019年5月2日

【数字孪生】数字孪生是制造业实现“智能+”的技术接口

【数字孪生】数字孪生是制造业实现“智能+”的技术接口

产业智能官

35+阅读 · 2019年4月30日

推荐系统

炼数成金订阅号

28+阅读 · 2019年1月17日

NLP实战：用主题建模分析网购评论（附Python代码）

NLP实战：用主题建模分析网购评论（附Python代码）

论智

18+阅读 · 2018年10月17日

【知识图谱】基于知识图谱的安保机器人、知识图谱为电商而生如何感应用户需求、知识图谱在运维中的应用

【知识图谱】基于知识图谱的安保机器人、知识图谱为电商而生如何感应用户需求、知识图谱在运维中的应用

产业智能官

29+阅读 · 2018年10月13日

报名 | 码隆科技与谷歌研究院合办Kaggle大赛，挑战商品图像分类极限！

报名 | 码隆科技与谷歌研究院合办Kaggle大赛，挑战商品图像分类极限！

机器之心

10+阅读 · 2018年4月19日

【推荐系统】深度解析京东个性化推荐系统演进史

【推荐系统】深度解析京东个性化推荐系统演进史

产业智能官

23+阅读 · 2017年12月8日

【智能客服】智能客服2.0，数字时代的人性化交互

【智能客服】智能客服2.0，数字时代的人性化交互

产业智能官

13+阅读 · 2017年11月13日

相关论文

AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation

Arxiv

0+阅读 · 2月19日

Persona2Web: Benchmarking Personalized Web Agents for Contextual Reasoning with User History

Arxiv

0+阅读 · 2月19日

AgentSkiller: Scaling Generalist Agent Intelligence through Semantically Integrated Cross-Domain Data Synthesis

Arxiv

1+阅读 · 2月10日

AgenticPay: A Multi-Agent LLM Negotiation System for Buyer-Seller Transactions

AgenticPay: A Multi-Agent LLM Negotiation System for Buyer-Seller Transactions

Arxiv

0+阅读 · 2月5日

Insight Agents: An LLM-Based Multi-Agent System for Data Insights

Arxiv

0+阅读 · 2月2日

TessPay: Verify-then-Pay Infrastructure for Trusted Agentic Commerce

Arxiv

0+阅读 · 1月30日

MalURLBench: A Benchmark Evaluating Agents' Vulnerabilities When Processing Web URLs

Arxiv

0+阅读 · 1月30日

AgentIF-OneDay: A Task-level Instruction-Following Benchmark for General AI Agents in Daily Scenarios

Arxiv

0+阅读 · 1月30日

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

Arxiv

0+阅读 · 1月26日

SafePro: Evaluating the Safety of Professional-Level AI Agents

Arxiv

0+阅读 · 1月13日

相关基金

社会化商务环境下基于中智集和云模型的推荐方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

以用户为中心的电子商务大数据偏好查询处理与优化

国家自然科学基金

0+阅读 · 2015年12月31日

基于在线消费者购买意向挖掘的个性化推荐研究

国家自然科学基金

0+阅读 · 2015年12月31日

社交网络中消费者行为演化及引导机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于社会网络的大型在线社区中虚拟商品购买行为研究

国家自然科学基金

0+阅读 · 2015年12月31日

服务团购中平台/商户决策优化与协调研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于领域知识和链路预测的个性化推荐研究

国家自然科学基金

4+阅读 · 2014年12月31日

网络购物平台商品质量管控作用机理及其演进研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于人眼关注度与情感分析的电子商务智能推荐计算

国家自然科学基金

0+阅读 · 2014年12月31日

负面在线评论和商家反馈对消费者个体态度和群体观点演化的影响研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员