The Alpha Illusion: Reported Alpha from LLM Trading Agents Should Not Be Treated as Deployment Evidence - 专知论文

会员服务 ·

0

Agent · 端到端 · 稳健性 · 大语言模型 · 统计量 ·

The Alpha Illusion: Reported Alpha from LLM Trading Agents Should Not Be Treated as Deployment Evidence

翻译：暂无翻译

Yuxuan Ye,Jun Han,Ao Hu,Juncheng Bu,Yiyi Chen,Liangjian Wen,Danilo Mandic,Danny Dongning Sun,Xu Yinghui,Zenglin Xu

End-to-end LLM trading agents have moved quickly from research curiosity to a small ecosystem of named systems, including FinCon, FinMem, TradingAgents, FinAgent, QuantAgent, and FLAG-Trader. Several of these report headline Sharpe ratios that would be material if read at face value on a deployment desk, and associated benchmarks such as FinBen report trading-task Sharpe statistics in the same range. The gap between architecture research and deployment claim has been crossed too freely on both sides of the academia--industry divide. We take a position on that gap: reported alpha from end-to-end LLM trading agents should not be treated as deployment evidence. Before such returns can support claims of deployable trading capability, they must survive structural validity tests for temporal integrity, real-world frictions, counterfactual robustness, predictive calibration, numerical execution, and multi-agent disaggregation. Current public evidence cannot yet distinguish robust predictive ability from temporal contamination, unmodeled frictions, short-window Sharpe uncertainty, narrative fitting, and parametric priors. The problem is not only evaluative but structural. Language confidence is not tradable probability, narrative reasoning is not numerical execution, and model priors may become undisclosed implicit factor exposures. We contribute a minimum reporting protocol suite, P1--P6, with tiered applicability by claim strength, and a conservative modular alternative that uses LLMs as auditable information interfaces upstream of independent calibration, risk, and execution modules. Code and reproduction harness: \url{https://github.com/hj1650782738/Trading}.

翻译：暂无翻译

0

相关内容

Agent

从静态模板到动态运行时图：大语言模型智能体（LLM Agents）工作流优化综述

从静态模板到动态运行时图：大语言模型智能体（LLM Agents）工作流优化综述

专知会员服务

23+阅读 · 3月30日

智能体化人工智能 (Agentic AI) 的前行之路：挑战与机遇

智能体化人工智能 (Agentic AI) 的前行之路：挑战与机遇

专知会员服务

43+阅读 · 1月8日

最新新Agent综述！76页327篇论文梳理，北交大桑基韬教授团队发布《迈向模型原生智能体式人工智能的范式转变综述》

最新新Agent综述！76页327篇论文梳理，北交大桑基韬教授团队发布《迈向模型原生智能体式人工智能的范式转变综述》

专知会员服务

40+阅读 · 2025年10月17日

LLM/智能体作为数据分析师：综述

LLM/智能体作为数据分析师：综述

专知会员服务

38+阅读 · 2025年9月30日

Agent有望定义万亿劳动力市场

Agent有望定义万亿劳动力市场

专知会员服务

19+阅读 · 2025年6月11日

从自我进化视角出发，全面解析LLM的推理能力技术演进路径

从自我进化视角出发，全面解析LLM的推理能力技术演进路径

专知会员服务

14+阅读 · 2025年3月6日

【AMD&霍普金斯】智能体实验室：将大语言模型（LLM）智能体作为研究助理

【AMD&霍普金斯】智能体实验室：将大语言模型（LLM）智能体作为研究助理

专知会员服务

30+阅读 · 2025年1月13日

《美国防部对人工智能和 LLM 编写评估因素的信心与偏见》2024最新275页论文

《美国防部对人工智能和 LLM 编写评估因素的信心与偏见》2024最新275页论文

专知会员服务

64+阅读 · 2024年3月4日

AI Agent，大模型时代重要落地方向, 42页ppt

AI Agent，大模型时代重要落地方向, 42页ppt

专知会员服务

291+阅读 · 2023年10月12日

AI Agent下一个热点？复旦最新86页《大型语言模型智能体的崛起与潜力》综述，详述LLM Agent: 大脑、感知和行动

AI Agent下一个热点？复旦最新86页《大型语言模型智能体的崛起与潜力》综述，详述LLM Agent: 大脑、感知和行动

专知会员服务

170+阅读 · 2023年9月15日

赛尔译文｜基础模型的风险与机遇（五）

赛尔译文｜基础模型的风险与机遇（五）

哈工大SCIR

11+阅读 · 2021年11月30日

赛尔译文 | 基础模型的机遇与风险（三）

赛尔译文 | 基础模型的机遇与风险（三）

哈工大SCIR

12+阅读 · 2021年10月26日

【RPA】RPA：AI落地的接盘侠

【RPA】RPA：AI落地的接盘侠

产业智能官

12+阅读 · 2020年5月7日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

放弃 RNN/LSTM 吧，因为真的不好用！望周知~

放弃 RNN/LSTM 吧，因为真的不好用！望周知~

人工智能头条

19+阅读 · 2018年4月24日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

论文浅尝 | Question Answering over Freebase

论文浅尝 | Question Answering over Freebase

开放知识图谱

19+阅读 · 2018年1月9日

肺炎支原体外排泵ABC Transporter在大环内酯类耐药中的作用机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

GaN基HEMT器件陷阱及缺陷表征分析方法及相关退化机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

非光滑非凸优化问题的交替线性化算法及其应用

国家自然科学基金

6+阅读 · 2015年12月31日

延迟Hamilton系统保结构算法研究及其应用

国家自然科学基金

0+阅读 · 2014年12月31日

随机吸引子的若干问题

国家自然科学基金

0+阅读 · 2014年12月31日

不确定需求下的拉动式合约拍卖协商机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

算法博弈论视角下的策略替代型网络博弈

国家自然科学基金

4+阅读 · 2014年12月31日

随机Helmholtz型问题的数值方法

国家自然科学基金

0+阅读 · 2014年12月31日

两类非马氏保险模型下的最优问题以及公司合并问题

国家自然科学基金

0+阅读 · 2014年12月31日

不确定环境下基于HTN的应急任务规划方法研究

国家自然科学基金

15+阅读 · 2012年12月31日

When Lower Privileges Suffice: Investigating Over-Privileged Tool Selection in LLM Agents

Arxiv

0+阅读 · 6月18日

Hidden Anchors in Multi-Agent LLM Deliberation

Arxiv

0+阅读 · 6月17日

A Technical Taxonomy of LLM Agent Communication Protocols

A Technical Taxonomy of LLM Agent Communication Protocols

Arxiv

0+阅读 · 6月17日

Decoupling Search from Reasoning: A Vendor-Agnostic Grounding Architecture for LLM Agents

Arxiv

0+阅读 · 6月17日

When Errors Become Narratives: A Longitudinal Taxonomy of Silent Failures in a Production LLM Agent Runtime

Arxiv

0+阅读 · 6月12日

Deterministic Integrity Gates for LLM-Assisted Clinical Manuscript Preparation: An Auditable Biomedical Informatics Architecture

Arxiv

0+阅读 · 6月9日

Memory is Reconstructed, Not Retrieved: Graph Memory for LLM Agents

Arxiv

0+阅读 · 6月4日

TowerMind: A Tower Defence Game Learning Environment and Benchmark for LLM as Agents

Arxiv

0+阅读 · 5月26日

Rethinking Agentic RAG: Toward LLM-Driven Logical Retrieval Beyond Embeddings

Arxiv

0+阅读 · 5月26日

AgenTEE: Confidential LLM Agent Execution on Edge Devices

Arxiv

0+阅读 · 5月6日

VIP会员

文章信息

相关主题

大语言模型

最新内容

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

专知会员服务

2+阅读 · 6月19日

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

专知会员服务

4+阅读 · 6月19日

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

专知会员服务

5+阅读 · 6月18日

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

专知会员服务

6+阅读 · 6月18日

《廉价自杀式无人机战争的军事战略影响：乌克兰和伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰和伊朗案例研究》

专知会员服务

11+阅读 · 6月18日

《面向反无人机作战的联邦式可解释射频–光电/红外情报融合：边缘人工智能优化、电子战韧性及分布式监视验证》

《面向反无人机作战的联邦式可解释射频–光电/红外情报融合：边缘人工智能优化、电子战韧性及分布式监视验证》

专知会员服务

9+阅读 · 6月18日

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

专知会员服务

6+阅读 · 6月17日

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

专知会员服务

9+阅读 · 6月17日

学习数据的几何：形状空间分析数学综述

学习数据的几何：形状空间分析数学综述

专知会员服务

7+阅读 · 6月17日

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

专知会员服务

13+阅读 · 6月17日

定向能反无人机系统最新发展动态

定向能反无人机系统最新发展动态

专知会员服务

8+阅读 · 6月17日

从燃煤战舰到算法战争：水面指挥的永恒要求

从燃煤战舰到算法战争：水面指挥的永恒要求

专知会员服务

6+阅读 · 6月17日

《短程弹道再入飞行器拦截时间中的一项异常现象》

《短程弹道再入飞行器拦截时间中的一项异常现象》

专知会员服务

8+阅读 · 6月17日

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

专知会员服务

8+阅读 · 6月17日

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

专知会员服务

10+阅读 · 6月17日

相关VIP内容

从静态模板到动态运行时图：大语言模型智能体（LLM Agents）工作流优化综述

从静态模板到动态运行时图：大语言模型智能体（LLM Agents）工作流优化综述

专知会员服务

23+阅读 · 3月30日

智能体化人工智能 (Agentic AI) 的前行之路：挑战与机遇

智能体化人工智能 (Agentic AI) 的前行之路：挑战与机遇

专知会员服务

43+阅读 · 1月8日

最新新Agent综述！76页327篇论文梳理，北交大桑基韬教授团队发布《迈向模型原生智能体式人工智能的范式转变综述》

最新新Agent综述！76页327篇论文梳理，北交大桑基韬教授团队发布《迈向模型原生智能体式人工智能的范式转变综述》

专知会员服务

40+阅读 · 2025年10月17日

LLM/智能体作为数据分析师：综述

LLM/智能体作为数据分析师：综述

专知会员服务

38+阅读 · 2025年9月30日

Agent有望定义万亿劳动力市场

Agent有望定义万亿劳动力市场

专知会员服务

19+阅读 · 2025年6月11日

从自我进化视角出发，全面解析LLM的推理能力技术演进路径

从自我进化视角出发，全面解析LLM的推理能力技术演进路径

专知会员服务

14+阅读 · 2025年3月6日

【AMD&霍普金斯】智能体实验室：将大语言模型（LLM）智能体作为研究助理

【AMD&霍普金斯】智能体实验室：将大语言模型（LLM）智能体作为研究助理

专知会员服务

30+阅读 · 2025年1月13日

《美国防部对人工智能和 LLM 编写评估因素的信心与偏见》2024最新275页论文

《美国防部对人工智能和 LLM 编写评估因素的信心与偏见》2024最新275页论文

专知会员服务

64+阅读 · 2024年3月4日

AI Agent，大模型时代重要落地方向, 42页ppt

AI Agent，大模型时代重要落地方向, 42页ppt

专知会员服务

291+阅读 · 2023年10月12日

AI Agent下一个热点？复旦最新86页《大型语言模型智能体的崛起与潜力》综述，详述LLM Agent: 大脑、感知和行动

AI Agent下一个热点？复旦最新86页《大型语言模型智能体的崛起与潜力》综述，详述LLM Agent: 大脑、感知和行动

专知会员服务

170+阅读 · 2023年9月15日

热门VIP内容

开通专知VIP会员享更多权益服务

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

相关资讯

赛尔译文｜基础模型的风险与机遇（五）

赛尔译文｜基础模型的风险与机遇（五）

哈工大SCIR

11+阅读 · 2021年11月30日

赛尔译文 | 基础模型的机遇与风险（三）

赛尔译文 | 基础模型的机遇与风险（三）

哈工大SCIR

12+阅读 · 2021年10月26日

【RPA】RPA：AI落地的接盘侠

【RPA】RPA：AI落地的接盘侠

产业智能官

12+阅读 · 2020年5月7日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

放弃 RNN/LSTM 吧，因为真的不好用！望周知~

放弃 RNN/LSTM 吧，因为真的不好用！望周知~

人工智能头条

19+阅读 · 2018年4月24日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

论文浅尝 | Question Answering over Freebase

论文浅尝 | Question Answering over Freebase

开放知识图谱

19+阅读 · 2018年1月9日

相关论文

When Lower Privileges Suffice: Investigating Over-Privileged Tool Selection in LLM Agents

Arxiv

0+阅读 · 6月18日

Hidden Anchors in Multi-Agent LLM Deliberation

Arxiv

0+阅读 · 6月17日

A Technical Taxonomy of LLM Agent Communication Protocols

A Technical Taxonomy of LLM Agent Communication Protocols

Arxiv

0+阅读 · 6月17日

Decoupling Search from Reasoning: A Vendor-Agnostic Grounding Architecture for LLM Agents

Arxiv

0+阅读 · 6月17日

When Errors Become Narratives: A Longitudinal Taxonomy of Silent Failures in a Production LLM Agent Runtime

Arxiv

0+阅读 · 6月12日

Deterministic Integrity Gates for LLM-Assisted Clinical Manuscript Preparation: An Auditable Biomedical Informatics Architecture

Arxiv

0+阅读 · 6月9日

Memory is Reconstructed, Not Retrieved: Graph Memory for LLM Agents

Arxiv

0+阅读 · 6月4日

TowerMind: A Tower Defence Game Learning Environment and Benchmark for LLM as Agents

Arxiv

0+阅读 · 5月26日

Rethinking Agentic RAG: Toward LLM-Driven Logical Retrieval Beyond Embeddings

Arxiv

0+阅读 · 5月26日

AgenTEE: Confidential LLM Agent Execution on Edge Devices

Arxiv

0+阅读 · 5月6日

相关基金

肺炎支原体外排泵ABC Transporter在大环内酯类耐药中的作用机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

GaN基HEMT器件陷阱及缺陷表征分析方法及相关退化机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

非光滑非凸优化问题的交替线性化算法及其应用

国家自然科学基金

6+阅读 · 2015年12月31日

延迟Hamilton系统保结构算法研究及其应用

国家自然科学基金

0+阅读 · 2014年12月31日

随机吸引子的若干问题

国家自然科学基金

0+阅读 · 2014年12月31日

不确定需求下的拉动式合约拍卖协商机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

算法博弈论视角下的策略替代型网络博弈

国家自然科学基金

4+阅读 · 2014年12月31日

随机Helmholtz型问题的数值方法

国家自然科学基金

0+阅读 · 2014年12月31日

两类非马氏保险模型下的最优问题以及公司合并问题

国家自然科学基金

0+阅读 · 2014年12月31日

不确定环境下基于HTN的应急任务规划方法研究

国家自然科学基金

15+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员