Janus-Q：基于分层门控奖励建模的端到端事件驱动交易框架 (Janus-Q: End-to-End Event-Driven Trading via Hierarchical-Gated Reward Modeling)

Financial market movements are often driven by discrete financial events conveyed through news, whose impacts are heterogeneous, abrupt, and difficult to capture under purely numerical prediction objectives. These limitations have motivated growing interest in using textual information as the primary source of trading signals in learning-based systems. Two key challenges hinder existing approaches: (1) the absence of large-scale, event-centric datasets that jointly model news semantics and statistically grounded market reactions, and (2) the misalignment between language model reasoning and financially valid trading behavior under dynamic market conditions. To address these challenges, we propose Janus-Q, an end-to-end event-driven trading framework that elevates financial news events from auxiliary signals to primary decision units. Janus-Q unifies event-centric data construction and model optimization under a two-stage paradigm. Stage I focuses on event-centric data construction, building a large-scale financial news event dataset comprising 62,400 articles annotated with 10 fine-grained event types, associated stocks, sentiment labels, and event-driven cumulative abnormal return (CAR). Stage II performs decision-oriented fine-tuning, combining supervised learning with reinforcement learning guided by a Hierarchical Gated Reward Model (HGRM), which explicitly captures trade-offs among multiple trading objectives. Extensive experiments demonstrate that Janus-Q achieves more consistent, interpretable, and profitable trading decisions than market indices and LLM baselines, improving the Sharpe Ratio by up to 102.0% while increasing direction accuracy by over 17.5% compared to the strongest competing strategies.

翻译：金融市场波动通常由新闻传递的离散金融事件驱动，这些事件的影响具有异质性、突发性，且难以通过纯数值预测目标捕捉。这些局限性促使学界日益关注将文本信息作为学习系统中交易信号的主要来源。现有方法面临两大关键挑战：（1）缺乏大规模、以事件为中心的数据集，能够同时建模新闻语义与基于统计的市场反应；（2）语言模型推理与动态市场条件下财务有效的交易行为之间存在错配。为应对这些挑战，我们提出Janus-Q——一个端到端的事件驱动交易框架，将金融新闻事件从辅助信号提升为核心决策单元。Janus-Q通过两阶段范式统一了以事件为中心的数据构建与模型优化。第一阶段聚焦事件中心化数据构建，构建了一个包含62,400篇文章的大规模金融新闻事件数据集，标注了10种细粒度事件类型、关联股票、情感标签及事件驱动的累积异常收益率（CAR）。第二阶段进行决策导向的微调，结合监督学习与由分层门控奖励模型（HGRM）引导的强化学习，该模型显式捕捉多交易目标间的权衡关系。大量实验表明，Janus-Q相比市场指数和LLM基线实现了更一致、可解释且盈利的交易决策，相较于最强竞争策略，夏普比率提升最高达102.0%，方向预测准确率提高超过17.5%。