Dynamic Objective Selection with Safeguards and LLM Oversight for Financial Decision-Making

Financial decision-making tasks such as stock recommendation and portfolio allocation typically estimate future return and risk and then select trades or allocations for an investor, and the chosen optimization objective often determines realized performance. However, because market conditions evolve over time, a fixed objective can be suboptimal across regimes, while regime-switching pipelines that rely on latent regime estimates can be noisy or delayed and frequent switching can increase turnover and operational instability. In this paper, we propose DOSS (Dynamic Objective Selection with Safeguards), a learning-based selector that directly chooses the decision-relevant objective function at each time point from interpretable statistical summaries of recent returns, selecting among a small set of candidates (e.g., return-seeking, loss-averse, and risk-adjusted) without introducing intermediate regime variables. DOSS formulates objective selection as a classification problem over objectives and performs sequential updates with a rolling window to make forward-looking selections without temporal leakage, while also outputting a confidence score for each proposal. To mitigate misselection and excessive switching in deployment, DOSS applies confidence-aware gating with a fail-safe that overrides low-confidence proposals to a conservative default and enforces explicit controls tied to switching frequency. We further integrate governance by positioning a Large Language Model (LLM) as an oversight component rather than a generator of new objectives: the LLM is restricted to accept a proposed objective or override it to a predefined safe default, with deterministic rule-based constraints triggering overrides when needed.

翻译：金融决策任务（如股票推荐与资产配置）通常需估算未来收益与风险，进而为投资者选择交易或配置方案，所选优化目标往往决定最终绩效。然而，由于市场环境随时间动态演变，固定目标在不同市场状态下可能呈现次优性；依赖潜在状态估计的状态切换流程则可能因估计噪声或滞后性导致性能下降，且频繁切换会加剧交易周转率与操作不稳定性。本文提出DOSS（含防护机制的动态目标选择器），该基于学习的选择器可直接从近期收益的可解释统计摘要中，在每个时间点选择决策相关的目标函数，从少量候选目标（如收益导向型、损失规避型与风险调整型）中做出选择，无需引入中间状态变量。DOSS将目标选择建模为跨目标的分类问题，通过滚动窗口进行序贯更新以做出前瞻性选择（避免时间泄漏），同时为每个提议输出置信度分数。为缓解部署阶段的误选与过度切换，DOSS采用置信度感知门控机制，通过故障保护机制将低置信度提议覆盖为保守默认值，并对切换频率实施显式约束。此外，我们进一步集成治理机制：将大语言模型定位为监督组件而非新目标生成器——LLM仅能接受已提议目标或将其覆盖为预定义安全默认值，同时当需要时触发确定性规则约束以实现覆盖操作。