While Large Language Model (LLM) agents show promise in automated trading, they still face critical limitations. Prominent multi-agent frameworks often suffer from inefficiency, produce inconsistent signals, and lack the end-to-end optimization required to learn a coherent strategy from market feedback. To address this, we introduce AlphaQuanter, a single-agent framework that uses reinforcement learning (RL) to learn a dynamic policy over a transparent, tool-augmented decision workflow, which empowers a single agent to autonomously orchestrate tools and proactively acquire information on demand, establishing a transparent reasoning process. Extensive experiments demonstrate that AlphaQuanter achieves state-of-the-art performance on key financial metrics. Moreover, its interpretable reasoning reveals sophisticated strategies, offering novel and valuable insights for human traders. Our code and data can be found at https://github.com/horizon-llm/AlphaQuanter.
翻译:尽管大型语言模型(LLM)智能体在自动化交易中展现出潜力,但其仍面临关键局限性。主流的多智能体框架往往效率低下,产生不一致的信号,且缺乏从市场反馈中学习连贯策略所需的端到端优化。为解决此问题,我们提出了AlphaQuanter——一种单智能体框架,其利用强化学习(RL)在透明、工具增强的决策工作流上学习动态策略,使单个智能体能够自主编排工具并按需主动获取信息,从而建立透明的推理过程。大量实验表明,AlphaQuanter在关键金融指标上达到了最先进的性能。此外,其可解释的推理揭示了复杂策略,为人类交易者提供了新颖而宝贵的见解。我们的代码和数据可在https://github.com/horizon-llm/AlphaQuanter获取。