Alpha mining, aimed at discovering predictive return signals, is typically formulated as symbolic regression. Traditional symbolic methods suffer from search inefficiency and biased prior knowledge. Recently, Large Language Models (LLMs) have emerged as a promising alternative, automatically generating textual thoughts and executable codes to achieve both efficient and interpretable alpha mining. However, existing approaches mostly focus on leveraging LLM's reasoning and reflection capabilities, yet largely neglect the positional bias due to the flat thought representation which restricts efficiency and diversity of the search process. This paper introduces Tree-structured thought Evolution (TreEvo), which evolves hierarchically decomposed thoughts to expand the effective search space. In addition, we propose a set of evolutionary operators tailored to structured thoughts. Experiments on four real-market datasets demonstrate that TreEvo not only obtains competitive alphas with traditional methods in up to 200 times fewer evaluations, but also consistently outperforms LLM-driven EAs across all datasets by $14.31\%$ on average.
翻译:Alpha挖掘旨在发现预测性回报信号,通常被形式化为符号回归问题。传统符号方法面临搜索效率低下和先验知识偏差的挑战。近年来,大语言模型(LLMs)作为有前景的替代方案出现,能够自动生成文本思路和可执行代码,实现高效且可解释的Alpha挖掘。然而,现有方法大多聚焦于利用LLM的推理与反思能力,却因扁平化思维表征导致的位置偏差严重限制了搜索过程的效率与多样性。本文提出树结构思维进化方法(TreEvo),通过进化层级化分解的思维来扩展有效搜索空间。此外,我们针对结构化思维设计了一套进化算子。在四个真实市场数据集上的实验表明,TreEvo不仅能在最多200倍更少的评估次数下获得与传统方法相当的Alpha表现,还能在所有数据集上平均超越LLM驱动的进化算法14.31%。