Automatically discovering formulaic alpha factors is a central problem in quantitative finance. Existing methods often ignore syntactic and semantic constraints, relying on exhaustive search over unstructured and unbounded spaces. We present AlphaCFG, a grammar-based framework for defining and discovering alpha factors that are syntactically valid, financially interpretable, and computationally efficient. AlphaCFG uses an alpha-oriented context-free grammar to define a tree-structured, size-controlled search space, and formulates alpha discovery as a tree-structured linguistic Markov decision process, which is then solved using a grammar-aware Monte Carlo Tree Search guided by syntax-sensitive value and policy networks. Experiments on Chinese and U.S. stock market datasets show that AlphaCFG outperforms state-of-the-art baselines in both search efficiency and trading profitability. Beyond trading strategies, AlphaCFG serves as a general framework for symbolic factor discovery and refinement across quantitative finance, including asset pricing and portfolio construction.
翻译:自动发现公式化Alpha因子是量化金融领域的核心问题。现有方法常忽略语法与语义约束,依赖于对非结构化、无界空间的穷举搜索。本文提出AlphaCFG,一种基于语法的框架,用于定义和发现语法有效、金融可解释且计算高效的Alpha因子。AlphaCFG采用面向Alpha的上下文无关文法来定义树状结构、规模可控的搜索空间,并将Alpha发现问题形式化为树状结构语言马尔可夫决策过程,进而通过语法感知的蒙特卡洛树搜索(由语法敏感的价值网络与策略网络引导)求解。在中国与美国股市数据集上的实验表明,AlphaCFG在搜索效率与交易盈利能力两方面均优于当前最先进的基线方法。除交易策略外,AlphaCFG可作为量化金融中符号化因子发现与优化的通用框架,适用于资产定价与投资组合构建等领域。