Causally-Guided Automated Feature Engineering with Multi-Agent Reinforcement Learning

Automated feature engineering (AFE) enables AI systems to autonomously construct high-utility representations from raw tabular data. However, existing AFE methods rely on statistical heuristics, yielding brittle features that fail under distribution shift. We introduce CAFE, a framework that reformulates AFE as a causally-guided sequential decision process, bridging causal discovery with reinforcement learning-driven feature construction. Phase I learns a sparse directed acyclic graph over features and the target to obtain soft causal priors, grouping features as direct, indirect, or other based on their causal influence with respect to the target. Phase II uses a cascading multi-agent deep Q-learning architecture to select causal groups and transformation operators, with hierarchical reward shaping and causal group-level exploration strategies that favor causally plausible transformations while controlling feature complexity. Across 15 public benchmarks (classification with macro-F1; regression with inverse relative absolute error), CAFE achieves up to 7% improvement over strong AFE baselines, reduces episodes-to-convergence, and delivers competitive time-to-target. Under controlled covariate shifts, CAFE reduces performance drop by ~4x relative to a non-causal multi-agent baseline, and produces more compact feature sets with more stable post-hoc attributions. These findings underscore that causal structure, used as a soft inductive prior rather than a rigid constraint, can substantially improve the robustness and efficiency of automated feature engineering.

翻译：自动化特征工程（AFE）使人工智能系统能够从原始表格数据中自主构建高效用的表征。然而，现有的AFE方法依赖于统计启发式方法，产生的特征在分布偏移下表现脆弱。我们提出了CAFE框架，该框架将AFE重新表述为一个因果引导的序列决策过程，将因果发现与强化学习驱动的特征构建相结合。第一阶段学习特征与目标之间的稀疏有向无环图以获得软因果先验，根据特征相对于目标的因果影响将其分组为直接、间接或其他类别。第二阶段采用级联多智能体深度Q学习架构来选择因果组和变换算子，通过分层奖励塑造和因果组级探索策略，优先考虑因果上合理的变换，同时控制特征复杂度。在15个公开基准测试（分类任务使用宏F1；回归任务使用逆相对绝对误差）中，CAFE相较于强大的AFE基线实现了高达7%的性能提升，减少了收敛所需的训练回合数，并达到了具有竞争力的目标达成时间。在受控协变量偏移下，CAFE相对于非因果多智能体基线将性能下降减少了约4倍，并产生了更紧凑的特征集，其事后归因也更为稳定。这些发现强调，将因果结构用作软归纳先验而非刚性约束，可以显著提升自动化特征工程的鲁棒性和效率。