We present an interpretable machine learning pipeline to decompose Cross-Sectional Equity Return Predictability into auditable factor contribution. We apply an XGBoost model with TreeSHAP attribution and conduct stress testing on 3632 Chinese A-share stocks from 2009 until 2019. Using 60-month, rolling windows over 55 months of out-of-sample data, XGBoost obtains a mean AUC of 0.547 and +2.38%/month (Newey-West t = 5.94; Annualized Sharpe 2.23) long-short spread for the top vs bottom quintiles. This alpha is persistent after adjusting for the Carhart four-factor model (+2.31%/month; t = 7.48). SHAP Decomposition indicates that behavioral signals (turnover and momentum) account for 58.2% of predictive attribution compared to 10.7% for valuation ratios, on average, across 55 industry groups. Ablation analysis serves to cross-validate this ranking and provides evidence that SHAP and ablation diverge in a manner that highlights feature substitutability structure that is largely invisible to either method used in isolation.
翻译:我们提出了一种可解释的机器学习流程,将截面股票收益可预测性分解为可审计的因子贡献。我们采用基于TreeSHAP归因的XGBoost模型,对2009年至2019年间3632只中国A股股票进行了压力测试。使用60个月滚动窗口及55个月样本外数据,XGBoost模型在最优与最差五分位组合中实现了平均AUC为0.547,月度多空收益价差为+2.38%(Newey-West t统计量=5.94;年化夏普比率2.23)。在经Carhart四因子模型调整后,该Alpha仍保持持续性(月度收益+2.31%;t=7.48)。SHAP分解表明,在55个行业组中,行为信号(换手率与动量)平均占预测归因的58.2%,而估值比率仅占10.7%。消融分析交叉验证了这一排序,并揭示了SHAP与消融方法在特征可替代性结构上的差异性——这种结构在单独使用任一方法时均难以察觉。