Predicting company growth is a critical yet challenging task because observed dynamics blend an underlying structural growth trend with volatile fluctuations. Here, we propose a Scaling-Theory-Informed Machine Learning (STIML) framework that integrates a scaling-based growth model to capture the mechanism-driven average trend, together with a data-driven forecasting model to learn the residual fluctuations. Using Compustat annual financial statement data (1950--2019) for 31,553 North American companies, we extend the growth model beyond assets to multiple financial indicators, and evaluate STIML against growth model-only and purely data-driven baselines. Across 16 target variables, we show that company growth exhibits a clear separation between trend-driven predictability and fluctuation-driven predictability, with their relative importance depending strongly on company size and volatility. Interpretability analyses further show that STIML captures multivariate dependencies beyond simple autocorrelation, and that macroeconomic variables contribute significantly less to predictive performance on average. Moreover, we find the scaling-based growth model overlooks asymmetric deviations, which instead contain the structured and learnable signals, suggesting a path to refine mechanistic models.
翻译:预测公司增长是一项关键但极具挑战性的任务,因为观测到的动态融合了潜在的结构性增长趋势与波动性涨落。本文提出了一种尺度理论启发的机器学习框架,该框架整合了一个基于尺度的增长模型以捕捉机制驱动的平均趋势,并结合一个数据驱动的预测模型来学习残差涨落。利用Compustat数据库中31,553家北美公司(1950–2019年)的年度财务报表数据,我们将增长模型从资产扩展至多个财务指标,并将STIML框架与仅使用增长模型及纯数据驱动的基线方法进行比较。在16个目标变量上,我们证明公司增长呈现出趋势驱动可预测性与涨落驱动可预测性之间的清晰分离,且二者的相对重要性高度依赖于公司规模和波动性。可解释性分析进一步表明,STIML框架能够捕捉超越简单自相关的多变量依赖关系,而宏观经济变量对预测性能的平均贡献显著较低。此外,我们发现基于尺度的增长模型忽略了非对称偏差,而这些偏差恰恰包含了结构化且可学习的信号,这为改进机制模型指明了一条路径。