The growing instability of both global and domestic economic environments has increased the risk of financial distress at the household level. However, traditional econometric models often rely on delayed and aggregated data, limiting their effectiveness. This study introduces a machine learning-based early warning system that utilizes real-time digital and macroeconomic signals to identify financial distress in near real-time. Using a panel dataset of 750 households tracked over three monitoring rounds spanning 13 months, the framework combines socioeconomic attributes, macroeconomic indicators (such as GDP growth, inflation, and foreign exchange fluctuations), and digital economy measures (including ICT demand and market volatility). Through data preprocessing and feature engineering, we introduce lagged variables, volatility measures, and interaction terms to capture both gradual and sudden changes in financial stability. We benchmark baseline classifiers, such as logistic regression and decision trees, against advanced ensemble models including random forests, XGBoost, and LightGBM. Our results indicate that the engineered features from the digital economy significantly enhance predictive accuracy. The system performs reliably for both binary distress detection and multi-class severity classification, with SHAP-based explanations identifying inflation volatility and ICT demand as key predictors. Crucially, the framework is designed for scalable deployment in national agencies and low-bandwidth regional offices, ensuring it is accessible for policymakers and practitioners. By implementing machine learning in a transparent and interpretable manner, this study demonstrates the feasibility and impact of providing near-real-time early warnings of financial distress. This offers actionable insights that can strengthen household resilience and guide preemptive intervention strategies.
翻译:全球及国内经济环境日益不稳定,增加了家庭层面金融困境的风险。然而,传统计量经济模型通常依赖滞后和汇总的数据,限制了其有效性。本研究提出了一种基于机器学习的早期预警系统,利用实时数字信号与宏观经济信号,以近乎实时的方式识别金融困境。通过使用包含750个家庭、跨越13个月三个监测周期的面板数据集,该框架结合了社会经济属性、宏观经济指标(如GDP增长、通货膨胀和汇率波动)以及数字经济指标(包括信息通信技术需求和市场波动性)。通过数据预处理和特征工程,我们引入了滞后变量、波动性指标和交互项,以捕捉金融稳定性的渐进变化和突变。我们将逻辑回归和决策树等基线分类器与随机森林、XGBoost和LightGBM等先进集成模型进行基准比较。结果表明,来自数字经济的工程化特征显著提升了预测准确性。该系统在二元困境检测和多类别严重程度分类中均表现可靠,基于SHAP的解释识别出通货膨胀波动性和信息通信技术需求为关键预测因子。关键在于,该框架设计用于在国家机构和低带宽区域办公室进行可扩展部署,确保政策制定者和从业者能够便捷使用。通过以透明且可解释的方式实施机器学习,本研究论证了提供近乎实时金融困境早期预警的可行性和实际影响。这提供了可操作的见解,有助于增强家庭韧性并指导预防性干预策略。