Survival analysis has become a standard approach for modelling time to default by time-varying covariates in credit risk. Unlike most existing methods that implicitly assume a stationary data-generating process, in practise, mortgage portfolios are exposed to various forms of data drift caused by changing borrower behaviour, macroeconomic conditions, policy regimes and so on. This study investigates the impact of data drift on survival-based credit risk models and proposes a dynamic joint modelling framework to improve robustness under non-stationary environments. The proposed model integrates a longitudinal behavioural marker derived from balance dynamics with a discrete-time hazard formulation, combined with landmark one-hot encoding and isotonic calibration. Three types of data drift (sudden, incremental and recurring) are simulated and analysed on mortgage loan datasets from Freddie Mac. Experiments and corresponding evidence show that the proposed landmark-based joint model consistently outperforms classical survival models, tree-based drift-adaptive learners and gradient boosting methods in terms of discrimination and calibration across all drift scenarios, which confirms the superiority of our model design.
翻译:生存分析已成为通过时变协变量对违约时间进行建模的标准方法,用于信用风险评估。与大多数现有方法隐含假设数据生成过程平稳不同,实践中,抵押贷款组合会面临由借款人行为变化、宏观经济条件、政策制度等多种因素引起的各种形式的数据漂移。本研究探讨了数据漂移对基于生存分析的信用风险模型的影响,并提出了一种动态联合建模框架,以提高模型在非平稳环境下的鲁棒性。该模型将源自余额动态的纵向行为标记与离散时间风险公式相结合,并融入了地标独热编码和等渗校准技术。基于房地美(Freddie Mac)的抵押贷款数据集,本研究模拟并分析了三种类型的数据漂移(突发型、渐进型和周期型)。实验及相应证据表明,所提出的基于地标的联合模型在所有漂移场景下,在区分度与校准度方面均持续优于经典生存模型、基于树的漂移自适应学习器以及梯度提升方法,这证实了我们模型设计的优越性。