We introduce new approaches for forecasting IBNR (Incurred But Not Reported) frequencies by leveraging individual claims data, which includes accident date, reporting delay, and possibly additional features for every reported claim. A key element of our proposal involves computing development factors, which may be influenced by both the accident date and other features. These development factors serve as the basis for predictions. While we assume close to continuous observations of accident date and reporting delay, the development factors can be expressed at any level of granularity, such as months, quarters, or year and predictions across different granularity levels exhibit coherence. The calculation of development factors relies on the estimation of a hazard function in reverse development time, and we present three distinct methods for estimating this function: the Cox proportional hazard model, a feed-forward neural network, and eXtreme gradient boosting. In all three cases, estimation is based on the same partial likelihood that accommodates left truncation and ties in the data. While the first case is a semi-parametric model that assumes in parts a log linear structure, the two machine learning approaches only assume that the baseline and the other factors are multiplicatively separable. Through an extensive simulation study and real-world data application, our approach demonstrates promising results.
翻译:本文提出利用个体索赔数据预测未决赔案(IBNR)频率的新方法,数据涵盖每笔已报案索赔的出险日期、报案延迟及可能的其他特征。我们方案的核心在于计算发展因子,这些因子可能同时受出险日期和其他特征影响,并以此作为预测基础。尽管假设对出险日期和报案延迟的观测接近连续,但发展因子可在任意粒度层级(如月度、季度或年度)进行表达,且不同粒度层级的预测结果保持一致性。发展因子的计算依赖于反向发展时间中风险函数的估计,我们提出三种不同的函数估计方法:Cox比例风险模型、前馈神经网络和极限梯度提升。三种方法均基于相同的部分似然函数进行估计,该函数能够处理数据中的左截断和同时间点事件。第一种方法作为半参数模型部分假设对数线性结构,而两种机器学习方法仅假设基线因子与其他因子具有可乘分离性。通过大量模拟研究和实际数据应用,本方法展现出良好的预测效果。